[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client
#29734: Broker should receive country stats information from Proxy and Client
-------------------------------------+--------------------------------
Reporter: cohosh | Owner: cohosh
Type: enhancement | Status: needs_revision
Priority: Medium | Milestone:
Component: Obfuscation/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: snowflake, geoip, stats | Actual Points: 2
Parent ID: #29207 | Points: 1
Reviewer: ahf | Sponsor: Sponsor19
-------------------------------------+--------------------------------
Comment (by cohosh):
Replying to [comment:24 dcf]:
> I have just a few further changes to recommend.
Updated branch: https://github.com/cohosh/snowflake/compare/geoip
>
> *
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/broker.go#L221
> {{{
> strings.Split(r.RemoteAddr, ":")[0]
> }}}
> This will fail for IPv6 addresses. Better to use net.SplitHostPort and
check the error return. The host–port splitting could also happen inside
of `Metrics.UpdateCountryStats`.
Ack >.< thanks. Done!
> *
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L196
> {{{
> _, err = io.WriteString(hash, scanner.Text())
> }}}
> A better way to do this may be to do `hashedFile :=
io.TeeReader(geoipFile, hash)` and then `scanner :=
bufio.NewScanner(hashedFile)` so that the hash is calculated as a side
effect of reading the file. There's no need to handle errors when writing
to a `hash.Hash` because it is documented to never error.
Done
> *
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L14
> {{{
> Recognized line formats for IPv4 are:
> INTIPLOW,INTIPHIGH,CC
> and
> "INTIPLOW","INTIPHIGH","CC","CC3","COUNTRY NAME"
> }}}
> It looks like the code doesn't recognize the 5-element syntax, so it
should be omitted here, or, if it's a common format, documented as not
being supported.
Documented as unsupported so we know what our line of thought was in case
we want to support it later
>
> Okay. Let's replace the panic with a `log.Fatal` so the failure gets
logged.
>
Done
> > > * The `parseEntry` functions need error checking on
`geoipStringToIP` and `net.ParseIP`, or else they will store a `nil` that
will result in a panic later on.
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/geoip.go#L93
This error] will be more useful if it includes a line number. Let me
suggest a different factoring of the geoip parsing code. Have `parseEntry`
be a plain function, not a method on `GeoIPv4Table`, and have it return
`(GeoIPEntry, error)` rather than inserting into the table itself.
GeoIPLoadFile can do the entry insertion, count lines and annotate the
error result from `parseEntry`.
> > The difficulty in refactoring it this way is that ipv4 and ipv6 tables
have differently formated entries that need to be parse differently.
> > I added error checking to parseEntry and have it return `(GeoIPEntry,
error)` as suggested, but leave it as a method on GeoIPv4Table and
GeoIPv6Table.
>
> I didn't explain this well. I meant that there should be separate
functions for the two formats e.g. `parseEntryIPv4` and `parseEntryIPv6`.
The existing `parseEntry` methods never refer to `table`, so they don't
need to be methods. But leaving them as methods is fine too.
Okay, I'm going to leave them as methods to avoid having two different
LoadFile functions for the different types of table.
>
> IMO annotating the error message with the problematic line should be
done in `GeoIPLoadFile`, not in `parseEntry`. This will eliminate the
duplication of the common `"Provided geoip file is incorrectly formatted"`
string and ensure that all the error paths get annotated. What I mean is,
in the `scanner.Scan()` loop, you can replace `return err` with `return
fmt.Errorf("Provided geoip file is incorrectly formatted: %s. Line is:
%+q")`.
Done, is there a better way of handling the error stubs in `parseEntry`
other than returning `fmt.Errorf("")`?
Replying to [comment:25 dcf]:
> Oh and it looks like country stats don't get incremented whenever
`GetCountryByAddr` doesn't find a match. I'm afraid this will make
analysis difficult if a large fraction of proxies aren't covered by the
geoip database, or the database is improperly installed, or something.
Could we count them with a country code of `"??"` or something?
>
>
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L213
> `GetCountryByAddr` returns `(string, error)`, but failure to find an
entry in a database is not really an "error". It makes it seem like there
are three return stats possible (found, not found, error) when really
there are only two (found, not found). How about changing the signature to
> {{{
> func GetCountryByAddr(table GeoIPTable, ip net.IP) (string, bool)
> }}}
> where the second return value is an `ok` flag, like when indexing a map
or reading from a channel. The caller can then do the `"??"` substitution
whenever `!ok`.
Okay, so I've changed the GetCountryByAddr signature to what you've
suggested and then just removed the error return value from
UpdateCountryStats. I've also added setting the country to `"??"` in
UpdateCountryStats.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29734#comment:27>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs