[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Proposal: Check Maxmind GeoIP DB before distributing



Hi,

On 30.06.2018 13:53, Jaskaran Singh wrote:
5. Dealing with false positives
Maxmind calculates geolocation of an IP addr using WHOIS records,
Reverse DNS etc. It claims to have precision rate of 99.5% on country
level. The other 0.5% is more likely to be those IP addresses for which
neither WHOIS record nor Reverse DNS are setup.

A very large percentage of Tor Nodes are run from datacenters, which
usually have all their records set up. It's highly unlikely for an IP
address belonging to a datacenter to be mapped to a wrong location.

Hence, false positives would be very few, and can be safely ignored
after a simple manual/scripted investigation.
We measured Tor relay locations a while ago using ICMP RTT measurements from multiple server instances located in Europe, North America, Asia, and Oceania. Using the minimum RTT for each connection*, we applied multilateration for estimating the location of a relay. Even though this approach is noisy because of varying network conditions and routes, we still get a good estimate of the relay's actual position.

We compared our estimated ICMP relay locations with the GeoIP information:
- our test set consisted of a full consensus
- we conducted the measurements within 5 days and repeated reference experiments a month later to test the stability of results - we sent 500 pings per relay from 8 remote servers and repeated the measurements multiple times
- we use the minimum RTT as input for the multilateration

Results can be summarized as follows:
- the median location error is in a range of 440km
- 287 outliers are more than 2654km away from the position that GeoIP suggested. This represents ~4.6% of the tested relays
- the 75th percentile of nodes differs by more than 1000km

Currently we repeat the experiments with 16 instead of 8 servers and work on improving the evaluation to improve the location estimate.

We cannot take these results as a ground truth, as a majority of GeoIP locations already document the actual country and continent a relay is in. Nevertheless, this is a good way to add an independent verification step. The location error for the outliers is a proof that there are nodes that actually run on a different continent and this is an important security issue if users want to circumvent a certain country. The same applies for the 75th percentile, which also leads to updated country information for a significant set of relays.

We can conclude that yes, a large percentage of Tor nodes have OK records. But the number of false positives is not that low and, from my opinion, cannot be ignored. Besides an independent verification step, for which I suggest timing measurements and multilateration, location errors that lead to an updated country code should be considered as update (or respective nodes should be flagged).

*this follows the motivation that no transmission can ever be faster than a certain threshold, so the minimum RTT is the closest we can get to this threshold


Cheers,
Katharina
_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev