[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[vidalia-svn] r1311: Add a geoip-spec.txt from arma. (trunk/doc)



Author: edmanm
Date: 2006-10-07 23:59:59 -0400 (Sat, 07 Oct 2006)
New Revision: 1311

Added:
   trunk/doc/geoip-spec.txt
Log:
Add a geoip-spec.txt from arma.


Added: trunk/doc/geoip-spec.txt
===================================================================
--- trunk/doc/geoip-spec.txt	                        (rev 0)
+++ trunk/doc/geoip-spec.txt	2006-10-08 03:59:59 UTC (rev 1311)
@@ -0,0 +1,116 @@
+$Id$
+
+0. Introduction.
+
+   Vidalia tracks geographic coordinates of the IP addresses for Tor
+   servers, so it can display them in its Network Map window. This
+   document describes the actual lookup and caching mechanism, and also
+   lays out some of the security questions and future directions.
+
+1. Fetching and caching.
+
+   When we learn one or more new descriptors, we check to see if we have
+   a cached mapping for each IP address to some geographic location. An
+   example mapping is:
+
+       206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
+
+   If we don't have a cached answer, we send an HTTP request to our
+   perl script either on pasiphae or at cmu, asking about one or more
+   IP addresses.
+   [XXX URLs? URL to source code? Who operates them? What is their
+   logging policy? What is the format of the request and response?]
+   [Where does the data for the answers come from?]
+
+   The script returns a line like the one above. We cache it in an
+   unsorted text file called "~/.vidalia/geoip-cache" along with a
+   Unix timestamp:
+   [Is this filename correct on osx/windows too?]
+
+       206.124.149.146,Bellevue,WA,US,47.6051,-122.1134:1159123625
+
+   We load the cache file on startup, discard all entries that are
+   over a month old, and write a new version. After that the cache
+   file is append-only.
+
+   The requests are done over Tor, as ordinary socks requests to the
+   local Tor client. Vidalia assumes that Tor is listening on port 9050.
+
+   [Todo: getconf socksport socksbindaddress and become smarter.]
+
+   Once we queue a request for geographic information, we wait
+   RESOLVE_QUEUE_DELAY (three seconds) before actually launching the
+   connection. That is, we queue requests until 3 seconds go by with no
+   more queuing. This way we lump them together and send one request
+   instead of tons of little ones.
+
+   If we don't get an answer, we don't retry -- if we don't have
+   geographic information for a server, it simply doesn't get mapped.
+
+2. Security and anonymity questions.
+
+   First, can the operator of the above URLs track popularity and
+   spreading of servers? Yes. Does this buy him anything? I'm not sure.
+
+   Second, because no end-to-end encryption/authentication is used, the
+   exit node can discover what is being requested -- and can modify the
+   answers that are sent back. What are the partitioning opportunities
+   in this scenario -- both passive partitioning to discover patterns
+   of behavior based on which descriptors have just been fetched, and
+   active partitioning to mislead the user into believing a given server
+   is at a certain set of coordinates?
+
+   Third, Vidalia 0.0.8 will accept and cache responses to questions
+   that it didn't ask. This probably aids partitioning attacks.
+
+3. Future directions.
+
+3.1. Encryption to/from the coordinate servers.
+
+   It would be smart to encrypt the queries and responses, to at least
+   limit the exposure. This could be done simply by running a Tor server
+   nearby each geoip service, and asking for the address
+
+       geoip.vidalia-project.net.foo.exit:80
+
+   Of course, this approach introduces more points of failure. A more
+   complex scheme would be for Vidalia to check first whether the
+   preferred exit server is running, and modify the address only when
+   it is.
+
+3.2. Tor servers could include geoip data in network statuses.
+
+   Rather than having separate geoip services that Vidalia maintains,
+   we could instead integrate the geoip data into the Tor network
+   status documents. The Tor directory authorities would learn this
+   information and the users would learn it through their ordinary
+   directory downloads.
+
+   This would make life easier for Vidalia, but it would also increase
+   the bandwidth overhead of network-status downloads -- rather than
+   caching the geoip information, users would fetch it at every update.
+
+3.3. Map networks, not individual IP addresses.
+
+   We should stop mapping individual IP addresses. For servers that have
+   dynamic IP addresses, we end up with something like
+
+       68.179.33.128,Toronto,ON,CA,43.6667,-79.4168:1155698575
+       68.179.33.129,Toronto,ON,CA,43.6667,-79.4168:1155696448
+       68.179.33.131,Toronto,ON,CA,43.6667,-79.4168:1157209955
+
+   Instead we should just cache
+
+       68.179.33.0/24,Toronto,ON,CA,43.6667,-79.4168:1155698575
+
+   Maxmind supposedly has a free IP-to-ASN database, but I'm not sure
+   if it includes prefix information. If not, yeah, just matching on
+   /24 might be easiest and sufficient.
+
+3.4. What else is geoip information for?
+
+   What other uses do we have for this information? Is it only useful
+   for drawing maps of the Tor network?
+
+   Once we start letting users control their circuits based on geographic
+   data, the security questions in Section 2 become more challenging.


Property changes on: trunk/doc/geoip-spec.txt
___________________________________________________________________
Name: svn:keywords
   + Id