[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] r9786: propose a plan for 104-short-descriptors (tor/trunk/doc/spec/proposals)



Author: arma
Date: 2007-03-09 17:55:35 -0500 (Fri, 09 Mar 2007)
New Revision: 9786

Modified:
   tor/trunk/doc/spec/proposals/104-short-descriptors.txt
Log:
propose a plan for 104-short-descriptors


Modified: tor/trunk/doc/spec/proposals/104-short-descriptors.txt
===================================================================
--- tor/trunk/doc/spec/proposals/104-short-descriptors.txt	2007-03-09 22:49:15 UTC (rev 9785)
+++ tor/trunk/doc/spec/proposals/104-short-descriptors.txt	2007-03-09 22:55:35 UTC (rev 9786)
@@ -34,7 +34,9 @@
   Another possible solution would be to drop these fields from descriptors,
   and have them uploaded as a part of a separate "bandwidth report" to the
   authorities.  This could help prevent the mistake of using long descriptors
-  in the place of short ones.
+  in the place of short ones. It could also be generalized later to be an
+  overall status report, to include sanitized GeoIP information and whatever
+  else comes up.
 
 Other disposable fields:
 
@@ -49,11 +51,15 @@
     accept
   (Apparently, exit polices are highly compressible.)
 
+  [Does size-on-disk matter to anybody? Some clients and servers don't
+   have much disk, or have really slow disk (e.g. USB). And we don't
+   store caches compressed right now. -RD]
+
 Issues:
 
   Indexing long descriptor or bandwidth reports presents an issue: right now
   the way to make sure you have the same copy of a descriptor as everyone
-  else is to request the descriptor by its digest, and to make sure to that
+  else is to request the descriptor by its digest, and to make sure that
   the digest you request is the one that the authorities like.
 
   Authorities should presumably list the digests of short descriptors, since
@@ -62,19 +68,21 @@
   with information nobody wants.
 
   Possible solutions are:
-    - Drop the property that you can be sure of having the same long
-      descriptor as others.  This seems unoptimal.
-    - Have a separate extra-information-status that also gets generated by the
+   1) Drop the property that you can be sure of having the same long
+      descriptor as others.  This seems unoptimal, but if nobody caches
+      long descriptors so you have to go to the authority to get them,
+      maybe it's not so bad.
+   2) Have a separate extra-information-status that also gets generated by the
       authorities; use it to tell which long descriptors others have.  Also a
       pain.
-    - Have short descriptors include a hash of the corresponding long
+   3) Have short descriptors include a hash of the corresponding long
       descriptor/extra-info.  This would keep the same order of magnitude
       performance increase (~59.2% savings as opposed to 61% savings.)
       This would require longdesc/extra-info downloaders to fetch
       router data before they could know which longdescs/extra info to fetch.
-    - Have each authority make a signed concatenated "extra info" document,
+   4) Have each authority make a signed concatenated "extra info" document,
       and hope we never need to reconcile them.
-    - ????
+   5) ????
 
 Migration:
 
@@ -83,12 +91,20 @@
        * Authorities should accept both, now, and silently drop short
          descriptors.
        * Routers should upload both once authorities accept them.
-       * There should be a "long descriptor" url and the current "normal" URL.
+       * There should be a "long descriptor" url named
+         /tor/server/fp-detailed/ and the current "normal" URL.
          Authorities should serve long descriptors from both URLs.
+         There's no such thing as asking for a long descriptor by
+         its digest.
      * Once tools that want long descriptors support fetching them from the
        "long descriptor" URL:
        * Have authorities remember short descriptors, and serve them from the
          'normal' URL.
+       These tools include:
+         lefkada's exit.py script.
+         tor26's noreply script and general directory cache.
+         https://nighteffect.us/tns/ for its graphs
+         and check with or-talk for the rest, once it's time.
 
   For bandwidth info approach:
      * First:
@@ -99,3 +115,30 @@
      * Once tools that want bandwidth info support fetching it:
        * Have routers stop including bandwidth info in their router
          descriptors.
+
+Discussion:
+
+  Solution 4 seems like a nice plan: in many cases, the external services
+  that use read-history and write-history are directory authorities
+  themselves, so they just use their local opinion.
+
+  Roger thinks we should go with the long/short descriptor plan, along
+  with solution 4. We don't want to just upload a bandwidth message,
+  because that involves new data structures for every new piece of
+  information we decide to upload. I suspect we'll realize once this
+  is deployed that there is other info we want to put in the long
+  descriptors.
+
+  This won't solve the future sanitized GeoIP uploading question, but
+  who knows where we'll actually want to send that data, and whether
+  we'll want to handle it with the same privacy constraints as this data,
+  so let's not try to solve that yet.
+
+  However, we may still need some basic reconciling algorithms between
+  authorities -- otherwise, if a router uploads to four authorities
+  and fails to reach the fifth, then that fifth will never have the new
+  descriptor. This will mean that the best strategy for external tools
+  is to fetch full concatenated-style long-descriptor lists from every
+  single authority, and merge them locally. So each authority should
+  periodically fetch the list from the others and take the new ones.
+