[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] r18172: {tor} move my microdescriptors proposal into slot 158 (in tor/trunk/doc/spec/proposals: . ideas)



Author: arma
Date: 2009-01-18 13:57:20 -0500 (Sun, 18 Jan 2009)
New Revision: 18172

Added:
   tor/trunk/doc/spec/proposals/158-microdescriptors.txt
Removed:
   tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt
Modified:
   tor/trunk/doc/spec/proposals/000-index.txt
Log:
move my microdescriptors proposal into slot 158


Modified: tor/trunk/doc/spec/proposals/000-index.txt
===================================================================
--- tor/trunk/doc/spec/proposals/000-index.txt	2009-01-18 18:56:28 UTC (rev 18171)
+++ tor/trunk/doc/spec/proposals/000-index.txt	2009-01-18 18:57:20 UTC (rev 18172)
@@ -80,6 +80,7 @@
 155  Four Improvements of Hidden Service Performance [FINISHED]
 156  Tracking blocked ports on the client side [OPEN]
 157  Make certificate downloads specific [ACCEPTED]
+158  Clients download consensus + microdescriptors [OPEN]
 
 
 Proposals by status:
@@ -99,6 +100,7 @@
    146  Add new flag to reflect long-term stability [for 0.2.1.x]
    149  Using data from NETINFO cells [for 0.2.1.x]
    156  Tracking blocked ports on the client side [for 0.2.?]
+   158  Clients download consensus + microdescriptors
  ACCEPTED:
    110  Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
    117  IPv6 exits [for 0.2.1.x]

Copied: tor/trunk/doc/spec/proposals/158-microdescriptors.txt (from rev 18171, tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt)
===================================================================
--- tor/trunk/doc/spec/proposals/158-microdescriptors.txt	                        (rev 0)
+++ tor/trunk/doc/spec/proposals/158-microdescriptors.txt	2009-01-18 18:57:20 UTC (rev 18172)
@@ -0,0 +1,207 @@
+Filename: 158-microdescriptors.txt
+Title: Clients download consensus + microdescriptors
+Version: $Revision$
+Last-Modified: $Date$
+Author: Roger Dingledine
+Created: 17-Jan-2009
+Status: Open
+
+1. Overview
+
+  This proposal replaces section 3.2 of proposal 141, which was
+  called "Fetching descriptors on demand". Rather than modifying the
+  circuit-building protocol to fetch a server descriptor inline at each
+  circuit extend, we instead put all of the information that clients need
+  either into the consensus itself, or into a new set of data about each
+  relay called a microdescriptor. The microdescriptor is a direct
+  transform from the relay descriptor, so relays don't even need to know
+  this is happening.
+
+  Descriptor elements that are small and frequently changing should go
+  in the consensus itself, and descriptor elements that are small and
+  relatively static should go in the microdescriptor. If we ever end up
+  with descriptor elements that aren't small yet clients need to know
+  them, we'll need to resume considering some design like the one in
+  proposal 141.
+
+2. Motivation
+
+  See
+  http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
+  http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
+  http://archives.seul.org/or/dev/Nov-2008/msg00007.html
+  for a discussion of the options and why this is currently the best
+  approach.
+
+3. Design
+
+  There are three pieces to the proposal. First, authorities will list in
+  their votes (and thus in the consensus) what relay descriptor elements
+  are included in the microdescriptor, and also list the expected hash
+  of microdescriptor for each relay. Second, directory mirrors will serve
+  microdescriptors. Third, clients will ask for them and cache them.
+
+3.1. Consensus changes
+
+  V3 votes should include a new line:
+    microdescriptor-elements bar baz foo
+  listing each descriptor element (sorted alphabetically) that authority
+  included when it calculated its expected microdescriptor hashes.
+
+  We also need to include the hash of each expected microdescriptor in
+  the routerstatus section. I suggest a new "m" line for each stanza,
+  with the base64 of the hash of the elements that the authority voted
+  for above.
+
+  The consensus microdescriptor-elements and "m" lines are then computed
+  as described in Section 3.1.2 below.
+
+  I believe that means we need a new consensus-method "6" that knows
+  how to compute the microdescriptor-elements and add "m" lines.
+
+3.1.1. Descriptor elements to include for now
+
+  To start, the element list that authorities suggest should be
+    family onion-key
+
+  (Note that the or-dev posts above only mention onion-key, but if
+  we don't also include family then clients will never learn it. It
+  seemed like it should be relatively static, so putting it in the
+  microdescriptor is smarter than trying to fit it into the consensus.)
+
+  We could imagine a config option "family,onion-key" so authorities
+  could change their voted preferences without needing to upgrade.
+
+3.1.2. Computing consensus for microdescriptor-elements and "m" lines
+
+  One approach is for the consensus microdescriptor-elements line to
+  include every element listed by a majority of authorities, sorted. The
+  problem here is that it will no longer be deterministic what the correct
+  hash for the "m" line should be. We could imagine telling the authority
+  to go look in its descriptor and produce the right hash itself, but
+  we don't want consensus calculation to be based on external data like
+  that. (Plus, the authority may not have the descriptor that everybody
+  else voted to use.)
+
+  The better approach is to take the exact set that has the most votes
+  (breaking ties by the set that has the most elements, and breaking
+  ties after that by whichever is alphabetically first). That will
+  increase the odds that we actually get a microdescriptor hash that
+  is both a) for the descriptor we're putting in the consensus, and b)
+  over the elements that we're declaring it should be for.
+
+  Then the "m" line for a given relay is the one that gets the most votes
+  from authorities that both a) voted for the microdescriptor-elements
+  line we're using, and b) voted for the descriptor we're using.
+
+  (If there's a tie, use the smaller hash. But really, if there are
+  multiple such votes and they differ about a microdescriptor, we caught
+  one of them lying or being buggy. We should log it to track down why.)
+
+  If there are no such votes, then we leave out the "m" line for that
+  relay. That means clients should avoid it for this time period. (As
+  an extension it could instead mean that clients should fetch the
+  descriptor and figure out its microdescriptor themselves. But let's
+  not get ahead of ourselves.)
+
+  It would be nice to have a more foolproof way to agree on what
+  microdescriptor hash each authority should vote for, so we can avoid
+  missing "m" lines. Just switching to a new consensus-method each time
+  we change the set of microdescriptor-elements won't help though, since
+  each authority will still have to decide what hash to vote for before
+  knowing what consensus-method will be used.
+
+  Here's one way we could do it. Each vote / consensus includes
+  the microdescriptor-elements that were used to compute the hashes,
+  and also a preferred-microdescriptor-elements set. If an authority
+  has a consensus from the previous period, then it should use the
+  consensus preferred-microdescriptor-elements when computing its votes
+  for microdescriptor-elements and the appropriate hashes in the upcoming
+  period. (If it has no previous consensus, then it just writes its
+  own preferences in both lines.)
+
+3.2. Directory mirrors serve microdescriptors
+
+  Directory mirrors should then read the microdescriptor-elements line
+  from the consensus, and learn how to answer requests. (Directory mirrors
+  continue to serve normal relay descriptors too, a) to serve old clients
+  and b) to be able to construct microdescriptors on the fly.)
+
+  The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
+    http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
+
+  All the microdescriptors from the current consensus should also be
+  available at:
+    http://<hostname>/tor/micro/all.z
+  so a client that's bootstrapping doesn't need to send a 70KB URL just
+  to name every microdescriptor it's looking for.
+
+  The format of a microdescriptor is the header line
+  "microdescriptor-header"
+  followed by each element (keyword and body), alphabetically. There's
+  no need to mention what hash it's for, since it's self-identifying:
+  you can hash the elements to learn this.
+
+  (Do we need a footer line to show that it's over, or is the next
+  microdescriptor line or EOF enough of a hint? A footer line wouldn't
+  hurt much. Also, no fair voting for the microdescriptor-element
+  "microdescriptor-header".)
+
+  The hash of the microdescriptor is simply the hash of the concatenated
+  elements -- not counting the header line or hypothetical footer line.
+  Unless you prefer that?
+
+  Is there a reasonable way to version these things? We could say that
+  the microdescriptor-header line can contain arguments which clients
+  must ignore if they don't understand them. Any better ways?
+
+  Directory mirrors should check to make sure that the microdescriptors
+  they're about to serve match the right hashes (either the hashes from
+  the fetch URL or the hashes from the consensus, respectively).
+
+  We will probably want to consider some sort of smart data structure to
+  be able to quickly convert microdescriptor hashes into the appropriate
+  microdescriptor. Clients will want this anyway when they load their
+  microdescriptor cache and want to match it up with the consensus to
+  see what's missing.
+
+3.3. Clients fetch them and cache them
+
+  When a client gets a new consensus, it looks to see if there are any
+  microdescriptors it needs to learn. If it needs to learn more than
+  some threshold of the microdescriptors (half?), it requests 'all',
+  else it requests only the missing ones.
+
+  Clients maintain a cache of microdescriptors along with metadata like
+  when it was last referenced by a consensus. They keep a microdescriptor
+  until it hasn't been mentioned in any consensus for a week. Future
+  clients might cache them for longer or shorter times.
+
+3.3.1. Information leaks from clients
+
+  If a client asks you for a set of microdescs, then you know she didn't
+  have them cached before. How much does that leak? What about when
+  we're all using our entry guards as directory guards, and we've seen
+  that user make a bunch of circuits already?
+
+  Fetching "all" when you need at least half is a good first order fix,
+  but might not be all there is to it.
+
+  Another future option would be to fetch some of the microdescriptors
+  anonymously (via a Tor circuit).
+
+4. Transition and deployment
+
+  Phase one, the directory authorities should start voting on
+  microdescriptors and microdescriptor elements, and putting them in the
+  consensus. This should happen during the 0.2.1.x series, and should
+  be relatively easy to do.
+
+  Phase two, directory mirrors should learn how to serve them, and learn
+  how to read the consensus to find out what they should be serving. This
+  phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
+  on how messy it turns out to be and how quickly we get around to it.
+
+  Phase three, clients should start fetching and caching them instead
+  of normal descriptors. This should happen post 0.2.1.x.
+


Property changes on: tor/trunk/doc/spec/proposals/158-microdescriptors.txt
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision
Added: svn:mergeinfo
   + 
Added: svn:eol-style
   + native

Deleted: tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt
===================================================================
--- tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt	2009-01-18 18:56:28 UTC (rev 18171)
+++ tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt	2009-01-18 18:57:20 UTC (rev 18172)
@@ -1,207 +0,0 @@
-Filename: xxx-microdescriptors.txt
-Title: Clients download consensus + microdescriptors
-Version: $Revision$
-Last-Modified: $Date$
-Author: Roger Dingledine
-Created: 17-Jan-2009
-Status: Open
-
-1. Overview
-
-  This proposal replaces section 3.2 of proposal 141, which was
-  called "Fetching descriptors on demand". Rather than modifying the
-  circuit-building protocol to fetch a server descriptor inline at each
-  circuit extend, we instead put all of the information that clients need
-  either into the consensus itself, or into a new set of data about each
-  relay called a microdescriptor. The microdescriptor is a direct
-  transform from the relay descriptor, so relays don't even need to know
-  this is happening.
-
-  Descriptor elements that are small and frequently changing should go
-  in the consensus itself, and descriptor elements that are small and
-  relatively static should go in the microdescriptor. If we ever end up
-  with descriptor elements that aren't small yet clients need to know
-  them, we'll need to resume considering some design like the one in
-  proposal 141.
-
-2. Motivation
-
-  See
-  http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
-  http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
-  http://archives.seul.org/or/dev/Nov-2008/msg00007.html
-  for a discussion of the options and why this is currently the best
-  approach.
-
-3. Design
-
-  There are three pieces to the proposal. First, authorities will list in
-  their votes (and thus in the consensus) what relay descriptor elements
-  are included in the microdescriptor, and also list the expected hash
-  of microdescriptor for each relay. Second, directory mirrors will serve
-  microdescriptors. Third, clients will ask for them and cache them.
-
-3.1. Consensus changes
-
-  V3 votes should include a new line:
-    microdescriptor-elements bar baz foo
-  listing each descriptor element (sorted alphabetically) that authority
-  included when it calculated its expected microdescriptor hashes.
-
-  We also need to include the hash of each expected microdescriptor in
-  the routerstatus section. I suggest a new "m" line for each stanza,
-  with the base64 of the hash of the elements that the authority voted
-  for above.
-
-  The consensus microdescriptor-elements and "m" lines are then computed
-  as described in Section 3.1.2 below.
-
-  I believe that means we need a new consensus-method "6" that knows
-  how to compute the microdescriptor-elements and add "m" lines.
-
-3.1.1. Descriptor elements to include for now
-
-  To start, the element list that authorities suggest should be
-    family onion-key
-
-  (Note that the or-dev posts above only mention onion-key, but if
-  we don't also include family then clients will never learn it. It
-  seemed like it should be relatively static, so putting it in the
-  microdescriptor is smarter than trying to fit it into the consensus.)
-
-  We could imagine a config option "family,onion-key" so authorities
-  could change their voted preferences without needing to upgrade.
-
-3.1.2. Computing consensus for microdescriptor-elements and "m" lines
-
-  One approach is for the consensus microdescriptor-elements line to
-  include every element listed by a majority of authorities, sorted. The
-  problem here is that it will no longer be deterministic what the correct
-  hash for the "m" line should be. We could imagine telling the authority
-  to go look in its descriptor and produce the right hash itself, but
-  we don't want consensus calculation to be based on external data like
-  that. (Plus, the authority may not have the descriptor that everybody
-  else voted to use.)
-
-  The better approach is to take the exact set that has the most votes
-  (breaking ties by the set that has the most elements, and breaking
-  ties after that by whichever is alphabetically first). That will
-  increase the odds that we actually get a microdescriptor hash that
-  is both a) for the descriptor we're putting in the consensus, and b)
-  over the elements that we're declaring it should be for.
-
-  Then the "m" line for a given relay is the one that gets the most votes
-  from authorities that both a) voted for the microdescriptor-elements
-  line we're using, and b) voted for the descriptor we're using.
-
-  (If there's a tie, use the smaller hash. But really, if there are
-  multiple such votes and they differ about a microdescriptor, we caught
-  one of them lying or being buggy. We should log it to track down why.)
-
-  If there are no such votes, then we leave out the "m" line for that
-  relay. That means clients should avoid it for this time period. (As
-  an extension it could instead mean that clients should fetch the
-  descriptor and figure out its microdescriptor themselves. But let's
-  not get ahead of ourselves.)
-
-  It would be nice to have a more foolproof way to agree on what
-  microdescriptor hash each authority should vote for, so we can avoid
-  missing "m" lines. Just switching to a new consensus-method each time
-  we change the set of microdescriptor-elements won't help though, since
-  each authority will still have to decide what hash to vote for before
-  knowing what consensus-method will be used.
-
-  Here's one way we could do it. Each vote / consensus includes
-  the microdescriptor-elements that were used to compute the hashes,
-  and also a preferred-microdescriptor-elements set. If an authority
-  has a consensus from the previous period, then it should use the
-  consensus preferred-microdescriptor-elements when computing its votes
-  for microdescriptor-elements and the appropriate hashes in the upcoming
-  period. (If it has no previous consensus, then it just writes its
-  own preferences in both lines.)
-
-3.2. Directory mirrors serve microdescriptors
-
-  Directory mirrors should then read the microdescriptor-elements line
-  from the consensus, and learn how to answer requests. (Directory mirrors
-  continue to serve normal relay descriptors too, a) to serve old clients
-  and b) to be able to construct microdescriptors on the fly.)
-
-  The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
-    http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
-
-  All the microdescriptors from the current consensus should also be
-  available at:
-    http://<hostname>/tor/micro/all.z
-  so a client that's bootstrapping doesn't need to send a 70KB URL just
-  to name every microdescriptor it's looking for.
-
-  The format of a microdescriptor is the header line
-  "microdescriptor-header"
-  followed by each element (keyword and body), alphabetically. There's
-  no need to mention what hash it's for, since it's self-identifying:
-  you can hash the elements to learn this.
-
-  (Do we need a footer line to show that it's over, or is the next
-  microdescriptor line or EOF enough of a hint? A footer line wouldn't
-  hurt much. Also, no fair voting for the microdescriptor-element
-  "microdescriptor-header".)
-
-  The hash of the microdescriptor is simply the hash of the concatenated
-  elements -- not counting the header line or hypothetical footer line.
-  Unless you prefer that?
-
-  Is there a reasonable way to version these things? We could say that
-  the microdescriptor-header line can contain arguments which clients
-  must ignore if they don't understand them. Any better ways?
-
-  Directory mirrors should check to make sure that the microdescriptors
-  they're about to serve match the right hashes (either the hashes from
-  the fetch URL or the hashes from the consensus, respectively).
-
-  We will probably want to consider some sort of smart data structure to
-  be able to quickly convert microdescriptor hashes into the appropriate
-  microdescriptor. Clients will want this anyway when they load their
-  microdescriptor cache and want to match it up with the consensus to
-  see what's missing.
-
-3.3. Clients fetch them and cache them
-
-  When a client gets a new consensus, it looks to see if there are any
-  microdescriptors it needs to learn. If it needs to learn more than
-  some threshold of the microdescriptors (half?), it requests 'all',
-  else it requests only the missing ones.
-
-  Clients maintain a cache of microdescriptors along with metadata like
-  when it was last referenced by a consensus. They keep a microdescriptor
-  until it hasn't been mentioned in any consensus for a week. Future
-  clients might cache them for longer or shorter times.
-
-3.3.1. Information leaks from clients
-
-  If a client asks you for a set of microdescs, then you know she didn't
-  have them cached before. How much does that leak? What about when
-  we're all using our entry guards as directory guards, and we've seen
-  that user make a bunch of circuits already?
-
-  Fetching "all" when you need at least half is a good first order fix,
-  but might not be all there is to it.
-
-  Another future option would be to fetch some of the microdescriptors
-  anonymously (via a Tor circuit).
-
-4. Transition and deployment
-
-  Phase one, the directory authorities should start voting on
-  microdescriptors and microdescriptor elements, and putting them in the
-  consensus. This should happen during the 0.2.1.x series, and should
-  be relatively easy to do.
-
-  Phase two, directory mirrors should learn how to serve them, and learn
-  how to read the consensus to find out what they should be serving. This
-  phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
-  on how messy it turns out to be and how quickly we get around to it.
-
-  Phase three, clients should start fetching and caching them instead
-  of normal descriptors. This should happen post 0.2.1.x.
-