[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[or-cvs] r18172: {tor} move my microdescriptors proposal into slot 158 (in tor/trunk/doc/spec/proposals: . ideas)
Author: arma
Date: 2009-01-18 13:57:20 -0500 (Sun, 18 Jan 2009)
New Revision: 18172
Added:
tor/trunk/doc/spec/proposals/158-microdescriptors.txt
Removed:
tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt
Modified:
tor/trunk/doc/spec/proposals/000-index.txt
Log:
move my microdescriptors proposal into slot 158
Modified: tor/trunk/doc/spec/proposals/000-index.txt
===================================================================
--- tor/trunk/doc/spec/proposals/000-index.txt 2009-01-18 18:56:28 UTC (rev 18171)
+++ tor/trunk/doc/spec/proposals/000-index.txt 2009-01-18 18:57:20 UTC (rev 18172)
@@ -80,6 +80,7 @@
155 Four Improvements of Hidden Service Performance [FINISHED]
156 Tracking blocked ports on the client side [OPEN]
157 Make certificate downloads specific [ACCEPTED]
+158 Clients download consensus + microdescriptors [OPEN]
Proposals by status:
@@ -99,6 +100,7 @@
146 Add new flag to reflect long-term stability [for 0.2.1.x]
149 Using data from NETINFO cells [for 0.2.1.x]
156 Tracking blocked ports on the client side [for 0.2.?]
+ 158 Clients download consensus + microdescriptors
ACCEPTED:
110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
117 IPv6 exits [for 0.2.1.x]
Copied: tor/trunk/doc/spec/proposals/158-microdescriptors.txt (from rev 18171, tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt)
===================================================================
--- tor/trunk/doc/spec/proposals/158-microdescriptors.txt (rev 0)
+++ tor/trunk/doc/spec/proposals/158-microdescriptors.txt 2009-01-18 18:57:20 UTC (rev 18172)
@@ -0,0 +1,207 @@
+Filename: 158-microdescriptors.txt
+Title: Clients download consensus + microdescriptors
+Version: $Revision$
+Last-Modified: $Date$
+Author: Roger Dingledine
+Created: 17-Jan-2009
+Status: Open
+
+1. Overview
+
+ This proposal replaces section 3.2 of proposal 141, which was
+ called "Fetching descriptors on demand". Rather than modifying the
+ circuit-building protocol to fetch a server descriptor inline at each
+ circuit extend, we instead put all of the information that clients need
+ either into the consensus itself, or into a new set of data about each
+ relay called a microdescriptor. The microdescriptor is a direct
+ transform from the relay descriptor, so relays don't even need to know
+ this is happening.
+
+ Descriptor elements that are small and frequently changing should go
+ in the consensus itself, and descriptor elements that are small and
+ relatively static should go in the microdescriptor. If we ever end up
+ with descriptor elements that aren't small yet clients need to know
+ them, we'll need to resume considering some design like the one in
+ proposal 141.
+
+2. Motivation
+
+ See
+ http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
+ http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
+ http://archives.seul.org/or/dev/Nov-2008/msg00007.html
+ for a discussion of the options and why this is currently the best
+ approach.
+
+3. Design
+
+ There are three pieces to the proposal. First, authorities will list in
+ their votes (and thus in the consensus) what relay descriptor elements
+ are included in the microdescriptor, and also list the expected hash
+ of microdescriptor for each relay. Second, directory mirrors will serve
+ microdescriptors. Third, clients will ask for them and cache them.
+
+3.1. Consensus changes
+
+ V3 votes should include a new line:
+ microdescriptor-elements bar baz foo
+ listing each descriptor element (sorted alphabetically) that authority
+ included when it calculated its expected microdescriptor hashes.
+
+ We also need to include the hash of each expected microdescriptor in
+ the routerstatus section. I suggest a new "m" line for each stanza,
+ with the base64 of the hash of the elements that the authority voted
+ for above.
+
+ The consensus microdescriptor-elements and "m" lines are then computed
+ as described in Section 3.1.2 below.
+
+ I believe that means we need a new consensus-method "6" that knows
+ how to compute the microdescriptor-elements and add "m" lines.
+
+3.1.1. Descriptor elements to include for now
+
+ To start, the element list that authorities suggest should be
+ family onion-key
+
+ (Note that the or-dev posts above only mention onion-key, but if
+ we don't also include family then clients will never learn it. It
+ seemed like it should be relatively static, so putting it in the
+ microdescriptor is smarter than trying to fit it into the consensus.)
+
+ We could imagine a config option "family,onion-key" so authorities
+ could change their voted preferences without needing to upgrade.
+
+3.1.2. Computing consensus for microdescriptor-elements and "m" lines
+
+ One approach is for the consensus microdescriptor-elements line to
+ include every element listed by a majority of authorities, sorted. The
+ problem here is that it will no longer be deterministic what the correct
+ hash for the "m" line should be. We could imagine telling the authority
+ to go look in its descriptor and produce the right hash itself, but
+ we don't want consensus calculation to be based on external data like
+ that. (Plus, the authority may not have the descriptor that everybody
+ else voted to use.)
+
+ The better approach is to take the exact set that has the most votes
+ (breaking ties by the set that has the most elements, and breaking
+ ties after that by whichever is alphabetically first). That will
+ increase the odds that we actually get a microdescriptor hash that
+ is both a) for the descriptor we're putting in the consensus, and b)
+ over the elements that we're declaring it should be for.
+
+ Then the "m" line for a given relay is the one that gets the most votes
+ from authorities that both a) voted for the microdescriptor-elements
+ line we're using, and b) voted for the descriptor we're using.
+
+ (If there's a tie, use the smaller hash. But really, if there are
+ multiple such votes and they differ about a microdescriptor, we caught
+ one of them lying or being buggy. We should log it to track down why.)
+
+ If there are no such votes, then we leave out the "m" line for that
+ relay. That means clients should avoid it for this time period. (As
+ an extension it could instead mean that clients should fetch the
+ descriptor and figure out its microdescriptor themselves. But let's
+ not get ahead of ourselves.)
+
+ It would be nice to have a more foolproof way to agree on what
+ microdescriptor hash each authority should vote for, so we can avoid
+ missing "m" lines. Just switching to a new consensus-method each time
+ we change the set of microdescriptor-elements won't help though, since
+ each authority will still have to decide what hash to vote for before
+ knowing what consensus-method will be used.
+
+ Here's one way we could do it. Each vote / consensus includes
+ the microdescriptor-elements that were used to compute the hashes,
+ and also a preferred-microdescriptor-elements set. If an authority
+ has a consensus from the previous period, then it should use the
+ consensus preferred-microdescriptor-elements when computing its votes
+ for microdescriptor-elements and the appropriate hashes in the upcoming
+ period. (If it has no previous consensus, then it just writes its
+ own preferences in both lines.)
+
+3.2. Directory mirrors serve microdescriptors
+
+ Directory mirrors should then read the microdescriptor-elements line
+ from the consensus, and learn how to answer requests. (Directory mirrors
+ continue to serve normal relay descriptors too, a) to serve old clients
+ and b) to be able to construct microdescriptors on the fly.)
+
+ The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
+ http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
+
+ All the microdescriptors from the current consensus should also be
+ available at:
+ http://<hostname>/tor/micro/all.z
+ so a client that's bootstrapping doesn't need to send a 70KB URL just
+ to name every microdescriptor it's looking for.
+
+ The format of a microdescriptor is the header line
+ "microdescriptor-header"
+ followed by each element (keyword and body), alphabetically. There's
+ no need to mention what hash it's for, since it's self-identifying:
+ you can hash the elements to learn this.
+
+ (Do we need a footer line to show that it's over, or is the next
+ microdescriptor line or EOF enough of a hint? A footer line wouldn't
+ hurt much. Also, no fair voting for the microdescriptor-element
+ "microdescriptor-header".)
+
+ The hash of the microdescriptor is simply the hash of the concatenated
+ elements -- not counting the header line or hypothetical footer line.
+ Unless you prefer that?
+
+ Is there a reasonable way to version these things? We could say that
+ the microdescriptor-header line can contain arguments which clients
+ must ignore if they don't understand them. Any better ways?
+
+ Directory mirrors should check to make sure that the microdescriptors
+ they're about to serve match the right hashes (either the hashes from
+ the fetch URL or the hashes from the consensus, respectively).
+
+ We will probably want to consider some sort of smart data structure to
+ be able to quickly convert microdescriptor hashes into the appropriate
+ microdescriptor. Clients will want this anyway when they load their
+ microdescriptor cache and want to match it up with the consensus to
+ see what's missing.
+
+3.3. Clients fetch them and cache them
+
+ When a client gets a new consensus, it looks to see if there are any
+ microdescriptors it needs to learn. If it needs to learn more than
+ some threshold of the microdescriptors (half?), it requests 'all',
+ else it requests only the missing ones.
+
+ Clients maintain a cache of microdescriptors along with metadata like
+ when it was last referenced by a consensus. They keep a microdescriptor
+ until it hasn't been mentioned in any consensus for a week. Future
+ clients might cache them for longer or shorter times.
+
+3.3.1. Information leaks from clients
+
+ If a client asks you for a set of microdescs, then you know she didn't
+ have them cached before. How much does that leak? What about when
+ we're all using our entry guards as directory guards, and we've seen
+ that user make a bunch of circuits already?
+
+ Fetching "all" when you need at least half is a good first order fix,
+ but might not be all there is to it.
+
+ Another future option would be to fetch some of the microdescriptors
+ anonymously (via a Tor circuit).
+
+4. Transition and deployment
+
+ Phase one, the directory authorities should start voting on
+ microdescriptors and microdescriptor elements, and putting them in the
+ consensus. This should happen during the 0.2.1.x series, and should
+ be relatively easy to do.
+
+ Phase two, directory mirrors should learn how to serve them, and learn
+ how to read the consensus to find out what they should be serving. This
+ phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
+ on how messy it turns out to be and how quickly we get around to it.
+
+ Phase three, clients should start fetching and caching them instead
+ of normal descriptors. This should happen post 0.2.1.x.
+
Property changes on: tor/trunk/doc/spec/proposals/158-microdescriptors.txt
___________________________________________________________________
Added: svn:keywords
+ Author Date Id Revision
Added: svn:mergeinfo
+
Added: svn:eol-style
+ native
Deleted: tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt
===================================================================
--- tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt 2009-01-18 18:56:28 UTC (rev 18171)
+++ tor/trunk/doc/spec/proposals/ideas/xxx-microdescriptors.txt 2009-01-18 18:57:20 UTC (rev 18172)
@@ -1,207 +0,0 @@
-Filename: xxx-microdescriptors.txt
-Title: Clients download consensus + microdescriptors
-Version: $Revision$
-Last-Modified: $Date$
-Author: Roger Dingledine
-Created: 17-Jan-2009
-Status: Open
-
-1. Overview
-
- This proposal replaces section 3.2 of proposal 141, which was
- called "Fetching descriptors on demand". Rather than modifying the
- circuit-building protocol to fetch a server descriptor inline at each
- circuit extend, we instead put all of the information that clients need
- either into the consensus itself, or into a new set of data about each
- relay called a microdescriptor. The microdescriptor is a direct
- transform from the relay descriptor, so relays don't even need to know
- this is happening.
-
- Descriptor elements that are small and frequently changing should go
- in the consensus itself, and descriptor elements that are small and
- relatively static should go in the microdescriptor. If we ever end up
- with descriptor elements that aren't small yet clients need to know
- them, we'll need to resume considering some design like the one in
- proposal 141.
-
-2. Motivation
-
- See
- http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
- http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
- http://archives.seul.org/or/dev/Nov-2008/msg00007.html
- for a discussion of the options and why this is currently the best
- approach.
-
-3. Design
-
- There are three pieces to the proposal. First, authorities will list in
- their votes (and thus in the consensus) what relay descriptor elements
- are included in the microdescriptor, and also list the expected hash
- of microdescriptor for each relay. Second, directory mirrors will serve
- microdescriptors. Third, clients will ask for them and cache them.
-
-3.1. Consensus changes
-
- V3 votes should include a new line:
- microdescriptor-elements bar baz foo
- listing each descriptor element (sorted alphabetically) that authority
- included when it calculated its expected microdescriptor hashes.
-
- We also need to include the hash of each expected microdescriptor in
- the routerstatus section. I suggest a new "m" line for each stanza,
- with the base64 of the hash of the elements that the authority voted
- for above.
-
- The consensus microdescriptor-elements and "m" lines are then computed
- as described in Section 3.1.2 below.
-
- I believe that means we need a new consensus-method "6" that knows
- how to compute the microdescriptor-elements and add "m" lines.
-
-3.1.1. Descriptor elements to include for now
-
- To start, the element list that authorities suggest should be
- family onion-key
-
- (Note that the or-dev posts above only mention onion-key, but if
- we don't also include family then clients will never learn it. It
- seemed like it should be relatively static, so putting it in the
- microdescriptor is smarter than trying to fit it into the consensus.)
-
- We could imagine a config option "family,onion-key" so authorities
- could change their voted preferences without needing to upgrade.
-
-3.1.2. Computing consensus for microdescriptor-elements and "m" lines
-
- One approach is for the consensus microdescriptor-elements line to
- include every element listed by a majority of authorities, sorted. The
- problem here is that it will no longer be deterministic what the correct
- hash for the "m" line should be. We could imagine telling the authority
- to go look in its descriptor and produce the right hash itself, but
- we don't want consensus calculation to be based on external data like
- that. (Plus, the authority may not have the descriptor that everybody
- else voted to use.)
-
- The better approach is to take the exact set that has the most votes
- (breaking ties by the set that has the most elements, and breaking
- ties after that by whichever is alphabetically first). That will
- increase the odds that we actually get a microdescriptor hash that
- is both a) for the descriptor we're putting in the consensus, and b)
- over the elements that we're declaring it should be for.
-
- Then the "m" line for a given relay is the one that gets the most votes
- from authorities that both a) voted for the microdescriptor-elements
- line we're using, and b) voted for the descriptor we're using.
-
- (If there's a tie, use the smaller hash. But really, if there are
- multiple such votes and they differ about a microdescriptor, we caught
- one of them lying or being buggy. We should log it to track down why.)
-
- If there are no such votes, then we leave out the "m" line for that
- relay. That means clients should avoid it for this time period. (As
- an extension it could instead mean that clients should fetch the
- descriptor and figure out its microdescriptor themselves. But let's
- not get ahead of ourselves.)
-
- It would be nice to have a more foolproof way to agree on what
- microdescriptor hash each authority should vote for, so we can avoid
- missing "m" lines. Just switching to a new consensus-method each time
- we change the set of microdescriptor-elements won't help though, since
- each authority will still have to decide what hash to vote for before
- knowing what consensus-method will be used.
-
- Here's one way we could do it. Each vote / consensus includes
- the microdescriptor-elements that were used to compute the hashes,
- and also a preferred-microdescriptor-elements set. If an authority
- has a consensus from the previous period, then it should use the
- consensus preferred-microdescriptor-elements when computing its votes
- for microdescriptor-elements and the appropriate hashes in the upcoming
- period. (If it has no previous consensus, then it just writes its
- own preferences in both lines.)
-
-3.2. Directory mirrors serve microdescriptors
-
- Directory mirrors should then read the microdescriptor-elements line
- from the consensus, and learn how to answer requests. (Directory mirrors
- continue to serve normal relay descriptors too, a) to serve old clients
- and b) to be able to construct microdescriptors on the fly.)
-
- The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
- http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
-
- All the microdescriptors from the current consensus should also be
- available at:
- http://<hostname>/tor/micro/all.z
- so a client that's bootstrapping doesn't need to send a 70KB URL just
- to name every microdescriptor it's looking for.
-
- The format of a microdescriptor is the header line
- "microdescriptor-header"
- followed by each element (keyword and body), alphabetically. There's
- no need to mention what hash it's for, since it's self-identifying:
- you can hash the elements to learn this.
-
- (Do we need a footer line to show that it's over, or is the next
- microdescriptor line or EOF enough of a hint? A footer line wouldn't
- hurt much. Also, no fair voting for the microdescriptor-element
- "microdescriptor-header".)
-
- The hash of the microdescriptor is simply the hash of the concatenated
- elements -- not counting the header line or hypothetical footer line.
- Unless you prefer that?
-
- Is there a reasonable way to version these things? We could say that
- the microdescriptor-header line can contain arguments which clients
- must ignore if they don't understand them. Any better ways?
-
- Directory mirrors should check to make sure that the microdescriptors
- they're about to serve match the right hashes (either the hashes from
- the fetch URL or the hashes from the consensus, respectively).
-
- We will probably want to consider some sort of smart data structure to
- be able to quickly convert microdescriptor hashes into the appropriate
- microdescriptor. Clients will want this anyway when they load their
- microdescriptor cache and want to match it up with the consensus to
- see what's missing.
-
-3.3. Clients fetch them and cache them
-
- When a client gets a new consensus, it looks to see if there are any
- microdescriptors it needs to learn. If it needs to learn more than
- some threshold of the microdescriptors (half?), it requests 'all',
- else it requests only the missing ones.
-
- Clients maintain a cache of microdescriptors along with metadata like
- when it was last referenced by a consensus. They keep a microdescriptor
- until it hasn't been mentioned in any consensus for a week. Future
- clients might cache them for longer or shorter times.
-
-3.3.1. Information leaks from clients
-
- If a client asks you for a set of microdescs, then you know she didn't
- have them cached before. How much does that leak? What about when
- we're all using our entry guards as directory guards, and we've seen
- that user make a bunch of circuits already?
-
- Fetching "all" when you need at least half is a good first order fix,
- but might not be all there is to it.
-
- Another future option would be to fetch some of the microdescriptors
- anonymously (via a Tor circuit).
-
-4. Transition and deployment
-
- Phase one, the directory authorities should start voting on
- microdescriptors and microdescriptor elements, and putting them in the
- consensus. This should happen during the 0.2.1.x series, and should
- be relatively easy to do.
-
- Phase two, directory mirrors should learn how to serve them, and learn
- how to read the consensus to find out what they should be serving. This
- phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
- on how messy it turns out to be and how quickly we get around to it.
-
- Phase three, clients should start fetching and caching them instead
- of normal descriptors. This should happen post 0.2.1.x.
-