[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] r15552: Add proposal 143: Improvements of Distributed Storage for To (tor/trunk/doc/spec/proposals)



Author: nickm
Date: 2008-06-28 15:25:39 -0400 (Sat, 28 Jun 2008)
New Revision: 15552

Added:
   tor/trunk/doc/spec/proposals/143-distributed-storage-improvements.txt
Modified:
   tor/trunk/doc/spec/proposals/000-index.txt
Log:
Add proposal 143: Improvements of Distributed Storage for Tor Hidden Service Descriptors

Modified: tor/trunk/doc/spec/proposals/000-index.txt
===================================================================
--- tor/trunk/doc/spec/proposals/000-index.txt	2008-06-28 18:47:05 UTC (rev 15551)
+++ tor/trunk/doc/spec/proposals/000-index.txt	2008-06-28 19:25:39 UTC (rev 15552)
@@ -65,6 +65,7 @@
 140  Provide diffs between consensuses [OPEN]
 141  Download server descriptors on demand [DRAFT]
 142  Combine Introduction and Rendezvous Points [OPEN]
+143  Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN]
 
 
 Proposals by status:
@@ -83,6 +84,7 @@
    137  Keep controllers informed as Tor bootstraps
    140  Provide diffs between consensuses
    142  Combine Introduction and Rendezvous Points
+   143  Improvements of Distributed Storage for Tor Hidden Service Descriptors
  NEEDS-REVISION:
    110  Avoiding infinite length circuits
    117  IPv6 exits

Added: tor/trunk/doc/spec/proposals/143-distributed-storage-improvements.txt
===================================================================
--- tor/trunk/doc/spec/proposals/143-distributed-storage-improvements.txt	                        (rev 0)
+++ tor/trunk/doc/spec/proposals/143-distributed-storage-improvements.txt	2008-06-28 19:25:39 UTC (rev 15552)
@@ -0,0 +1,195 @@
+Filename: 143-distributed-storage-improvements.txt
+Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
+Version: $Revision$
+Last-Modified: $Date$
+Author: Karsten Loesing
+Created: 28-Jun-2008
+Status: Open
+
+Change history:
+
+  28-Jun-2008  Initial proposal for or-dev
+
+Overview:
+
+  An evaluation of the distributed storage for Tor hidden service
+  descriptors and subsequent discussions have brought up a few improvements
+  to proposal 114. All improvements are backwards compatible to the
+  implementation of proposal 114.
+
+Design:
+
+  1. Report Bad Directory Nodes
+
+  Bad hidden service directory nodes could deny existence of previously
+  stored descriptors. A bad directory node that does this with all stored
+  descriptors causes harm to the distributed storage in general, but
+  replication will cope with this problem in most cases. However, an
+  adversary that attempts to make a specific hidden service unavailable by
+  running relays that become responsible for all of a service's
+  descriptors poses a more serious threat. The distributed storage needs to
+  defend against this attack by detecting and removing bad directory nodes.
+
+  As a countermeasure hidden services try to download their descriptors
+  every hour at random times from the hidden service directories that are
+  responsible for storing it. If a directory node replies with 404 (Not
+  found), the hidden service reports the supposedly bad directory node to
+  a random selection of half of the directory authorities (with version
+  numbers equal to or higher than the first version that implements this
+  proposal). The hidden service posts a complaint message using HTTP 'POST'
+  to a URL "/tor/rendezvous/complain" with the following message format:
+
+    "hidden-service-directory-complaint" identifier NL
+
+      [At start, exactly once]
+
+      The identifier of the hidden service directory node to be
+      investigated.
+
+    "rendezvous-service-descriptor" descriptor NL
+
+      [At end, Excatly once]
+
+      The hidden service descriptor that the supposedly bad directory node
+      does not serve.
+
+  The directory authority checks if the descriptor is valid and the hidden
+  service directory responsible for storing it. It waits for a random time
+  of up to 30 minutes before posting the descriptor to the hidden service
+  directory. If the publication is acknowledged, the directory authority
+  waits another random time of up to 30 minutes before attempting to
+  request the descriptor that it has posted. If the directory node replies
+  with 404 (Not found), it will be blacklisted for being a hidden service
+  directory node for the next 48 hours.
+
+  A blacklisted hidden service directory is assigned the new flag BadHSDir
+  instead of the HSDir flag in the vote that a directory authority creates.
+  In a consensus a relay is only assigned a HSDir flag if the majority of
+  votes contains a HSDir flag and no more than one third of votes contains
+  a BadHSDir flag. As a result, clients do not have to learn about the
+  BadHSDir flag. A blacklisted directory node will simply not be assigned
+  the HSDir flag in the consensus.
+
+  In order to prevent an attacker from setting up new nodes as replacement
+  for blacklisted directory nodes, all directory nodes in the same /24
+  subnet are blacklisted, too. Furthermore, if two or more directory nodes
+  are blacklisted in the same /16 subnet concurrently, all other directory
+  nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
+  most 48 hours.
+
+  2. Publish Fewer Replicas
+
+  The evaluation has shown that the probability of a directory node to
+  serve a previously stored descriptor is 85.7% (more precisely, this is
+  the 0.001-quantile of the empirical distribution with the rationale that
+  it holds for 99.9% of all empirical cases). If descriptors are replicated
+  to x directory nodes, the probability of at least one of the replicas to
+  be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
+  overall availability of 99.9%, x = 3.55 replicas need to be stored. From
+  this follows that 4 replicas are sufficient, rather than the currently
+  stored 6 replicas.
+
+  Further, the current design stores 2 sets of descriptors on 3 directory
+  nodes with consecutive identities. Originally, this was meant to
+  facilitate replication between directory nodes, which has not been and
+  will not be implemented (the selection criterion of 24 hours uptime does
+  not make it necessary). As a result, storing descriptors on directory
+  nodes with consecutive identities is not required. In fact it should be
+  avoided to enable an attacker to create "black holes" in the identifier
+  ring.
+
+  Hidden services should store their descriptors on 4 non-consecutive
+  directory nodes, and clients should request descriptors from these
+  directory nodes only. For compatibility reasons, hidden services also
+  store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
+  clients will be able to retrieve 4 out of 6 descriptors, but will fail
+  for the remaining 2 descriptors, which is sufficient for reliability. As
+  soon as 0.2.0.x is deprecated, hidden services can stop publishing the
+  additional 2 replicas.
+
+  3. Change Default Value of Being Hidden Service Directory
+
+  The requirements for becoming a hidden service directory node are an open
+  directory port and an uptime of at least 24 hours. The evaluation has
+  shown that there are 300 hidden service directory candidates in the mean,
+  but only 6 of them are configured to act as hidden service directories.
+  This is bad, because those 6 nodes need to serve a large share of all
+  hidden service descriptors. Optimally, there should be hundreds of hidden
+  service directories. Having a large number of 0.2.1.x directory nodes
+  also has a positive effect on 0.2.0.x hidden services and clients.
+
+  Therefore, the new default of HidServDirectoryV2 should be 1, so that a
+  Tor relay that has an open directory port automatically accepts and
+  serves v2 hidden service descriptors. A relay operator can still opt-out
+  running a hidden service directory by changing HidServDirectoryV2 to 0.
+  The additional bandwidth requirements for running a hidden service
+  directory node in addition to being a directory cache are negligible.
+
+  4. Make Descriptors Persistent on Directory Nodes
+
+  Hidden service directories that are restarted by their operators or after
+  a failure will not be selected as hidden service directories within the
+  next 24 hours. However, some clients might still think that these nodes
+  are responsible for certain descriptors, because they work on the basis
+  of network consensuses that are up to three hours old. The directory
+  nodes should be able to serve the previously received descriptors to
+  these clients. Therefore, directory nodes make all received descriptors
+  persistent and load previously received descriptors on startup.
+
+  5. Store and Serve Descriptors Regardless of Responsibility
+
+  Currently, directory nodes only accept descriptors for which they think
+  they are responsible. This may lead to problems when a directory node
+  uses an older or newer network consensus than hidden service or client
+  or when a directory node has been restarted recently. In fact, there are
+  no security issues in storing or serving descriptors for which a
+  directory node thinks it is not responsible. To the contrary, doing so
+  may improve reliability in border cases. As a result, a directory node
+  does not pay attention to responsibilty when receiving a publication or
+  fetch request, but stores or serves the requested descriptor. Likewise,
+  the directory node does not remove descriptors when it thinks it is not
+  responsible for them any more.
+
+  6. Avoid Periodic Descriptor Re-Publication
+
+  In the current implementation a hidden service re-publishes its
+  descriptor either when its content changes or an hour elapses. However,
+  the evaluation has shown that failures of hidden service directory nodes,
+  i.e. of nodes that have not failed within the last 24 hours, are very
+  rare. Together with making descriptors persistent on directory nodes,
+  there is no necessity to re-publish descriptors hourly.
+
+  The only two events leading to descriptor re-publication should be a
+  change of the descriptor content and a new directory node becoming
+  responsible for the descriptor. Hidden services should therefore consider
+  re-publication every time they learn about a new network consensus
+  instead of hourly.
+
+  7. Discard Expired Descriptors
+
+  The current implementation lets directory nodes keep a descriptor for two
+  days before discarding it. However, with the v2 design, descriptors are
+  only valid for at most one day. Directory nodes should determine the
+  validity of stored descriptors and discard them one hour after they have
+  expired (to compensate wrong clocks on clients).
+
+  8. Shorten Client-Side Descriptor Fetch History
+
+  When clients try to download a hidden service descriptor, they memorize
+  fetch requests to directory nodes for up to 15 minutes. This allows them
+  to request all replicas of a descriptor to avoid bad or failing directory
+  nodes, but without querying the same directory node twice.
+
+  The downside is that a client that has requested a descriptor without
+  success, will not be able to find a hidden service that has been started
+  during the following 15 minutes after the client's last request.
+
+  This can be improved by shortening the fetch history to only 5 minutes.
+  This time should be sufficient to complete requests for all replicas of a
+  descriptor, but without ending in an infinite request loop.
+
+Compatibility:
+
+  All proposed improvements are compatible to the currently implemented
+  design as described in proposal 114.
+


Property changes on: tor/trunk/doc/spec/proposals/143-distributed-storage-improvements.txt
___________________________________________________________________
Name: svn:keywords
   + Author Date Id Revision
Name: svn:eol-style
   + native