[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-dev] Initial musings on the next Free Haven (part one of two)



P2P file sharing systems such as Free Haven can be considered as three
component layers: the transport layer (how to communicate things),
the PKI or lookup layer (how to keep track of things), and the backend
(what to do).

Our goals for Free Haven are threefold:

1) Strong Anonymity -- for publisher, for reader, for server.
2) Flexibility (dynamic) (eg no static list of nodes)
3) Document Persistence -- we are a storage network, not a caching
   network.

In the original Free Haven (maybe I should start calling it Free Haven
classic), we addressed each layer as follows:

* Transport layer: the mythical ideal mix-net. The current mix-net
  infrastructure has unbearably high latency and is rumored to have
  low reliability. In any case, it's hard to use and not very well set
  up. The amount of actual privacy gained from the current implementations
  is not well understood. Using reply blocks (eg for pseudonymous servers)
  is not long-term reliable, since they tend to die as mix-net nodes die.

* PKI layer: broadcast. Everybody keeps track of everybody. That way
  anonymity is easier, nobody needs to keep track of where shares live,
  and we can make sure to keep a connected network.
  This notion does not scale.

* Backend: a complex system of trades and trust metrics designed to
  confuse people about where shares currently live while still
  maintaining enough accountability to make sure that people aren't
  cheating 'too much'. This is hard to do well, and difficult to model
  so the only way to know if it works is to try and see.

It's time to try to replace the above solutions. They all suck.

Allow me to give a brief overview of a few primitives which might be
useful to us down the road.

* Tarzan: a Decentralized Stream-based Anonymizing Network.
  Tarzan employs a decentralized network which emphasizes simplicity
  and extensibility, and allows anonymous streaming connections between
  peers. Tarzan is designed to be a low-level infrastructure which
  provides either one-way or two-way streams, where one or both endpoints
  can be behind firewalls. http://freehaven.net/tarzan/doc/tarzan.ps

* Chord. Chord is a key/value lookup system. It uses a network of n
  nodes, and lookups or updates can be performed by touching only
  log n of the nodes.  With caching and replication, this becomes way
  better than log n in most cases. In the simplest sense, Chord can
  be used as a decentralized PKI replacement. That is, we can use it
  to keep track of which nodes are participating in the system, and
  whatever metadata we want to track about nodes (eg, what subsets
  of the hashspace they're responsible for). Similarly, we can use
  it to track the location of shares of documents stored in the
  system. http://theory.lcs.mit.edu/~karger/chord.ps

* Entanglements. This is an idea due to Marc Waldman and David Mazieres,
  and it's not published yet so I'm only going to outline it here.
  Basically, when you're doing secret sharing to produce shares for a
  new document, you "reuse" old shares from other previously published
  documents, for half of the new shares. Thus you make your new document
  *depend* on those other shares, and thus on those other documents. So
  for a given share, it will survive based on the popularity of the
  most popular document it's associated with (give or take). This also
  produces a new notion of plausible deniability, since there is no longer
  a deterministic link between fetching a given share and intending to
  fetch a specific document.

Stay tuned for part two of these musings, entitled "a straw man free
haven design," wherein I try to convince myself that we should replace
the (Mix-net, broadcast, trust/trade) triplet with (Tarzan, Chord,
Entanglements).
(In several days, at this rate.)

--Roger