[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Hidden Descriptor and DHT



On Mon, May 01, 2006 at 01:30:04PM -0500, Krishna Sankar wrote:

Hi, Krishna, and thanks for your interest!  I've tried to answer your
questions, and I've written a bit more about the security issues that
exist, and about Tor's current scalability issues.

> Am looking into couple of solutions to separate the storage/lookup
> system - all using DHT, possibly a set of primitives on top of
> Chord. Couple of quick questions:
> 	a)	Is this still a project to work on ? Who else is
> 	working on this ?

As far as I know, nobody is working on this; it could be neat to get
results, but see below.

> 	b)	How many servers and how many entries would normally
> 	be there ? Would help if I have some order of magnitude - back
> 	of the envelope type calculations

It's hard to say; right now, we've got over 500 Tor servers, of which
most could be directory caches.  We also have 5 directory authorities.
The reason it's hard to say how many entries and servers we'd have in
a DHT scenario is that the whole point of changing our directory
structure would be to scale to more servers than we have now.

> 	c)	What kind of redundancy could we plan on ? Say 3
> 	distinct copies of entries ?

You'd have to decide on that based on reliability and anonymity
requirements.

> 	d)	How esoteric do we want this to be ? Is it Ok to take
> 	time to rebalance ?

Again, this depends on requirements.

> 	e)	Any thoughts on SCTP as an inter-server protocol ?
> 	Better heartbeat, multiple data and control channels et al

Some thoughts. I did an evaluation for SCTP here:
  http://archives.seul.org/or/dev/Sep-2004/msg00002.html

There are severe crypto and anonymity issues involved; instead of
encryption inside SCTP, we'd need SCTP-inside-encryption.  More
critically, there isn't enough SCTP support in the wild for us to
realistically require servers to have it installed.

> 	f)	Haven't yet focused on the security aspects, which is
> 	my next TBD. Thoughts ?

The security implications are *critical*; you shouldn't even be
thinking about stuff like rebalancing until you have those settled.
It is far easier to come up with a DHT algorithm that works than a DHT
algorithm that works in the presence of a strong attacker trying to
break it, or to use it in order to subvert users' anonymity.

You should probably start by understanding the current Tor directory
protocol, at:
      http://tor.eff.org/cvs/doc/dir-spec.txt

This protocol is adequate for security, but it has a few critical
problems that impede scalability:

   1. It requires every client to know about every server.
   2. It expects every directory cache to know about every server.
   3. It expects every server to potentially connect to every server.

Changing (1) is important.  If we move to a more P2P model where (say)
every client is potentially a server, we simply can't have every
client know every server.  But it's hard to change (1) without enabling
partitioning attacks, where an attacker exploits knowledge of a
particular client's knowledge to deduce which circuits might or might
not have come from a particular client.  It's also hard to change (1)
without introducing scenarios where hostile directory caches can
influence clients' knowledge in order to influence their choice of
servers.  I bet a solution is possible, though.

Fixing (3) is also important, if only because you can't have 200K TCP
ports in this sad world of ours.  But once we go to a non-clique P2P
topology, many of our assumptions about eavesdropping and traceability
get questioned.  Also, we'll need to figure out how to pick the
non-clique topology in question.

Changing (2) would be addressed by a DHT-like server lookup mechanism,
but it isn't IMO so critical as (1) and (3).  Suppose (an insanely
successful case) 100K servers: that makes about 260 MB for the whole
directory; 130 MB compressed.  That's a lot of bytes, but it's not so
much that every directory cache couldn't bittorrent the whole thing
every, say, 24 hours.  (Yes, this would kinda suck, but not so bad as
requiring the clients to bittorrent it as well.)

You should probably also read sections 5.2 and 4.5 of
      http://tor.eff.org/cvs/doc/design-paper/challenges.pdf .
They aren't right, but they're more right than they are wrong. 

> 
> Cheers
> Krishna
> 
> P.S: For many reasons, I have tried to send this mail a few times
> and obviously, have not yet succeeded ;o( Will keep on trying ...

I hope this helps; please let us know if you've got any questions.

-- 
Nick Mathewson

Attachment: pgpwgjnyISv3T.pgp
Description: PGP signature