On Mon, May 01, 2006 at 01:30:04PM -0500, Krishna Sankar wrote: Hi, Krishna, and thanks for your interest! I've tried to answer your questions, and I've written a bit more about the security issues that exist, and about Tor's current scalability issues. > Am looking into couple of solutions to separate the storage/lookup > system - all using DHT, possibly a set of primitives on top of > Chord. Couple of quick questions: > a) Is this still a project to work on ? Who else is > working on this ? As far as I know, nobody is working on this; it could be neat to get results, but see below. > b) How many servers and how many entries would normally > be there ? Would help if I have some order of magnitude - back > of the envelope type calculations It's hard to say; right now, we've got over 500 Tor servers, of which most could be directory caches. We also have 5 directory authorities. The reason it's hard to say how many entries and servers we'd have in a DHT scenario is that the whole point of changing our directory structure would be to scale to more servers than we have now. > c) What kind of redundancy could we plan on ? Say 3 > distinct copies of entries ? You'd have to decide on that based on reliability and anonymity requirements. > d) How esoteric do we want this to be ? Is it Ok to take > time to rebalance ? Again, this depends on requirements. > e) Any thoughts on SCTP as an inter-server protocol ? > Better heartbeat, multiple data and control channels et al Some thoughts. I did an evaluation for SCTP here: http://archives.seul.org/or/dev/Sep-2004/msg00002.html There are severe crypto and anonymity issues involved; instead of encryption inside SCTP, we'd need SCTP-inside-encryption. More critically, there isn't enough SCTP support in the wild for us to realistically require servers to have it installed. > f) Haven't yet focused on the security aspects, which is > my next TBD. Thoughts ? The security implications are *critical*; you shouldn't even be thinking about stuff like rebalancing until you have those settled. It is far easier to come up with a DHT algorithm that works than a DHT algorithm that works in the presence of a strong attacker trying to break it, or to use it in order to subvert users' anonymity. You should probably start by understanding the current Tor directory protocol, at: http://tor.eff.org/cvs/doc/dir-spec.txt This protocol is adequate for security, but it has a few critical problems that impede scalability: 1. It requires every client to know about every server. 2. It expects every directory cache to know about every server. 3. It expects every server to potentially connect to every server. Changing (1) is important. If we move to a more P2P model where (say) every client is potentially a server, we simply can't have every client know every server. But it's hard to change (1) without enabling partitioning attacks, where an attacker exploits knowledge of a particular client's knowledge to deduce which circuits might or might not have come from a particular client. It's also hard to change (1) without introducing scenarios where hostile directory caches can influence clients' knowledge in order to influence their choice of servers. I bet a solution is possible, though. Fixing (3) is also important, if only because you can't have 200K TCP ports in this sad world of ours. But once we go to a non-clique P2P topology, many of our assumptions about eavesdropping and traceability get questioned. Also, we'll need to figure out how to pick the non-clique topology in question. Changing (2) would be addressed by a DHT-like server lookup mechanism, but it isn't IMO so critical as (1) and (3). Suppose (an insanely successful case) 100K servers: that makes about 260 MB for the whole directory; 130 MB compressed. That's a lot of bytes, but it's not so much that every directory cache couldn't bittorrent the whole thing every, say, 24 hours. (Yes, this would kinda suck, but not so bad as requiring the clients to bittorrent it as well.) You should probably also read sections 5.2 and 4.5 of http://tor.eff.org/cvs/doc/design-paper/challenges.pdf . They aren't right, but they're more right than they are wrong. > > Cheers > Krishna > > P.S: For many reasons, I have tried to send this mail a few times > and obviously, have not yet succeeded ;o( Will keep on trying ... I hope this helps; please let us know if you've got any questions. -- Nick Mathewson
Attachment:
pgpwgjnyISv3T.pgp
Description: PGP signature