[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: 0.0.8pre1 works: now what?



In my eyes, there are three big issues remaining. We need to:
1) let clients describe how comfortable they are using unverified servers
in various positions in their paths --- and recommend good defaults.
2) detect which servers are suitable for given streams.
3) reduce dirserver bottlenecks.

---------------------------------------------------------------------
Part 1: configuring clients; and good defaults.

Nodes should have an 'AllowUnverified' config option, which takes
combinations of entry/exit/middle, or any.
I agree it's difficult to decide for a reasonable default value. Using clients only as exit nodes only slightly increases a user's risk because the exit node can't learn much without colluding with a server node that may be picked as the first hop. Exit plus middle nodes as clients is also quite safe. Picking clients as entries is risky as the client may own/observe the web (or whatever) server. Using clients as both entry and exit in a path is the highest risk because owning/observing the web server is no longer needed for the adversary.

The quite paranoid default could therefore be "use clients as exit and/or middle nodes in a path"; the less paranoid one is "use clients as entry/middle/exit but never as entry and exit in the same path". Other choices do not make much sense in my opinion.

What's funny is that the really paranoid user wants others to use his node as entry node, but picking clients as entries is relatively high risk. Similarly, most users would use clients as exits, but only a few clients will be willing to act es exits. Could this mean that there potential to "unload" traffic onto the clients is quite small? Or put it differently, can we shift incentive for clients somehow from entry (which currently gives you better anonymity) to exit (which currently could give you troubles) nodes? I don't have an answer but I believe this is a key problem to be solved on the way towards a hybrid Tor where a significant (in fact most) of the traffic is handled by clients.


---------------------------------------------------------------------
Part 2: choosing suitable servers.

If we want to maintain the high quality of the Tor network, we need a
way to determine and indicate bandwidth (aka latency) and reliability
properties for each server.

Approach one: ask people to only sign up if they're high-quality nodes,
and also require them to send us an explanation in email so we can approve
their server. This works quite well, but if we take the required email
out of the picture, bad servers might start popping out of the woodwork.
(It's amazing how many people don't follow instructions.)
Maybe not a bad choice to start until you get very many e-mails a day.

Approach two: nodes track their own uptime, and estimate their max
bandwidth. The way they track their max bandwidth right now is by
recording whenever bytes go in or out, and remembering a rolling average
over the past ten seconds, and then also the maximum rolling-average
observed in the past 12 hours. Then the estimated bandwidth is the smaller
of the in-max and the out-max. They report this in the descriptor they
upload, rounding it down to the nearest 10KB, and capping anything over
100KB to 100KB. Clients could be more likely to choose nodes with higher
bandwidth entries (maybe from a linear distribution, maybe something
else -- thoughts?).
Sounds reasonable. Simply picking clients at random (bandwidth-dependant) in a circuit may not be the best option, though. Better would be classifying clients according their usefulness for specific application. E.g. that 10KB node (or even less) that is online for 23+ hours a day is a great choice for remote logins, and this 100KB client that is usually online just one hour a day is just fine for web browsing. The disadvantage is that this requires the circuit setup to be application aware.

Since uptime is published too, some streams (such as irc or aim) prefer
reliability to latency. Maybe we should prefer latency by default,
and have a Config option StreamPrefersReliability to specify by port
(or by addr:port, or anything exit-policy-style), that looks at uptime
rather than advertised bandwidth.
OK, that's about what I meant above :-)

And of course, notice that we're trusting the servers to not lie. We
could do spot-checking by the dirservers, but I'm not sure what we would
do if one dirserver thought something was up, and the others were fine.
At least for jerks the dirservers can agree about, maybe we could
configure the dirservers to blacklist descriptors from certain IP spaces.
In general - although I fully agree that making use of the clients is a necessary step -- including clients is extremely risky. First of all, new users may soon be frustrated and leave if performance is poor because of a few unreliable clients (I agree that we have the same problem if the Tor-core, i.e. the servers get overloaded). This means that the clients offering service should offer really good service, which will be extremely hard to guarantee. The other issue is that some users may find it funny to disrupt the QoS. I'm not so much concerned about adversaries running many nodes (not until Tor really gets big), but more about clients that drop the circuits of others randomly. This would easily reduce Tor to its core again because nobody will use other clients.

In any case, all your questions are extremely difficult to answerI believe the right way to go is to come up with a reasonable (it won't be perfect, only time and experience will tell) design to start with and give it a try. Same way as Tor has done since its first public release. falling back to Tor-core only is always an option. What follows depends on the popularity of Tor. If 100 clients will act as relays, not many problems should arise (despite poor service for the other 99900 clients... I'm very curious about how many of the clients will relay data for others). Node discovery must likely change if 1000s of clients are offering to relay data for others. Maybe there will be too many clients offering poor service and Tor will performs poor. It may then be time to think about a reputation scheme again...

--Marc