[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Exit Balancing Patch



On Thursday 26 July 2007 11:57:44 Mike Perry wrote:
>[..]
>
> Currently, there is a LOT of unused bandwidth in the top 15% of the
> nodes in the network due to load balancing issues. I've prepared some
> charts for my Defcon talk to illustrate this imbalancing.
>
Hi Mike,

Apologies to everyone if I'm restating the obvious here, but I'm trying to 
piece together an understanding of the issue from this and other threads. 
Everyone else is familiar with the issues so assumptions are made that have 
to be decoded by the novice.

You don't say so in your post, but I take it the percentile ranges were taken 
in descending order of advertised bw?

So if I understand this correctly:

* Overall network bw is poorly utilized because high-capacity nodes aren't 
participating in the large slice of the network's circuits that their bw 
warrants.

* Tor needs to be careful (and clip at some sane bw of X mb/s) because 
otherwise too much traffic will concentrate on a small group of nodes, 
harming anonymity.

Given that a tor circuit is only as fast as its slowest node isn't it just as 
likely that for some reasonable percentage of the time, the user's circuit's 
will still be problematically slow (e.g. fast guard, slow relay, fast exit)?

If that's true then increasing the clipping point alone will not improve 
overall performance, so in order to work does your proposal have to be 
two-pronged? 

( 1 ) adopt two-hop paths as a user option, because those will always be 
mostly, but not always, be fast if ( 2 ) high-capacity nodes get their 'fair' 
share of the user's circuits once punitive clipping is removed. This is 
because guards have a guaranteed high-ish bw and we've reduced our odds of 
getting a slow node on the rest of the circuit because we're only choosing 
one more, rather than two more AND we've allowed fast nodes to throw their 
weight around a bit more in the random selection.

Is there more to it or am I missing something?

As a side note:
Wouldn't a useful metric be nodes' cpu usage or even capacity (e.g. make and 
ghz)?  Is there any research or intuition on the relationship between cpu 
make/ghz'age and bw-capacity that illustrate how much of that bw is just 
notional once the node is participating in x circuits with y kb/s? Would it 
be worth including this in descriptors?