[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: peculiar server "bandwidth" posted by server "mnl" and possible new type of attack

On Tue, Sep 09, 2008 at 05:15:15AM -0500, Scott Bennett wrote:
> bandwidth 5242880 10485760 52239166
>                            ^^^^^^^^ ---> ~48.8 MB/s (!)

Wow. Nice! :)

And, as you say, unlikely to be true.

>      All of the above leads me to suspect two things.  One is that there may
> be some bug triggered only under exceedingly odd conditions that leads to
> reporting of data rates grossly out of line with reality.

Yep. Probably an integer underflow somewhere.

>      The second is that when a server claims such disproportionately high
> capacity, the overall performance of the tor server network can be compromised.

There has been quite a bit of work on this one, at least.

First, clients use the min of the bandwidthrate and the measured
bandwidth, so in the above "5242880 10485760 52239166" example clients
will load-balance based on 5242880. So this bug doesn't have any real
effect, at least in this case.

Also, clients cap the number they'll believe at 10MB, so an evil relay
can only sucker people so much.

There are research designs for doing community-measured bandwidth
numbers, which would make it much harder for an evil relay to sucker
people. See for example http://freehaven.net/anonbib/#snader08
Unfortunately, all of these designs have downsides currently, so they're
not ready for deployment yet.

>      That brings us back to something I've already posted on OR-TALK, namely,
> the apparent slowdown in tor traffic that has reduced the traffic through my
> tor server by at least 30% and, judging from the reduced peaks shown for a lot
> of the high-volume servers listed on the torstatus page, the tor network at
> large.

We're working on plans to start gathering more methodical data about
how the network has run and is running, with the goal of being able to
answer questions like this more usefully.

Another example I'd love to investigate is the apparent feedback effect --
once a relay happens to have a high-volume flow through it, it tends to
continue to advertise high bandwidth numbers for weeks after, whereas if
it doesn't start with any high-volume flows, it never seems to attract
them later.

My guess is that the current slowdown is due to a combination of fewer
file-sharing users (which, if true, will alas probably pass) and Tor's
not-as-optimal-as-it-should-be load balancing algorithm. In particular,
our load balancing algorithm produces high variance. Or said more
normally, the speed you get is wildly unpredictable, sometimes really
good and sometimes really awful -- and we need to cut down on the
frequency of 'really awful'.

So, stay tuned, but be patient. I'm hoping that we'll have that "metrics
and measurements" plan ramped up sometime in early 2009 -- with people
paid to focus on it, even. :)

In the meantime, perhaps we can get some more intuition about this
particular slowdown. For example, does it happen for 0.2.0.x relays and
not 0.1.2.x relays?