[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: peculiar server "bandwidth" posted by server "mnl" and possible new type of attack

     On Tue, 9 Sep 2008 06:42:02 -0400 Roger Dingledine <arma@xxxxxxx> wrote:
>On Tue, Sep 09, 2008 at 05:15:15AM -0500, Scott Bennett wrote:
>> bandwidth 5242880 10485760 52239166
>>                            ^^^^^^^^ ---> ~48.8 MB/s (!)
>Wow. Nice! :)
>And, as you say, unlikely to be true.
>>      All of the above leads me to suspect two things.  One is that there may
>> be some bug triggered only under exceedingly odd conditions that leads to
>> reporting of data rates grossly out of line with reality.
>Yep. Probably an integer underflow somewhere.

     That would certainly be nice.  I hope it's so easy.
>>      The second is that when a server claims such disproportionately high
>> capacity, the overall performance of the tor server network can be compromised.
>There has been quite a bit of work on this one, at least.
>First, clients use the min of the bandwidthrate and the measured
>bandwidth, so in the above "5242880 10485760 52239166" example clients
>will load-balance based on 5242880. So this bug doesn't have any real
>effect, at least in this case.
>Also, clients cap the number they'll believe at 10MB, so an evil relay
>can only sucker people so much.

     Whew.  Thank you so much, Roger!  That's exactly the kind of thing
I wanted for reassurance.
     However...hmmm...how much disruption could, say, a dozen i486's on
10 Mb/s ethernet connections claiming to be 5 MB/s servers cause?  That
ought to be fairly cheap, well within the coffee budget of an NSA office,
for example.
>There are research designs for doing community-measured bandwidth
>numbers, which would make it much harder for an evil relay to sucker
>people. See for example http://freehaven.net/anonbib/#snader08
>Unfortunately, all of these designs have downsides currently, so they're
>not ready for deployment yet.
>>      That brings us back to something I've already posted on OR-TALK, namely,
>> the apparent slowdown in tor traffic that has reduced the traffic through my
>> tor server by at least 30% and, judging from the reduced peaks shown for a lot
>> of the high-volume servers listed on the torstatus page, the tor network at
>> large.
>We're working on plans to start gathering more methodical data about
>how the network has run and is running, with the goal of being able to
>answer questions like this more usefully.

     Sounds good.  I'll look forward to that.
>Another example I'd love to investigate is the apparent feedback effect --
>once a relay happens to have a high-volume flow through it, it tends to
>continue to advertise high bandwidth numbers for weeks after, whereas if
>it doesn't start with any high-volume flows, it never seems to attract
>them later.

     Yes, indeed.  That is something I've posted about twice recently and
gotten no responses.
     Actual changes in the physical capacity of a server and its connection
to the Internet are relatively rare events, most of which would involve a
restart of the tor server anyway.  The published actual usage (a.k.a.
observed "bandwidth" [jeez, what a misappropriated term that is]) is used
as a means of determining what that physical limit may be, which is then
used by clients everywhere.  Ideally, a sequence of such reports from a
server should constitute a successive approximation that converges on the
actual physical limitation.  However, in tor's current method of reporting
reduced values for reduced *usage* and having those reduced values then
understood to represent *capacity*, the series never converges.  So instead
of seeing a series of reported usages that increases asymptotically toward
the capacity, tor sees fictionally varying *capacities*.  The feedback you
mention would be fine if not permitted to create nonlinear responses.  This
problem could be eliminated by having tor always publish the greater of the
last published value and the highest value from the reporting period
(currently the previous 24 hours, right?).
     A second potential issue under the current implementation is that the
default sampling rate is 1.333.../day, so if there are shorter term, but
repetitive, variations (e.g., daily fluctuations), tor has no possibility
of detecting and recognizing them.  1.333.../day gives a Nyquist frequency
of .666.../day, so any normal oscillations in traffic volume that happen
faster than that (i.e., with a period shorter than 1.333... days = 32 hours)
cannot be detected, but *would* be aliased in the measured data, thereby
bollixing any mechanism put into tor to adapt to periodic changes.  This
situation also argues in favor of a linearization like what I proposed above.
     OTOH, this aspect might be handled differently.  For example, a separate,
more frequent measurement of peak data rate could be made.  In essence, the
15-minute totals reported in the old descriptors and now in the extra-info
documents could be used that way, though they would be up to 18 hours out
of date.  They aren't quite the same as 10-second peaks, so maybe those
could also be recorded in an extra-info or similar document.
     But the linearization method is much simpler and eliminates any need
to adapt to periodic changes in traffic loads.  It's also a server-side
change that clients don't need to be aware of in order to get the benefit.
The more servers reporting only monotonically increasing peak rates, which
approach the unknown-to-tor true capacity rates, the more accurate the
networkwide picture of capacity should be, and the better the overall tor
performance ought to be.
>My guess is that the current slowdown is due to a combination of fewer
>file-sharing users (which, if true, will alas probably pass) and Tor's
>not-as-optimal-as-it-should-be load balancing algorithm. In particular,

     What would have caused their to be fewer file-sharers?  Especially
now that fall semester has started, and students are back under the partial
concealment of the anti-RIAA defenses of their schools?  >:-} (I must have
missed something in the news...)

>our load balancing algorithm produces high variance. Or said more
>normally, the speed you get is wildly unpredictable, sometimes really
>good and sometimes really awful -- and we need to cut down on the
>frequency of 'really awful'.

     A part of the cause may be what we've discussed just above because it
means the clients are basing their route selection upon server wildly varying
usages rather than on server capacities.  But that's obviously only a piece
of this problem.
>So, stay tuned, but be patient. I'm hoping that we'll have that "metrics
>and measurements" plan ramped up sometime in early 2009 -- with people

     I can hardly wait.  I've wanted that kind of thing almost since I first
started using tor.

>paid to focus on it, even. :)

     Yeah?  Very cool! :-)  Hats off to whoever pulled that one off!
>In the meantime, perhaps we can get some more intuition about this
>particular slowdown. For example, does it happen for 0.2.0.x relays and
>not 0.1.2.x relays?
     I'm afraid I can't help there because there's only a handful of servers
for which I have any recollection of their previously typical rates.  You
would probably have a better chance of getting useful responses if individual
server operators would report that kind of thing about their own servers
because in most cases they are the people most likely to be familiar with
their servers' typical loads.  Too bad we don't have historical records of
the torstatus page contents.
     Roger, thanks again very much for your lightning-quick response.

                                  Scott Bennett, Comm. ASMELG, CFIAG
* Internet:       bennett at cs.niu.edu                              *
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *