[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Exit Balancing Patch

On Thu, Jul 26, 2007 at 11:27:18PM -0700, Mike Perry wrote:
> > Well, I think we want *some* clipping -- otherwise people can advertise
> > What new number would capture most of the current nodes we're
> > seeing? 5MB/s?
> I think we should put it well above the current fastest node, which is
> blutmaggie at 5.6MB/sec. From what I can tell, this node actually does
> have that much capacity. It has 0 failures over 18 fetches averaging
> 120KB/sec. Kudos to them!
> I see no reason not to put the limit at 10MB/sec. All we really want
> to do is prevent someone from claiming their bandwidth is infinity. It
> should never actually clip a legitimate node's bandwidth if we can
> help it.

Right, I think it's important to recognize that there that there are
two different vulnerabilities we're worried about.

The first vulnerability is that fast nodes attract a disproportionate
amount of Tor traffic. If we actually had a node that could handle 1GBit,
and we added it to the network, then suddenly there would be a single
point where an attack would yield really good results. So we want to
avoid making the network "too" lopsided -- and one of the open research
questions here is finding the right tradeoff between how lopsided we
can afford vs how much performance penalty we should force on users to
ensure that they get a diverse network.

The second vulnerability is that somebody who wants to attract Tor
traffic can run a small node, claim to have a whole lot of bandwidth,
and be in an improved position to attack Tor. Our current defense here
is a MAX_BELIEVABLE_BANDWIDTH cutoff (currently at 1.5MB/s), which makes
sure we avoid allocating more traffic than a certain cap to any given
node. Future proposed defenses include limiting the number of Tor servers
on a given IP address (proposal 109), and doing spot checks to see if a
given server is performing much worse than expected -- one metric would be
"is his performance much worse than other servers at the same level?"

Now, the 1.5MB/s cap is relevant to both of these vulnerabilities.
Regarding the second issue, I agree that it is a good idea to raise the
number, and I don't think anybody will object. Regarding the first issue,
I think you're right, but I don't think we've explored it well enough yet;
we should do that at some point.

> Even though there is no automated scanning yet, I can quickly
> verify performance from random IPs that are difficult to anticipate,
> and we can raise the limit accordingly.

We should think about how we can get automated scanning into place. The
fact that we *can* do a check is great, but we need to work on the next
step which is having it automatically and constantly look for anomalies.

What is the limiting factor here? Code? High-speed network connections?
Humans to oversee it? Ways to present the results usefully?

> But we should set it high
> enough that we do not have to raise it for a good while.

This is an important enough point that I'll elaborate on it. It's not
just a matter of trying to avoid needing this discussion again in 6
months when 8MB/s servers are common. Rather, we're trying to pick a
number to give all the clients right now, so that in 6 months today's
clients won't be back to harming the load balancing again. So we should
try to pick a number that will still not be reached at the end of the
expected lifetime of today's Tor release.

(There's also a minor partitioning attack here, in the form of a
statistical leak based on whether you're weighting your node choices like
the people with the new number or like the people with the old number; but
I don't think that's a big deal compared to the other big deals we have.)

> Also, I think the exit weighting formula is wrong. If that portion of
> my patch is not to be applied, I would like some sort of justification
> as to why what is currently there is better than what I proposed (or
> is even correct), but that is the least of my concerns, since exit
> bandwidth is usually scarce.

I've asked Nick to look at it and compare it to the equation he produced
earlier. His initial response was similar -- that he wanted some kind
of justification for why this new one was better.

My plan is to sit down sometime and look at both equations and try
to figure out how they compare, but it's low on my priority list, so
hopefully before then one or the other of you will get beyond the "well,
I wrote one, and that other one is confusing to me, so why not use mine"
stage. :)

> > Also, should we raise the default rate limit from 3MB/6MB too? It's
> > been a while since we raised it, and I imagine it's clipping a few
> > servers. (Does somebody want to count how many?)
> Oh, yeah. Definitely should change this.
> How does this manifest itself? The first element in the bandwidth line
> of the descriptor is 3MB and the second is 6MB? or is it even possible
> to tell? the 3rd number is cut at the lower of the observed vs limit..
> Should I just check to see who has exactly 3MB as their reported
> capacity? Will that work? Is it exactly 3MB, or is it 3000000
> bytes?

Look for BandwidthRate (first number) of 3145728 and BandwidthBurst
(second number) of 6291456. Then look for a third number that exceeds
3145728. On briefly looking through moria2's cached-routers list, I found
threeish such servers. So I don't think this is a big deal yet either way.