[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Proposal 161: Computing Bandwidth Adjustments



Thus spake Nick Mathewson (nickm@xxxxxxxxxxxxxx):

> On Tue, May 12, 2009 at 09:46:05PM -0700, Mike Perry uttered:
> > Thus spake Nick Mathewson (nickm@xxxxxxxxxxxxxx):
>  [...]
> > > There are some magic numbers here: Why 3%?  How large are these files
> > > and how often are they downloaded?  Why "several hundred"?  Is the
> > > "several hundred" a function of the "45" above?
> > 
> > Currently no. I've been meaning to do some statistics on how long it
> > takes to converge. Eyeballing seems to indicate the filtered values
> > converge after about 100 fetches or so, or about 3-4 per node. I've
> > been doing runs of 250 just for good measure.
> 
> As discussed on IRC, we should see if we can get away with fetching
> smaller files to test the weaker slices. If we can, hooray for us!  If
> it doesn't work, too bad. :/
> 
> As for the slice size, it seems that maybe a quasi-constant slice size
> would work better than having it be fixed at 3% of the network.  That
> way, we always send the same amount of traffic per node, rather than
> getting sparser and sparser as the slices get bigger but our traffic
> stays the same.

Ok, it should be fairly easy to set this at slices of 50 nodes each.
It gets a bit more complicated if I would need to grow the slices
to include at least N exit nodes, but we seem to have at least 10 exit
nodes in every 50 node slice anyway.

Also, with respect to the 250 fetches, it should be possible to write
a new node generator type that attempts to keep the circuit chosen
count equal for each node, and it should also be fairly
straight-forward to stop the scan once every node has been chosen for
at least 7 circuits.

> > > I want to see a calculation for what fraction of the network capacity
> > > we expect to use for this testing.  I wouldn't expect it to be over a
> > > percent, but I'd like to see the math to prove it.
> > 
> > This is a function of how much we parallelize and how many scanners we
> > have. Right now, there is 0 parallelization built in, which means that
> > each scanner takes up approximately 20-50Kbytes/sec of two nodes at a
> > time. This is compared to the 120Mbytes/sec of exit capacity Tor
> > currently has.
> >  
> > > Also, how long does it take to run these tests over the network?  If
> > > it takes much longer than a consensus interval, we should think about
> > > what happens if a node's advertised capacity changes greatly, and
> > > whether we want to treat newer measurements as more accurate.
> > 
> > Each slice takes around 3-6 hours. The entire network can take as long
> > as a week.. Hence the desire to parallelize..
> 
> My intuition says that unless we scan the whole network every two days
> (or ideally less), our data will likely be useless.
> 
> What if, instead of running in a batch mode, we were constantly
> scanning and iteratively refining our guesses for ratios, discounting
> old data as newer data arrived?  We'd still want to parallelize some,
> but we wouldn't run into the problem of having some nodes' estimated
> values be far more accurate than others.

Hrmm.. So the continually-updating idea is not that hard, thanks to
our SQL-based stat store.

We can then fully parallelize and output a first pass with filtered
averages for each node. A second script can then compute a
network-wide average of filtered stream averages, and then use that to
produce ratios and then bandwidths. 

This would then solve both our desire for quick scans and for
network-wide ratios.

But note this won't be cheap memory-wise. We'd need a new tor
client instance, a new speedracer instance, and a new metatroller
instance for each slice. That's at least 12m + 5m + 50m for each
slice. That quickly adds up to around 2GB for doing the entire network 
in parallel. I can see about reducing the metatroller overhead, but
with all the statistics we need to load in and out of the SQL tables,
its probably not going to be easy. 

It will also require about 600Kbytes/sec, or almost 5Mbit up+down per
authority (based on an average stream capacity of 20Kbytes/sec and 30
slices).

We can try to reach some middle-ground where each parallel scanner
scans like 15% of the network or so, as opposed to 3%, but the way the
restrictions get set up, we can only scan a particular 3% slice at a
time. By the time we wrap through the 15%, we probably would just want
to discard all the prior scan results and start fresh.

> > > (Crazy idea 1: If the values change too fast or slow, we could
> > > exponentiate the ratios out of the algorithm before using them.
> > > Slightly less crazy idea 2: if the values change too fast, we could do
> > > a standard time-weighted average, where we compute the new declared
> > > bandwidth BW' as a weighted average of the old declared bandwidth and
> > > the new measured bandwidth.)
> > 
> > Yeah, I like idea 2, basically what you mentioned in 160. This would
> > be done Tor-end, not scanner end, right?
> 
> I don't have a strong feeling about that.  To me, it seems easier to
> do it at the scanner end, or as a processing step between the scanner
> end and the authorities... but that's just because I'm thinking of the
> scanner stuff as easier to change if we think of a better algorithm.

Ok.

-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs

Attachment: pgpRsbLTTrYkt.pgp
Description: PGP signature