[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: revised Proposal 161: Computing bandwidth adjustments



Thus spake Nick Mathewson (nickm@xxxxxxxxxxxxxx):

> I've pushed Mike's revisions to proposal 161 based on discussion
> here.  I have a few questions on the revised version:
> 
> >   Two hop circuits are built using nodes from the same slice, and a large
> >   file is downloaded via these circuits. For nodes in the first 15% of the
> >   network, a 500K file will be used. For nodes in the next 15%, a 250K file
> >   will be used. For nodes in next 15%, a 100K file will be used. The 
> >   remainder of the nodes will fetch a 75K file.[1]
> 
> The "first" 15%?   Do you mean the "fastest"?  And what does the [1]
> refer to?

Yes, fastest.

I pushed a new version to a "bandwidth-proposals" branch on my public
repo last night (I think :) that added this footnote, which says that
I will perform measurements to determine these cutoffs properly (which
I'm doing now), and that gives some detail on parallelizing scans so
they can finish in a reasonable time.

> >   This process is repeated 250 times, and average stream capacities are 
> >   assigned to each node from these results. 
> >   
> >   In the future, a node generator type can be created to ensure that
> >   each node is chosen to participate in an equal number of circuits,
> >   and the selection will continue until every live node is chosen
> >   to participate in at least 7 circuits.
> 
> By the way, what does the algorithm do with non-running nodes now?
> Does it retry them as it goes?  Does a circuit that failed to build
> mean anything for the node's capacity?  Does it count to the number of
> circuits built?

Non-running nodes recieve a ratio of None right now. I think we'll
just not report one and leave the authorities to abstain from voting
on a measured bandwidth for them.

Circuit failures do not affect ratios. Built circuits are not counted
as a progress metric as that is hard to enforce and still make
progress. Some nodes fail circuits endlessly.
 
> Also, we'll want to think more about this "until every live node is chosen
> to participate in at least 7 circuits" thing before we do it; if 7
> random circuits per live node are necessary and sufficient to measure
> the nodes' capacities, then we can get away with less (or will have to
> handle more) circuits than we're currently planning.  [For example, if
> all 50 nodes are live, we'll need about 354 circuits on average.  If
> only 30 are live, we'll expect to be done after 201 circuits.]

I picked 7 because we see around 10-20% circuit failure rate, and I
tended to notice node bandwidth didn't change after 3 or so
measurements. Not yet scientific.

7 each should require 7*(50-number_of_exits circuits), which is
something like 7*35 for most slices.

I also wrote the TorFlow NodeGenerator to do this last night as well,
but I still need to test it.

> >   This will produce a new bandwidth value that will be output into a 
> >   file consisting of lines of the form:
> > 
> >   node_id=<idhex> SP bw=<Bw_new> NL
> >   
> >   This file can be either copied or rsynced into a directory readable
> >   by the directory authority.
> 
> When documenting and building this format, remember to note that
> programs parsing it MUST ignore unrecognized line formats and
> unrecognized fields on the lines they do recognize.

Ok.
 
> Otherwise, this looks fine to me.

-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs

Attachment: pgp55Pvsr68R8.pgp
Description: PGP signature