Thus spake Nick Mathewson (nickm@xxxxxxxxxxxxxx): > I've pushed Mike's revisions to proposal 161 based on discussion > here. I have a few questions on the revised version: > > > Two hop circuits are built using nodes from the same slice, and a large > > file is downloaded via these circuits. For nodes in the first 15% of the > > network, a 500K file will be used. For nodes in the next 15%, a 250K file > > will be used. For nodes in next 15%, a 100K file will be used. The > > remainder of the nodes will fetch a 75K file.[1] > > The "first" 15%? Do you mean the "fastest"? And what does the [1] > refer to? Yes, fastest. I pushed a new version to a "bandwidth-proposals" branch on my public repo last night (I think :) that added this footnote, which says that I will perform measurements to determine these cutoffs properly (which I'm doing now), and that gives some detail on parallelizing scans so they can finish in a reasonable time. > > This process is repeated 250 times, and average stream capacities are > > assigned to each node from these results. > > > > In the future, a node generator type can be created to ensure that > > each node is chosen to participate in an equal number of circuits, > > and the selection will continue until every live node is chosen > > to participate in at least 7 circuits. > > By the way, what does the algorithm do with non-running nodes now? > Does it retry them as it goes? Does a circuit that failed to build > mean anything for the node's capacity? Does it count to the number of > circuits built? Non-running nodes recieve a ratio of None right now. I think we'll just not report one and leave the authorities to abstain from voting on a measured bandwidth for them. Circuit failures do not affect ratios. Built circuits are not counted as a progress metric as that is hard to enforce and still make progress. Some nodes fail circuits endlessly. > Also, we'll want to think more about this "until every live node is chosen > to participate in at least 7 circuits" thing before we do it; if 7 > random circuits per live node are necessary and sufficient to measure > the nodes' capacities, then we can get away with less (or will have to > handle more) circuits than we're currently planning. [For example, if > all 50 nodes are live, we'll need about 354 circuits on average. If > only 30 are live, we'll expect to be done after 201 circuits.] I picked 7 because we see around 10-20% circuit failure rate, and I tended to notice node bandwidth didn't change after 3 or so measurements. Not yet scientific. 7 each should require 7*(50-number_of_exits circuits), which is something like 7*35 for most slices. I also wrote the TorFlow NodeGenerator to do this last night as well, but I still need to test it. > > This will produce a new bandwidth value that will be output into a > > file consisting of lines of the form: > > > > node_id=<idhex> SP bw=<Bw_new> NL > > > > This file can be either copied or rsynced into a directory readable > > by the directory authority. > > When documenting and building this format, remember to note that > programs parsing it MUST ignore unrecognized line formats and > unrecognized fields on the lines they do recognize. Ok. > Otherwise, this looks fine to me. -- Mike Perry Mad Computer Scientist fscked.org evil labs
Attachment:
pgp55Pvsr68R8.pgp
Description: PGP signature