I figure that this probably should be reviewed on or-dev, since it is a relatively small patch to smartlist_choose_by_bandwidth, but with potentially large implications for load balancing on Tor (which I believe is broken). Since balancing is key to usability and thus anonymity, I believe this patch should go into both 0.1.2.x and 0.2.0. Presently, Tor attempts to balance exit node usage thusly (from path-spec.txt): For non-exit positions on "fast" circuits, we pick routers as above, but we weight the clipped advertised bandwidth of Exit-flagged nodes depending on the fraction of bandwidth available from non-Exit nodes. Call the total clipped advertised bandwidth for Exit nodes under consideration E, and the total clipped advertised bandwidth for non-Exit nodes under consideration N. If E<N/2, we do not consider Exit-flagged nodes. Otherwise, we weight their bandwidth with the factor (E-N/2)/(N+E-N/2) == (2E - N)/(2E + N). This ensures that bandwidth is evenly distributed over nodes in 3-hop paths. It is possible this formula is vaguely/empirically correct, but it is so obtuse I think it should be replaced with the more straightforward calculation with T being the Total network bandwidth (exit+nonexit): - if E < T/3 we do not consider exit-flagged nodes (and never choose them for entry and middle positions) - else (if E > T/3) we take the leftover E-T/3, divide it by E and get exit_weight=(E-T/3)/E. This will be used to weight the bandwidth of all exit-flagged nodes. It is easy to demonstrate that the sum of the weighted total available exit bandwidth is E-T/3 using this formula, which is the "leftover" bandwidth after exits are used in one third of all node choices (assuming the network is otherwise balanced). The original formula may also provably correct, but I'm not sure how to go about it. It seems wrong to me though. This formula is also much more readily adaptable to 2 and 4 hop paths than the above formula. Additionally, the patch does the following two things: 1. Expand MAX_BELIEVABLE_BANDWIDTH from 1.5MBytes/sec to 10Mbytes/sec. We already have several Tor nodes with near 5Mbytes/sec. Capping them at 1.5MBytes/sec artifically dumps load on slower nodes. There are effective ways to detect/prevent liars. This #define is not one of them. If any nodes exceed the limit, the Tor client will print out a notice (mostly just to tell us we need to raise it again). 2. Explicitly ensure we don't return exits if one wasn't requested or the exit_weight=0. It may be a wasted check though, since exit_weight should prevent the tmp variable from exceeding the selected bw for exit nodes. -- Mike Perry Mad Computer Scientist fscked.org evil labs
Index: src/or/routerlist.c =================================================================== --- src/or/routerlist.c (revision 10850) +++ src/or/routerlist.c (working copy) @@ -1145,7 +1145,7 @@ /** Do not weight any declared bandwidth more than this much when picking * routers by bandwidth. */ -#define MAX_BELIEVABLE_BANDWIDTH 1500000 /* 1.5 MB/sec */ +#define MAX_BELIEVABLE_BANDWIDTH 10000000 /* 10 MB/sec */ /** Helper function: * choose a random element of smartlist <b>sl</b>, weighted by @@ -1155,7 +1155,8 @@ * routerinfo_t's. Otherwise it's a list of routerstatus_t's. * * If <b>for_exit</b>, we're picking an exit node: consider all nodes' - * bandwidth equally regardless of their Exit status. If not <b>for_exit</b>, + * bandwidth equally regardless of their Exit status, since there may be + * some in the list because they exit to obscure ports. If not <b>for_exit</b>, * we're picking a non-exit node: weight exit-node's bandwidth less * depending on the smallness of the fraction of Exit-to-total bandwidth. */ @@ -1164,7 +1165,7 @@ { int i; routerinfo_t *router; - routerstatus_t *status; + routerstatus_t *status=NULL; int32_t *bandwidths; int is_exit; uint64_t total_nonexit_bw = 0, total_exit_bw = 0, total_bw = 0; @@ -1202,8 +1203,22 @@ this_bw = router_get_advertised_bandwidth(router); } /* if they claim something huge, don't believe it */ - if (this_bw > MAX_BELIEVABLE_BANDWIDTH) + if (this_bw > MAX_BELIEVABLE_BANDWIDTH) { + char fp[HEX_DIGEST_LEN+1]; + if (status) { + base16_encode(fp, sizeof(fp), + status->identity_digest, DIGEST_LEN); + } else if (router) { + base16_encode(fp, sizeof(fp), + router->cache_info.identity_digest, DIGEST_LEN); + } + log_notice(LD_DIR, + "Bandwidth %d for router %s (%s) exceeds allowed max %d, capping", + this_bw, router ? router->nickname : "(null)", + status || router ? fp : "0", + MAX_BELIEVABLE_BANDWIDTH); this_bw = MAX_BELIEVABLE_BANDWIDTH; + } if (is_known) { bandwidths[i] = (int32_t) this_bw; // safe since MAX_BELIEVABLE<INT32_MAX if (is_exit) @@ -1250,20 +1265,22 @@ /* Figure out how to weight exits. */ if (for_exit) { - /* If we're choosing an exit node, exit bandwidth counts fully. */ + /* If we're choosing an exit node, presumably all nodes in the list + * can access our port. Count total */ exit_weight = 1.0; total_bw = total_exit_bw + total_nonexit_bw; - } else if (total_exit_bw < total_nonexit_bw / 2) { + } else if (total_exit_bw < (total_nonexit_bw+total_exit_bw) / 3) { /* If we're choosing a relay and exits are greatly outnumbered, ignore * them. */ exit_weight = 0.0; total_bw = total_nonexit_bw; } else { - /* If we're choosing a relay and exits aren't outnumbered use the formula - * from path-spec. */ - uint64_t leftover = (total_exit_bw - total_nonexit_bw / 2); - exit_weight = U64_TO_DBL(leftover) / - U64_TO_DBL(leftover + total_nonexit_bw); + /* If the total exit bandwidth is greater than ONE THIRD of the + * TOTAL network bandwidth then the leftover = E-T/3. The weight ratio + * is then (E-T/3)/E. */ + uint64_t leftover = total_exit_bw-(total_exit_bw+total_nonexit_bw)/3; + exit_weight = U64_TO_DBL(leftover)/U64_TO_DBL(total_exit_bw); + total_bw = total_nonexit_bw + DBL_TO_U64(exit_weight * U64_TO_DBL(total_exit_bw)); } @@ -1292,6 +1309,11 @@ tmp += ((uint64_t)(bandwidths[i] * exit_weight)); else tmp += bandwidths[i]; + + /* Don't return exits if not requested and they are scarce */ + if (is_exit && !(for_exit || exit_weight > 0.0)) + continue; + if (tmp >= rand_bw) break; } Index: doc/spec/path-spec.txt =================================================================== --- doc/spec/path-spec.txt (revision 10850) +++ doc/spec/path-spec.txt (working copy) @@ -182,17 +182,16 @@ proportional to its advertised bandwidth [the smaller of the 'rate' and 'observed' arguments to the "bandwidth" element in its descriptor]. If a router's advertised bandwidth is greater than MAX_BELIEVABLE_BANDWIDTH - (1.5 MB/s), we clip to that value. + (10 MB/s), we clip to that value. For non-exit positions on "fast" circuits, we pick routers as above, but we weight the clipped advertised bandwidth of Exit-flagged nodes depending on the fraction of bandwidth available from non-Exit nodes. Call the total clipped advertised bandwidth for Exit nodes under consideration E, - and the total clipped advertised bandwidth for non-Exit nodes under - consideration N. If E<N/2, we do not consider Exit-flagged nodes. - Otherwise, we weight their bandwidth with the factor (E-N/2)/(N+E-N/2) == - (2E - N)/(2E + N). This ensures that bandwidth is evenly distributed over - nodes in 3-hop paths. + and the total clipped advertised bandwidth for all nodes under + consideration T. If E<T/3, we do not consider Exit-flagged nodes. + Otherwise, we weight their bandwidth with the factor (E-T/3)/E. This + ensures that bandwidth is evenly distributed over nodes in 3-hop paths. Additionally, we may be building circuits with one or more requests in mind. Each kind of request puts certain constraints on paths:
Attachment:
pgpZn1SbedIgV.pgp
Description: PGP signature