[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #25687 [Core Tor/Tor]: over-report of observed / self-measure bandwidth on fast hardware -- important to torflow / peerflow



#25687: over-report of observed / self-measure bandwidth  on fast hardware --
important to torflow / peerflow
-------------------------------------------------+-------------------------
 Reporter:  starlight                            |          Owner:  (none)
     Type:  defect                               |         Status:  new
 Priority:  Medium                               |      Milestone:  Tor:
                                                 |  unspecified
Component:  Core Tor/Tor                         |        Version:  Tor:
                                                 |  0.2.6.10
 Severity:  Normal                               |     Resolution:
 Keywords:  tor-bwauth, needs-research, needs-   |  Actual Points:
  proposal?                                      |
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by starlight):

 Replying to [comment:14 teor]:
 > > Averaging all relay measurements to a single value appears too simple
 in the face of reality.  My suggestion was and is to apply some variation
 of a moving average in such a way that relay biases are determined
 relative to relays of similar capacity rather than to all relays.  Note
 this is not at all the same as the "slice" scheme used by Torflow to group
 relays for measurement and I agree eliminating it was a good idea.
 > >
 > > https://en.wikipedia.org/wiki/Moving_average
 >
 > The ticket for implementing moving averages is #27789.

 BTW what I mean here is a moving average where the X-axis is not the a
 time series of measurements for a single relay, but where the X-axis is,
 for a single point-in-time consensus calculation, the ordered set of
 "observed bandwidth" measurements.  I believe this could improve the
 behavior of consensus voting dramatically.  The idea is that superfast
 relays should have their self-measure adjusted relative to the performance
 of superfast relays, that medium speed relays be compared with other
 medium speed relays, and low-capacity relays likewise are compared with
 other similarly slow relays.  The width of the averaging window could vary
 depending on the X value of observed bandwidth, the behavior easily
 parameterized any number of ways. Please note my emphasis on "observed
 bandwidth" specifically, rather than the effective self advertised
 bandwidth that might be limited by "maximum advertised" bandwidth.
 Certainly the max advertised value should be used in calculating the vote,
 but for comparing similar relays actual capacity seem more appropriate.
 But, this could also vary depending on a control parameter and it could be
 tried both ways.

 Of course the above will bring the high-end of the measurement offset
 ratio down from 8.0 to +/- in the neighborhood of 1.0, but the the
 polynomial control bias curve can be used to restore the emphasis on fast
 relays over slower relays though perhaps at less than the current extreme.

 Also I believe it is important to look at individual relays when
 evaluating results as well as graphical comparisons of all relays.  A few
 gnarly spread-sheets similar in concept to the ones I have posted will
 help illuminate both the micro and macro outcomes of different approaches.

 I am an engineer rather than a mathematician, and everything I've
 suggested falls into the category of "empirical trial-and error".  The
 problem of balancing the Tor network is so complex I believe it defies
 deterministic analysis in the way that weather does.  The crux of it is
 taking an idea that does work (Torflow section 2 logic) an refining it via
 cautious experiment while keeping it reasonably simple. Nothing I've
 suggested requires more than the sort of medium complex math learned in
 advanced high-school course work.  I was taught the basics of polynomial
 expressions in middle school (by an unusually gifted instructor whom I
 vividly recall, no doubt).  Balancing the consensus is a dynamic feedback-
 loop control system of the sort where prediction modeling with any
 semblance of precision is impossible.

 Instead of writing proposals, why not spend two or three weeks just coding
 it all up and then try it out in various mix/match combinations of the
 three general suggestions along with other research currently under way?
 Similar to how most Internet protocols came into existence.  More often
 than not RFCs were written after the code was working and was already
 deployed in limited production.

 Finally, I should reiterate that everything suggested can be "turned off"
 simply by applying parameter values.  The polynomial bias curve can be
 disabled with 1 * x ^0^ (i.e a line where y=1), the moving average
 disabled by setting the window width to infinity, and classful grouping of
 relays disabled in a fashion similar to Torflow with an on/off toggle.  In
 particular the bias curve and moving averages can be tested gently in
 increments and hasty retreats made when outcomes arrive in a manner not
 desired.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25687#comment:18>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs