[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #25687 [Core Tor/Tor]: over-report of observed / self-measure bandwidth on fast hardware -- important to torflow / peerflow
#25687: over-report of observed / self-measure bandwidth on fast hardware --
important to torflow / peerflow
-------------------------------------------------+-------------------------
Reporter: starlight | Owner: (none)
Type: defect | Status: new
Priority: Medium | Milestone: Tor:
| unspecified
Component: Core Tor/Tor | Version: Tor:
| 0.2.6.10
Severity: Normal | Resolution:
Keywords: tor-bwauth, needs-research, needs- | Actual Points:
proposal? |
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+-------------------------
Comment (by starlight):
Replying to [comment:14 teor]:
> > Averaging all relay measurements to a single value appears too simple
in the face of reality. My suggestion was and is to apply some variation
of a moving average in such a way that relay biases are determined
relative to relays of similar capacity rather than to all relays. Note
this is not at all the same as the "slice" scheme used by Torflow to group
relays for measurement and I agree eliminating it was a good idea.
> >
> > https://en.wikipedia.org/wiki/Moving_average
>
> The ticket for implementing moving averages is #27789.
BTW what I mean here is a moving average where the X-axis is not the a
time series of measurements for a single relay, but where the X-axis is,
for a single point-in-time consensus calculation, the ordered set of
"observed bandwidth" measurements. I believe this could improve the
behavior of consensus voting dramatically. The idea is that superfast
relays should have their self-measure adjusted relative to the performance
of superfast relays, that medium speed relays be compared with other
medium speed relays, and low-capacity relays likewise are compared with
other similarly slow relays. The width of the averaging window could vary
depending on the X value of observed bandwidth, the behavior easily
parameterized any number of ways. Please note my emphasis on "observed
bandwidth" specifically, rather than the effective self advertised
bandwidth that might be limited by "maximum advertised" bandwidth.
Certainly the max advertised value should be used in calculating the vote,
but for comparing similar relays actual capacity seem more appropriate.
But, this could also vary depending on a control parameter and it could be
tried both ways.
Of course the above will bring the high-end of the measurement offset
ratio down from 8.0 to +/- in the neighborhood of 1.0, but the the
polynomial control bias curve can be used to restore the emphasis on fast
relays over slower relays though perhaps at less than the current extreme.
Also I believe it is important to look at individual relays when
evaluating results as well as graphical comparisons of all relays. A few
gnarly spread-sheets similar in concept to the ones I have posted will
help illuminate both the micro and macro outcomes of different approaches.
I am an engineer rather than a mathematician, and everything I've
suggested falls into the category of "empirical trial-and error". The
problem of balancing the Tor network is so complex I believe it defies
deterministic analysis in the way that weather does. The crux of it is
taking an idea that does work (Torflow section 2 logic) an refining it via
cautious experiment while keeping it reasonably simple. Nothing I've
suggested requires more than the sort of medium complex math learned in
advanced high-school course work. I was taught the basics of polynomial
expressions in middle school (by an unusually gifted instructor whom I
vividly recall, no doubt). Balancing the consensus is a dynamic feedback-
loop control system of the sort where prediction modeling with any
semblance of precision is impossible.
Instead of writing proposals, why not spend two or three weeks just coding
it all up and then try it out in various mix/match combinations of the
three general suggestions along with other research currently under way?
Similar to how most Internet protocols came into existence. More often
than not RFCs were written after the code was working and was already
deployed in limited production.
Finally, I should reiterate that everything suggested can be "turned off"
simply by applying parameter values. The polynomial bias curve can be
disabled with 1 * x ^0^ (i.e a line where y=1), the moving average
disabled by setting the window width to infinity, and classful grouping of
relays disabled in a fashion similar to Torflow with an on/off toggle. In
particular the bias curve and moving averages can be tested gently in
increments and hasty retreats made when outcomes arrive in a manner not
desired.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25687#comment:18>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs