[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #28868 [Core Tor/sbws]: best_priority() can starve the worker threads of good relays
#28868: best_priority() can starve the worker threads of good relays
---------------------------+-----------------------------------
Reporter: teor | Owner: juga
Type: defect | Status: needs_review
Priority: Medium | Milestone: sbws: 1.0.x-final
Component: Core Tor/sbws | Version: sbws: 1.0.2
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: #28663 | Points:
Reviewer: | Sponsor:
---------------------------+-----------------------------------
Comment (by teor):
Replying to [comment:3 juga]:
> Replying to [ticket:28868 teor]:
> > `best_priority()` tries to measure unmeasured and failing relays
first.
> >
> > But if `fraction_relays` or `min_relays` always fail, those relays
will always end up first in the priority queue. (More precisely, those
relays will end up first in the priority queue, until the results of the
good relays ~~time out~~ are discarded for being too old.)
> >
> > Thinking about starvation is complicated, because of the
`freshness_reduction_factor` on some errors.
> >
> > Here's a very simple algorithm that avoids starving good relays for
failed relays:
> > 1. Count the number of times that sbws has attempted to get a result
from each relay.
>
> This is already done when writing the results: ResultError and
ResultSuccess keep that.
But some failures do not write a ResultError.
> > 2. Test the relays with the lowest number of attempts first. (Don't
check if the attempt succeeded or failed.)
>
> This's what i was proposing by commenting the part where it prioritizes
ResultError measurements.
I don't understand what you mean here.
Can you link to the comment, or quote it?
> >
> > For this priority rule to work, every time a relay is queued, it must
get a result. Here's how we can make that happen"
> > * Modify `result_putter_error()` to store an error result to the
queue.
>
> result_putter already writes ResultError.
But result_putter_error() is called when there is an exception in
apply_async(), and it does not write ResultError.
> Here there're two other bugs, result_putter_error, only happens when:
> 1. The relay being measured, doesn't have a descriptor (#28870)
> 2. The operator hits KeyboardInterrupt (#28869)
>
> AFAIK, there're no other cases where the error callback is called.
The code is complicated, so it could throw other exceptions that you
haven't seen yet. Future code changes could also add more exceptions.
> > * Make sure timeouts store an error result to the queue.
> > * Add a unit test and integration test that makes sure every queued
relay has a result.
>
> Testing this is hard, but i'll see.
Replying to [comment:5 juga]:
> > Add a unit test and integration test that makes sure every queued
relay has a result.
> Maybe this could be done as part of #28566 instead?.
>
> #28933 runs the actual scanner. It is not counting that all the relays
get measured, though in the test network this is the case.
>
> Created PR without the tests:
https://github.com/torproject/sbws/pull/328
You'll also need to update the documentation:
https://github.com/torproject/sbws/blob/master/docs/source/specification.rst
#simple-relay-prioritization
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28868#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs