[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output
#32265: MS: Format an exit list from a previous exit list and exitmap output
----------------------------------+--------------------------------
Reporter: irl | Owner: irl
Type: task | Status: needs_revision
Priority: Medium | Milestone:
Component: Metrics/Exit Scanner | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: #29654 | Points:
Reviewer: karsten | Sponsor:
----------------------------------+--------------------------------
Comment (by irl):
Replying to [comment:5 karsten]:
> Glad to see that the rewrite is progressing so quickly!
>
> Couple remarks/questions:
> - Why 48 hours and not 24 hours? Doesn't the current exit scanner keep
scan results for 24 hours? I might be wrong, though. Let's use whatever
the current scanner does.
https://2019.www.torproject.org/tordnsel/exitlist-spec.txt
It discards relays that were not seen in the last 48 hours in a consensus.
> - Rather than downloading exit lists from CollecTor, wouldn't it be
sufficient to just read the latest exit list previously written by this
scanner? And if there's none, just assume that no previous scans have
happened. In theory, this should be all we need to learn.
Probably, but this was a handy way to get test data and I wanted to try
out the new Stem functionality. It would be nice to have a method to
bootstrap a new scanner but this could just mean manually downloading the
latest exit list and putting it in the right place.
> - It seems that `LastStatus` is only taken from exit lists downloaded
from CollecTor but never set by new measurements. We should make a plan
what to do with this field. Take it out? Populate it with consensus valid-
after times?
Right, this is the tricky bit. Do you know if anything consumes the
LastStatus or Published timestamps? Ideally we could just drop these but
for now I'm synthesizing them from the timestamp of the last measurement
which could be close enough for the consumers.
> - Does exitmap with the plugin use previous scans as input to decide
which relays to scan? I believe that it uses some logic to avoid scanning
relays too frequently. This has two effects: it doesn't generate more load
on the network and on single relays than necessary, and it ensures that
new relays are scanned sooner. As a result, the new scanner could be run
once or twice per hour, rather than every 2 or 3 hours (at 45 minutes
runtime).
No. It scans the entire network every time. It does this asynchronously,
and doesn't try to prioritize anything. Just whichever circuits are built
first will be tested first. I was even thinking it could run continuously.
If exit relays cannot cope with two HTTP requests an hour, perhaps they
shouldn't be exit relays.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32265#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs