[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output

To: undisclosed-recipients: ;
Subject: Re: [tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output
From: "Tor Bug Tracker & Wiki" <blackhole@xxxxxxxxxxxxxx>
Date: Wed, 20 Nov 2019 14:11:26 -0000
Auto-submitted: auto-generated
Delivered-to: archiver@xxxxxxxx
Delivery-date: Wed, 20 Nov 2019 09:11:39 -0500
In-reply-to: <043.db4dd86b8431bb0e23e8ec1295ccc978@torproject.org>
List-archive: <http://lists.torproject.org/pipermail/tor-bugs/>
List-help: <mailto:tor-bugs-request@lists.torproject.org?subject=help>
List-id: "auto: Tor bug tracker status mails" <tor-bugs.lists.torproject.org>
List-post: <mailto:tor-bugs@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=unsubscribe>
References: <043.db4dd86b8431bb0e23e8ec1295ccc978@torproject.org>
Reply-to: no-reply@xxxxxxxxxxxxxx, tor-assistants@xxxxxxxxxxxxxx
Sender: "tor-bugs" <tor-bugs-bounces@xxxxxxxxxxxxxxxxxxxx>

#32265: MS: Format an exit list from a previous exit list and exitmap output
----------------------------------+------------------------------
 Reporter:  irl                   |          Owner:  irl
     Type:  task                  |         Status:  needs_review
 Priority:  Medium                |      Milestone:
Component:  Metrics/Exit Scanner  |        Version:
 Severity:  Normal                |     Resolution:
 Keywords:                        |  Actual Points:
Parent ID:  #29654                |         Points:
 Reviewer:  karsten               |        Sponsor:
----------------------------------+------------------------------

Comment (by karsten):

 Replying to [comment:9 irl]:
 > Replying to [comment:8 karsten]:
 > > Actually, I think it's harmful to download exit lists from CollecTor
 and merging them with the scanner's own measurements. We should instead
 merge new scan results with previous local results. It's also yet another
 dependency to download something from CollecTor that is not really needed.
 I'd say kill this code.
 >
 > Ok, it's gone.

 But it's still merging with the last-written local exit list?

 > > Well, the spec says what these fields are being used for: `Published`
 is used to skip relays that haven't published a new descriptor since the
 one in the current consensus, and `LastStatus` is used to know when to
 throw out relays from the list. This is all under the assumption that the
 scanner reads its previous exit list from disk before making measurements.
 > >
 > > My suggestion would be to use the consensus valid-after time as
 `LastStatus` time. It's pretty much the same as the `published` time in a
 version 2 status, and it would work for this purpose.
 >
 > I saw what TorDNSEL is using it for, but I wonder if people use exit
 lists in ways we haven't anticipated. I guess we can synthesize the valid
 after time from the measurement time, but our plugin is not directly
 handling consensuses or server descriptors. It would take changes to
 exitmap internals to get this data out.

 I don't think we're using it (I'd have to check), nor do I know about
 others using it. But I'd be careful removing it or filling it with
 approximately correct data.

 Can we somehow access the consensus used for scanning and fill in these
 fields as part of the merge script? Maybe we can extend exitmap to dump
 that consensus to disk at the time of making a list of relays to scan?

 > > > No. It scans the entire network every time. It does this
 asynchronously, and doesn't try to prioritize anything. Just whichever
 circuits are built first will be tested first. I was even thinking it
 could run continuously. If exit relays cannot cope with two HTTP requests
 an hour, perhaps they shouldn't be exit relays.
 > >
 > > Ideally, we would change as few variables at the same time as
 possible, in order to compare the new results with the old ones. Changing
 the scheduling from "only scan relays with changed descriptors" to "scan
 all relays once per hour" seems like a major design change that we could
 make at a later time.
 >
 > This could add a lot of time to the project. The exitmap architecture
 doesn't really have a way to do this, so it would take changes to the
 internals there. I guess we can perform the measurements and then throw
 them away as a shortcut option, but once we've done the measurement anyway
 that seems wasteful.

 I see. Then let's keep this in mind when comparing results. (This is
 mostly a note to myself. ;))

 One question though: If scanning takes 45 minutes right now, can we
 schedule scans in a way that they will still work when scanning takes 75
 minutes (larger network) or 15 minutes (fewer/faster exits)? For example,
 we should avoid concurrent runs, and if we do scans continuously, we
 should avoid too frequent scans.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32265#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online

_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Prev by Author: Re: [tor-bugs] #32036 [Core Tor/Tor]: output debug logs to logcat as early as possible on Android
Next by Author: [tor-bugs] #32386 [Core Tor/Tor]: Doxygen: Make output more C-like
Previous by thread: Re: [tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output
Next by thread: Re: [tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output
Index(es):
- Author
- Thread