[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #29624 [Metrics/ExoneraTor]: New version of exit list format
#29624: New version of exit list format
-------------------------------------------------+-------------------------
Reporter: irl | Owner: karsten
Type: task | Status:
| accepted
Priority: Medium | Milestone:
Component: Metrics/ExoneraTor | Version:
Severity: Normal | Resolution:
Keywords: metrics-exit-list-project metrics- | Actual Points:
roadmap-2019-q2 |
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+-------------------------
Comment (by karsten):
Replying to [comment:2 irl]:
> Replying to [comment:1 karsten]:
> > > * Source ASN
> > > * Source country
> >
> > I'm not sure about these. We would basically include these by having
the source look up its IP address in a database. But then the result
depends on which database (version) the source uses. Of course, whoever
uses this information could as well look up the source IP address in the
database (version) of their choice and discard these two fields. Maybe
this means we shouldn't put too much effort in the source's ability to
include these two fields. Or we could just omit them from the spec. Not
sure!
>
> I would rather have them as optional if you think that they would not be
required. I would expect this to be either declared by the user, who
should know best, or looked up via RIPEstat.
We can specify these. What format would we expect the country and ASN to
be in?
> > This one is tricky. I don't think that the current scanner includes
scans that ended with unknown failures or timeouts. It includes, for each
found exit IP address, the latest scan time of a successful run resulting
in that IP address. It probably omits IP addresses after a given number of
hours, but we'd have to look at the code in order to know.
>
> I would like to have one line per measurement, whether it succeeds, has
a duplicate result, fails, or whatever. This helps us to understand how
the tool is performing and doesn't hide information that would be really
useful in debugging.
I see your point. However, this would be a backward-incompatible change to
the current document format where the IP address is unique for the
`ExitAddress` lines of any given router. And it might not scale if we add
lots and lots of scans all ending with the same result. Unclear.
> > > I think we should probably have one line per measurement, so IPv4
and IPv6 results would be listed separately, not on the same line. In the
future we may have differing transports to consider (TCP/QUIC/something
else) so maybe we should not just have IPv4 vs IPv6 but some numeric
identifier that is later extensible.
> >
> > Agreed on the IPv4/IPv6 distinction. I was thinking to simply include
a new `ExitAddress6` line for IPv6 addresses and continue using
`ExitAddress` for IPv4 addresses. And I'd probably simply add another
keyword for the next transport or address version. What else do you have
in mind?
>
> This could also work, but we should do it in a way that we have defined
a generalised format for the measurement result and then we have specifics
for IPv4 and IPv6 which should just be that the expected address format is
different.
Sounds good.
> > Relatedly, I'd want to include `OrAddress` and `OrAddress6` for the
addresses found in the consensus. Background is that I'd like to use exit
lists as single input document type for ExoneraTor in the future.
>
> Perhaps we are describing Internet Address Lists and not Exit Lists?
Possibly. Maybe this won't scale, either. Unclear.
> > > Exit lists are not currently included in torspec but probably should
be. The specification should cover the existing format, and then also the
new format. We should expect that we will later extend the new format with
a signature. Maybe we should just figure that out now also.
> >
> > Turns out that specifying the existing format is not trivial. Right
now I'm looking at metrics-lib only, but I think I'll have to look at
other code that produces/consumes these lists. For example, it would be
great to know whether `Published` and `LastStatus` in the current format
are considered required or optional fields, because it would be very
convenient to lose them in version 2. What other code I should be looking
at?
>
> I've not thought about this yet, but why would it be convenient to lose
these in version 2?
Both are rather implementation-specific pieces of information that are not
really relevant for exit lists. `Published` is used to avoid doing another
scan until the next descriptor arrives, and `LastStatus` is used to decide
when to discard a router. Both parts are contained in exit lists, because
they're not primarily an output format but an internal state file used by
TorDNSEL.
It doesn't hurt to have these lines, except that they eat up space.
However, declaring them as required means we can never remove them from
future formats without making a backward-incompatible change. But maybe
this ship has sailed, and we need to consider them required, because they
have always been there.
P.S.: Do we need a new Metrics/* subcomponent for this exit list work?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29624#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs