[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #18798 [CollecTor]: analysis of descriptor completeness
#18798: analysis of descriptor completeness
-----------------------+-----------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: task | Status: needs_information
Priority: Medium | Milestone:
Component: CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------+-----------------------------------
Comment (by karsten):
Thanks for starting this analysis!
Let me first speculate about the cause for those daily patterns that you
found there. And let me start by giving an example from recent logs to
explain the format of `M-` lines:
{{{
M-2016-04-11T22:00:00Z ->
D-38F20E16457647CCFF5BD131692D5FCA129E87DC210B456DA983AB291141C85D (0.0279
-> 0.0279)
M-2016-04-11T23:00:00Z ->
D-38F20E16457647CCFF5BD131692D5FCA129E87DC210B456DA983AB291141C85D (0.0279
-> 0.0279)
M-2016-04-11T23:00:00Z ->
D-597C4455AF049B147337BBFF35CE4817676339FF5C94E971A05D416FD1A2DD95 (0.0279
-> 0.0558)
M-2016-04-12T00:00:00Z ->
D-38F20E16457647CCFF5BD131692D5FCA129E87DC210B456DA983AB291141C85D (0.0280
-> 0.0558)
M-2016-04-12T00:00:00Z ->
D-597C4455AF049B147337BBFF35CE4817676339FF5C94E971A05D416FD1A2DD95 (0.0280
-> 0.0558)
}}}
1. The first line means that there's a microdescriptor with digest
`38F2..` missing from the microdescriptor consensus with valid-after time
`2016-04-11 22:00:00`. That missing microdescriptor adds a value of
`0.0279` to the total missing descriptor count which is then `0.0279`.
The idea is to only warn if that total value passes `1.0`.
1. The second line says that the same missing microdescriptor is also
referenced from the microdescriptor consensus with valid-after time
`2016-04-11 23:00:00`. Given that we shouldn't double-count that missing
descriptor, we're not increasing the total count there.
1. The third line mentions another microdescriptor with digest `597C..`
that is missing, and in this case it's referenced from the microdescriptor
consensus with valid-after time `2016-04-11 23:00:00`. That one raises
the total count by another `0.0279` to then `0.0558`.
1. I guess the remaining two lines are self-explanatory at this point.
Now, what could be the reason for the daily pattern you found there?
First of all this has to do with the Tor network growing and shrinking
over the day (surprise!). My guess is that quite a few of the relays of
which we're missing microdescriptors leave the network during some part of
the day and rejoin at a later time. So, when your numbers go up again,
those are microdescriptors that we're still missing at that point, not
newly missing microdescriptors. At least that's my guess, I didn't
confirm it with real data.
Another reason for the high increase could be that you're double-counting
missing descriptors by counting a descriptor that's missing in n consensus
n times. Again, I didn't look whether that's how you're counting things,
I'm just guessing.
Regarding your other question of which lines to look at, the `M-` lines
are only a small part of what we're interested in. In theory, everything
after `Missing referenced descriptors:` is relevant for this analysis.
Each of those lines lists a descriptor that references another descriptor
that we're missing, which includes lines starting with:
- `S-`: a server descriptor references an extra-info descriptor that is
missing,
- `V-`: a vote references a server descriptor that we're missing,
- `C-`: a consensus references a server descriptor that we're missing,
and
- `M-`: a microdescriptor consensus references a microdescriptor that is
missing (see above).
I guess it would be interesting to have statistics on all four types of
missing descriptors (or three if we count a server descriptor referenced
from a vote or a consensus as the same). Did I only give you three days
of logs? If so, I should give you at least a month of logs. In
particular the disk-full problem would skew the results a bit.
Thanks!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18798#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs