[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #23367 [Metrics/Metrics website]: Onion address counts ignore descriptor upload overlap



#23367: Onion address counts ignore descriptor upload overlap
-------------------------------------+------------------------------
 Reporter:  teor                     |          Owner:  metrics-team
     Type:  defect                   |         Status:  new
 Priority:  Medium                   |      Milestone:
Component:  Metrics/Metrics website  |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:                           |  Actual Points:
Parent ID:                           |         Points:
 Reviewer:                           |        Sponsor:
-------------------------------------+------------------------------

Comment (by teor):

 Replying to [comment:1 karsten]:
 > I'll have to dive deeper into this topic, but here are some quick
 thoughts:
 >
 >  - I don't think we're including anything from v3 in these statistics,
 but we'd have to ask asn and dgoulet to be certain.

 No, we're not. And perhaps we will end up collecting them using PrivCount
 in Tor.

 >  - I believe we're taking descriptor overlap periods into account for
 v2. See Section 5, "Extrapolating network totals" of the linked report:
 "As an approximation, we assume that a hidden service publishes its
 descriptor to ''twelve'' directories over a 24-hour period: the service
 stores ''two'' replicas per descriptor using different descriptor
 identifiers,  both descriptor replicas get stored to ''three'' different
 hidden-service directories each, and the service changes descriptor
 identifiers once every 24 hours which leads to ''two'' different
 descriptor identifiers per replica." And later in that section we say how
 this is just an approximation.
 >
 > Do you think there's a defect in the v2 code?

 Yes. In each 24-hour period, there is a 1-hour overlap where descriptors
 are posted to the current and next HSDirs. So services with addresses that
 correspond to the first or last hour (initial bytes 00-0B and F4-FF) can
 be seen at 6 or 18 directories, not 12. But this probably balances out
 over time.

 This is how I fixed it in experimental PrivCount (there might be bugs):
 https://github.com/privcount/privcount/pull/423/commits/4f1fb9191c9f3c5dc0ccbfe43c2b021a213a0c78

 I also wonder if you need to account for the 1-2 hour delay between a
 consensus being produced, and clients downloading and using it. But the
 variance is probably small.

 > And, independent of that question, is there anything in particular that
 should we keep in mind when extending this code to v3?

 * There is an overlap for 12 hours per day, from when the client receives
 the 0000 consensus, for 36 hours (that is, approximately 0100-0200 for 36
 hours)
 * The hash ring changes every 24 hours based on the SRV
 * You need the ed25519 relay ids from descriptors to calculate the hash
 ring (they're not in the consensus)

 There are a few more minor things that affect v2 and v3. I added a list to
 experimental PrivCount's position weights script:
 https://github.com/privcount/privcount/pull/423/commits/e4d5786469b12781a10b1c875d9228d65a17b2d9
 #diff-a5cebcf3ce45960e58426e68588e82e1R41

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23367#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs