[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-bugs] #9183 [Onionoo]: Avoid parsing server descriptors more than once



#9183: Avoid parsing server descriptors more than once
-------------------------+--------------------------------------------------
 Reporter:  karsten      |          Owner:  karsten
     Type:  enhancement  |         Status:  new    
 Priority:  major        |      Milestone:         
Component:  Onionoo      |        Version:         
 Keywords:               |         Parent:         
   Points:               |   Actualpoints:         
-------------------------+--------------------------------------------------
 In the past 48 hours, Onionoo's hourly cronjob have taken between 6 and 61
 minutes.  The latter number is particularly problematic, because two
 cronjobs must not overlap, in theory.

 An analysis of substeps shows that I/O-heavy steps have highest variance.
 For example, relay and bridge server descriptors are parsed in three
 places in the code, which takes between 0:16 and 18:18 minutes, between
 0:08 and 22:34 minutes, and between 0:11 and 14:30 minutes.

 We can save some time here by avoiding to parse server descriptors more
 than once.  In the mentioned cases, we simply parse all server descriptors
 published in the last 72 hours.  What we should do instead is keep a parse
 history to only parse those descriptors published in the last hour, and
 read contents of older descriptors from our own state files.

 Start with WeightsDataWriter and then tweak DetailsDataWriter.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9183>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs