[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances
#18910: distributing descriptors accross CollecTor instances
-------------------------------+---------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: enhancement | Status: needs_review
Priority: High | Milestone: CollecTor 1.1.0
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+---------------------------------
Comment (by iwakeh):
1. Valid points about file-io. My local runs identified the network as
the bottleneck rather than file-io when doing an initial sync to an empty
CollecTor instance. Subsequent runs were way shorter. The mirror as a
running Collector instance needed only 20 min on the first sync-run and
now way less (3 to 1 min).
But anyway, it's true that some of the copying could and should be
reduced.
2. Ideally index.json should be a true picture of 'recent', but actually,
it'll always only be a snapshot, even if it's updated with each change,
b/c then the syncing instance cannot update index.json continuously. So,
CollecTor's sync should accommodate the possible differences, which it
does currently, I think.
How to proceed?
Do you think this is a halt to the release?
I think it can be released as is, because the current set-up increases
descriptor availability a lot and is tested.
I'm wary of tuning it now without a release delay. And, regarding both
writing and parsing there are duplicate and trip-licate implementations in
the code-base, which should be streamlined and can be tuned in that
process.
I'd suggest to release and have new tickets (which will be part of the
other tickets for planning the streamlining, modularization, and other
improvements):
1. streamline writing all over the code-base with an emphasis on reducing
file-io for CollecTor;
2. Make index.json as close to the current state as necessary and
feasible, which includes pondering about how accurate it should be with
the given use-cases. Maybe have a clean-up module before index-run.
Is that an ok plan?
--
Ticket URL: <https://troodi.torproject.org/projects/tor/ticket/18910#comment:89>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs