[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #25161 [Metrics/CollecTor]: Fix another memory problem with the webstats bulk import
#25161: Fix another memory problem with the webstats bulk import
-------------------------------+--------------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: assigned
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+--------------------------
Changes (by iwakeh):
* owner: iwakeh => karsten
* status: accepted => assigned
Comment:
Providing plenty of RAM for the import shortens the processing time quite
a bit due to less GC time. The 85min using 16G for the entire available
archives of meronense and weschniakowii together (reported
[https://trac.torproject.org/projects/tor/ticket/25100#comment:18 here])
reduce to just 65 min with 30G (of which only 22G were actually used at
peak time, 10G most of the time). Of course, timing depends highly on
available cores (here only four were available) and lesser the type of
cpu.
If a machine with 64G is available for import it can just be run on the
entire 'out' folder of webstats.tp.o and should be fine with 48-56G
(assuming that weschniakowii represents one of the hosts with the heavier
log load).
In case the import gets interrupted the logs will clearly indicate which
hosts were processed successfully. This should be used to move the
already completed imports out of the import directory to save processing
time. No problem if that is forgotten, CollecTor won't re-add or
overwrite anything, but the additional scanning might take longer than
without.
Collector properties should be set to single run and have limits turned
off for importing the already existing sanitized logs.
I used metrics-lib commit 9f2db9a19 and collector commit 06d1a81d4 and
performed some manual checks that the resulting sanitized logs stay the
same except for the intended changes (e.g. removal of '?' etc.). All
seemed fine.
Assigning to 'karsten' as the import seems ready to go.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25161#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs