[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #13720 [Ooni]: Investigate possible performance improvements to the ooni-pipeline
#13720: Investigate possible performance improvements to the ooni-pipeline
-----------------------------+---------------------
Reporter: hellais | Owner: hellais
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Ooni | Version:
Resolution: | Keywords:
Actual Points: | Parent ID:
Points: |
-----------------------------+---------------------
Comment (by dcf):
For what it's worth, I was also struggling with the slowness of the Python
yaml module (in the context of [https://lists.torproject.org/pipermail
/ooni-dev/2015-June/000288.html this project]) to find server-side
blocking of Tor in OONI reports). For me, yaml.CSafeLoader is ''way''
faster, like over 30Ã.
These are the times to parse 1.5 GB of gzip files, consisting of
http_requests reports between 2015-06-16 and 2015-06-24:
{{{
yaml.safe_load_all(f)
real 138m29.467s
user 138m27.808s
sys 0m6.356s
yaml.load_all(f, Loader=yaml.CSafeLoader)
real 4m40.021s
user 5m21.960s
sys 0m7.428s
}}}
I had tried optimizing the HTML parsing and gzip decompression; the YAML
decoding was the bottleneck by far.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/13720#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs