[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-bugs] #13600 [Onionoo]: Improve bulk imports of descriptor archives

Subject: Re: [tor-bugs] #13600 [Onionoo]: Improve bulk imports of descriptor archives
From: "Tor Bug Tracker & Wiki" <blackhole@xxxxxxxxxxxxxx>
Date: Sun, 28 Jun 2015 00:38:32 -0000
Auto-submitted: auto-generated
Delivered-to: archiver@xxxxxxxx
Delivery-date: Sat, 27 Jun 2015 20:38:41 -0400
In-reply-to: <047.5337b9bf2309c919e1f7f906c4bc3f51@xxxxxxxxxxxxxx>
List-archive: <http://lists.torproject.org/pipermail/tor-bugs/>
List-help: <mailto:tor-bugs-request@lists.torproject.org?subject=help>
List-id: "auto: Tor bug tracker status mails" <tor-bugs.lists.torproject.org>
List-post: <mailto:tor-bugs@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=unsubscribe>
References: <047.5337b9bf2309c919e1f7f906c4bc3f51@xxxxxxxxxxxxxx>
Reply-to: tor-assistants@xxxxxxxxxxxxxx
Sender: "tor-bugs" <tor-bugs-bounces@xxxxxxxxxxxxxxxxxxxx>

#13600: Improve bulk imports of descriptor archives
-----------------------------+-----------------
     Reporter:  karsten      |      Owner:
         Type:  enhancement  |     Status:  new
     Priority:  normal       |  Milestone:
    Component:  Onionoo      |    Version:
   Resolution:               |   Keywords:
Actual Points:               |  Parent ID:
       Points:               |
-----------------------------+-----------------

Comment (by leeroy):

 Thank you for clearing up the slight differences mentioned. I was hoping
 those were minor. There were other differences, but they were clearly
 trivial (like omission of rdns, or use of ip for unresolved rdns). I'll
 take a look at the code again in NodeStatus.

 __Input validation:__ Excellent, I was thinking this too! If extra
 validation is going to be performed, it's also worth checking out
 streaming data from the archives directly. I suspect this will be to a
 significant advantage, as it will no longer be needed to take up extra
 space for the uncompressed tarball.

 __Parsing archives:__ Sounds good. I was thinking of at least warning the
 operator about an accumulation of archives, but with #16424 this isn't as
 much of a problem.

 __Importing multiple months:__ I was testing this together with looking
 into reproducing the smaller directory for parsed data. I got the out-of-
 memory-heap error while using --update-only with '''two''' months. It
 occurred at approx. 80% (based on time), during consensus parsing (based
 on stack trace). So parsing is itself very sensitive to heap memory. I
 have some thoughts on how to solve this. Besides the disk-based data
 structures to reduce heap dependency, I'll take a look again at metrics-
 lib to see if it can benefit from lexer-parser improvements. The heap
 dependency during parse could be reduced, while increasing ease-of-
 maintenance, by using a grammar-based recognizer, streaming reads (from
 archives), and lock-free (cas) lists. It creates a parse-stage that scales
 to I/O if done right. Combines parse and write, reducing heap requirement.

 __Parsing archives:__ Due to the out-of-memory error I restarted this test
 using a smaller data set. I also hope it's harmless, but having seen it I
 don't want to rule it out unless provable. I'll notify you here once I
 know for sure.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/13600#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Prev by Author: Re: [tor-bugs] #9663 [Tor]: Table-based basepoint multiply optimizations for ntor handshake
Next by Author: Re: [tor-bugs] #9663 [Tor]: Table-based basepoint multiply optimizations for ntor handshake
Previous by thread: Re: [tor-bugs] #13600 [Onionoo]: Improve bulk imports of descriptor archives
Next by thread: [tor-bugs] #16400 [Tor]: Bug: Assertion cp failed in microdescs_parse_from_string at ../src/or/routerparse.c:4168
Index(es):
- Author
- Thread