[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #7828 [Stem]: Run descriptor parser over all prior descriptors
#7828: Run descriptor parser over all prior descriptors
-------------------------+--------------------------------------------------
Reporter: atagar | Owner: karsten
Type: task | Status: accepted
Priority: normal | Milestone:
Component: Stem | Version:
Keywords: descriptors | Parent:
Points: | Actualpoints:
-------------------------+--------------------------------------------------
Changes (by karsten):
* status: new => accepted
* owner: atagar => karsten
Comment:
Replying to [comment:3 atagar]:
> Thanks! Here's a script that should do the trick. Just fill in the
'LOG_FILE' with the destination for the output, and provide the descriptor
paths to the reader. The DescriptorReader's paths can be either files or
directories.
Okay, I started running this on serra. This will take a few days to run.
Good thing serra is bored anyway.
> Are the descriptors in text files or tarballs? I'm hoping for the former
since I suspect that we still have performance concerns around tarballs,
but there's no rush on this so as long as it finishes eventually I'm
happy.
I'm feeding it with decompressed tarballs. That's what's fastest with
metrics-lib. Do you know if that's different for stem? If so, can we do
anything to improve parsing decompressed tarballs, because that's most
convenient for all sorts of analyses? (Extracting years of descriptor
tarballs is somewhat painful, in particular if you accidentally include
those directories in a backup.)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7828#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs