[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #2687 [Torperf]: Write Python version of filter.R to parse Torperf's new .mergedata format (was: Update filter.R to parse Torperf's new .mergedata format)
#2687: Write Python version of filter.R to parse Torperf's new .mergedata format
-------------------------+--------------------------------------------------
Reporter: karsten | Owner:
Type: enhancement | Status: assigned
Priority: major | Milestone:
Component: Torperf | Version:
Keywords: | Parent:
Points: 4 | Actualpoints:
-------------------------+--------------------------------------------------
Changes (by karsten):
* owner: karsten =>
* status: needs_review => assigned
Comment:
Replying to [comment:18 tomb]:
> Tentative conclusion: R is ill suited to significant string
manipulation
> Tentative recommendation: Let R crunch numbers and stats, but do the
string manipulation in a different language.
Okay. I didn't expect R to be incapable of handling this data format,
because R is really fast at parsing CSV files, tables, and so on. But I
agree with you. Let's stop trying to use R for this task.
> Why not move the string manipulation into the programs that provide the
.data and .mergedata?
You mean why not produce both the .mergedata format and another format
that R can handle more easily? Why would we need the .mergedata format
then? We should agree on a single data format that describes Torperf
data.
If we find another format that R can handle more easily, we should only
use that format. But we want to make sure that the data format can be
extended easily. For example, if we want to add another parameter like
CBT, we want to do that without breaking old stuff. Or we might want to
have some fields show up in only some of the measurements, like hidden
service substeps, but without writing NA for them in all other
measurements. And we want to be able to remove fields that we don't need
anymore. The key=value formats seems more flexible than CSV here. See
#3036 for the Torperf data format discussion.
So, what can we do about this ticket? How about we rewrite filter.R in
Python? The rest of Torperf is written in Python, so that we can expect
people to have that available. I'm changing the ticket summary
accordingly. If you disagree about Python or rewriting filter.R in it, we
can always change it to something else.
Thanks!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2687#comment:19>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs