[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #16424 [metrics-lib]: Support parsing of .xz compressed tarballs
#16424: Support parsing of .xz compressed tarballs
-----------------------------+---------------------
Reporter: karsten | Owner: karsten
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: metrics-lib | Version:
Resolution: | Keywords:
Actual Points: | Parent ID:
Points: |
-----------------------------+---------------------
Comment (by leeroy):
Which other improvements? The parsing improvement or read from archive? If
the parsing improvement, it's a surprise. As far as the read from archives
is concerned, I'll paraphrase in pseudo-pseudocode what metrics-lib does
versus what I do in the benchmark.
__Metrics-lib__
1. (8k reads) get bufferedinputstream(get tararchiveinputstream(get
fileinputstream from archive))
1. (set position) get a tar entry corresponding to a contained descriptor
file
1. construct an unbounded bytearrayoutputstream of unknown size requiring
JVM management
1. read 1k from bufferedinputstream into a separate 1k buffer
1. copy this 1k buffer into bytearrayoutputstream
1. repeat (4) and (5) until all bytes of the tar entry are read
1. now bytearrayoutputstream is a copy of the tar entry, convert
bytearrayoutputstream to bytearray (make another copy)
1. parse the copy
__Benchmark__
1. (8k reads) get archiveinputstream(get bufferedinputstream(get
fileinputstream from archive))
1. (set position) get an archive entry corresponding to a contained
descriptor file
1. construct a bounded byte array from the known size of the archive
entry, a primitive which is only GC'd by JVM
1. read the archive entry into the byte array
1. parse the byte array
Maybe it doesn't matter much. Which is why I'll next be doing some testing
of metrics-lib performance. I'll post some results once had a chance to do
the tests on metrics-lib. Mostly the read is only noticeable during large
archive entries. If you want to try it out put the partial 2015-07.tar.xz
archives in the same folder as the benchmark, and make sure you have
metrics-lib plus it's dependencies (libcommon-codec, libcommon-compress),
then compile/run making sure to set your classpath.
javac -cp /usr/share/java/*:.:descriptor.jar Benchmark16424.java
java -cp /usr/share/java/*:.:descriptor.jar Benchmark16424
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16424#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs