[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles

To: undisclosed-recipients: ;
Subject: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
From: "Tor Bug Tracker & Wiki" <blackhole@xxxxxxxxxxxxxx>
Date: Wed, 21 Feb 2018 18:45:58 -0000
Auto-submitted: auto-generated
Delivered-to: archiver@xxxxxxxx
Delivery-date: Wed, 21 Feb 2018 13:46:10 -0500
List-archive: <http://lists.torproject.org/pipermail/tor-bugs/>
List-help: <mailto:tor-bugs-request@lists.torproject.org?subject=help>
List-id: "auto: Tor bug tracker status mails" <tor-bugs.lists.torproject.org>
List-post: <mailto:tor-bugs@lists.torproject.org>
List-subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=subscribe>
List-unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-bugs>, <mailto:tor-bugs-request@lists.torproject.org?subject=unsubscribe>
Reply-to: no-reply@xxxxxxxxxxxxxx, tor-assistants@xxxxxxxxxxxxxx
Sender: "tor-bugs" <tor-bugs-bounces@xxxxxxxxxxxxxxxxxxxx>

#25329: Enable metrics-lib to process large (> 2G) logfiles
---------------------------------+--------------------------
     Reporter:  iwakeh           |      Owner:  metrics-team
         Type:  enhancement      |     Status:  new
     Priority:  Medium           |  Milestone:
    Component:  Metrics/Library  |    Version:
     Severity:  Normal           |   Keywords:
Actual Points:                   |  Parent ID:  #25317
       Points:                   |   Reviewer:
      Sponsor:                   |
---------------------------------+--------------------------
 Metrics-lib receives compressed logs, usually of sizes below 600kB.  As
 this can be dealt with in-memory, this ticket is about handling the logs
 that deflate to larger files (approx. 2G).

 Commons-compressed doesn't provide methods for determining the deflated
 content size (as the command line tool xz does).  Other compression types
 metrics-lib supports have this option, but it also would require more
 changes.

 Compression can be very effective. Thus, using a cut-off compressed size
 is sort of arbitrary.  An example for xz compression: the 3G deflated log
 has 589492 compressed input array length; using extreme compression it
 even shrinks to a length of 405480; on the other hand a deflated 64M file
 can have an input array of 509212 length.

 For handling larger log files with metrics-lib some interface changes will
 be necessary.  Here a suggestion:

 {{{

  public interface LogDescriptor extends Descriptor {

    /**
 -   * Returns the decompressed raw descriptor bytes of the log.
 +   * Returns the compressed raw descriptor bytes of the log.
 +   *
 +   * <p>For access to the log's decompressed bytes
 +   * use method {@code decompressedByteStream}.</p>
 +   *
     * @since 2.2.0
     */

    public byte[] getRawDescriptorBytes();

    /**
 +   * Returns the decompressed raw descriptor bytes of the log as stream.
 +   *
 +   * @since 2.2.0
 +   */
 +  public InputStream decompressedByteStream();
 +

 }}}


 I think this might be easiest to understand and use; and of course the
 implementation wouldn't need to change processing for large and 'normal'
 logs.  It also avoids deciding about the method to find out if a file is
 large or not.

 Thoughts?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25329>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online

_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Follow-Ups:
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki
- Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
  - From: Tor Bug Tracker & Wiki

Prev by Author: [tor-bugs] #25315 [Applications/Tor Browser]: Web sites embedding https://www.facebook.com/tr/ freeze Tor Browser due to NoScript's XSS filter
Next by Author: Re: [tor-bugs] #25150 [Core Tor/Tor]: Avoid malloc/free on each server-side ntor handshake
Previous by thread: [tor-bugs] [Tor Bug Tracker & Wiki] Batch modify: #6767, #24454, #25226, #25313, ...
Next by thread: Re: [tor-bugs] #25329 [Metrics/Library]: Enable metrics-lib to process large (> 2G) logfiles
Index(es):
- Author
- Thread