[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #25161 [Metrics/CollecTor]: Fix another memory problem with the webstats bulk import
#25161: Fix another memory problem with the webstats bulk import
-------------------------------+--------------------------
Reporter: karsten | Owner: iwakeh
Type: defect | Status: assigned
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+--------------------------
Comment (by karsten):
Looking at the stack trace and the input log files, I noticed that two log
files are larger than 2G when decompressed:
{{{
3.2G in/webstats/archeotrichon.torproject.org/dist.torproject.org-
access.log-20160531
584K in/webstats/archeotrichon.torproject.org/dist.torproject.org-
access.log-20160531.xz
2.1G in/webstats/archeotrichon.torproject.org/dist.torproject.org-
access.log-20160601
404K in/webstats/archeotrichon.torproject.org/dist.torproject.org-
access.log-20160601.xz
}}}
I just ran another bulk import with just those two files as import and ran
into the same exception.
It seems like we shouldn't attempt to decompress these files into a
`byte[]` in `FileType.decompress`, because Java can only handle arrays
with up to 2 billion elements:
https://en.wikipedia.org/wiki/Criticism_of_Java#Large_arrays . Maybe we
should work with streams there, not `byte[]`.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25161#comment:12>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs