[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #29697 [Internal Services]: archive.tpo is soon running out of space
#29697: archive.tpo is soon running out of space
-------------------------------+--------------------------
Reporter: boklm | Owner: anarcat
Type: defect | Status: assigned
Priority: Medium | Milestone:
Component: Internal Services | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+--------------------------
Comment (by dcf):
Replying to [comment:5 anarcat]:
> If git-annex is too complicated, we can talk to IA directly. I would
recommend, however, against using their web-based upload interface which,
even they acknowledge, is terrible and barely useable. I packaged the
[https://tracker.debian.org/pkg/python-internetarchive internetarchive]
python client in Debian to work around that problem and it works much
better.
>
> Moving files to IA only shifts the problem, in my opinion: then we have
only a single copy, elsewhere and while we don't need to manage that space
anymore, we also don't manage backups and will never know if they drop
stuff on us (and they do, sometimes, either deliberately or by mistake). I
would propose that if stuff moves out of our "backed-up" infrastructure,
it should be stored in at least two administratively distinct locations.
Recently I had the idea to archive some early flash proxy/pyobfsproxy
browser bundles from circa 2013--some of them were only ever present under
!https://people.torproject.org/~dcf/ and so what I have locally is a
superset of what's at archive.torproject.org (for this specific group of
packages). The problem I'm encountering with IA is the automatic malware
scan--as soon as I upload a self-extracting Windows .exe package, the
virus scan returns positive and automatically darks (hides) the entire
item. Here are some attempted uploads that got darked:
* https://archive.org/details/tor-flashproxy-browser-2.4.6-alpha-1
* https://archive.org/details/tor-flashproxy-browser-2.4.6-alpha-2
* https://archive.org/details/tor-flashproxy-pyobfsproxy-
browser-2.4.7-alpha-1
Here's a
[https://www.virustotal.com/gui/file/358c6b2b96ad4d8137a835b987a6a0bc7ab85f5b8a010863e101a7f2a40c74f4/detection
sample report] from the upload log. Notice some of the matches say
"Not-a-virus" and are simply reporting the presence of tor, but it's
enough to fail the IA check.
* Kaspersky: Not-a-virus:NetTool.Win32.Tor.k
* Qihoo-360: Win32/Virus.NetTool.c06
* Microsoft: PUA:Win32/Presenoker
* ZoneAlarm by Check Point: Not-a-virus:NetTool.Win32.Tor.k
It seems that I can avoid the virus check by structuring the uploads:
upload all files except the .exe, let them be virus scanned, then upload
the .exe. The upload log says "item already had a curatenote indicating it
had been checked, no need to update" and the item remains undarked. But
this is no solution; besides being an apparent bug in the malware scanning
system, it'll only work until the next time someone runs a batch scan or
something, and then the items will disappear. For the sake of example,
here are items I managed to upload in that way:
* https://archive.org/details/tor-flashproxy-pyobfsproxy-
browser-2.4.7-test-1
* https://archive.org/details/tor-pluggable-transports-
browser-2.4.11-alpha-1
TL;DR: archiving at IA will probably require talking to someone there and
getting them to make a special collection for us with
[https://archive.org/services/docs/api/metadata-
schema/index.html#viruscheck viruscheck] disabled.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29697#comment:12>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs