[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #7828 [Stem]: Run descriptor parser over all prior descriptors
#7828: Run descriptor parser over all prior descriptors
-------------------------+--------------------------------------------------
Reporter: atagar | Owner: karsten
Type: task | Status: accepted
Priority: normal | Milestone:
Component: Stem | Version:
Keywords: descriptors | Parent:
Points: | Actualpoints:
-------------------------+--------------------------------------------------
Comment(by karsten):
There's a problem, but I can't track it down right now:
{{{
karsten@serra:~/tasks/task-7828/stem$ ./parse.py
ParsingFailure!
Exception in thread Descriptor Reader:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
line 434, in _read_descriptor_files
self._handle_walker(walker, new_processed_files)
File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
line 462, in _handle_walker
self._handle_file(os.path.join(root, filename), new_processed_files)
File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
line 515, in _handle_file
self._handle_archive(target)
File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
line 571, in _handle_archive
self._notify_skip_listeners(target, ParsingFailure(exc))
File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
line 586, in _notify_skip_listeners
listener(path, exception)
File "./parse.py", line 22, in <lambda>
lambda path, exc: LOGGER.warning(" skipped %s due to '%s' (type: %s)"
% (path, exc, type(exc), ))
UnicodeEncodeError: 'ascii' codec can't encode characters in position
34-35: ordinal not in range(128)
^C^Z
[3]+ Stopped ./parse.py
karsten@serra:~/tasks/task-7828/stem$
karsten@serra:~/tasks/task-7828/stem$ git diff
diff --git a/stem/descriptor/__init__.py b/stem/descriptor/__init__.py
index 25b180b..395cbe6 100644
--- a/stem/descriptor/__init__.py
+++ b/stem/descriptor/__init__.py
@@ -331,11 +331,14 @@ class _UnicodeReader(object):
def readline(self):
return stem.util.str_tools.to_unicode(self.wrapped_file.readline())
- def readlines(self, sizehint = 0):
+ def readlines(self, sizehint = None):
# being careful to do in-place conversion so we don't accidently
double our
# memory usage
- results = self.wrapped_file.readlines(sizehint)
+ if sizehint is not None:
+ results = self.wrapped_file.readlines(sizehint)
+ else:
+ results = self.wrapped_file.readlines()
for i in xrange(len(results)):
results[i] = stem.util.str_tools.to_unicode(results[i])
diff --git a/stem/descriptor/reader.py b/stem/descriptor/reader.py
index 0125a49..55ef886 100644
--- a/stem/descriptor/reader.py
+++ b/stem/descriptor/reader.py
@@ -126,8 +126,8 @@ class ParsingFailure(FileSkipped):
def __init__(self, parsing_exception):
super(ParsingFailure, self).__init__(parsing_exception)
self.exception = parsing_exception
- print "ParsingFailure: %s" % (parsing_exception, )
-
+ print "ParsingFailure!"
+ #print "ParsingFailure: %s" % (parsing_exception.encode('ascii',
'ignore'), )
class UnrecognizedType(FileSkipped):
"""
karsten@serra:~/tasks/task-7828/stem$ git log | head
commit 3fd28f26a86e6e071906d77c5bc8d6f6c6fb52aa
Merge: 8615af1 be9a532
Author: Karsten Loesing <karsten@xxxxxxxxxxxxxxxxxxxx>
Date: Tue Feb 26 11:58:50 2013 +0000
Merge branch 'master' of https://git.torproject.org/stem
commit be9a5323a37ea0f1b7d497d7fc33e101453eb2cf
Author: Karsten Loesing <karsten.loesing@xxxxxxx>
Date: Wed Feb 20 12:26:29 2013 +0100
karsten@serra:~/tasks/task-7828/stem$ ls data/
extra-infos-2007-08.tar extra-infos-2010-09.tar server-
descriptors-2006-11.tar server-descriptors-2009-12.tar
extra-infos-2007-09.tar extra-infos-2010-10.tar server-
descriptors-2006-12.tar server-descriptors-2010-01.tar
extra-infos-2007-10.tar extra-infos-2010-11.tar server-
descriptors-2007-01.tar server-descriptors-2010-02.tar
extra-infos-2007-11.tar extra-infos-2010-12.tar server-
descriptors-2007-02.tar server-descriptors-2010-03.tar
extra-infos-2007-12.tar extra-infos-2011-01.tar server-
descriptors-2007-03.tar server-descriptors-2010-04.tar
extra-infos-2008-01.tar extra-infos-2011-02.tar server-
descriptors-2007-04.tar server-descriptors-2010-05.tar
extra-infos-2008-02.tar extra-infos-2011-03.tar server-
descriptors-2007-05.tar server-descriptors-2010-06.tar
extra-infos-2008-03.tar extra-infos-2011-04.tar server-
descriptors-2007-06.tar server-descriptors-2010-07.tar
extra-infos-2008-04.tar extra-infos-2011-05.tar server-
descriptors-2007-07.tar server-descriptors-2010-08.tar
extra-infos-2008-05.tar extra-infos-2011-06.tar server-
descriptors-2007-08.tar server-descriptors-2010-09.tar
extra-infos-2008-06.tar extra-infos-2011-07.tar server-
descriptors-2007-09.tar server-descriptors-2010-10.tar
extra-infos-2008-07.tar extra-infos-2011-08.tar server-
descriptors-2007-10.tar server-descriptors-2010-11.tar
extra-infos-2008-08.tar extra-infos-2011-09.tar server-
descriptors-2007-11.tar server-descriptors-2010-12.tar
extra-infos-2008-09.tar extra-infos-2011-10.tar server-
descriptors-2007-12.tar server-descriptors-2011-01.tar
extra-infos-2008-10.tar extra-infos-2011-11.tar server-
descriptors-2008-01.tar server-descriptors-2011-02.tar
extra-infos-2008-11.tar extra-infos-2011-12.tar server-
descriptors-2008-02.tar server-descriptors-2011-03.tar
extra-infos-2008-12.tar extra-infos-2012-01.tar server-
descriptors-2008-03.tar server-descriptors-2011-04.tar
extra-infos-2009-01.tar extra-infos-2012-02.tar server-
descriptors-2008-04.tar server-descriptors-2011-05.tar
extra-infos-2009-02.tar extra-infos-2012-03.tar server-
descriptors-2008-05.tar server-descriptors-2011-06.tar
extra-infos-2009-03.tar extra-infos-2012-04.tar server-
descriptors-2008-06.tar server-descriptors-2011-07.tar
extra-infos-2009-04.tar extra-infos-2012-05.tar server-
descriptors-2008-07.tar server-descriptors-2011-08.tar
extra-infos-2009-05.tar extra-infos-2012-06.tar server-
descriptors-2008-08.tar server-descriptors-2011-09.tar
extra-infos-2009-06.tar extra-infos-2012-07.tar server-
descriptors-2008-09.tar server-descriptors-2011-10.tar
extra-infos-2009-07.tar extra-infos-2012-08.tar server-
descriptors-2008-10.tar server-descriptors-2011-11.tar
extra-infos-2009-08.tar extra-infos-2012-09.tar server-
descriptors-2008-11.tar server-descriptors-2011-12.tar
extra-infos-2009-09.tar extra-infos-2012-10.tar server-
descriptors-2008-12.tar server-descriptors-2012-01.tar
extra-infos-2009-10.tar extra-infos-2012-11.tar server-
descriptors-2009-01.tar server-descriptors-2012-02.tar
extra-infos-2009-11.tar server-descriptors-2005-12.tar server-
descriptors-2009-02.tar server-descriptors-2012-03.tar
extra-infos-2009-12.tar server-descriptors-2006-02.tar server-
descriptors-2009-03.tar server-descriptors-2012-04.tar
extra-infos-2010-01.tar server-descriptors-2006-03.tar server-
descriptors-2009-04.tar server-descriptors-2012-05.tar
extra-infos-2010-02.tar server-descriptors-2006-04.tar server-
descriptors-2009-05.tar server-descriptors-2012-06.tar
extra-infos-2010-03.tar server-descriptors-2006-05.tar server-
descriptors-2009-06.tar server-descriptors-2012-07.tar
extra-infos-2010-04.tar server-descriptors-2006-06.tar server-
descriptors-2009-07.tar server-descriptors-2012-08.tar
extra-infos-2010-05.tar server-descriptors-2006-07.tar server-
descriptors-2009-08.tar server-descriptors-2012-09.tar
extra-infos-2010-06.tar server-descriptors-2006-08.tar server-
descriptors-2009-09.tar server-descriptors-2012-10.tar
extra-infos-2010-07.tar server-descriptors-2006-09.tar server-
descriptors-2009-10.tar server-descriptors-2012-11.tar
extra-infos-2010-08.tar server-descriptors-2006-10.tar server-
descriptors-2009-11.tar
}}}
Want to get an account on serra and try parsing descriptors yourself? I
might not be able to look into this in the next week or two, or I'll run
into trouble with deliverables. :/
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7828#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs