[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #8050 [Stem]: Stem's DescriptorReader should provide an option to provide statuses vs. status entries
#8050: Stem's DescriptorReader should provide an option to provide statuses vs.
status entries
----------------------------+-----------------------------------------------
Reporter: karsten | Owner: atagar
Type: enhancement | Status: closed
Priority: normal | Milestone:
Component: Stem | Version:
Resolution: implemented | Keywords:
Parent: | Points:
Actualpoints: |
----------------------------+-----------------------------------------------
Changes (by atagar):
* status: new => closed
* resolution: => implemented
Comment:
Hi Karsten. I just pushed something that should make everyone happy...
https://gitweb.torproject.org/stem.git/commitdiff/ea0b73a5aa221fadafc2ba718a0ef42e151e5ad6
The DescriptorReader and parse_file() now have a 'document_handler'
argument that has three options:
* give me router status entries
* give me a document with the router status entries
* give me a document *without* reading the router status entries
https://stem.torproject.org/api/descriptor/descriptor.html#stem.descriptor.__init__.DocumentHandler
To use this simply provide one of the enum values. For instance...
{{{
from stem.descriptor import parse_file, DocumentHandler
with open('/path/to/my/cached-consensus') as document_file:
document = next(parse_file(document_file, "network-status-consensus-3
1.0", document_handler = DocumentHandler.DOCUMENT))
print "document version %i, had %i routers" % (document.version,
len(document.routers))
}}}
The 'next()' call is because parse_file() gives you an iterator, in this
case containing a single value that's a NetworkStatusDocumentV3 instance.
Feel free to reopen if this isn't what you wanted.
> The alternative, to iterate over status entries and look at every
referenced status document to see if I saw that before or not, seems
complicated.
Not really. The documents all had the same reference so you could have
simply kept a set...
{{{
seen_documents = set()
for entry in my_descriptor_reader:
if not entry.document in seen_documents:
seen_documents.add(entry.document)
... do stuff...
}}}
> It probably doesn't even work for bandwidth weights which are parsed
after the status entries.
As mentioned in our email exchange this is wrong. It reads the header and
footer, *then* the router status entries in the middle.
Cheers! -Damian
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8050#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs