[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #8255 [Stem]: Parse exit list entry
#8255: Parse exit list entry
-------------------------+--------------------------------------------------
Reporter: atagar | Owner: atagar
Type: enhancement | Status: needs_revision
Priority: minor | Milestone:
Component: Stem | Version:
Keywords: | Parent: #8252
Points: | Actualpoints:
-------------------------+--------------------------------------------------
Changes (by atagar):
* status: needs_review => needs_revision
Comment:
Hi arlolra, sorry about the delay and thanks for the patch!
This looks fantastic, only a few minor questions and comments...
{{{
+"""
+Parsing for TorDNSEL files.
+
+Defined in https://www.torproject.org/tordnsel/exitlist-spec.txt
+"""
}}}
Thanks for including the link. However, we probably just need it once and
it's already in TorDnsel's pydocs so just only include it there.
{{{
+ _read_until_keywords("ExitNode", tordnsel_file, skip = True)
+ while True:
+ contents = _read_until_keywords("ExitAddress", tordnsel_file)
+ contents += _read_until_keywords("ExitNode", tordnsel_file)
+ if contents:
+ yield TorDnsel(bytes.join(b"", contents), validate, **kwargs)
+ else:
+ break # done parsing file
}}}
This will skip anything prior to the first 'ExitNode'. The TorDNSEL spec
is a tad less formal than other spec types so this is probably appropriate
though in most other cases extra content like this is disallowed.
The spec says entries are...
{{{
(ExitNode | Published | LastStatus | ExitAddress*)*
}}}
Which indicates seems to indicate that the ExitAddress appears zero or
more times. Won't the above have problems if ExitAddress doesn't appear
for an entry?
{{{
+class TorDnsel(Descriptor):
}}}
Mind if we rename this class to TorDNSEL?
{{{
+ :var list exit_addresses: list of (str address, datetime date) tuples
consisting of the found exit address and the time
}}}
This should have an asterisk too since the list will always exist (it has
a default value of being empty), and lets also mention that it's IPv4
addresses.
{{{
+ for keyword, values in entries.items():
+ value, _ = values[0]
}}}
Lets balk with a validation error if we get any block content.
{{{
+ elif keyword == "ExitAddress":
+ for value, _ in values:
+ address, date = value.split(" ", 1)
+ if validate and not
stem.util.connection.is_valid_ipv4_address(address):
+ raise ValueError("ExitAddress isn't a valid IPv4 address: %s"
% address)
+ try:
+ date = datetime.datetime.strptime(date, "%Y-%m-%d %H:%M:%S")
+ except ValueError:
+ if validate:
+ raise ValueError("ExitAddress found time wasn't parsable:
%s" % value)
+ self.exit_addresses.append((address, date))
}}}
This worries me a little bit because if 'validate' is False then the date
we append could be either a datetime or str. Lets continue (skip the
entry) when date can't be parsed and validate is False.
{{{
+ descriptors = list(_parse_file(io.BytesIO(desc_text)))
+ self.assertEqual(3, len(descriptors))
+ self.assertTrue(isinstance(descriptors[0], TorDnsel))
+ self.assertEqual(3, len(descriptors[1].exit_addresses))
}}}
Would you mind some assertions about the parsed values? Also, a test or
two with invalid content would be nice. For instance, we should probably
have a test that parses content without any ExitAddress considering the
issue mentioned earlier.
Cheers! -Damian
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8255#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs