[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #6471 [Metrics Utilities]: Design file format and Python/Java library for multiple GeoIP or AS databases
#6471: Design file format and Python/Java library for multiple GeoIP or AS
databases
-------------------------------+--------------------------------------------
Reporter: karsten | Owner:
Type: enhancement | Status: needs_revision
Priority: normal | Milestone:
Component: Metrics Utilities | Version:
Keywords: | Parent:
Points: | Actualpoints:
-------------------------------+--------------------------------------------
Changes (by atagar):
* status: needs_review => needs_revision
Comment:
> def address_ston(address_string):
> try:
> address_struct = socket.inet_pton(socket.AF_INET, address_string)
> except socket.error:
> raise ValueError
> return struct.unpack('!I', address_struct)[0]
I'm sure that you don't want to add a dependency just for this sort of
functionality, but just a fyi that I have a few IP utilities that you
might find to be helpful...
https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/util/connection.py
I wrote them to support exit policies, in particular checking if a given
endpoint falls under a particular IP/mask...
https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/exit_policy.py
The get_address_binary() can do something similar to what you have here...
{{{
>>> import socket
>>> import struct
>>> address_struct = socket.inet_pton(socket.AF_INET, "127.0.0.1")
>>> struct.unpack('!I', address_struct)[0]
2130706433
>>> from stem.util import connection
>>> address_bin = connection.get_address_binary("127.0.0.1")
>>> print address_bin
01111111000000000000000000000001
>>> int(address_bin, 2)
2130706433
}}}
> def __str__(self):
> return "%s,%s,%s,%s,%s" % \
> (Database.address_ntos(self.start_address),
> Database.address_ntos(self.end_address),
> self.code,
> Database.date_ntos(self.start_date),
> Database.date_ntos(self.end_date))
Any advantage to this verses just saving the 'line' argument we were
constructed from?
> def date_ston(date_string):
> def address_ntos(address):
> def date_kton(key):
I haven't a clue what any of these acronyms mean. My understanding is that
shortening function names to some arcane, overly cramped abbreviation is
an artifact of old-time C development where saving every byte of space
mattered. Mind coming up with more descriptive names?
> return int(date_datetime.strftime('%s')) / 86400
It took me a second to figure out where the 86400 came from. You might
want to comment that.
> for line in input_file.readlines():
File objects themselves are iterable over the lines. I suspect that
calling readlines() here is creating a defensive copy with a list of
lines, so this causes you to read the file into memory twice (both for
this list and what we add while processing it).
{{{
>>> with open('/tmp/foo') as input_file:
... for line in input_file:
... print line.strip()
...
pepperjack is
very tasty cheese!
}}}
Cheers! -Damian
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6471#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs