[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-relays] Reimbursement of Exit Operators
On 9/18/13 2:53 AM, Damian Johnson wrote:
>> Unless maybe stem already does exactly this for us?
>
> Yup, stem parses the extrainfo descriptors...
>
> https://stem.torproject.org/api/descriptor/extrainfo_descriptor.html#stem.descriptor.extrainfo_descriptor.ExtraInfoDescriptor
>
> The only pesky bit is that you'll need to download a lot of
> descriptors from metrics (I assume you need the entries published over
> a long period of time?).
Parsing extra-info descriptors is only step one. Time periods of
contained byte histories can overlap quite substantially. You'll need a
database or efficient file format to avoid over-counting. For example,
assume you have an extra-info descriptor with these lines:
extra-info torrelayfishsticks 9FD2E81F27FB2628B3FEABEB2E66854984E48ABB
write-history 2013-09-03 01:35:10 (900 s) [...] 37888,37888,61440,786432
A simple but expensive solution would be to write lines like this to a file:
9FD2E81F27FB2628B3FEABEB2E66854984E48ABB,2013-09-03 01:35:10,w,786432
9FD2E81F27FB2628B3FEABEB2E66854984E48ABB,2013-09-03 01:20:10,w,61440
9FD2E81F27FB2628B3FEABEB2E66854984E48ABB,2013-09-03 01:05:10,w,37888
9FD2E81F27FB2628B3FEABEB2E66854984E48ABB,2013-09-03 00:50:10,w,37888
Once you have that, you sort that file, throw out duplicate lines, and
sum up values by fingerprint, date, and read/write.
This approach works fine if you need to evaluate byte histories once per
month or so and if it's okay for the job to run a few hours. If you
want to do this more often, you might want to use a database for this.
See https://gitweb.torproject.org/metrics-tasks.git/tree/HEAD:/task-8462
for a related approach. The file based approach is much simpler though.
All the best,
Karsten
_______________________________________________
tor-relays mailing list
tor-relays@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays