[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-relays] Metrics for assessing EFF's Tor relay challenge?



Hi Karsten,

On 04/05/2014 09:58 AM, Karsten Loesing wrote:
> On second thought, and after sleeping over this, I'm less convinced that we should use an external library for the caching. We should rather start with a simple dict in memory and flush it based on some simple rules. That would allow us to tweak the caching specifically for our use case. And it would mean avoiding a dependency. We can think about moving to onion-py at a later point. That gives you the opportunity to unspaghettize your code, and once that is done we'll have a better idea what caching needs we have for the challenger tool to decide whether to move to onion-py or not. Would you still want to help write the simple caching code for challenger? 
I cleaned up the caching code and added a simple in-memory dict caching provider that has no further dependencies to onion-py. (it also has no provisions for eviction/flushing at all, but I will add that next. Right now everything is cached forever, but of course a new response from OnionOO replaces an old one.)

I can write the OnionOO API code and caching code for challenger, if I can use Python 3 and the requests library. (See below)
Of course I'd really like to actually have a user for onion-py, since it would help getting the necessary feedback and polish to push the library to version 1.0, but I understand if that isn't appropriate for this project.
>>  I don't really understand what the code does. What is meant by
>> "combining" documents? What exactly are we trying to measure? Once I
>> know that and have thought of a sensible way to integrate it into
>> onion-py I'm confident I can infact write that glue code :)
> Right now, the script sums up all graphs contained in Onionoo's
> bandwidth, clients, uptime, and weights documents.  It also limits the
> range of the new graphs to max(first) to max(last) of given input graphs.
>
> For example, assume we want to know the total bandwidth provided by the
> following 2 relays participating in the relay challenge:
>
> datetime:  0, 1, 2, 3, 4, 5, ...
>
> relay 1:     [5, 4, 5, 6]
> relay 2:  [4, 3, 5, 4]
>
> combined:    [8, 9, 9, 6]
>
> This is not perfect for various reasons, but it's the best I came up
> with yesterday.  Also, as we all know, perfect is the enemy of good.
>
> (If you're curious, reason #1: the graph goes down at the end, and we
> can't say whether it's because relay 2 disappeared or did not report
> data yet; reason #2: we're weighting both relays' B/s equally, though
> relay 1 might have been online 24/7 and relay 2 only long enough that
> Onionoo doesn't put in null; there may be more reasons.)
Ah, I see! :) So for scalar attributes of relays (such as consensus_weight_fraction) it's just a sum, and for histories it's the graphs combined as you just outlined. That makes sense, thank you!
> I'm not also sure about Python 3.  Whatever we write needs to run on
> Debian Wheezy with whatever libraries are present there.  If they're all
> Python 3, great.  If not, can't do.

I would strongly prefer to use Python 3. I understand wanting to use debian stable (I use it myself), but Python 3 is 6 years old and Python 2 is completely dead and its use for new projects is not recommended.
The only mandatory dependency for onion-py, and for me, is requests (I really dislike using urllib* directly - if you want to know why, check https://gist.github.com/kennethreitz/973705), and the python3-requests package in Wheezy is from 2012, and there is no python3-flask. :-(

Is there anything standing against using pip (python3-pip package) to install requests and flask from pypi?
>
> Thanks for your feedback!
>
> All the best,
> Karsten
Cheers,
Luke

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
tor-relays mailing list
tor-relays@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays