[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-dev] Tor Censorship Detector: Can I help?



On 5/7/13 12:44 AM, Sam Burnett wrote:
> Hi,

Hi Sam,

> I'd like to help improve the Tor Censorship Detector. I've read some
> background material and think I understand the basics of George Danezis'
> detection algorithm [1, 2].

Great!  Trivial nitpick: here's a better URL for George's tech report:

https://research.torproject.org/techreports/detector-2011-09-09.pdf

> Is anyone still working on this? Two tickets from a year ago talk about
> experimenting with various detection algorithms and turning one of them
> into a standalone utility [3, 4]. Has anything happened since then?

I don't think that anyone made progress on the detection algorithm or
tool.  We're still running this code:

https://gitweb.torproject.org/metrics-tasks.git/tree/HEAD:/task-2718

What did change, however, is that we'll soon have better input data for
a new detection algorithm available:

https://metrics.torproject.org/csv/userstats.csv

This file contains by-country statistics for directly connecting users
and for bridge users.  Here are the first five lines, to give you an idea:

date,node,country,transport,version,frac,users
2013-05-06,relay,,,,22,798301
2013-05-06,relay,??,,,22,10045
2013-05-06,relay,a1,,,22,692
2013-05-06,relay,a2,,,22,204
2013-05-06,relay,ad,,,22,162

- The date column is, well, the ISO 8601 date.

- The node column contains either 'relay' or 'bridge'.

- The country column contains either the empty string for all countries
or the ISO 3166 two-letter lower-case country code plus some
MaxMind-specific codes plus '??' for unknown.

- You can safely ignore the transport and version columns for the
moment.  These are for pluggable transport users and for users by IP
version.  In the future it may be interesting to see sudden changes in
usage by transport, but so far these values are not stable enough.

- You can also ignore the frac line.  It says what fraction of relays or
bridges we're basing our estimate on, from 0 to 100.  A value of 10
should be sufficient for the censorship detector, because we want it to
warn as early as possible.

- The users column is the estimated number of users.

If you want to learn more about how we compute these estimates, here are
the code and the tech report that the code is based on:

https://trac.torproject.org/projects/tor/ticket/8462

https://research.torproject.org/techreports/counting-daily-bridge-users-2012-10-24.pdf

Just keep in mind that this is still work in progress.

> My background: I'm a graduate student at Georgia Tech studying network
> censorship circumvention and measurement. Although I've met Tor
> developers on various occasions, I haven't directly contributed to the
> project; I'd like to change that.

Cool!  Let me know if I can give you any more details or provide any
assistance.  Thanks for working on the censorship detector!

Best,
Karsten


> Thanks!
> 
> Sam
> 
> [1] https://lists.torproject.org/pipermail/tor-dev/2011-September/002923.html
> [2] https://metrics.torproject.org/papers/detector-2011-08-11.pdf
> [3] https://trac.torproject.org/projects/tor/ticket/3718
> [4] https://trac.torproject.org/projects/tor/ticket/4180
> _______________________________________________
> tor-dev mailing list
> tor-dev@xxxxxxxxxxxxxxxxxxxx
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
> 

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev