[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [tor-talk] Tor and HTTPS graphic



On Wed, Mar 7, 2012 at 9:54 PM, Mike Perry <mikeperry@xxxxxxxxxxxxxx> wrote:

> You know, in hindsight, I don't want to sound like I'm hating on Steven
> or his work. His work was quite clear along all of the dimensions I am
> talking about, and was excellent research.
>
> He in fact did even compare 500 flows/hour to 50 flows/hour and found
> that the success rate did drastically improve, implicitly acknowledging
> and measuring the relationship between event rate and accuracy.

Yes. Murdoch's work was quite informative, one of the more palatable
dumpster morsels I've happened across.

If you draw a line straight down figure 5(a) of [1] at 10k packets,
you actually can see the effect of the base rate fallacy right there.
As his concurrent flow count increases, the P(M|C) (which he calls
P(correct target)) rate drops rather quickly. I bet if you got the
actual P(C|M) values and adjusted the units appropriately, you'd find
a 1/M^2 in there.

George Danezis claimed in [2] that the best-match decision process of
modern classifiers eliminates the quadratic 1/M^2 drop-off, but I
don't believe that to be the case. I think that experimentally you'll
find that your best-match classifier performs worse when you throw
more items at it, just as Murdoch did. This effect is also seen in
authorship classification work. The more authors you try to correlate,
the worse off your rankings are. In fact, the last time I checked,
state of the art text classification currently breaks down at around
just 100 authors, using a best-match classifier.


[1]. http://www.cl.cam.ac.uk/~sjm217/papers/pet07ixanalysis.pdf
[2].
https://conspicuouschatter.wordpress.com/2008/09/30/the-base-rate-fallacy-and-the-traffic-analysis-of-tor/
_______________________________________________
tor-talk mailing list
tor-talk@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk