Thus spake Maxim Kammerer (mk@xxxxxx): > On Sat, Aug 25, 2012 at 1:12 AM, Mike Perry <mikeperry@xxxxxxxxxxxxxx> wrote: > > The Raccoon has made a believer out of me, but there are some limits to > > both of his/her proofs.. The full proofs can still be found here: > > http://web.archive.org/web/20100416150300/http://archives.seul.org/or/dev/Sep-2008/msg00016.html > > Wrt. the first proof, it seems to me that the assumed correlation > accuracy rate of 99.9% is incredibly low, and I think that the Raccoon > recognized that by referring to sampling and retention at the end of > his post. With the targeted attack that's similar to âExample 3â in > Raccoon's post that I described in my previous comment here, where one > analyzes all exit traffic without missing packets, I would expect the > correlation accuracy (and as a result, match confidence) to > exponentially approach 100% very quickly with the number of relevant > packets seen, and extremely quickly if the traffic is interactive > (i.e., browsing). > > Actually, c/n of 30% in âExample 3â is close to the 25% that's > discussed in the OP here, so let's redo the example with c/n=25% and > different correlation accuracies (leaving the other numbers intact): > > (using âbc -lâ) > ca = 0.999 > pm = (1/5000) * (0.25)^2 > ca*pm / (pm*ca + (1-pm)*(1-ca)) > > ca = 0.999 > .01233363786760166917 > ca = 0.9999 > .11110246894375430565 > ca = 0.99999 > .55555617284636495961 > ca = 0.999999 > .92592671467910125759 > ca = 0.9999999 > .99206358969515668554 > ca = 0.99999999 > .99920064946444143613 > ca = 0.999999999 > .99992000739924807495 > > So reducing correlation accuracy error to 10^-9 will give you 99.99% > confidence in end-to-end correlation match. I suspect that a few > seconds of interactive traffic will give you a correlation accuracy > that's much better than a 10^-9 error. Well, the argument over correlation accuracy comes down to observation resolution, feature extraction ability, and academic lab conditions versus reality. For an example, let's assume that the adversary cannot see inside of Guard TLS connections. With this assumption: if at any point there's concurrent Guard TLS activity from a single client (either other circuit activity, directory fetch activity, or circuit pre-building activity), then some or all of your fine-grained timing and size information features go out the window. To see the effects of this currently, consider: Is it *really* the case that only one connection in *a billion* experiences incidental concurrent activity that interferes with or obliterates high-resolution feature extraction? I think the actual rate of random (or deliberate) concurrent activity is much higher than that, especially for heavily used tor clients, and even more so if they are serving as bridges or relays. But, against high-resolution adversaries, the really interesting question is: How little real cover traffic is actually needed to obscure timing and size information to the point where the remaining features are insufficient for high rates of correlation success? And over how many observations can such activity be expected to survive for a given base rate of similar activity? I suspect that for relatively short-lived bursts of traffic like web site views and random webapp AJAX activity, we can actually do pretty well with very little effort and overhead. Especially against the one-ended version of the correlation attack: the website fingerprinting attack, but probably against both. But for long-lived or otherwise atypical connections, you're absolutely right. There's just a whole lot of information encoded there.. Almost any level of observation will likely be able to correlate such flows eventually, and it's also hard to imagine generalized padding techniques that could blend these flows with web traffic. Unfortunately, because academia has mostly concluded that this work is uninteresting and that all forms of this problem are generally "impossible", we have no solid answers to these types of questions wrt what can be done in practice. Perhaps it is merely because defense work is less sexy than attack work when it comes to getting publications. I don't know for sure. I haven't yet figured out exactly why CS academia is broken. There's a whole lot of symptoms, though... But anyway, failing real research, there's always the botnets, the drug war, and the aliens to guide us... Can I get three cheers for Big Data? After all, I'm sure we can trust Them to tell us how the science shakes out in the end, amirite? ;). -- Mike Perry
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ tor-talk mailing list tor-talk@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk