[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Tor seems to have a huge security risk--please prove me wrong!

On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> Thus spake Paul Syverson (syverson@xxxxxxxxxxxxxxxx):
> > > For those who want more background, you can read more at item #1 on
> > > https://www.torproject.org/research.html.en#Ideas
> > > (I hoped to transition
> > > https://www.torproject.org/volunteer.html.en#Research over to that new
> > > page, but haven't gotten around to finishing)
> > 
> > Yes. Exploring defensive techniques would be good. Unlike correlation,
> > fingerprinting seems more likely to be amenable to traffic shaping;
> > although the study of this for countering correlation (as some of us
> > recently published at PETS ;>) may be an OK place to build on.
> > Personally I still think trust is going to play a bigger role as an
> > effective counter than general shaping, but one place we seem to be in
> > sync is that it all needs more study.
> Yeah, though again I want to point out that what we are actually
> looking at when we intuitively believe fingerprinting to be easier to
> solve than correlation is the event rate from the base rate fallacy.
> Otherwise, they really are the same problem. Correlation is merely the
> act of taking a live fingerprint and extracting a number of bits from
> it, and adding these bits to the number of bits obtained from a window
> of time during which the event was supposed to have occurred.
> Or, to put it in terms of event rates, it is merely the case that much
> fewer potentially misclassified events happen during the very small
> window of time provided by correlation, as opposed to the much larger
> number of events that happen during a dragnet fingerprinting attempt.
> Any classifier needs enough bits to differentiate between two
> potentially coincident events. This is also why Tor's fixed packet
> size performs better against known fingerprinting attacks. Because
> we've truncated the lower 8 bits off of all signatures that use size
> as a feature in their fingerprint classifiers. They need to work to
> find other sources of bits.

I disagree. Most of what you say about base rates etc. is valid and
should be taken into account, but that is not the only thing that is
going on. First, you have just stated one reason that correlation
should be easier than fingerprinting but then tried to claim it as
some sort of methodological flaw. Truncating the lower 8 bits does
have a significant impact on fingerprinting but little impact on
correlation because of the windows and datasets, just like you said.
But way more importantly, fingerprinting is inherently a passive
attack. You are sifting through a pile of known fingerprints looking
for matches and that's all you can do as an attacker. But its easy to
induce any timing signature you want during a correlation attack. (It
seems to be completely unnecessary because of point 1, but it would be
trivial to add that if you wanted to.) Tor's current design has no
mechanism to counter active correlation. Proposed techniques, such as
in the recent paper by Aaron, Joan, and me, are clearly too expensive
and iffy at this stoge of research. This is totally different for
fingerprinting. One could have an active attack similar to
fingerprinting in which one tries to alter a fingerprint to make it
more unique and then look for that fingerprint.  I don't want to get
into a terminological quibble, but that is not what I mean by
fingerprinting and would want to call it something else or start
calling fingerprinting 'passive fingerprinting', something like that.
Then there is the whole question of how effective this would be,
plus a lot more details to say what "this" is, but anyway I think
we have good reason to treat fingerprinting and correlation as different
but related problems unless we want to say something trivial like
"They are both just instances of pattern recognition."

> Personally, I believe that it may be possible to develop fingerprint
> resistance mechanisms good enough to also begin to make inroads
> against correlation, *if* the network is large enough to provide an
> extremely high event rate. Say, the event rate of an Internet-scale
> anonymity network.
> For this reason, I think it is very important for academic research to
> clearly state their event rates, and the entropy of their feature
> extractors and classifiers. As well as source code and full data
> traces, so that their results can be reproduced on larger numbers of
> targets and with larger event rates, as I mentioned in my other reply.

We don't have the luxury of chemistry or even behavioral stuff like
population biology of some species of fish to just hand out full
traces. There's this pesky little thing user privacy that creates a
tension we have that those fields don't. We could also argue more
about the nature of research and publication criteria, but I suspect
that we will quickly get way off topic in such a discussion, indeed
have already started.

To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxxx with
unsubscribe or-talk    in the body. http://archives.seul.org/or/talk/