[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Tor seems to have a huge security risk--please prove me wrong!



Thus spake Paul Syverson (syverson@xxxxxxxxxxxxxxxx):

> On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> > Any classifier needs enough bits to differentiate between two
> > potentially coincident events. This is also why Tor's fixed packet
> > size performs better against known fingerprinting attacks. Because
> > we've truncated the lower 8 bits off of all signatures that use size
> > as a feature in their fingerprint classifiers. They need to work to
> > find other sources of bits.
> 
> I disagree. Most of what you say about base rates etc. is valid and
> should be taken into account, but that is not the only thing that is
> going on. First, you have just stated one reason that correlation
> should be easier than fingerprinting but then tried to claim it as
> some sort of methodological flaw. Truncating the lower 8 bits does
> have a significant impact on fingerprinting but little impact on
> correlation because of the windows and datasets, just like you said.
> But way more importantly, fingerprinting is inherently a passive
> attack. You are sifting through a pile of known fingerprints looking
> for matches and that's all you can do as an attacker. But its easy to
> induce any timing signature you want during a correlation attack. (It
> seems to be completely unnecessary because of point 1, but it would be
> trivial to add that if you wanted to.) Tor's current design has no
> mechanism to counter active correlation. Proposed techniques, such as
> in the recent paper by Aaron, Joan, and me, are clearly too expensive
> and iffy at this stoge of research. This is totally different for
> fingerprinting. One could have an active attack similar to
> fingerprinting in which one tries to alter a fingerprint to make it
> more unique and then look for that fingerprint.  I don't want to get
> into a terminological quibble, but that is not what I mean by
> fingerprinting and would want to call it something else or start
> calling fingerprinting 'passive fingerprinting', something like that.
> Then there is the whole question of how effective this would be,
> plus a lot more details to say what "this" is, but anyway I think
> we have good reason to treat fingerprinting and correlation as different
> but related problems unless we want to say something trivial like
> "They are both just instances of pattern recognition."

Ah, of course. What I meant to say then was that "passive
fingerprinting" really is the same problem as "passive correlation". 

I don't spend a whole lot of time worrying about the "global *active*
adversary", because I don't believe that such an adversary can really
exist in practical terms. However, it is good that your research
considers active adversaries in general, because they can and do exist
on more localized scales.

I do believe that the "global external passive adversary" does exist
though (via the AT&T secret rooms that splice cables and copy off
traffic in transit), and I think that the techniques used against
"passive fingerprinting" can be very useful against that adversary. I
also think a balance can be found to provide defenses against the
"global external passive adversary" to try to bring their success
rates low enough that their incentive might switch to becoming a
"local internal adversary", where they have to actually run Tor nodes
to get enough information to perform their attacks.
 
This is definitely a terminological quibble, but I think it is useful
to consider these different adversary classes and attacks, and how
they relate to one another. I think it is likely that we are able to
easily defeat most cases of dragnet surveillance with very good
passive fingerprinting defenses, but that various types of active
surveillance may remain beyond our (practical) reach for quite some
time.
 
> > Personally, I believe that it may be possible to develop fingerprint
> > resistance mechanisms good enough to also begin to make inroads
> > against correlation, *if* the network is large enough to provide an
> > extremely high event rate. Say, the event rate of an Internet-scale
> > anonymity network.
> > 
> > For this reason, I think it is very important for academic research to
> > clearly state their event rates, and the entropy of their feature
> > extractors and classifiers. As well as source code and full data
> > traces, so that their results can be reproduced on larger numbers of
> > targets and with larger event rates, as I mentioned in my other reply.
> 
> We don't have the luxury of chemistry or even behavioral stuff like
> population biology of some species of fish to just hand out full
> traces. There's this pesky little thing user privacy that creates a
> tension we have that those fields don't. We could also argue more
> about the nature of research and publication criteria, but I suspect
> that we will quickly get way off topic in such a discussion, indeed
> have already started.

In most cases, we pretty intensely frown on these attacks on the live
Tor network, even for research purposes, so I don't think anyone is
asking for live user traces. However most of this research is done in
simulation, and it is rare if ever that the source code for the attack
setup, or the simulation traces are ever provided.

As I said before, it would be great if we could develop a common
gold-standard simulator that we could use for all of this research.
The UCSD people may be building something like this, but I also seem
to recall Steven Murdoch being interested in providing some model or
corpus for base-line comparison of all timing attack literature.

I don't think this is too off-topic, because I am saying that this
openness is what we need to be able to effectively study timing
attack and defense. I don't think it will be possible to succeed
without it.


-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs

Attachment: pgpk1hn63phct.pgp
Description: PGP signature