[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Low-Cost Traffic Analysis of Tor



On Tue, Mar 15, 2005 at 01:21:51PM -0600, Mike Perry wrote:
> Why do some target nodes yield such drastic results? For example,
> where there any particular properties of C, K, and L (and perhaps A)
> in Figure 4 that might explain why the technique worked so much better
> on them?

We are not quite sure what causes some nodes to be more affected by
our attack than others, which is why we didn't say much about this in
the paper. One hypothesis is that nodes near their capacity will be
mode badly affected, and our results do support this (the nodes you
mention are believed to be either on DSL connections or were being
DDoSed).  However we didn't survey enough nodes and collect enough
samples for this to be particularly convincing evidence.

> What is your estimate on the minimum amount of time the pseudorandom
> traffic from the corrupt server must run in order for results to be
> picked up by the probe server? 

This is difficult to say. Typically these attacks never give an
absolute yes or no as to whether a particular address is the
originator, but instead give a probability. This will initially be
close to random, but gradually converge on the correct answer, so the
time a user is safe depends on how accurate the attacker needs to be.
I don't think our results can give an answer to that, firstly because
we don't have enough samples and secondly because there may be better
correlation functions which give more accurate results. Actually I am
pretty sure this can be improved - see Figure 3b in the paper. Another
problem with measuring anonymity is that even if one test only shows a
slightly better than random chance that Alice is the person making the
connection, when Alice looks at the website the next day, the attack
can be repeated and the probability of a false positive is reduced.

> What is the nature of the 'echos' that might cause false positives to
> which you refer? It would seem to me that echos would be lulls in
> traffic propogating to other nodes that would normally be recieving
> non-generated relayed data. In this cause, would they really cause
> false positives? Or am I confused on the nature of the echo effect?

A possibility I was thinking of is, say A is the victim, and B the
corrupt server, and their stream is going through Tor nodes M1, M2,
M3. Each of M1, M2, M3 will be detected as they carry the primary
signal. Then say there is also C and D, communicating though M1, M4,
M5. Now when B is sending the pulse, M1, M2, M3 will slow down other
connections going through it, (this is exactly
what we detect). But now the C-D connection will be slowed down, due
to M1 being common, and so M4 and M5 will be less loaded. A third
stream, E-F, going through M4, M6, M7 can now go faster (due to M4
being common), causing M6 and M7 to become more heavily loaded. M6 and
M7 may thus be detected by our algorithm, and this is the nature of
the echos we thought about. 

George says on this:
"One would expect to see echos if the traffic was indeed mixed
together, ie. if different streams were interacting together so that
they each pick up each other's characteristics. If there is perfect
propagation and mixing of the different streams that is good news
since we would not be able to tell them apart. On the other hand we
found that there is enough to be able to remotely monitor nodes, but
not enough to produce false positives."

> Thanks for posting this paper, it was very interesting and
> illuminating.

I'm glad you found it useful.

Thanks,
Steven.

Attachment: pgp86nrfTlkio.pgp
Description: PGP signature