[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] Revise abstract and introduction



Update of /home/freehaven/cvsroot/doc/e2e-traffic
In directory moria.mit.edu:/tmp/cvs-serv10891

Modified Files:
	e2e-traffic.tex 
Log Message:
Revise abstract and introduction

Index: e2e-traffic.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/e2e-traffic/e2e-traffic.tex,v
retrieving revision 1.11
retrieving revision 1.12
diff -u -d -r1.11 -r1.12
--- e2e-traffic.tex	21 Jan 2004 01:34:46 -0000	1.11
+++ e2e-traffic.tex	21 Jan 2004 02:09:41 -0000	1.12
@@ -12,7 +12,7 @@
 \newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
 \newcommand\XXXX[1]{{\small\bf [XXXX #1]}}
 
-\newcommand\PMIX{P_{\mbox{MIX}}}
+\newcommand\PMIX{P_{\mbox{\scriptsize MIX}}}
 \newcommand\V[1]{\overrightarrow{#1}}
 \newcommand\B[1]{\overline{#1}}
 
@@ -26,9 +26,7 @@
 
 \begin{document}
 \title{Practical Traffic Analysis: \\
-       Extending statistical disclosure attacks toward real-world mix-nets}
-% XXXX Okay; maybe these mix-nets aren't quite real-world.  Any suggestion
-% XXXX for a better title?
+       Extending and resisting statistical disclosure}
 
 \author{Nick Mathewson and Roger Dingledine}
 \institute{The Free Haven Project\\
@@ -38,44 +36,47 @@
 \centerline{\LARGE\bf *DRAFT* --- not for publication}
 %======================================================================
 \begin{abstract}
-We extend earlier work on passive long-term end-to-end traffic analysis.
-Whereas earlier work addressed cases in which a global passive eavesdropper
-was trying to learn sender-receiver connections at a single batch mix whose
-senders had known behavior patterns, this paper describes how an analogous
-attack can reveal sender-receiver connections, even when the eavesdropper is
-non-global, the sender behavior is unknown, the substrate is a network of
-timed dynamic-pool mixes, and senders use padding messages.  Each weakening
-of earlier assumptions increases the time over which the attacker must
-observe the mix-net, but has little effect on the attacker's storage or
-computational requirements.  Additionally, we describe how such an attacker
-can exploit extra information about message linkability in order to reduce
-the number of rounds required to link senders to recipients.  We present
-simulated results for each of our attacks, focusing on the amount of
-information needed to link senders and recipients.  Finally, we discuss
-some mechanisms for partially resisting these attacks.
+We extend earlier research on mounting and resisting passive
+long-term end-to-end traffic analysis against anonymous message systems.
+%%Whereas earlier work addressed cases in which a global passive eavesdropper
+%%was trying to learn sender-receiver connections at a single batch mix whose
+%%senders had known behavior patterns, this paper describes how an analogous
+%%attack can reveal sender-receiver connections, even when the eavesdropper is
+%%non-global, the sender behavior is unknown, the substrate is a network of
+%%timed dynamic-pool mixes, and senders use padding messages.
+We loosen the assumptions of earlier attacks by describing how an eavesdropper
+can learn sender-receiver connections even when senders are complex, the
+substrate is a network of pool mixes, senders use padding messages, and the
+attacker is non-global.
+%%of earlier assumptions increases the time over which the attacker must
+%%observe the mix-net, but has little effect on the attacker's storage or
+%%computational requirements.
+Additionally, we describe how an attacker can use extra information about
+message linkability in order to reduce the amount of traffic needed to link
+senders to recipients.  Finally, we simulate our attacks for a variety of
+scenarios, focusing on the amount of information needed to link senders and
+recipients.
+%XXXX We should say something about 'we found some good countermeasures.'
 \end{abstract}
 
 %======================================================================
 \section{Introduction}
 \label{sec:intro}
 Since the introduction of mix-nets \cite{chaum-mix} in 1981, many
-attacks against these anonymity systems have been proposed and discussed.
-%\cite{back01,langos02,disad-free-routes,desmedt,mitkuro,raymond00,trickle02}
-While some of these attacks have been partially
+attacks against these anonymity systems have been proposed.
+While some of these attacks have been
 addressed by improved mix-net designs,
-%\cite{abe,babel,flash-mix,kesdogan,pfitzmann85,shuffle,hybrid-mix,mixmaster-spec,minion-design}
-% let's leave the cites until section 2 -RD
-one class of attack that remains effective against current deployable
+one attack that remains effective against current deployable
 mix-net designs is the \emph{long-term intersection attack}. In this
-attack a passive eavesdropper observes a large volume of network traffic
+attack, a passive eavesdropper observes a large volume of network traffic
 and notices over time that certain recipients are more likely to
 receive messages after given senders have transmitted messages.  Although
-these correlations tend to be slight, given enough time an attacker can
+these correlations are slight, given enough time an attacker can
 use them to deduce which senders are communicating with which recipients.
-Categorical defenses against long-term intersection attacks tend to require
-either an impractically large amount of cover traffic (by requiring senders
-to send dummy messages to every possible recipient whenever they want to send
-a single message to anyone) or a set of senders
+Previous researchers have believed that ong-term intersection attacks can
+only be stopped by using an impractically large amount of cover traffic (so
+that senders send dummy messages to every possible recipient whenever they
+want to send a real message to anyone) or a set of senders
 with perfect uptimes who send messages continually (so that the attacker can
 never learn how the network behaves in the absence of particular senders).
 %
@@ -99,19 +100,23 @@
 % wouldn't be a categorical defense unless *all* senders, not just 'a
 % set of senders', behaved this way; the ones who didn't wouldn't be
 % protected. -RD
+In this paper, we present several variations on a version of the long-term
+intersection attack, and examine their simulated performance.  Preliminary
+results indicate that these attacks can be resisted with significantly less
+overhead than previously supposed.
 
 Kesdogan, Agrawal, and Penz propose an example of a long-term
 intersection attack in \cite{limits-open}, and expand on this attack in
 \cite{agrawal03}. Their {\it disclosure attack}
 assumes a fairly strict model of sender behavior and works against
-only a single batch mix (a batch mix waits until it receives $b$
-messages, then reorders and retransmits them all).  Additionally, the
+only a single batch mix.  (A batch mix waits until it receives $b$
+messages, then reorders and retransmits them all.)  Additionally, the
 disclosure attack requires an attacker to solve an NP-complete
 problem to reveal the connections between senders and recipients.
 
 Danezis presents an algorithmically simpler {\it statistical
 disclosure attack} \cite{statistical-disclosure} that requires far
-less computation on the part of the attacker.  But while this attack
+less computation.  But while this attack
 is easier to describe and implement, it assumes the same
 restrictive sender and network models as the original disclosure
 attack.
@@ -127,37 +132,35 @@
   and can send multiple messages in a single batch.
 \item The attacker lacks {\it a priori} knowledge of the network's
   average behavior when Alice is not sending messages.
-\item Mixes use a different batching algorithm, such as the timed dynamic-pool
-  algorithm \cite{trickle02} used by Mixmaster \cite{mixmaster-spec}, or the
+\item Mixes use a different batching algorithm, such as Mixmaster's
+  dynamic-pool
+  algorithm \cite{trickle02,mixmaster-spec}, or the
   generalized mix algorithm proposed by \cite{pet2003-diaz}.  (Rather than
-  the normal ``batch'' mix behavior of relaying all messages when $b$
+  the ``batch'' mix behavior of relaying all messages when $b$
   messages have arrived, these algorithms hold messages in a ``pool'' for
   a random number of rounds based on the number of messages in the pool.)
 \item Alice uses a mix network (of any topology, with synchronous or
   asynchronous batching) to relay her messages through a succession of
   mixes, instead of using just a single mix.
 \item Alice sends some traffic padding to be dropped by some mix node in the
-  network,  to disguise when she is sending real messages.
+  network, to disguise when she is sending real messages.
 \item The attacker can only view a subset of the messages entering and
   leaving the network, so long as this subset includes some messages
   from Alice and some messages to Alice's recipients.
 \item The cover traffic generated by other senders changes
   slowly over time.  (We do not address this case completely).
 \end{tightlist}
-Each of these variations from the original statistical disclosure attack
-model increases the amount of traffic the attacker must observe to
-learn Alice's regular recipients.
+Each of these deviations from the original
+%%statistical disclosure attack
+model reduces the rate at which the attacker learns Alice's recipients, and
+increases the amount of traffic the attacker must observe.
 
 Additionally, we show how an attacker can exploit additional knowledge to
 speed up these attacks.  Such knowledge includes:
 \begin{tightlist}
-\item Statistical linkability between messages.  For example, a pair
-  of messages written in the same language is likelier to have been
-  written by a single sender than is a pair of messages written in
-  different languages.
-\item Full linkability between messages.  For example, if messages
-  are pseudonymous, all messages from the same pseudonym are almost
-  certainly written by a single sender.
+\item Linkability between messages.  For example if messages are written in
+  the same language, or signed by the same pseudonym, the attacker can deduce
+  that they are likelier to have been written by the same sender.
 \item {\it A priori} suspicion of certain messages having originated
   or not originated from Alice.  For example, messages written in a
   language Alice doesn't speak are unlikely to have been written
@@ -169,18 +172,19 @@
 \item Alice's behavior is not consistent over time.  If Alice does not
   produce enough traffic with the same group of regular recipients,
   the attacker cannot learn Alice's behavior.
-  We will quantify what `enough' means below.
+  %%We will quantify what `enough' means below.
     % if it's true that we will, we should probably shout it louder
     % in the abstract. guess we'll wait to see if we do. -RD
 \item The attacker cannot observe how the network behaves when Alice isn't
   sending messages. If Alice always sends the same number of messages, in
   every round, forever, the attacker may not be able to learn who receives
-  messages in Alice's absence.
+  messages in Alice's absence.  (Our preliminary results suggest that this
+  effect can be achieved with far less padding.)
 \item The attacker cannot tell when the sender is originating
   messages.
-  % For example, the sender may be running her own mix
-  % node and injecting messages into it directly.
-  %% why leave this out? it sounds important. -RD
+  %% For example, the sender may be running her own mix
+  %% node and injecting messages into it directly.
+  % why leave this out? it sounds important. -RD
 \end{tightlist}
 
 We begin in section \ref{sec:previous-work} by presenting a brief
@@ -203,7 +207,9 @@
 in turn decrypts, delays, and re-orders messages, before relaying them
 toward their destinations.
 
-% terms to possible define: mixmaster, mixminion, anonymity set
+% Must say something about mixnets here.
+
+% terms to possible define: mixmaster, mixminion, anonymity set.
 % mention that there are a bunch of mix network designs, but they're
 % all the same to us.
 
@@ -364,7 +370,7 @@
 remove the assumption that the attacker has full knowledge of the
 distribution $\V{u}$ of cover traffic sent by users other than Alice.
 
-We model Alice's behavior with a probability distribution $P_m$ such
+We model Alice's behavior with a probability function $P_m$ such
 that in every round Alice sends $n$ messages with a probability
 $P_m(n)$, and with a behavior vector $\V{v}$ of the recipients to
 which Alice sends.  Thus, Alice's expected contribution to each round
@@ -420,7 +426,7 @@
 messages it is currently storing.  All messages in the mix's pool at the end
 of a round have an equal probability of being included in that round's batch.
 Thus, we can characterize the mix's batching algorithm as a probability
-distribution $\PMIX(b|s)$---the probability that the mix relays $b$ messages
+function $\PMIX(b|s)$---the probability that the mix relays $b$ messages
 when it has $s$ messages in the pool.  Clearly, $\sum_{b=0}^{s}\PMIX(b|s) =
 1$, for all values of s.
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/