[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] clean and tighten sec 3.1



Update of /home/freehaven/cvsroot/doc/e2e-traffic
In directory moria.mit.edu:/home2/arma/work/freehaven/doc/e2e-traffic

Modified Files:
	e2e-traffic.tex 
Log Message:
clean and tighten sec 3.1


Index: e2e-traffic.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/e2e-traffic/e2e-traffic.tex,v
retrieving revision 1.25
retrieving revision 1.26
diff -u -d -r1.25 -r1.26
--- e2e-traffic.tex	24 Jan 2004 21:40:33 -0000	1.25
+++ e2e-traffic.tex	24 Jan 2004 22:53:55 -0000	1.26
@@ -324,10 +324,10 @@
 
 Danezis also derives a precondition that the attack will only succeed when
 \( m < \frac{N}{b-1} \), and calculates the expected number of rounds to
-succeed (with $95\%$ confidence for security parameter $\ell=2$ and $99\%$
-confidence for $\ell=3$):
+succeed (with $95\%$ confidence for security parameter $\l=2$ and $99\%$
+confidence for $l=3$):
 
-\[t > \left[ m\ell \left(\sqrt{\frac{m-1}{m}} +
+\[t > \left[ ml \left(\sqrt{\frac{m-1}{m}} +
                        \sqrt{\frac{N-1}{N^2}(b-1)} \right) \right]^2 \]
 
 %======================================================================
@@ -335,13 +335,13 @@
 \label{sec:extending}
 \subsection{Broadening the attack}
 \label{subsec:broadening}
-In this subsection, we examine ways to extend Danezis's Statistical
-Disclosure Attack to systems more closely resembling real-world mix-nets.  In
+Here we examine ways to extend Danezis's statistical
+disclosure attack to systems more closely resembling real-world mix-nets.  In
 Section~\ref{sec:simulation}, we examine the time and information
 requirements for these attacks against simulated networks.
 
-\subsubsection{Complex senders, unknown background traffic}
-\label{subsubsec:complex-senders}
+\subsubsection{Complex senders, unknown background traffic:}
+% \label{subsubsec:complex-senders}
 First, we relax the requirements related to sender behavior.
 We allow Alice to choose among her recipients with non-uniform
 probability, and to send multiple messages in a single batch.  We also
@@ -361,8 +361,9 @@
 $t'$ of batches to
 which Alice has {\it not} contributed any messages.\footnote{The attack can
   still proceed if few such Alice-free batches exist, so long as Alice
-  contributes more to some batches than to others.  Specifically, the approach
-  described in Section~\ref{subsubsec:complex-mix} can exploit differences
+  contributes more to some batches than to others.  Specifically, the
+  approach described on the next page (against pool mixes and mix-nets)
+  can exploit differences
   between low-Alice and high-Alice batches to infer background behavior.}
 For each such
 batch $i$, the attacker constructs a vector $\V{u_i}$, whose elements are
@@ -384,32 +385,33 @@
 \[\V{v} \approx \frac{1}{\B{m}}
             \left[ b\B{O} - (b-\B{m})\B{U} \right] \]
 
-\subsubsection{Attacking pool mixes and mix-nets}
-\label{subsubsec:complex-mix}
+\subsubsection{Attacking pool mixes and mix-nets:}
+% \label{subsubsec:complex-mix} can't usefully label subsubsections.
 Most mix-net designs have already abandoned fixed-batch mixes in
 favor of other algorithms that better hide the relation
 between senders and recipients.  Such improved algorithms include
 timed dynamic-pool mixes, generalized mixes, and randomized versions of
-each \cite{trickle02}\cite{pet2003-diaz}.  Rather than reordering and
-relaying all of their messages whenever they have received a fixed number
-$b$, these algorithms store received messages in a {\it pool}, and at fixed
-intervals relay a {\it fraction} of the pooled messages, based on the pool's
+each \cite{pet2003-diaz,trickle02}.  Rather than reordering and
+relaying all the messages whenever a fixed number $b$ have arrived,
+these algorithms store received messages in a {\it pool}, and at fixed
+intervals relay a {\it fraction} of the pooled messages based on the pool's
 current size.
 
 When attacking such a mix, the attacker no longer knows for certain
 which batches of recipients contain a message from Alice.  Instead,
-the attacker can only estimate, for each batch of exiting messages,
+the attacker can only estimate, for each batch of output messages,
 the probability that the batch includes one of Alice's messages.
 
 Following D\'iaz and Serjantov's approach in \cite{pet2003-diaz}, we treat
 these mixing algorithms generically as follows: a mix relays a
-number of messages at the end of each round, depending on the number of
+number of messages at the end of each round, depending on how many
 messages it is currently storing.  All messages in the mix's pool at the end
 of a round have an equal probability of being included in that round's batch.
 Thus, we can characterize the mix's batching algorithm as a probability
 function $\PMIX(b|s)$---the probability that the mix relays $b$ messages
-when it has $s$ messages in the pool.  Clearly, $\sum_{b=0}^{s}\PMIX(b|s) =
-1$, for all values of s.
+when it has $s$ messages in the pool.  Clearly, $\forall s,
+\sum_{b=0}^{s}\PMIX(b|s) = 1$: the mix will always output between $0$
+and $s$ messages.
 
 %In the case of a timed dynamic-pool mix, this distribution is:
 %\[ P_{TDP}(b|s) =
@@ -432,21 +434,21 @@
 We assume that the attacker has a fair estimate of $P_R$.\footnote{The
 attacker can estimate $P_R$ by
 sending test messages through the mix, or by counting the
-messages entering and leaving the mix and deducing the pool size.
-Dummy messages can hamper these techniques to an extent discussed
-in \cite{dummy-msgs}.}
+messages entering and leaving the mix and deducing the pool size.}
+%Dummy messages can hamper these techniques to an extent discussed
+%in \cite{dummy-msgs}.
 Now, when Alice sends a message in round
 $i$, the attacker observes rounds $i$ through some later round $i+k$,
 choosing $k$ so that $\sum_{j=k+1}^{\infty} P_R^i(j)$ is negligible.
 The attacker then uses $P_R$ to
-computes a weighted mean $\B{O_w}$ of the observations from all of these
-rounds, weighting by the expected number of messages from Alice that will exit
+compute $\B{O_w}$, the mean of the observations from all of these rounds
+weighted by the expected number of messages from Alice that will exit
 that round:
 \[ \B{O_w} = \sum_i \sum_{r=0}^k P_R^i(r) m_i \V{o_{i+r}}
    \approx \frac{\B{m}\V{v} + (\B{n}-\B{m})\V{u}}{\B{n}} \]
 
 To solve for Alice's behavior $\V{v}$, the attacker now needs a good estimate
-for estimate for the background $\V{u}$.  The attacker gets this by
+for the background $\V{u}$.  The attacker gets this by
 averaging observations $\V{u_i}$ from batches with a negligible probability of
 including messages from Alice.  Such batches, however, are not
 essential: If the attacker chooses a set of $\V{u_i}$ such that each
@@ -454,21 +456,22 @@
 Alice, the attacker will obtain
 \[\B{U'} \approx \frac{\delta_a}{\B{n}} \V{v} +
                    \frac{1-\delta_a}{\B{n}} \V{u} \]
-and can thus substitute
+and can thus solve again for $\V{v}$ in the earlier equation for
+$\B{O_w}$, now using
 \[\V{u} \approx \frac{1}{1-\delta_a}
                \left[ \B{n} \B{U'} - \delta_a \V{v} \right] \]
-into the earlier equation for $\B{O_w}$ and again solve for $\V{v}$.
 
-Another way senders behave differently from the model of the original
-disclosure attack is by directing their messages through a chosen path
+Senders can also behave differently from the original model
+by directing their messages through a chosen path
 in a network of mixes.  While using a mix-net increases the effort an
 attacker must spend to observe all messages leaving the system, it
 has no additional effect on intersection attacks beyond changing the
 delaying characteristics $P_R$ of the anonymity system as introduced
 above.
 
-Assume for the sake of simplicity that all mixes share a single
-$P_R$, and that Alice chooses a path of length $\ell_0$.  The chance of
+Assume for the sake of simplicity that all mixes have the same
+expected delay $P_R$, and that Alice chooses a path of length $\ell_0$.
+The chance of
 the message being delayed by a further $d$ rounds is now
 \[  P_R'(\ell_0+d) = \binom{\ell_0+d-1}{d} (1-P_D)^{\ell_0} P_D^d \]
 If Alice chooses her path length probabilistically according to
@@ -481,16 +484,16 @@
 Danezis has independently extended statistical disclosure to pool mixes
 \cite{gd-thesis}.
 
-\subsubsection{Dummy traffic}
-\label{subsubsec:dummy-traffic}
-One strategy for reducing the impact of traffic analysis is for Alice
-to periodically send messages into the network that arrive at no
-recipient, or to periodically send messages into the network that
+\subsubsection{Dummy traffic:}
+%\label{subsubsec:dummy-traffic}
+Alice can also reduce the impact of traffic analysis by
+periodically sending messages into the network that are dropped inside
+the network, or by periodically sending messages into the network that
 arrive at recipients other than those with whom she wants to
 communicate.
 
 Although these methods can succeed in slowing or stopping the attacker (as
-discussed below in section \ref{sec:simulation}), the change in the attack
+discussed below in Section \ref{sec:simulation}), the change in the attack
 itself is trivial: Alice's behavior
 vector $\V{v}$ no longer adds to $1$, since there is now a probability that a
 message from Alice will not reach any recipient.  Aside from this, the
@@ -530,43 +533,42 @@
 %% regime results in traffic quadratic in the number of participants in
 %% the system, and thus scales poorly.
 
-\subsubsection{Partial observation}
-\label{subsubsec:partial}
+\subsubsection{Partial observation:}
+%\label{subsubsec:partial}
 Until now, we have required that the attacker, as a global passive
-adversary, observe all the messages entering and leaving the system.
-(At least, all the messages sent by Alice, and all the messages
-reaching Alice's recipients.)  This requirement is not so difficult
-as it might seem: in order to be a ``global'' adversary against
+adversary, observe all the messages entering and leaving the system
+(at least, all the messages sent by Alice, and all the messages
+reaching Alice's recipients).  This requirement is not so difficult
+as it might seem: to be a ``global'' adversary against
 Alice, an attacker need only eavesdrop upon Alice, and upon the mixes
 that deliver messages to recipients. (Typically, not all mixes do
-so.)  In this section, we examine the consequences for a non-global
-attacker.
+so.)
 
 Depending on which points of the network the attacker can observe, a
 non-global attacker has different characteristics.  If an attacker
 eavesdrops on a fraction of the {\it mixes} in the system, the
 attacker receives a random sample\footnote{But possibly a biased
   sample, depending on Alice's path selection algorithm.} of the
-messages entering or leaving the system.  Such an attacker
-will still converge on the same $\B{O}$ and thus the same
-estimation of Alice's behavior, but will require more rounds of
-observation to do so.
+messages entering or leaving the system. If such an attacker can see some
+messages from Alice and some messages to her recipients, he will still
+converge on the same $\B{O}$ and thus the same estimation of Alice's
+behavior, but the attack will require more rounds of observation.
 
 Alternatively, an attacker who eavesdrops on a fraction of the {\it
   users} in the system receives {\it all} of the messages sent to or
-from those users but no messages sent to or from other users.  So long as
-one of these users is Alice, the attack can still proceed: To such an
-attacker, the network is as if the messages sent by Alice to
+from those users but no messages sent to or from other users.  So long
+as one of these users is Alice, the network (to such an attacker) is as
+if the messages sent by Alice to
 unobserved recipients were dummy messages.  Now the attack converges
 as before, but with only information concerning the observed
 recipients:  the attacker learns which of the observed recipients
 receive messages from Alice, and which do not.
 
-\subsubsection{Time-variant background traffic}
-\label{subsubsec:time-variant}
+\subsubsection{Time-variant background traffic:}
+%\label{subsubsec:time-variant}
 If Alice's behavior changes completely and radically over time, a long-term
 intersection attack cannot proceed: the attacker cannot make enough
-observations of any version of Alice's behavior in order to converge
+observations of any version of Alice's behavior to converge
 on a $\B{v}$ for any of them.
 
 On the other hand, if Alice's behavior $\V{v}$ remains consistent
@@ -582,14 +584,17 @@
 So if an attacker can get good local estimates to $\V{u}$, the
 intersection attack proceeds as before.
 
-\subsubsection{Attacking recipients}
-\label{subsubsec:recipients}
-As a final (and somewhat anticlimactic) extension to the original
-attack, we note that an attacker can mount attacks against recipients
-as well as senders with only slightly higher storage, and no increase
-in computational cost.
+\subsubsection{Attacking recipients:}
+%\label{subsubsec:recipients}
+%As a final (and somewhat anticlimactic) extension to the original
+%attack,
+Finally,
+we note that an attacker can % mount attacks against 
+find recipients
+as well as senders with slightly higher storage and the same %no increase in
+computational cost.
 
-In this model, the attacker wishes to know which senders are sending
+The attacker wishes to know which senders are sending
 anonymous messages to a given recipient Bob.  The analysis remains the
 same: the attacker compares sender behavior in rounds from which Bob
 probably receives messages with behavior in rounds from which Bob
@@ -597,7 +602,7 @@
 attacker cannot tell in advance when Bob will receive a message.
 Therefore, the attacker must remember a window of recent observations
 at all times,
-such that if Bob later receives a message, the probability is negligible that
+such that if Bob later receives a message, the chance is negligible that
 the message was sent before the first round in the window.
 
 \subsection{Strengthening the attack}
@@ -613,8 +618,8 @@
 increased traffic, we discuss ways to reduce the amount of traffic required
 for the attack by incorporating additional information.
 
-\subsubsection{Exploiting message partitioning}
-\label{subsubsec:full-linkability}
+\subsubsection{Exploiting message partitioning:}
+%\label{subsubsec:full-linkability}
 The attacker's work can be greatly simplified if some messages leaving
 the system are
 {\it linkable}.  Two messages are said to be {\it linkable} if they are
@@ -667,8 +672,8 @@
 %  to $<$Word6, Bob$>$, she is more likely to send to $<$Word6,
 %  Carol$>$ than to $<$Word2K, Carol$>$.}  -NM
 
-\subsubsection{Exploiting {\it a priori} suspicion}
-\label{subsubsec:suspicion}
+\subsubsection{Exploiting {\it a priori} suspicion:}
+%\label{subsubsec:suspicion}
 Finally, the attacker may have reason to believe that some messages
 are more likely to have been sent by the target user than others.  For
 example, if we believe that Alice speaks Urdu but not Arabic, or
@@ -697,7 +702,7 @@
 model of the original statistical disclosure attack, then against more
 sophisticated models.  We describe our simulations and present results below.
 
-\subsubsection{The original statistical disclosure attack}
+\subsubsection{The original statistical disclosure attack:}
 %trial1
 First, we simulated Danezis's original statistical disclosure attack,
 varying the parameters $N$ (the number of recipients), $m$ (the number
@@ -729,7 +734,7 @@
 recipients (large $m$); when there are fewer recipients for Alice to hide
 hers among (small $N$); or when batch sizes are small (small $b$).
 
-\subsubsection{Complex sender behavior and unknown background traffic}
+\subsubsection{Complex sender behavior and unknown background traffic:}
 % trial2
 The next simulation examines the consequences of a more complex model for
 background traffic, and of several related models for Alice's behavior.
@@ -793,7 +798,7 @@
 more than they did before.
 %XXX say more? -NM
 
-\subsubsection{Attacking pool mixes and mix-nets}
+\subsubsection{Attacking pool mixes and mix-nets:}
 \label{subsec:sim-complex-mixes}
 %trials 3,4
 Pooling slows an attacker by increasing the number of output messages
@@ -843,8 +848,8 @@
 % XXX can you explain this better? -NM
 
 
-\subsubsection{The impact of dummy traffic}
-\label{subsec:sim-dummies}
+\subsubsection{The impact of dummy traffic:}
+%\label{subsec:sim-dummies}
 %trials 5a,5b
 Several proposals exist for using dummy messages to frustrate traffic
 analysis.  Although several of them have been examined in the context of
@@ -894,8 +899,8 @@
 ``imperfect threshold padding'' (Alice always tries to pad up to a threshold
 of $M$ messages per round, but is sometimes offline).
 
-\subsubsection{The impact of partial observation}
-\label{subsec:sim-partial}
+\subsubsection{The impact of partial observation:}
+%\label{subsec:sim-partial}
 %trial 6
 Finally, we examine the degree to which a non-global passive adversary can
 mount the statistical disclosure attack.  Again, we base our simulation on
@@ -937,8 +942,8 @@
 closing, we offer suggestions for mix-net designs, and suggest several open
 questions for future work.
 
-\subsubsection{Implications for mix-net design}
-\label{subsubsec:implications}
+\subsubsection{Implications for mix-net design:}
+%\label{subsubsec:implications}
 If we were to design a mix network based on our findings here, what steps
 should we take in order to frustrate intersection attack?
 
@@ -985,8 +990,8 @@
 % connection with an attacker on each end compromises a sender--receiver
 % link.
 
-\subsubsection{Questions for future work}
-\label{subsubsec:future-work}
+\subsubsection{Questions for future work:}
+%\label{subsubsec:future-work}
 Many questions remain to be answered before the effectiveness of long-term
 intersection attacks can be considered a closed problem.
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/