[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[freehaven-cvs] Make math look better

To: freehaven-cvs@freehaven.net
Subject: [freehaven-cvs] Make math look better
From: nickm@seul.org (Nick Mathewson)
Date: Tue, 20 Jan 2004 20:04:08 -0500 (EST)
Delivered-to: archiver@seul.org
Delivered-to: freehaven-cvs-outgoing@seul.org
Delivered-to: freehaven-cvs@seul.org
Delivery-date: Tue, 20 Jan 2004 20:04:43 -0500
Reply-to: freehaven-dev@freehaven.net
Sender: owner-freehaven-cvs@freehaven.net
Update of /home/freehaven/cvsroot/doc/e2e-traffic
In directory moria.mit.edu:/tmp/cvs-serv8284

Modified Files:
	e2e-traffic.tex 
Log Message:
Make math look better

Index: e2e-traffic.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/e2e-traffic/e2e-traffic.tex,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -d -r1.9 -r1.10
--- e2e-traffic.tex	21 Jan 2004 00:47:26 -0000	1.9
+++ e2e-traffic.tex	21 Jan 2004 01:04:06 -0000	1.10
@@ -13,6 +13,8 @@
 \newcommand\XXXX[1]{{\small\bf [XXXX #1]}}
 
 \newcommand\PMIX{P_{\mbox{MIX}}}
+\newcommand\V[1]{\overrightarrow{#1}}
+\newcommand\B[1]{\overline{#1}}
 
 \newenvironment{tightlist}{\begin{list}{$\bullet$}{
   \setlength{\itemsep}{0mm}
@@ -120,8 +122,6 @@
 pattern over time, we show how an attacker can learn Alice's regular
 recipients, even when we relax the original disclosure attack's model
 in the following ways:
-%in the presence of one or more of these deviations from the
-%original disclosure attack's model:
 \begin{tightlist}
 \item Alice chooses non-uniformly among her communication partners,
   and can send multiple messages in a single batch.
@@ -136,10 +136,8 @@
 \item Alice uses a mix network (of any topology, with synchronous or
   asynchronous batching) to relay her messages through a succession of
   mixes, instead of using just a single mix.
-\item Alice sends some intra-network traffic padding delivered to no
-  particular recipient, to disguise when she is sending real messages.
-% XXX is 'delivered to no particular recipient' the same as 'delivered
-%     to some mix node in the network'? -RD
+\item Alice sends some traffic padding to be dropped by some mix node in the
+  network,  to disguise when she is sending real messages.
 \item The attacker can only view a subset of the messages entering and
   leaving the network, so long as this subset includes some messages
   from Alice and some messages to Alice's recipients.
@@ -314,25 +312,25 @@
 implement in terms of storage, speed, and algorithmic complexity.
 
 In the statistical disclosure attack, we model Alice's behavior as an
-unknown vector $\vec{v}$ whose elements correspond to the
+unknown vector $\V{v}$ whose elements correspond to the
 probability of Alice sending a message to each of the $N$ recipients
-in the system.  The elements of $\vec{v}$ corresponding to Alice's $m$
-recipients will be $1/m$; the other $N-m$ elements of $\vec{v}$ will
+in the system.  The elements of $\V{v}$ corresponding to Alice's $m$
+recipients will be $1/m$; the other $N-m$ elements of $\V{v}$ will
 be $0$.  We model the behavior of the cover traffic sent by other users
-as a known vector $\vec{u}$ each of whose $N$ elements is $1/N$.
+as a known vector $\V{u}$ each of whose $N$ elements is $1/N$.
 
 The attacker derives from each output round $i$ an observation vector
-$\vec{o_i}$, each of whose elements corresponds to the probability of
+$\V{o_i}$, each of whose elements corresponds to the probability of
 Alice's having sent a message to each particular recipient in that round. (In
-a round $i$ where Alice has send a message, each element of $\vec{o_i}$ will
+a round $i$ where Alice has send a message, each element of $\V{o_i}$ will
 have value $1/b$ if it corresponds to a recipient who has received a message,
 and $0$ if it does not.
-Taking the arithmetic mean $\bar{O}$ of a large set of the observation
+Taking the arithmetic mean $\B{O}$ of a large set of the observation
 vectors gives (by the law of large numbers):
-\[\bar{O} = \frac{1}{t}\sum_{i=i}^{t} \vec{o_i} \approx
-  \frac{\vec{v} + (b-1)\vec{u}}{b} \]
+\[\B{O} = \frac{1}{t}\sum_{i=i}^{t} \V{o_i} \approx
+  \frac{\V{v} + (b-1)\V{u}}{b} \]
 From this, the attacker estimates
-\[\vec{v} \approx b\frac{\sum_{i=1}^t \vec{o_i}}{t} - (b-i)\vec{u}\]
+\[\V{v} \approx b\frac{\sum_{i=1}^t \V{o_i}}{t} - (b-i)\V{u}\]
 
 % XXXX Maybe add a simple example?
 
@@ -364,40 +362,40 @@
 We allow Alice to choose among her recipients with non-uniform
 probability, and to send multiple messages in a single batch.  We also
 remove the assumption that the attacker has full knowledge of the
-distribution $\vec{u}$ of cover traffic sent by users other than Alice.
+distribution $\V{u}$ of cover traffic sent by users other than Alice.
 
-We model Alice's behavior with a probability distribution $p_m$ such
+We model Alice's behavior with a probability distribution $P_m$ such
 that in every round Alice sends $n$ messages with a probability
-$p_m(n)$, and with a behavior vector $\vec{v}$ of the recipients to
+$P_m(n)$, and with a behavior vector $\V{v}$ of the recipients to
 which Alice sends.  Thus, Alice's expected contribution to each round
-is $\vec{v} \sum_{n=0}^{\infty} n p_m(n)$.
+is $\V{v} \sum_{n=0}^{\infty} n P_m(n)$.
 
 To mount the attack, the attacker first obtains an
-estimate of $\vec{u}$ by observing a large number $t'$ of batches to
+estimate of $\V{u}$ by observing a large number $t'$ of batches to
 which Alice has {\it not} contributed any messages.\footnote{The attack can
   still proceed if few such Alice-free batches exist, so long as Alice
   contributes more to some batches than to others.  Specifically, the approach
   described in section \ref{subsubsec:complex-mix} can exploit differences
   between low-Alice and high-Alice batches to infer background behavior.}
 For each such
-batch $i$, the attacker constructs a vector $\vec{u_i}$ containing
+batch $i$, the attacker constructs a vector $\V{u_i}$ containing
 $1/b$ for recipients that have received a message in that batch, and
 $0$ for recipients that have not.  The attacker then estimates
-$\vec{u}$ as:
-\[\vec{u} \approx \bar{U} = \frac{1}{t'}\sum_{i=1}^{t'}\vec{u_i} \]
+$\V{u}$ as:
+\[\V{u} \approx \B{U} = \frac{1}{t'}\sum_{i=1}^{t'}\V{u_i} \]
 
 The attacker then observes, for each round $i$ in which Alice {\it does}
 send a message, the number of messages $m_i$ sent by Alice, and
-computes $\vec{o_i}$ as before.  Computing the arithmetic mean
+computes $\V{o_i}$ as before.  Computing the arithmetic mean
 gives us
-\[\bar{O} = \frac{1}{t}\sum_{i=1}^{t}{\vec{o_i}}
+\[\B{O} = \frac{1}{t}\sum_{i=1}^{t}{\V{o_i}}
             \approx
-              \frac{\bar{m}\vec{v} + (b-\bar{m})\bar{U}}{b}
-              \mbox{\ where\ } \bar{m} = \frac{1}{t}\sum{m_i} \]
+              \frac{\B{m}\V{v} + (b-\B{m})\B{U}}{b}
+              \mbox{\ where\ } \B{m} = \frac{1}{t}\sum{m_i} \]
 
 From this, the attacker estimates Alice's behavior as
-\[\vec{v} \approx \frac{1}{\bar{m}}
-            \left[ b\bar{O} - (b-\bar{m})\bar{U} \right] \]
+\[\V{v} \approx \frac{1}{\B{m}}
+            \left[ b\B{O} - (b-\B{m})\B{U} \right] \]
 
 \subsubsection{Attacking pool mixes}
 \label{subsubsec:complex-mix}
@@ -417,8 +415,7 @@
 the probability that the batch includes one of Alice's messages.
 
 Following D\'iaz and Serjantov's approach in \cite{pet2003-diaz}, we treat
-these mixmin algorithms generically as follows: a mix relays a
-% XXX 'mixmin'? do you mean 'mixing'?
+these mixing algorithms generically as follows: a mix relays a
 number of messages at the end of each round, depending on the number of
 messages it is currently storing.  All messages in the mix's pool at the end
 of a round have an equal probability of being included in that round's batch.
@@ -457,24 +454,24 @@
 Now, when Alice sends a message in round
 $i$, the attacker observes rounds $i$ through some later round $i+k$,
 choosing $k$ so that $P_R^i(k)$ is negligible.  The attacker then uses $P_R$ to
-computes a weighted mean $\bar{O_w}$ of the observations from all of these
+computes a weighted mean $\B{O_w}$ of the observations from all of these
 rounds, weighting by the expected number of messages from Alice that will exit
 that round:
-\[ \bar{O_w} = \sum_i \sum_{r=0}^k P_R^i(r) m_i \vec{o_{i+r}}
-   \approx \frac{\bar{m}\vec{v} + (\bar{n}-\bar{m})\vec{u}}{\bar{n}} \]
+\[ \B{O_w} = \sum_i \sum_{r=0}^k P_R^i(r) m_i \V{o_{i+r}}
+   \approx \frac{\B{m}\V{v} + (\B{n}-\B{m})\V{u}}{\B{n}} \]
 
-To estimate $\vec{u}$ in this case, the simplest approach is to
-average $\vec{u_i}$ from batches with a negligible probability of
+To estimate $\V{u}$ in this case, the simplest approach is to
+average $\V{u_i}$ from batches with a negligible probability of
 including messages from Alice.  Such batches, however, are not
-essential: If the attacker chooses a set of $\vec{u_i}$ such that each
+essential: If the attacker chooses a set of $\V{u_i}$ such that each
 round contains (on average) a small number $\delta_a>0$ of messages from
 Alice, the attacker will obtain
-\[\bar{U'} \approx \frac{\delta_a}{\bar{n}} \vec{v} +
-                   \frac{1-\delta_a}{\bar{n}} \vec{u} \]
+\[\B{U'} \approx \frac{\delta_a}{\B{n}} \V{v} +
+                   \frac{1-\delta_a}{\B{n}} \V{u} \]
 and can thus substitute
-\[\vec{u} \approx \frac{1}{1-\delta_a}
-               \left[ \bar{n} \bar{U'} - \delta_a \vec{v} \right] \]
-into the earlier equation for $\bar{O_w}$ and again solve for $\vec{v}$.
+\[\V{u} \approx \frac{1}{1-\delta_a}
+               \left[ \B{n} \B{U'} - \delta_a \V{v} \right] \]
+into the earlier equation for $\B{O_w}$ and again solve for $\V{v}$.
 
 \subsubsection{Attacking mix-nets}
 \label{subsubsec:mix-nets}
@@ -506,7 +503,7 @@
 communicate.
 
 In the first case, the change is trivial: Alice's behavior vector
-$\vec{v}$ no longer adds to $1$, since there is now a probability that
+$\V{v}$ no longer adds to $1$, since there is now a probability that
 a message from Alice will not reach any recipient.  Aside from this, the
 attack can proceed as before, so long as Alice sends more messages (including
 dummies) in some rounds than in others.
@@ -524,15 +521,15 @@
   only the recipient of each dummy message can distinguish it from a
   real message.}.  Suppose that every message Alice sends has a
 probability $P_c$ of being a cover message, and that recipients for
-cover messages are chosen from a distribution vector $\vec{v_c}$.
+cover messages are chosen from a distribution vector $\V{v_c}$.
 Now, instead of converging on Alice's true pattern of recipients
-$\vec{v}$, the attacker instead learns
-\[ \vec{v'} = (1-P_c)\vec{v} + P_c\vec{v_c} \]
+$\V{v}$, the attacker instead learns
+\[ \V{v'} = (1-P_c)\V{v} + P_c\V{v_c} \]
 In other words, although the attacker cannot deduce which recipients
 receive messages from Alice, he can instead learn a superset of
-Alice's recipients. 
+Alice's recipients.
 
-Clearly, if Alice chooses $\vec{v_c}$ such that she sends messages with
+Clearly, if Alice chooses $\V{v_c}$ such that she sends messages with
 equal probability to all recipients, she can completely hide the
 actual recipients of her messages.  In practice, however, this
 requirement in unrealistic:  doing so will require that she send
@@ -542,7 +539,7 @@
 the system, and thus scales poorly.
 
 \XXXX{say more here.  Dummy policies other than N-squared may help a
-  little, by making it harder to learn $\vec{u}$.}
+  little, by making it harder to learn $\V{u}$.}
 
 \subsubsection{Partial observation}
 \label{subsubsec:partial}
@@ -562,7 +559,7 @@
 attacker receives a random sample\footnote{But possibly a biased
   sample, depending on Alice's path selection algorithm} of the
 messages entering or leaving the system.  An attacker in this case
-will still converge on the same $\bar{O}$ and thus the same
+will still converge on the same $\B{O}$ and thus the same
 estimation of Alice's behavior, but will require more rounds of
 observation to do so.
 
@@ -583,19 +580,19 @@
 If Alice's behavior changes completely and radically over time, a long
 term intersection cannot proceed: the attacker cannot make enough
 observations of any version of Alice's behavior in order to converge
-on a $\bar{v}$ for any of them.
+on a $\B{v}$ for any of them.
 
-On the other hand, if Alice's behavior $\vec{v}$ remains consistent
-while the behavior of the steady-state traffic $\vec{u}$ changes slowly, the
-attacker still has some hope.  Rather than estimating a single $\bar{U}$
+On the other hand, if Alice's behavior $\V{v}$ remains consistent
+while the behavior of the steady-state traffic $\V{u}$ changes slowly, the
+attacker still has some hope.  Rather than estimating a single $\B{U}$
 from observations to which Alice does not contribute, the attacker
-estimates a series of successive $\bar{u_i}$ values based on the
+estimates a series of successive $\B{u_i}$ values based on the
 average behavior of the network during a comparatively shorter
-duration of time.  Now the attacker observes $\vec{o_i}$ as before and
-computes the average of $\vec{o_i} - \vec{u_i}$, as before.  Now,
-\[ \vec{v} \propto \frac{1}{t}\sum_{i=1}^t \vec{o_i} - \bar{u_i}
+duration of time.  Now the attacker observes $\V{o_i}$ as before and
+computes the average of $\V{o_i} - \V{u_i}$, as before.  Now,
+\[ \V{v} \propto \frac{1}{t}\sum_{i=1}^t \V{o_i} - \B{u_i}
 \]
-So if an attacker can get good local estimates to $\vec{u}$, the
+So if an attacker can get good local estimates to $\V{u}$, the
 intersection attack proceeds as before.
 
 \XXXX{(Okay, not quite -- there's a constant factor there too.)}
@@ -643,12 +640,12 @@
 could trivially use this information to link senders and recipients.
 One way to do this
 is by treating classes of fully linkable messages as virtual message
-destinations.  Instead of collecting observations $\vec{o_i}$ of
+destinations.  Instead of collecting observations $\V{o_i}$ of
 recipients who receive messages in round $i$, the attacker now
-collects observations $\vec{o_i}$ of linkable classes
+collects observations $\V{o_i}$ of linkable classes
 (e.g. pseudonyms) that receive in round $i$.  Since two distinct
 senders don't produce messages in the same linkability class, the
-elements of Alice's $\vec{v}$ and the background $\vec{u}$ are now
+elements of Alice's $\V{v}$ and the background $\V{u}$ are now
 disjoint, and thus easier for the attacker to separate.
 
 \XXXX{Need to analyze below the circumstances under which this works;
@@ -710,11 +707,11 @@
 
 To exploit this knowledge, an attacker can (as suggested in the
 original Statistical Disclosure paper \cite{statistical-disclosure})
-modify the estimated probabilities in $\vec{o_i}$ of Alice having sent
+modify the estimated probabilities in $\V{o_i}$ of Alice having sent
 each delivered message.  For example, if 100 messages are sent in a
 round, and the attacker judges that 50 of them are twice as likely to
 have been sent by Alice than the other 50, the attacker assigns
-$2/150$ to each element of $\vec{o_i}$ corresponding to a likely
+$2/150$ to each element of $\V{o_i}$ corresponding to a likely
 message, and $1/150$ to each element corresponding to an unlikely
 message.
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/
Prev by Author: [freehaven-cvs] Add another test to try to figure out how much paddi...
Next by Author: [freehaven-cvs] Revise abstract and introduction
Previous by thread: [freehaven-cvs] Mostly added a robustness discussion
Next by thread: [freehaven-cvs] fix a cite
Index(es):
- Author
- Thread