[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] patches for preproc version of sync-batching paper.



Update of /home/freehaven/cvsroot/doc/sync-batching
In directory moria.mit.edu:/home2/arma/work/freehaven/doc/sync-batching

Modified Files:
	model.tex sync-batching.pdf sync-batching.tex 
Log Message:
patches for preproc version of sync-batching paper.
we have until monday evening for more patches, if you have any.


Index: model.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/sync-batching/model.tex,v
retrieving revision 1.10
retrieving revision 1.11
diff -u -d -r1.10 -r1.11
--- model.tex	27 Jan 2004 14:20:30 -0000	1.10
+++ model.tex	3 May 2004 05:42:33 -0000	1.11
@@ -7,6 +7,7 @@
 \newcommand{\EE}{\mathcal{E}}
 
 \section{Modeling Methodology}
+\label{sec:model}
 
 The basic model underlying our comparative study of mix network topologies
 is \emph{mixing as probabilistic permutation}.  At the cost of a
@@ -34,15 +35,16 @@
 observations, too.  Let $m'_1,\ldots,m'_N$ be the (virtual) output array.
 Due to the mixing performed by the network, it may or may not be the case
 that $m'_i=m_i$, \ie, the messages have been probabilistically permuted
-by the network.  We will refer to the following discrete probability
-distribution as the \emph{mixing distribution} of the network:
-\begin{equation}
-\label{maindist}
-\begin{array}{l}
-\PP_1\quad\ldots\quad\PP_N \\
-\mbox{where}\ \PP_i=\mathit{Prob}(m'_i=m)
-\end{array}
-\end{equation}
+by the network.  We will refer to the discrete probability
+distribution $\PP_1\ldots\PP_N$, where $\PP_i=\mathit{Prob}(m'_i=m)$,
+as the \emph{mixing distribution} of the network.
+%\begin{equation}
+%\label{maindist}
+%\begin{array}{l}
+%\PP_1\quad\ldots\quad\PP_N \\
+%\mbox{where}\ \PP_i=\mathit{Prob}(m'_i=m)
+%\end{array}
+%\end{equation}
 Informally, each $\PP_i$ is the probability that the targeted
 message $m$ re-appears in the $i$th position of the output buffer.
 
@@ -78,7 +80,7 @@
 can be interpreted as $m$'s position before passing through the mix,
 and $s'$ as its position afterwards.
 
-For the purposes of computing distribution (\ref{maindist}), we are
+For the purposes of computing the mixing distribution $\PP_i$, we are
 interested in deadlock states, \ie, those corresponding to the situation
 in which $m$ has passed through all mixes in its path and exited the
 mix network with no further transitions possible.  Suppose a special
@@ -104,7 +106,7 @@
 mix can be interpreted as permutation in a virtual buffer of size $K$.
 In particular, the targeted message $m$ appears in any of the $K$
 output positions with equal probability after passing through the mix.
-Therefore, each mix can be modeled by a simple Markov chain as below
+Therefore, each honest mix can be modeled by a simple Markov chain as below
 (recall that state $s$ represents the current position of message $m$,
 and let $t$ be the sequential number of the current hop). However,
 the compromised
@@ -135,7 +137,7 @@
 in Section~\ref{singlemix}.
 The main difference between topologies
 is how the targeted message moves through the network, resulting in
-different mixing distributions in~(\ref{maindist}).
+different mixing distributions $\PP_1\ldots\PP_N$.
 
 \subsubsection{Assumptions.}
 
@@ -150,14 +152,22 @@
         \mbox{mix $x$ was selected as entry point}
         )]$.
 
-We simplify the model by assuming that message load on each internal link
-within the network is exactly equal to the statistically expected load
-given a particular network topology.  This assumption is approximated
-with very high probability when the number of messages in a single batch
-is significantly higher than the number of network nodes (see
-Section~\ref{subsec:average-actual} for a detailed discussion), and
-randomness involved in message routing is obtained from unbiased coin
-tosses in the same way for all messages.
+Note that we must consider two sources of uncertainty. The first is the
+distribution of compromised nodes in the network, which we address by
+assuming a fixed probability that any given node is bad.  Thus we are
+calculating \emph{prior} distributions---effectively the average of all
+possible occurrences of compromised nodes in the network. (In contrast,
+\cite{Diaz02,Serj02} consider \emph{posterior} distributions, where
+certain nodes are known to be bad). The second uncertainty is the users'
+selection of message routes, which we address by treating the message
+load on each internal link within the network as exactly equal to
+the statistically expected load given a particular network topology.
+This assumption is approximated with very high probability when the
+number of messages in a single batch is significantly higher than the
+number of network nodes (see Section~\ref{subsec:average-actual} for
+discussion). %, and randomness
+%involved in message routing is obtained from unbiased coin tosses in
+%the same way for all messages.
 
 Intuitively, suppose there are four mixes in the first layer of the
 network, and batch size is 128.  We will analyze the average-case

Index: sync-batching.pdf
===================================================================
RCS file: /home/freehaven/cvsroot/doc/sync-batching/sync-batching.pdf,v
retrieving revision 1.20
retrieving revision 1.21
diff -u -d -r1.20 -r1.21
Binary files /tmp/cvs99vxTb and /tmp/cvsWhzxJj differ

Index: sync-batching.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/sync-batching/sync-batching.tex,v
retrieving revision 1.54
retrieving revision 1.55
diff -u -d -r1.54 -r1.55
--- sync-batching.tex	29 Apr 2004 08:55:19 -0000	1.54
+++ sync-batching.tex	3 May 2004 05:42:33 -0000	1.55
@@ -123,7 +123,7 @@
 model. Section~\ref{sec:related} relates previous work
 to synchronous batching, including a response to each of the arguments
 from~\cite{disad-free-routes}. Section~\ref{sec:scenarios} presents the
-three topologies, and describes calculating their entropy
+three topologies, and Section~\ref{sec:model} describes their entropy
 (average anonymity the sender expects from the network). We use a model
 checker to compute entropy for networks with 16 nodes:
 we present our results and assess the assumptions behind them in
@@ -202,7 +202,7 @@
 synchronous batching. We refer to that paper for a detailed discussion
 of the timing model, how to handle loosely synchronized clocks, and the
 step-by-step instructions for senders and mixes to use the network and
-judge whether messages have arrived in time.
+judge whether messages have arrived on time.
 
 That paper also describes a receipt and witness system by which senders
 and mixes can prove that a given mix failed to pass on or accept a given
@@ -225,9 +225,9 @@
 
 {\bf{Position in mix route:}} This attack partitions messages that go
 through a given honest node based on how many hops each message has
-traveled so far. If the adversary owns all other nodes in the network,
+travelled so far. If the adversary owns all other nodes in the network,
 he can distinguish messages at different positions in their path (say,
-one has passed two mixes already, and another has passed three), and
+one has traversed two mixes already, and another has traversed three), and
 thus learn the sender and recipient of each message. The authors note:
 \emph{Eventually, a message is only unobservable in that group of messages
 which have this mix on the same routing position.} But in the synchronous
@@ -307,7 +307,7 @@
 trivially link them to their messages). Mixes ensure their messages
 come from distinct senders, so Alice can expect good mixing at each
 honest node in her path~\cite{web-mix}. For cascades this approach is
-clearly efficient, because Alice only needs tickets for her chosen
+clearly efficient because Alice only needs tickets for her chosen
 cascade~\cite{disad-free-routes}, but her anonymity set is still
 limited to that one cascade. We conjecture that other topologies
 can give equivalent anonymity while only obtaining tickets from a
@@ -417,7 +417,7 @@
 layers of disjoint nodes such that messages may pass from any node at
 one layer to any node at the next layer; and a 16x4 free-route mixnet,
 in which all nodes may receive messages at all layers. Note that
-because free-route nodes are reused, `16x4' does not imply 64 nodes.
+because free-route nodes are reused, `16x4' does not mean 64 nodes.
 
 Examples of a 2x2 cascade mixnet, a 2x2 stratified mixnet, and a 4x2
 free-route mixnet are illustrated in Figures~\ref{fig:casc-2x2},
@@ -543,16 +543,16 @@
 e.g.~based on trust. A skewed distribution of messages only at the entry
 or exit of the network should not impact entropy too much---we see from
 Figures \ref{fig:caschops}-\ref{fig:freehops} that much of each network's
-entropy is achieved after just a few hops.
+entropy is achieved from just a few hops.
 
-Reputation systems, on the other hand, encourage users to preferentially
-choose nodes \emph{throughout} the network. Further, reputation
+Reputation systems, on the other hand, encourage users to prefer certain
+nodes at \emph{each layer} of the network. Further, reputation
 information can be exploited by an adversary to reduce anonymity,
 for example by predicting the user's behavior based on reputation
 statistics, or by attracting more traffic by building a strong reputation
 or degrading the reputation of others. Placing nodes with similar reputation
-in the same layer of a stratified or cascade network, or placing them
-in the same cascade, can complicate these attacks, but employed naively,
+in the same layer of a stratified network, or placing them
+in the same cascade, might complicate these attacks, but employed naively,
 this can facilitate other attacks~\cite{casc-rep}. This
 topic merits further investigation.
 
@@ -614,7 +614,7 @@
 
 In Section \ref{subsec:blending} we talk about techniques to discourage
 a mix from dropping or substituting messages in the batch. But
-what if the adversary simply submits more messages to the batch?
+what if the adversary simply submits \emph{more} messages to the batch?
 
 It turns out that as long as $k$ of the $n$ input messages come from
 honest senders, Alice will still be assured that she can expect entropy
@@ -726,19 +726,6 @@
 %It may also attract attackers to such cabals so that the net anonymity
 %does not increase.
 
-Like stop-and-go mixes~\cite{stop-and-go}, we may be able to get improved
-anonymity by allowing Alice to choose to delay her message at a given
-hop until the next batch. That is, the node would delay her message by
-$t_\mathrm{batch}$ and re-introduce it at the same point in the path.
-If each message is either delayed once or not delayed, that gives us
-a latency of 3 to 6 hours for non-delayed messages, 6 to 9 hours for
-delayed messages, and a 6-hour anonymity set (unless the attacker knows
-that someone never sends or receives delayed messages, in which case the
-anonymity set for those users is still 3 hours; also, if the attacker
-owns the node Alice chooses, he may be able to speculate about which
-senders would choose to delay messages). We leave further investigation
-to future work.
-
 \subsection{Robustness of Message Delivery}
 
 Better entropy can be achieved by longer routes: e.g., if we
@@ -765,8 +752,7 @@
 message for a given size mixnet batch. We must be sure to consider
 robustness when comparing topologies.
 
-\begin{table} \caption{Percent of messages delivered vs number of crashed nodes}
-\label{table:expected-delivery}
+\begin{table}
 \renewcommand{\arraystretch}{1.3}
 \begin{center}
 \begin{tabular}[b]{| l | l || c | c | c | c |}
@@ -795,10 +781,11 @@
 \hline
 
 \end{tabular}
+\caption{Percent of messages delivered vs number of crashed nodes}
+\label{table:expected-delivery}
 \end{center}
 \end{table}
 
-
 Table~\ref{table:expected-delivery} shows that
 4x4 cascades and 4x4 stratified networks do roughly the same on
 average, but this is for very different reasons. The chance that the
@@ -832,8 +819,7 @@
 though, we can consider the effect on anonymity of simple node
 crash, since this is the most straightforward way to actively shrink
 anonymity. Also, as discussed in Section~\ref{subsec:blending}, there
-are techniques
-to detect and/or punish more selective attacks.
+are techniques to detect and/or deter more selective attacks.
 
 The threat model we consider here is an extension of the
 one in Section~\ref{sec:scenarios}. As before, the adversary can watch
@@ -873,7 +859,7 @@
 used at multiple layers, a message that reaches a crashed node at a
 given layer could not have been routed through that node at any earlier
 layer. Further, the attacker may gain additional information by crashing
-nodes only at certain hops! Even worse, as the ratio of input messages to
+nodes only at certain layers! Even worse, as the ratio of input messages to
 width of a layer shrinks, it becomes more likely that nodes at a given
 layer will only receive messages from a subset of nodes at the previous
 layer or, in any case, that the layer distribution will be unbalanced
@@ -881,7 +867,7 @@
 
 On the other hand, because nodes are recycled for use at multiple layers,
 it is much harder to plan an attack. If nodes can't crash and then come
-back between hops (perhaps it's hard to do undetectably), crashing an
+back in a single batch (perhaps it's hard to do undetectably), crashing an
 entry node to reduce the anonymity of a message at another node may cause
 that message to be blocked when it must traverse the crashed node at a
 later layer.  But it will generally be hard to predict when to come up
@@ -918,13 +904,13 @@
 
 We have shown synchronous free-routes can provide good anonymity, but we
 must also begin comparing this design to more traditional asynchronous
-free-route designs like Mixminion. Synchronous batching does away with
-the need for a replay cache (each message is labeled with its batch),
-removes partitioning attacks from key rotation, and generally provides
-clearer anonymity guarantees.
+free-route designs like Mixminion. Synchronous batching needs no replay
+cache (each message is labeled with its batch), weakens partitioning
+attacks from blending and key rotation, and generally provides clearer
+anonymity guarantees.
 
 On the other hand, because Mixminion's pool batching strategy
-spreads out message probability distributions between batches, our
+spreads out message distributions between batches, our
 design may fall more quickly to long-term statistical disclosure
 attacks~\cite{statistical-disclosure}. Our design is also less robust
 to transient failures: a late Mixminion message still arrives, whereas
@@ -933,9 +919,21 @@
 the lowest chance of being down in a hop period they are needed, but
 free-route networks lose proportionally fewer messages from a single
 down node.)  But our design can tell the user for sure whether his mail
-was delivered in the batch (and he can resend if not), whereas late
-messages in Mixminion always leave the user wondering if it will come
+was delivered in the batch (and he can resend if not), whereas Mixminion's
+unpredictability always leaves the user wondering if it will come
 out sometime.
+
+Like stop-and-go mixes~\cite{stop-and-go}, we may be able to get improved
+anonymity by allowing Alice to choose to delay her message at a given
+hop until the next batch. That is, the node would delay her message by
+$t_\mathrm{batch}$ and re-introduce it at the same point in the path.
+If each message is either delayed once or not delayed, that gives us
+a latency of 3 to 6 hours for non-delayed messages, 6 to 9 hours for
+delayed messages, and a 6-hour anonymity set (unless the attacker knows
+that someone never sends or receives delayed messages, in which case the
+anonymity set for those users is still 3 hours; also, if the attacker
+owns the node Alice chooses, he may be able to speculate about which
+senders would choose to delay messages).
 We leave further comparison to future work.
 
 \section{Summary}

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/