[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-cvs] patches and a few questions



Update of /home/freehaven/cvsroot/doc/batching-taxonomy
In directory moria.seul.org:/home/arma/work/freehaven/doc/batching-taxonomy

Modified Files:
	taxonomy.bib taxonomy.tex 
Log Message:
patches and a few questions


Index: taxonomy.bib
===================================================================
RCS file: /home/freehaven/cvsroot/doc/batching-taxonomy/taxonomy.bib,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- taxonomy.bib	15 Aug 2002 15:50:44 -0000	1.3
+++ taxonomy.bib	22 Aug 2002 08:47:26 -0000	1.4
@@ -397,7 +397,7 @@
 
 @Proceedings{FGJP98,
   title = {Comparison of Commitment Schemes Used in Mix-Mediated
-  ANonymous Communication for Preventinf Pool-Mode Attacks},
+  Anonymous Communication for Preventing Pool-Mode Attacks},
   year = 	 {1998},
   OPTkey = 	 {},
   booktitle = {3rd Australasian Conference on Information Security and Privacy (ACISP'98},

Index: taxonomy.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/batching-taxonomy/taxonomy.tex,v
retrieving revision 1.18
retrieving revision 1.19
diff -u -d -r1.18 -r1.19
--- taxonomy.tex	15 Aug 2002 15:53:26 -0000	1.18
+++ taxonomy.tex	22 Aug 2002 08:47:26 -0000	1.19
@@ -32,12 +32,12 @@
 
 \begin{abstract}
 The literature contains a variety of different mixes, some of which
-have also been used in deployed anonymity systems. We explore their
+have been used in deployed anonymity systems. We explore their
 anonymity and message delay properties, and show how to mount active
 attacks against them by altering the traffic between the mixes. We
 show that if certain mixes are used, such attacks cannot destroy the
 anonymity of a particular message completely.  We work out the cost of
-these attacks in terms of the number of messages that the attacker
+these attacks in terms of the number of messages the attacker
 must insert into the network and the time he must spend.  We discuss
 advantages and disadvantages of these mixes and the settings in which
 their use is appropriate.  Finally, we look at dummy traffic and SG
@@ -77,7 +77,7 @@
 a mix can manipulate the batch of messages entering that mix so the
 only message unknown to him in the batch is the target message
 \cite{Cott94,babel}.  This manipulation may involve delaying or
-dropping all other incoming messages (a \emph{trickle} attack),
+dropping most or all other incoming messages (a \emph{trickle} attack),
 flooding the batch with attacker messages (a \emph{flooding}
 attack), or some combination of the two which we call the
 \emph{blending} attack.
@@ -94,7 +94,7 @@
 
 In the past, many anonymity systems have been concerned with
 protecting their users against passive adversaries, either global or
-local, ususally citing the $n-1$ or the blending attack as a
+local, usually citing the $n-1$ or the blending attack as a
 vulnerability, with (quoting \cite{Berthold00}) ``no general
 applicable method to prevent this attack''. In this paper we discuss
 ways of reducing this vulnerability.
@@ -109,22 +109,16 @@
 of building fast hardware to insert or delay messages. We further
 assume our attacker can send messages from many different source
 addresses; indeed, infrastructures to ensure sender authentication
-have proved difficult to build. Thus, the threat of the active
+have proved difficult to build.
+%cite?
+Thus, the threat of the active
 attacker comes from him having the power to log and manipulate
 messages on the links.
 
 Note that the global active attacker can be viewed as a combination of
 two separate attackers: one who can only insert messages (global
 inserting attacker) and one who can only delay messages (global
-delaying attacker).\footnote{In fact, in theory only insertion of
-messages is necessary given assumptions such as predictable bounds on
-network link capacities. This is because the global inserting
-adversary can insert arbitrarily many attacker messages between any
-two good messages.  This reduction also assumes that it is never
-necessary to cause a reordering of legitimate messages.  However, it
-is more realistic to model attacks as a combination of delays and
-insertions, and we will not make further mention of this potential
-reduction.}
+delaying attacker).
 
 %\framebox{\parbox{11cm}{Paul says:\\
 %    Note the footnote.}}
@@ -183,7 +177,7 @@
 \end{itemize}
 
 Although it may appear that the ``vulnerability'' of the mixes goes up
-as you go down the list, this is not necessarily the case -- the cost
+as we go down the list, this is not necessarily the case -- the cost
 of the attack is important to consider as well. For example, suppose
 the anonymity
 of a message going through Mix 1 under active attack is proportional
@@ -220,7 +214,7 @@
 \item The mixes have limited physical memory and so can only contain a
 finite number of messages.
 
-\item Mixes deal with message replays.
+\item Mixes prevent message replays.
 
 %  \framebox{\parbox{11cm}{Paul says:\\
 %      This seemed implicit in the paper, and I think was explicit in
@@ -237,6 +231,9 @@
 
 \item When we talk about anonymity, we mean sender anonymity. Similar
 ideas can be applied to receiver anonymity as well.
+%\item When we talk about anonymity, we mean sender unlinkability ---
+%given a message the adversary cannot determine its sender. Similar ideas
+%can be applied to receiver unlinkability as well.
 
 \end{itemize}
 
@@ -262,7 +259,7 @@
 The attack proceeds as outlined in Section \ref{attacks} and is
 usually referred to as the $n-1$ or the flooding attack.  It takes
 $\epsilon$ time as inserting the required messages and the (usually
-two) firings of the mix mix take a constant time. The attack takes a
+two) firings of the mix take a constant time. The attack takes a
 minimum of $n-1$ (the attacker waits until the mix is empty, forwards
 the target message to the mix along with $n-1$ attacker messages to
 flush it out) and a maximum of $2n-2$ messages (the attacker does not
@@ -272,20 +269,25 @@
 The above attack seemingly requires both the global delaying attacker
 capabilities and the global inserting attacker capabilities. However,
 this is not necessarily so. If the global inserting attacker is able
-to insert $n-1$ of its own messages between each of the ``good''
+to insert $n-1$ of his own messages between each of the ``good''
 messages going into a mix, he effectively mounts an $n-1$ attack on
-each of them.
+each of them.\footnote{In fact, in theory only insertion of messages
+is necessary, assuming there are predictable bounds on network link
+capacities, and that it is never necessary to cause a reordering of
+legitimate messages.  However, it is more realistic to model attacks
+as a combination of delays and insertions, and we will not pursue this
+potential reduction.}
 
 Another possible attack scenario is when an attacker owns a mix in a
 free route mix network. He has the capability to delay all the
 messages going through this mix until the conditions are ``just
-right'', i.e. the mix the target message is headed for contains $n-1$
+right'', i.e. the next mix contains $n-1$
 of the attacker's messages and will fire as soon as the target message
 reaches it. Thus, he has the capability to attack the next mix in the
 route of each of the messages going through the mix he controls. (As
 a result, if the attacker own all the mixes but one,
 he is able to compromise anonymity of any
-message).
+message.)
 
 %(AAS comment: Verification comments should go here!)
 \subsection{Timed Mix}
@@ -298,7 +300,7 @@
 
 \paragraph*{Message Delay:} 
 The minimum delay is $\epsilon$, which occurs if the message arrives
-just before the mix is due to be fire. The maximum delay is
+just before the mix is due to fire. The maximum delay is
 $t-\epsilon$, which is the case when the message arrives just
 after. The mean delay is $\frac{t}{2}$.
 
@@ -332,7 +334,7 @@
 \paragraph*{Blending Attack Behaviour:} 
 The attack is exact and certain and proceeds as follows: The adversary
 delays the target message for a maximum of time $t$ until the mix
-firess. He then forwards in the target message and blocks all other
+fires. He then delivers the target message and blocks all other
 incoming messages.  After another $t$ seconds the mix fires again
 producing the target message on its own. This takes a maximum of
 $2t-\epsilon$ and a minimum of $\epsilon$ seconds (when the mix was
@@ -352,7 +354,7 @@
 \subsection{Threshold Or Timed Mix}
 
 \paragraph*{Parameters:} 
-threshold, $n$; period, $t$.
+$n$, threshold; $t$, period.
 
 \paragraph*{Flushing Algorithm:}
 The mix fires every $t$ seconds or when $n$ messages accumulate in
@@ -380,7 +382,7 @@
 \subsection{Threshold And Timed Mix}
 
 \paragraph{Parameters:}
-threshold, $n$; period, $t$.
+$n$, threshold; $t$, period.
 
 \paragraph*{Flushing Algorithm:}
 A mix fires every $t$ seconds but only when at least $n$ messages have
@@ -397,7 +399,7 @@
 \paragraph{Blending Attack Behaviour:} 
 The $n-1$ attack is still exact and uses a combination of the
 attacks on the threshold and the timed mixes. It takes a maximum of
-$2t-\epsilon$ and a minimum of $\epsilon$ seconds, a maximum of
+$2t-\epsilon$ and a minimum of $\epsilon$ seconds; and a maximum of
 $2(n-1)$ and a minimum of $n-1$ messages.
 
 \paragraph{Adversaries:}
@@ -431,31 +433,6 @@
 %messages between each ``good'' message. If the ``good'' messages
 %arrive at a rate of something like 1 every 5 seconds, this is entirely
 %feasible even if n is large (it's easy to send messages very fast).
-%
-%[Actually, it's easily possibly to delay a message if you're the
-% mix sending it to the next hop. In fact, it's very plausible since an
-% adversary owning several mixes may be tracking a message and its next
-% hop would be an honest node. So the adversary would wait to send it
-% until the conditions are just right. -RRD]
-%[That delaying operation is harder to do if the mix is somehow
-% accountable; eg, if it's a threshold mix and people are verifying
-% its output. So the pool (and other complex variants you use below) makes
-% it less (not?) possible to confirm that the mix has misbehaved. -RRD]
-%
-%[Link-level encryption offers some limited protection against traffic
-%analysis. Encrypted links between honest nodes prevent an adversary from
-%recognizing even his own messages; but without link padding, he is still
-%able to measure how much traffic is being transmitted. -RRD]
-%
-%[Aside: it seems a timed mix is an awful thing to use in a cascade,
-% since the adversary could be the preceding hop. Actually, more
-% generally, cascades are very vulnerable to n-1 attacks of all sorts,
-% since the previous hop can just rig the batch. Cascade protocols must
-% take this into account, and that's really hard to do (Come to think
-% of it, casc-rep doesn't defend adequately against this. Detecting the
-% attack afterwards is nice because it prevents wholescale anonymity
-% compromise, but it still doesn't cut it for that poor message.) -RRD]
-
 
 \section{Pool Mixes}
 
@@ -467,6 +444,7 @@
 \subsection{Threshold Pool Mix}
 
 \paragraph*{Parameters:} $n$, threshold; $f$, pool.
+% why did we choose the variable $f$? -RRD
 
 \paragraph*{Flushing Algorithm:} 
 The mix fires when $n+f$ messages accumulate in the mix. A pool of $f$
@@ -474,8 +452,8 @@
 in the mix. The other $n$ are forwarded on.
 
 \paragraph*{Message Delay:} 
-The minimum delay is $\epsilon$, the maximum delay is infinite --- if
-no messages arrive, the mix does not fire. Note, however, that even if
+The minimum delay is $\epsilon$, the maximum delay is infinite --- until
+$n$ new messages arrive, the mix does not fire. Note, however, that even if
 there is a constant flow of messages and the mix is firing
 periodically, there is still a small but non-zero probability of the
 message remaining in the mix for an arbitrarily long time. The mean
@@ -491,7 +469,12 @@
 anonymity of the message going through a mix depends on the entire
 history of events that have happened in this mix. Thus, we achieve
 the maximum anonymity $A_{max}$ when all of the messages that have ever
-passed through the mix came from different senders.
+passed through the mix come from different senders.
+% What about making the threshold a number of distinct senders? Would
+% help against a non-global adversary. Would totally break in a cascade
+% environment. Would have side-effects even in free-route systems, like
+% biasing mixes towards having more traffic from users and less from other
+% mixes? Hrm. Complex. Probably not worth it.
 Serjantov and Danezis carried out this analysis in \cite{Serj02}.
 
 \[
@@ -522,20 +505,23 @@
 The attacker delays the target message, fills up the pool mix and
 flushes it. If $j$ of the attacker messages don't come out, he knows that
 $f-j$ good messages remain inside the pool mix. He now delays all the
-other incoming good messages and tries to flush them out of the mix
-(he can detect all good messages coming out). Of course, this takes
+other incoming good messages and tries to flush the remaining good
+messages out of the mix
+(he can distinguish them as they come out). Of course, this takes
 more messages than to flush out a threshold mix (see below for
 details), but the attack is still exact (if/when the attacker succeeds
-in getting all the messages to leave the mix, he knows it). Once the
-mix fires, he just sends in the target message, and flushes the mix as
+in getting all the messages to leave the mix, he knows it). When all
+good messages have been flushed, he just sends in the target message,
+and flushes the mix as
 before until the target message comes out. Because the attacker is not
 guaranteed to flush out the mix completely or to flush out the target
 message, the attack is uncertain.
 
-Note that this attack crucially depends on the attacker being able to
-delay messages 
-%for a some time
-and/or send messages at very high rates.
+% redundant? removing. -rd
+%Note that this attack crucially depends on the attacker being able to
+%delay messages 
+%%for a some time
+%and/or send messages at very high rates.
 
 \paragraph*{Analysis:} 
 Flushing out the mix: after $k$ rounds, the probability that a
@@ -576,7 +562,7 @@
 \paragraph*{Flushing Algorithm:} 
 The mix fires every $t$ seconds. A pool of $f$ messages chosen
 uniformly at random is retained in the mix. The others are forwarded
-on. If fewer then $f$ messages are in the mix, it does not fire.
+on. If $f$ or fewer messages are in the mix, it does not fire.
 (Thus, strictly speaking, this is a threshold and timed pool mix, for
 which the threshold is $0$.)
 
@@ -603,20 +589,25 @@
 case, a mix's history includes the senders of each message and also
 the number of messages that got mixed together in each batch (and,
 potentially, a whole host of other features). Of course, in practice,
-a record of the only last few rounds gives a good approximation.
+a record of only the last few rounds gives a good approximation.
 
 The minimum anonymity of a timed pool mix is clearly smaller
-than that of the threshold pool mix. We note that in pool mixes the
+than that of the threshold pool mix. We stress that in pool mixes the
 bulk of the anonymity comes from mixing the target message with the
 batch of incoming messages, not from mixing it with the messages in
 the pool. Because the timed mix does not always have this batch of
 messages for mixing the target message, its minimum anonymity should
 be considered to be very much worse than that of the threshold pool
 mix.
+% can we talk about this a bit more? i want some better intuition
+% here. it will let us answer questions like "for the cottrell mix,
+% when you have more than $min$ messages but less than $min*frac$,
+% do you fire, even tho you'd end up with less than $min$ in the
+% pool? or do you fire but always maintain $min$ in the pool?"
 
 \paragraph*{Blending Attack Behaviour:}
-Two flavours of blending attack are possible on this mix.  Because the
-adversary has no control over when the mix fires, he can add many or
+Two flavours of blending attack are possible on this mix.  The adversary
+has no control over when the mix fires, but he can choose to add many or
 just a few messages to each batch. (He prevents all other messages
 from reaching the mix).
 
@@ -626,7 +617,10 @@
 messages do not come out with the flush. This approach is very
 inefficient in terms of the number of messages added by the attacker
 ($b$) since the probability of flushing the mix by this method is
-$\frac{f^2}{b+f}$ (supposing there were $f$ good messages in the mix
+$\frac{f^2}{b+f}$ 
+% can we put a more obvious version of this fraction, followed by an
+% equal sign, followed by the above?
+(supposing there were $f$ good messages in the mix
 initially).\footnote{As previously mentioned, in practice there is an
 upper limit on $b$ due to the finite memory capacity of the mix.}
 However, this aims to flush the good messages out of the mix in one
@@ -649,7 +643,7 @@
 
 \subsection{Cottrell Mix (Timed Dynamic-Pool Mix)} 
 \paragraph*{Parameters:} 
-$t$, period; $min$, pool; $frac$, fraction; ($0$ threshold).
+$t$, period; $min$, pool; $frac$, fraction; ($0$, threshold).
 
 \paragraph*{Flushing Algorithm:} 
 The mix fires every $t$ seconds.  A fraction $frac$ of randomly
@@ -708,14 +702,11 @@
 Finally, the number of messages inside the pool mix may be arbitrarily
 high and cannot be reduced to below $min$ in one round.  Therefore, if
 we wish to send a message
-%and decrease the probability of it
-%getting tracked,
-that is harder to track, we should send it at the time of higher
+that is harder to track, we should send it at a time of higher
 traffic --- thereby increasing the cost (in terms of messages or time)
-of attempted attacks on it.\footnote{Of course this assumes that we
-  can determine times of higher legitimate traffic. An attacker
-  capable of arbitrary message insertions, as we have been assuming,
-  will make that more complicated.}
+of attempted attacks on it.\footnote{Unfortunately, an attacker capable
+of arbitrary message insertions, as we have been assuming, will make it
+hard to determine times of higher legitimate traffic.}
 %\footnote{At times of high traffic, the mix will contain
 %many more messages than $min$, and to flush it free of good messages
 %will require either more time or more messages from the attacker.}.
@@ -730,18 +721,15 @@
 $\frac{6}{10}$ requires 5 rounds for the expected number of good
 messages remaining in the mix to fall below 1. If the period is 5
 minutes, the overall flushing time is 25 mins. 500 messages will be
-used in the flushing part of the attack. 
+used in the first part of the attack. % and how many to flush the
+% target message out?
 \end{exa}
 
-%FIXME Andrei, you should actually write the example :)
-% I think this is enough. If you think more is required, either fill
-% in or kill it.
-
 This example shows that the protection this mix provides against
 blending attacks is still weak.  Indeed, the probability of a
 successful attack is proportional to $\frac{1}{frac^k}$ where $k$ is
 the number of rounds. Ideally we would like a mix for which the anonymity
-does not go to $0$ as the cost of the attack goes up.
+does not go to $0$ even as the cost of the attack goes up.
 
 We would also be happier with a mix where the probability of a
 successful attack is proportional to $\frac{1}{P(k)}$ where $P$ is a
@@ -787,14 +775,15 @@
 rid of.  This is easily possible by choosing the number of messages to
 flush from a binomial probability distribution (flipping a biased coin
 once for each message to decide whether it is to be flushed or
-not). The weight of a coin is to be chosen by an appropriate function
+not). The weight of the coin is chosen by an appropriate function
 of the number of messages in the mix. Thus, it would take the attacker
 several rounds to establish with any confidence the number of
-good messages in the mix initially. Of course, he can never be exactly
-sure of that figure. Thus, introducing randomness is a promising
-direction, but note that this would not increase the cost of the
-attack significantly (if at all). Thus, we shall not pursue it
-further.  
+good messages in the mix initially, and he can never be exactly
+sure of that figure.
+
+While introducing randomness is a promising direction, note that it would
+not increase the cost of the attack significantly (if at all). Thus,
+we shall not pursue it further.
 
 %\subsection{Alternative Active Attack Scenario}
 %
@@ -845,11 +834,13 @@
 and witness scheme of \cite{mix-acc} works because a mix loses
 reputation if it hasn't obtained a receipt from the next hop within
 the allowed timeframe. All of these systems are designed to ensure
-that messages will leave each mix predictably. Yet more verification
+that messages will leave each mix predictably.
+%roger -- remember to integrate this paragraph better
+Yet more verification
 schemes which work in a system with pool mixes are described in
 \cite{Jer2000,FGJP98}. There the authors propose a commitment scheme
-which ensures that the mix comits to a decision before he receives the
-target message, and its actions are verifiable by the sender of the
+which ensures that the mix commits to a decision before he receives the
+target message, and his actions are verifiable by the sender of the
 target message.
 
 In some topologies, such as free route mix networks, a widespread
@@ -864,7 +855,7 @@
 detect an attack afterwards, e.g. by checking for successful delivery
 of test messages at the end of the cascade \cite{casc-rep}, can help
 detect misbehaving mixes --- but they still allow the adversary to
-uncover a few messages before he has noticed.  As we design new
+uncover a few messages before he is noticed.  As we design new
 protocols and batching approaches, we must consider this tradeoff
 between protection from blending attacks and verifiability.
 
@@ -894,12 +885,15 @@
 pass through more mixes) until he can detect that one message has gone
 to a recipient. That is the target message since all the dummy
 messages end in a mix. This attack is exact, but much more expensive
-than anything we have described previously. The situation can be
-further improved by making the mix choose its route length for the
+than anything we have described previously. We can make the adversary's
+job harder by making the mix choose its route length for the
 dummies from a uniform distribution rather than always choosing
-$5$. We could get further protection by allowing mixes to send cover
+$5$. We can get further protection by allowing mixes to send cover
 traffic to the users (possibly in some format that can be
-automatically discarded).  However, providing complete protection with
+automatically discarded
+% no -- because then we make them distinguishable by the adversary
+% too. this problem is very very hard.
+).  However, providing complete protection with
 this approach is very hard. Each mix must know all the users in the
 system: if a mix only delivers dummies to a subset of the users, an
 adversary can distinguish with better than even probability between a
@@ -914,6 +908,12 @@
 scale of the attack without changing the properties. For example, even
 with the Mixmaster cover traffic policy, a threshold mix network can
 be attacked in $\epsilon$ time (but a very large number of messages).
+% What if we always filled the outgoing batch up to some threshold?
+% Then the issue wouldn't be orthogonal, right? Hrm. Padding question;
+% hard. But I still don't buy that dummies are completely orthogonal
+% to the blending attack properties, since mixes can generate dummies
+% and thus make it harder to attack them (and it's easier to generate
+% dummies in some mix models than others).
 
 \subsection{Making Messages Unrecognizable}
 
@@ -942,18 +942,15 @@
 most of these have not been rigorously analysed or evaluated
 \cite{Cott94,babel,Jer2000,Kesdogan98}.
 
-We will concentrate our attention on one particular proposal -- Stop-
-and-Go Mixes \cite{Kesdogan98}. The authors outline several techniques
-that
-could be used to protect mix systems against active attacks %similar to
-like
-the ones considered in this work.
+We will concentrate our attention on one particular proposal ---
+Stop-and-Go Mixes \cite{Kesdogan98}. The authors outline several
+techniques that could be used to protect mix systems against active
+attacks like the ones considered in this work.
 
 The first is a scheme for an anonymity system based on a mix cascade
-that requires authentication. This is based on a stronger set of
-assumptions -- that the attacker cannot mount a distributed $n-1$
+that requires authentication. This is based on a stronger 
+assumption --- that the attacker cannot mount a distributed $n-1$
 attack where the hostile messages come from different users. 
-%It cannot be used in mix networks.
 Unfortunately, it seems very hard to reuse this idea in free route
 networks, since there is no centralized input location at which we can do
 authentication; compromised mixes could claim to have done authentication
@@ -966,7 +963,7 @@
 discarded. Therefore the attacker's ability to delay messages is much
 more limited, so active attacks are harder.
 
-The attack is uncertain, the authors argue that the probability of
+The attack is uncertain, and the authors argue that the probability of
 executing the attack successfully is very low.
 
 However, the scheme relies on the users being able to accurately
@@ -986,13 +983,13 @@
 the system have not yet been worked out.
 
 \section{Conclusion}
-In this paper we have presented a set of mixes and examined their
+We present a set of mixes and examine their
 anonymity, message delay and blending attack properties. In
-particular, we suggested that it is useful to partition the mixes into
-certain categories of vulnerability but emphasized that the cost and
+particular, we suggest that it is useful to partition the mixes into
+certain categories of vulnerability but emphasize that the cost and
 worst case attack scenario are important qualities to consider.
 
-Although we have shown the mixes to be rather vulnerable to active
+Although we show the mixes to be rather vulnerable to active
 attacks, some avenues still have hope. The first of these is
 verification schemes.
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/