[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-cvs] fixes and corrections up through section 5



Update of /home/freehaven/cvsroot/doc/batching-taxonomy
In directory moria.seul.org:/home/arma/work/freehaven/doc/batching-taxonomy

Modified Files:
	taxonomy.bib taxonomy.tex 
Log Message:
fixes and corrections up through section 5



Index: taxonomy.bib
===================================================================
RCS file: /home/freehaven/cvsroot/doc/batching-taxonomy/taxonomy.bib,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- taxonomy.bib	6 May 2002 22:11:01 -0000	1.1
+++ taxonomy.bib	7 May 2002 07:55:42 -0000	1.2
@@ -1,6 +1,29 @@
 % Note that this file requires a \usepackage{url} which seems to be
 % one of the standard latex packages
 
+@InProceedings{flash-mix,
+   author =      {Markus Jakobsson},
+   title =       {Flash {M}ixing},
+   booktitle =   {Principles of Distributed Computing - {PODC} '99},
+   year =        {1999},
+   publisher =   {ACM},
+   note =        {\url{http://citeseer.nj.nec.com/jakobsson99flash.html}},
+}
+
+@Misc{mix-acc,
+   author =      {Roger Dingledine and Michael J. Freedman and David Hopwood and David Molnar},
+   title =       {A {R}eputation {S}ystem to {I}ncrease {MIX}-net {R}eliability},
+   howpublished = {Proceedings of the Information Hiding Workshop 2001},
+   note =        {\url{http://www.freehaven.net/papers.html}},
+}
+
+@Misc{casc-rep,
+   author =      {Roger Dingledine and Paul Syverson},
+   title =       {Reliable {MIX} {C}ascade {N}etworks through {R}eputation},
+   howpublished = {Proceedings of Financial Cryptography 2002},
+   note =        {\newline \url{http://www.freehaven.net/papers.html}},
+}
+
 @article{chaum81untraceable,
     author = "David Chaum",
     title = "Untraceable electronic mail, return addresses and digital pseudonyms",

Index: taxonomy.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/batching-taxonomy/taxonomy.tex,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- taxonomy.tex	7 May 2002 05:56:06 -0000	1.3
+++ taxonomy.tex	7 May 2002 07:55:42 -0000	1.4
@@ -538,7 +538,8 @@
 Intuitively, the attack ceases to be certain. It proceeds as follows:
 
 The attacker delays the target message, fills up the pool mix and
-flushes it. If $j$ of the attacker messages came out, he knows that
+flushes it. If $j$ of the attacker messages don't come out, he knows that
+% critical fix -- i added a "don't". true? -RRD
 $f-j$ good messages remain inside the pool mix. He now delays all
 the other incoming good messages and tries to flush them out of the
 mix (he can detect all good messages coming out). Of course, this
@@ -551,8 +552,9 @@
 flush out the target message, the attack is uncertain.
 
 Note that this attack crucially depends on the attacker being able to
-delay messages for a some time and/or send messages at very high
-rates.
+delay messages 
+%for a some time
+and/or send messages at very high rates.
 
 \paragraph*{Analysis:} 
 Flushing out the mix: after $k$ rounds, the probability that a
@@ -576,7 +578,7 @@
 \begin{exa}
 Consider a pool mix with a threshold of 100 and a pool of 60 messages.
 The average delay of a message through this mix is 1.6
-rounds. Assuming the attacker starts off with an mix with 60 messages,
+rounds. Assuming the attacker targets a mix with 60 messages,
 the expected number of good messages remaining in the mix falls below
 1 after 5 rounds or 500 messages.
 \end{exa}
@@ -586,12 +588,12 @@
 messages at the cost of increasing the average delay of a message
 passing through the mix.
 
-\subsection{Timed Pool}
+\subsection{Timed Pool Mix}
 
 \paragraph*{Parameters:} $t$, period; $f$, pool; ($f$, threshold).
 
 \paragraph*{Flushing Algorithm:} 
-A mix is flushed every $t$ seconds. A pool of $f$ messages chosen
+The mix is flushed every $t$ seconds. A pool of $f$ messages chosen
 uniformly at random is retained in the mix. The others are forwarded
 on. If fewer then $f$ messages are in the mix, it does not fire.
 (Thus, strictly speaking, this is a threshold and timed pool mix, for
@@ -599,88 +601,90 @@
 
 \paragraph*{Message Delay:} 
 The minimum message delay is $\epsilon$, and there is no maximum delay
-(if no messages arrive, the messages which are in the pool will
-forever remain in the mix). In addition to that, just like in the case
-of the threshold pool mix, there is a small but non-zero probability
-that any given message remain in the mix for an arbitrarily long time
-even if there is a flow of messages through the mix.
+(if no messages arrive, the messages that are in the pool will
+never leave the mix). Like in the case of the threshold pool mix, there
+is again a small but non-zero probability
+that any given message could remain in the mix for an arbitrarily
+long time even if there are messages flowing through the mix.
 
 
 \paragraph*{Anonymity:}
-The timed nature of this mix allows arbitrarily large number (only
+The timed nature of this mix allows an arbitrarily large number (only
 limited by the memory capacity of the mix) of messages to be mixed
 together.  If we assume a constant rate $r$ of message arrival, the
 anonymity provided by this mix can be calculated in just the same way
-as for the threshold case. However, we will not pursue this further
-here.
+as for the threshold case; but we leave this calculation for future work.
 
 Alternatively, the anonymity of a message going through this mix can
-be calculated from the history of the mix -- a record of the operation
-of the mix ever since it was started which includes, just like in the
-threshold case, the senders of each message and also the number of
-messages which got mixed together (and, potentially, a whole host of
-other features). Of course, in practice, only a record of the last few
+be calculated from the history of the mix --- a record of the operation
+of the mix ever since it was started. Like in the threshold case, a
+mix's history includes the senders of each message and also the number
+of messages which got mixed together in each batch (and, potentially,
+a whole host of
+other features). Of course, in practice, a record of the only last few
 rounds gives a good approximation.
 
 Assuming a threshold pool mix has a threshold larger
 than the pool size, the minimum anonymity of a timed pool mix is
 clearly smaller than that for the threshold pool mix. We note that in
-the pool mixes it is the mixing of the target message with the batch
-of incoming messages that provides the bulk of the anonymity, not the
-mixing of it with the messages in the pool. Because the timed mix does
-not always have this batch of messages that the target message is
-mixed with, its minimum anonymity should be considered to be very much
+pool mixes the bulk of the anonymity comes from mixing the target message
+with the batch of incoming messages, not from
+mixing it with the messages in the pool. Because the timed mix does
+not always have this batch of messages for mixing the target message,
+its minimum anonymity should be considered to be very much
 worse than that of the threshold pool mix.
 
 \paragraph*{Blending Attack Behaviour:}
-There are two flavours of the blending attack possible on this mix.
-Because the adversary has no control of when the mix flushes, he can
-add as many or as few messages to each flush, (while preventing all
-the other good messages from reaching the mix).
+Two flavours of blending attack are possible on this mix.
+Because the adversary has no control over when the mix flushes, he can
+add many or just a few messages to each flush. (He prevents all
+other messages from reaching the mix).
 
-The former involves the attacker adding as many messages as possible
-in one round trying to maximize the probability that after one flush
-only the attacker messages will be left in the pool\footnote{Note that
-  the attacker knows he has succeeded if $f$ of his messages did not
-  come out after the flush.}. This is very inefficient in terms of the
-number of messages added by the attacker ($b$) as the probability of
+By adding as many messages as possible in one round, he maximizes
+the probability that after one flush
+only the attacker messages will be left in the pool. He knows
+that he has succeeded if $f$ of his messages do not come out with the
+flush. This approach is very inefficient in terms of the
+number of messages added by the attacker ($b$) since the probability of
 flushing the mix by this method is $\frac{f^2}{b+f}$ (supposing there
 were $f$ good messages in the mix initially).\footnote{As previously
   mentioned, in practice there is an upper limit on $b$ due to the
   finite memory capacity of the mix.}  However, this aims to flush the
 good messages out of the mix in one round, and therefore a maximum of
-$t$ seconds. This allows the entire attack to be executed in less than
+$t$ seconds. Thus the entire attack can be executed in less than
 $2t$ seconds with an arbitrarily high probability of success.
 
-The latter involves the attacker adding just one message to each round
+Alternatively, the attacker can add just one message to each round
 for many rounds. This is very efficient in the number of
-messages\footnote{Indeed, this is a much more efficient attack than
-  the one on the threshold pool mix in terms of messages, but clearly
-  not in terms of time.} $b$ -- the probability of flushing the mix is
-$f\left(\frac{f}{f+1}\right)^b$, but delays all the messages to the
+messages $b$ (indeed, this attack is much more efficient than
+the one on the threshold pool mix in terms of messages, but clearly
+not in terms of time). The probability of flushing the mix is
+$f\left(\frac{f}{f+1}\right)^b$, but this approach delays all the messages
+to the
 mix by $tb$ seconds, which is highly observable to the users of the
 anonymity system.
 
-Depending on whether the attacker is able to send out many messages or
-delay the messages to a mix for a long time, he may choose either or a
-combination of these options.
-
-%Read through and corrected up to here.
+The attacker can choose either or a combination of these approaches,
+based on whether he can send out many messages or delay messages to a
+mix for a long time.
 
 \subsection{Cottrell Mix (Timed Dynamic-Pool Mix)} 
 \paragraph*{Parameters:} $t$, period; $min$, pool; $frac$, fraction; ($min$ threshold).
 
 \paragraph*{Flushing Algorithm:} 
-A mix gets flushed every $t$ seconds.  Random fraction $frac$ of
-messages inside the mix at the time remain in the mix. But never less
-than $min$.
+The mix gets flushed every $t$ seconds.  A fraction $frac$ of randomly
+chosen messages inside the mix at the time remain in the mix. But if
+there are $min$ or fewer messages in the pool, it outputs no messages.
 
-Note that if there are $min$ messages in the mix, a flush does not cause
-any messages to be put out onto the network. But it is still a logical
-flush. This mix has been used in the Mixmaster system for a number of
+%Note that if there are $min$ messages in the mix, a flush does not cause
+%any messages to be put out onto the network. But it is still a logical
+%flush.
+% commented out for now. if we're using this to introduce dummy traffic
+% later, we can say it then -RRD
+This mix has been used in the Mixmaster remailer system for a number of
 years, but has not been described or evaluated before.
 
-\paragraph*{Message Delay:} Just as in the other pool mixes, the
+\paragraph*{Message Delay:} Like the other pool mixes, the
 minimum delay is $\epsilon$ and there is no upper limit on the delay.
 The mean delay depends on the future rate of arrival of messages into
 the mix. If the rate is high enough that exactly $n*frac$ messages are
@@ -688,32 +692,37 @@
 This is in general higher than that of the timed pool mix.
 
 \paragraph*{Anonymity:}
-The minimum and maximum anonymity of this mix are identical to that of
-the timed pool mix.  Similarly, a log of the activity of the mix would
-allow us to calculate the anonymity of a message passing through the
-mix (although the calculation would be slightly different).  Also, we
+The Cottrell mix has identical minimum and maximum anonymity properties
+compared to a timed pool mix. We could similarly use
+a log of the mix's activity
+to calculate the anonymity of a message passing through it
+(although the calculation would be slightly different).  Also, we
 note that the anonymity provided by the mix would be higher than that
 provided by a timed mix (as the number of messages in the mix goes up,
 $frac$ keeps the chance of the message remaining in the mix constant,
-while it decreases in the case of the timed pool mix).
+whereas it decreases in the case of the timed pool mix).
 
 \paragraph*{Blending Attack Behaviour:}
 The introduction of the dynamic parameter $frac$ has several
-consequences (as compared to the timed pool mix).
+new consequences compared to the timed pool mix.
 
 Firstly, the maximum probability of flushing the mix of good messages
-in one flush is $frac$, therefore there is no possibility of flushing
-the mix with high probability in one flush as was possible in the
-timed pool mix case.
+in one flush is $frac$. Therefore there is no possibility of flushing
+the mix with high probability in one flush: the first of the two blending
+attacks on timed pool mixes is blocked.
 
 Secondly, the attacker has to find out how many messages are in the
 mix.  Of course, the number of good messages in the mix is easy to
 calculate from the number of messages which come out. 
+%FIXME can you elaborate more on this, andrei?
 
-Finally, the number of messages inside the pool mix can be
+Finally, the number of messages inside the pool mix may be
 arbitrarily high and cannot be reduced to below $min$ in one round.
-Thus, if we wish to send a message and decrease the probability of it
-getting tracked, we should send it at the time of higher traffic
+Therefore, if we wish to send a message 
+%and decrease the probability of it
+%getting tracked,
+that is harder to track,
+we should send it at the time of higher traffic ---
 thereby increasing the cost (in terms of messages or time) of attempted
 attacks on it%\footnote{At times of high traffic, the mix will contain
 %many more messages than $min$, and to flush it free of good messages
@@ -724,7 +733,7 @@
   insertions, as we have been assuming, will make that more
   complicated.}
 
-Thus, the Cottrell mix requires the attacker to be able to delay all
+Thus the Cottrell mix requires the attacker to be able to delay all
 the traffic to the mix for a substantial number of rounds, and
 therefore for a substantial time.
 
@@ -732,31 +741,51 @@
 %Example demonstrating above
 %\end{exa}
 
+%FIXME Andrei, you should actually write the example :)
+
 This example shows that the protection this mix provides against
 blending attacks is still weak.  Indeed, the probability of a
 successful attack is proportional to $\frac{1}{frac^k}$ where $k$ is
-the number of rounds. Ideally we would like a mix the anonymity of
-which does not go to $0$ as the cost of the attack goes up resistant
-mix.
+the number of rounds. Ideally we would like a mix for which the anonymity
+does not go to $0$ as the cost of the attack goes up.
 
 We would also be happier with a mix where the probability of a
 successful attack is proportional to $\frac{1}{P(k)}$ where $P$ is a
 polynomial of a small degree.
 
-However, we note from the example above that we can attempt to design
-verifiability schemes to detect very long delays of messages through
-the mix and act appropriately. In the case of a system which needs to
-guarantee a certain level of anonymity, all messages arising from a
-mix which has not responded for a long time would be discarded thereby
-not giving away their destination to the attacker. This, in our
-opinion could lead to a very much stronger system.
+%However, we note from the example above that we can attempt to design
+%verifiability schemes to detect very long delays of messages through
+%the mix and act appropriately. In the case of a system which needs to
+%guarantee a certain level of anonymity, all messages arising from a
+%mix which has not responded for a long time would be discarded thereby
+%not giving away their destination to the attacker. This, in our
+%opinion could lead to a very much stronger system.
 
-\section{More on Mixes}
+We note that the minimum anonymity of the Cottrell mix is still very
+much worse than that of the threshold pool mix (where the threshold
+size is much larger than the pool size). Therefore, we propose adding
+a threshold to a Cottrell mix to improve its minimum anonymity
+properties. We believe
+%We shall not go into any more detail other than to say
+%that it is intuitively clear that 
+this modification will not have
+adverse effects on the blending attack properties, although will
+slightly decrease the efficiency of this mix.
 
-All of the mixes presented above were vulnerable to exact attacks. In
-other words, the attacker was always able to find out if his attack
-was successful, and possibly expend more resources to try again if it
-was not.
+%But there is more fun to be had: we can
+%have not just the number, but also the *fraction* of the messages sent
+%out can depend on the number of messages inside the mix. This smells
+%like differentiation. Furthermore, we can have a non-linear dependence
+%of these two variables.
+
+\section{More on resisting blending attacks}
+
+\subsection{Resisting exact attacks}
+
+All of the mixes presented above are vulnerable to exact attacks. In
+other words, the attacker is always able to find out if his attack
+is successful, and possibly expend more resources to try again if
+not.
 
 However, we could do better. If we make it hard to determine the
 number of good messages in the mix, it will be harder for the attacker
@@ -766,30 +795,51 @@
 once for each message to decide whether it is to be flushed or
 not). The weight of a coin is to be chosen by an appropriate function
 of the number of messages in the mix. Thus, it would take the attacker
-some number of rounds to establish with any certainty the number of
+several rounds to establish with any confidence the number of
 good messages in the mix initially. Of course, he can never be exactly
 sure of that figure. Thus, introducing randomness is a promising
 direction, but note that this would not increase the cost of the
 attack significantly (if at all). Thus, we shall not pursue it
 further.  
 
-[But at the same time, hiding this number makes verifiability even
-harder. -RRD]
+\subsection{But resisting blending can prevent verification!}
 
-We note that the minimum anonymity of the Cottrell mix is still very
-much worse than that of the threshold pool mix (where the threshold
-size is much larger than the pool size). Therefore, we propose adding
-a threshold to a Cottrell mix to improve its minimum anonymity
-properties. We shall not go into any more detail other than to say
-that it is intuitively clear that this modification will not have
-adverse effects on the blending attack properties, although will
-slightly decrease the efficiency of this mix.
+The tools that we give honest mixes to protect against these blending
+attacks are precisely the tools that dishonest MIXes use to launch
+them. \emph{Uncertainty} comes from not letting observers know when
+a given message might leave the mix; \emph{inexactness} comes from
+not letting observers know how many messages are currently inside the
+mix. This flexibility allows a mix to prevent the adversary from learning
+the details of message flow, but at the same time it allows a compromised
+mix to manipulate message flow, for example by delaying a message until
+it's ready to launch a trickle attack on the downstream hop.
 
-%But there is more fun to be had: we can
-%have not just the number, but also the *fraction* of the messages sent
-%out can depend on the number of messages inside the mix. This smells
-%like differentiation. Furthermore, we can have a non-linear dependence
-%of these two variables.
+Many basic mix verification schemes, such as that proposed in
+\cite{chaum81untraceable} where senders observe the batches exiting
+a mix to confirm it processed their messages, require schemes where
+the messages come out with the next mix flush. More complex
+robustness schemes \cite{flash-mix} enforce timely message processing
+if the adversary does not own a threshold of participating mixes. The
+receipt and witness scheme of \cite{mix-acc} works because a mix loses
+reputation if it hasn't obtained a receipt from the next hop within
+the allowed timeframe. All of these systems are designed to ensure that
+messages will leave each mix predictably.
+
+In some topologies, such as free-route mix-nets, a widespread delaying
+attack is difficult to perform because the adversary must control all
+mixes that are sending messages to the target mix. In this case we may
+gain more by allowing mixes to protect against blending attacks than we
+lose by permitting compromised nodes to manipulate messages. At the other
+extreme, in a cascade topology, a single compromised mix has absolute
+control over the downstream message batch. In this case verification of
+correct processing is critical, to prevent wholescale (unobservable!)
+anonymity compromise. Methods to detect an attack afterwards, e.g. by
+checking for successful delivery of test messages at the end of the
+cascade \cite{casc-rep}, can help detect misbehaving mixes --- but they
+still allow the adversary to uncover a few messages before he's noticed.
+As we design new protocols and batching approaches, we must consider
+this tradeoff between protection from blending attacks and protection
+from verifiability.
 
 \section{Cover Traffic}
 
@@ -898,5 +948,4 @@
 \bibliography{taxonomy}
 
 \end{document}
-
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/