[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] more edits, and remove redundant comments at the end



Update of /home/freehaven/cvsroot/doc/alpha-mixing
In directory moria:/home/arma/work/freehaven/doc/alpha-mixing

Modified Files:
	alpha-mixing.tex 
Log Message:
more edits, and remove redundant comments at the end


Index: alpha-mixing.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/alpha-mixing/alpha-mixing.tex,v
retrieving revision 1.31
retrieving revision 1.32
diff -u -d -r1.31 -r1.32
--- alpha-mixing.tex	11 Mar 2006 05:01:17 -0000	1.31
+++ alpha-mixing.tex	11 Mar 2006 06:06:38 -0000	1.32
@@ -151,13 +151,6 @@
 unpredictable without assumptions about the rate of incoming messages. On
 the other hand, threshold mixes can provide minimum anonymity properties.
 
-As we will see, one of the virtues of alpha mixing is that
-the timed/threshold distinction for mixes can blur, and it
-becomes more a distinction for firing strategies of individual
-messages than of mixes. For our initial analysis we will assume
-a steady-state network with constant rate of incoming messages, which
-means against a passive adversary the anonymity properties are equivalent.
-
 ``Deterministic'' here refers to the algorithm by which messages
 change $\alpha$ after each mix firing. Later we will consider
 algorithms that will change alpha probabilistically, for example based on
@@ -187,6 +180,13 @@
 net are waiting and nearly full while messages are being accumulated
 at relatively empty mixes.)
 
+As we will see, one of the virtues of alpha mixing is that
+the timed/threshold distinction for mixes can blur, and it
+becomes more a distinction for firing strategies of individual
+messages than of mixes. For our initial analysis we will assume
+a steady-state network with constant rate of incoming messages, which
+means against a passive adversary the anonymity properties are equivalent.
+
 It is also possible to have a threshold-or-timed alpha mix in which
 all messages are decremented in the alpha stack if either $t$ seconds
 have passed or $n$ messages have arrived.
@@ -199,11 +199,10 @@
 anonymity against a local passive adversary}
 \label{sec:passive-adversary-anonymity}
 
-We will describe anonymity for a threshold mix. As we have already
-noted, we assume a steady-state network in which messages arrive with
-various alphas at a regular rate, and the mix fires
-at regular intervals. Thus the threshold mix is indistinguishable by a
-local external passive adversary from a timed mix.
+Here we describe the anonymity for a threshold alpha mix during
+steady-state (messages arrive with various alphas at a regular rate, and
+the mix fires at regular intervals). In this case the threshold mix is
+indistinguishable by a local external passive adversary from a timed mix.
 
 We assume the adversary does not know the specific alpha of any
 message entering the mix, e.g., that this is provided to the mix
@@ -543,8 +542,12 @@
 This can be viewed as a commons: everybody will hope that somebody
 else takes the latency hit.
 
-Not all users have the same sensitivity level, however. In fact users
-will have different sensitivity levels for different communications.
+There are two ways to resolve this risk though.
+First, note that not all users have the same sensitivity level: some
+users favor performance and others favor anonymity. Three factors are
+most important in characterizing the utility function for our users:
+their need for anonymity, their willingness to accept delay, and
+their guess at (expectation of) the current alpha levels in the network.
 In~\cite{econymics} it was shown that there can be optimal levels of
 free riding: more-sensitive users have incentive to provide ``free''
 communications service for less-sensitive users by running network
@@ -552,15 +555,23 @@
 better anonymity protection for the more-sensitive users. This can
 provide adequate incentive even if there are many others running
 nodes.  Similarly, while the existence of higher $\alpha$ traffic may
-reduce my incentive to set higher $\alpha$ levels for my own traffic,
+reduce Alice's incentive to set higher $\alpha$ levels for her own traffic,
 it does not eliminate that incentive.
 
-Choosing alpha range based on the sensitivity and timeliness
-constraints of one's own messages increases the autonomy and control
-over one's own security and utility. Indeed, if an adversary can make
-reasonable guesses about a choice of alpha range for a message, then
-increased alpha for other messages in a mix might actually decrease
-anonymity for a target message. For example, consider a mix containing a
+Second, when Alice chooses her alphas' range based on her sensitivity
+and timeliness constraints for her own messages, she gets increased
+autonomy and control over her own security and utility.
+%against \emph{some}
+%attackers it is still beneficial to have higher alphas: if an attacker
+%watching Alice's traffic guesses that she chose $\alpha = 0$ in hopes that
+%other people would improve her anonymity for her, then he can ignore the
+%higher-alpha traffic. That is, to get the maximum benefit of blending with
+%lots of different traffic, Alice actually does need to choose her alphas
+%from a wide range at least sometimes.
+Indeed, if an adversary can make reasonable guesses about a choice of
+alpha range for a message, then much higher or much lower alphas for
+other messages in a mix might actually decrease the anonymity set for
+a target message. For example, consider a mix containing a
 target message with low alpha and an ancillary message that is either from
 about the same alpha range or from a much larger alpha range than the
 target message.  If the adversary learns that the second message has a
@@ -570,7 +581,7 @@
 doesn't depend on the strategic behavior of others. Users of the
 system are not likely to have such fine-tuned knowledge of the system,
 the behavior of others, and their own needs. Thus if we can prescribe
-recommendations for choice of alpha, e.g., based on analysis and
+recommendations for choice of alpha, for example based on analysis and
 observed patterns within the network, we can expect most people to
 heed them. (Although they may not follow them --- we can expect
 hyperbolic discounting of risk, disregard of risk for expedience,
@@ -591,7 +602,7 @@
 \section{Beta Alpha: Variations on Alpha Mixing}
 \label{sec:beta-alpha}
 
-In the previous sections, we have investigated and analysed some basic
+In the previous sections, we investigated and analysed some basic
 alpha mixing designs and the incentive questions and attacks that arise
 from them. In this section we introduce and briefly discuss some more
 complex designs that are harder to analyse fully but may provide better
@@ -644,7 +655,7 @@
 In the basic threshold deterministic-alpha mix, if there are
 $\mbox{\emph{threshold}} = n$ messages in each of alpha levels $0$ through
 $\ell$, all of the messages in levels $0$ through $\ell$ will be sent at
-once; however, the different levels will not be mixed together.
+once; however, messages from the different levels will not be mixed together.
 The mix will send all messages
 with $\alpha = 0$, lower the stack, send the next batch of messages
 that now have $\alpha = 0$, etc. An adversary may not know exactly
@@ -660,7 +671,7 @@
 way up to $\alpha = \ell$ while also learning the alpha level of most of
 the messages.
 
-The simplest solution is to simply mix all messages that emerge at
+The simplest solution is simply to mix all messages that emerge at
 once. This will prevent an adversary from watching the order in which
 messages exit during a flush and thus learning about their sensitivity.
 The stronger attack we worry about is the blending attack:
@@ -668,10 +679,13 @@
 highest reasonably expected level, trickling in a message, then
 flooding with $\alpha = 0$ messages repeatedly to learn the
 sensitivity of that message and its next destination.
-Batching all outgoing messages together, combined with the minimal dummy
-scheme presented in Section~\ref{sec:dummies}, would substantially reduce
+Batching all outgoing messages together, combined with the dummy
+schemes presented in Section~\ref{sec:dummies}, would substantially reduce
 the risk from blending.
 
+%An alternate threshold alpha mixing scheme would only fire when
+%$n$ messages of $\alpha = 0$ have arrived. That is, 
+
 We could also require that the firing of the mix be
 threshold-and-timed, which would prevent the adversary from triggering
 an alpha-stack dump by only allowing messages of one alpha level to
@@ -688,7 +702,7 @@
 protections together.
 
 There are various ways to have realtime and threshold properties together
-in one mix design. We will note two of them.
+in one mix design. We note two of them next.
 
 \subsection{Dynamic-$\alpha$ mixing}
 
@@ -718,14 +732,15 @@
 batch of messages that a given message will be sent with, after (or
 possibly also together with) the messages in the alpha levels that are
 below it in the stack. This lends itself naturally to the batching
-concept familiar in the mix literature. Threshold batching would seem
-to imply unpredictable delays since we don't know how long it will
+concept familiar in the mix literature. Intuitively, threshold batching
+implies unpredictable delays since we don't know how long it will
 take for a threshold number of messages to accumulate at $\alpha = 0$.
-Timed mixing on the other hand will allow a predictable delay, but
-this is both a lower and an upper bound on the latency of delivery.
-Threshold batching would seem to allow messages to potentially be
+Timed mixing on the other hand will allow a predictable delay by providing
+an upper bound on latency.
+But because timed mixing also provides a \emph{lower} bound on latency,
+threshold batching can be faster because it can allow messages to be
 processed as quickly as they arrive, provided the batch size does not
-get in the way. 
+get in the way.
 
 This is the idea behind tau mixing: a message $M$ arrives at a mix
 with an associated threshold $\tau_{M}$ of how many other messages
@@ -743,15 +758,18 @@
 Messages that are more sensitive should be assigned a $\sum
 \tau^{(i)}_{M}$ from a private distribution on a range that
 increases with sensitivity. Many of the same features of alpha mixing
-apply, e.g., the dummy strategy, how to distribute $\sum \tau$ across
-the mixes in the path, etc. If taus are purely threshold values, then
-an adversary that is powerful enough to perform a sustained flush of
-the entire network will be able to conduct end-to-end timing
-correlations on more sensitive messages (assuming we stick to a purely
-internally routed dummy scheme). However, both a threshold and a
+apply, including the dummy strategy discussion, the techniques for
+allocating $\sum \tau$ across
+the mixes in the path, and so on.
+
+If taus are purely threshold values, then an adversary that is powerful
+enough to perform a sustained flush of the entire network will be able
+to conduct end-to-end timing correlations on more sensitive messages
+(assuming we stick to a purely internally routed dummy scheme). To
+address this attack, both a threshold and a
 random minimum delay at each mix can be given as security parameter.
 This will prevent effective flushing unless the adversary can also
-perform sustained blocking of all inputs to the mixnet. Even then
+perform sustained blocking of all inputs to the mixnet, and even then
 the attack will be substantially slowed.
 
 \section{Conclusion}
@@ -795,29 +813,11 @@
 %%%% Stuff with 4s is stuff from alpha strategy section
 %%%% that I didn't want to toss just yet
 
-%4 some of the users favor performance and some favor anonymity. Three
-%4 factors are most important here:
-%4 
-%4   - their interest in anonymity.
-%4 
-%4   - their willingness to accept delay.
-%4 
-%4   - their guess at (expectation of) the current alpha levels
-%4 
-%4     in the network.
-%4 
-%4 
 %4 C.f. Andrei's complaints about stop-and-go -- in that design you had
 %4 to anticipate the network load to get good anonymity, and in this
 %4 design you need to anticipate the other users' anonymity/latency
 %4 requirements to get what you want.
 %4 
-%4 
-%4 To help people guess current alpha levels, we could run a node and use
-%4 that to estimate the average $\Sigma \alpha$ in use in the network.
-%4 We could publish it to the users, which would cause some sort of
-%4 feedback loop.
-%4 
 %4 Speaking of which, how does the network bootstrap? See the econymics
 %4 paper where we argue that people who don't care about anonymity will
 %4 be the early adopters, since the people who do care won't be willing
@@ -825,19 +825,10 @@
 %4 choose $\alpha$ of 0, and then sometime later the users willing to try
 %4 a non-zero $\alpha$ will arrive?
 %4 
-%4 
 %4 The attacker's a priori expectation of Alice's paranoia improves his
 %4 attack.
 %4 
 
-%***************\\
-%Where Paul stopped on Wednesday. Below is unchanged from before today.
-%***************\\
-
-%***************\\
-%Where Andrei stopped on Thursday. Below is unchanged from before.
-%***************\\
-
 %\section{Assumptions}
 %
 %Assume that in most cases we are analyzing a steady-state network.
@@ -907,18 +898,6 @@
 %messages, higher alpha messages, and dynamic pool messages all living
 %in the same mix.
 
-%\section{Related work}
-%
-%Traditional mixes.
-%
-%Traditional onion routers.
-%
-%Batching taxonomy. Pool mixes.
-%
-%Stop-and-go.
-%
-%Econymics and usability-is-a-security-parameter.
-%
 %\section{Analysis}
 %
 %In this section we look at the two mixes in more detail, their
@@ -944,73 +923,6 @@
 %state what's the diff). The point is that this little change is a win
 %all around.
 %
-%The properties here, of course, depend on what the attacker knows
-%about the security parameter of the users.
-%
-%Sender anonymity, one mix, illustrated on two rounds only (equivalently,
-%suppose maximum alpha is 1 just to illustrate the idea)
-%
-%Round 1: $I_1 = i_{1,1} \ldots i_{n,1}$ came in, messages $o_{1,1} \ldots
-%o_{x,1}$ came out.
-%
-%Round 2: $I_2 = i_{1,2} \ldots i_{m,2}$ came in, messages $o_{1,2} \ldots
-%o_{y,2}$ came out.
-%
-%$\alpha(x)$ is the set of possible alphas of message $x$ as known by
-%the attacker. Note that if the attacker knows nothing, then $\forall x
-%\alpha(x) = {0,1}$
-%
-%Our target message is $o_{1,2}$. Anonymity set (in messages): 
-%
-%\[
-%\{x | x \in I_1 \wedge 1 \in \alpha(x)\} \cup \{y | y \in I_2 \wedge 0
-%\in \alpha(y)\}
-%\]
-%
-%Hence (almost) any knowledge of alphas by the attacker degrades
-%anonymity. Note that complete knowledge of alphas by the attacker
-%\emph{may} leave the message with no anonymity, however, this is
-%extremely unlikely (or amounts to a variant of the trickle attack, a
-%rather stupid one when you come to think of it).
-%
-%Now, the anonymity probability distribution is also an easy thing to
-%work out, but first we need a little more formalization of the
-%assumptions. Essentially, where we allowed the attacker possibilistic
-%knowledge about the alphas of the messages, we now allow him (better)
-%probabilistic knowledge.
-%
-%Notation: call $x_{\alpha}$ the alpha in message $x$. Hence the
-%attacker knows the probability distributions $P(x_{\alpha}=y)$ for
-%every message $x$, y ranging from 0 to $y_max$, [PFS: huh?] in our case 1.
-%
-%Now, the anonymity probability distribution in the case above is:
-%
-%\[
-%\mbox{Normalise}(\{p | x \in I_1 \wedge p = P(x_{\alpha}=0)\} \cup 
-%\{p | x \in I_2 \wedge p = P(x_{\alpha}=1))
-%\]
-%
-%AAS: Paul, what's the notation for a finite probability distribuiton
-%like this??? Can't think off hand.
-%
-%Clearly, there is an easy generalisation to N rounds. (To be inserted)
-%
-%Show by equations:
-%
-%part one: anonymity gotten by all users when every $\alpha is 0.$
-%(AAS: depends what the attacker knows, done above, in a steady state,
-%unless the attacker can show it > threshold)
-%
-%part two: anonymity gotten by Alice when she sets her $\alpha to 1$.
-%(AAS: ditto)
-%
-%part three: anonymity gotten by Alice when $\exists$ some other
-%sender's message of $\alpha 1$.
-%(AAS: ditto)
-%
-%(AAS: actually, I don't think the above 3 cases are what we want, I
-%think the examples ought to be rather different. 
-%
 %--------------------------------- AAS: Not quite sure where this fits
 %in, but it is quite good.
 %
@@ -1052,37 +964,14 @@
 %Performance is a function of network load. No guaranteed minimum delays --
 %but guaranteed minimum anonymity sets, assuming no blending attacks?
 %
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
-%  Following section is now all in rewritten
-%  Dummies section above, no? -PFS
-%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
 %\section{Dummies}
 %
-%Just as with trickle02 and generalized-batching, we want to
-%provide uncertainty even in edge cases.
-%
-%The active attacker can arrange an edge case via blending attacks, but
-%a passive attacker can also simply wait for an edge case to occur.
-%
-%We need to always apply the dummy strategy because we can't know
-%if we're under attack. Dummies should be routed back to ourselves,
-%a la heartbeat-traffic.
-%
-%
 %What fraction of the traffic is dummies, then? We can graph this for
 %various strategies. Paul says (for timed): always send out at least
 %2 messages, and always add at least one dummy. In this case the dummy
 %overhead is bounded: between NL and 2NL for N mixes. (This sounds
 %suspicious now that I see it again -- is it wrong? -RD)
 %
-%Note that we can achieve Andrei's goal of strict uncertainty with
-%way more efficiency just by having a single dummy anywhere in the
-%buckets.
-
-%\section{Game Theory}
 %\section{Strategic Choice of Alpha}
 %
 %As observed in section~\ref{passive-adversary-anonymity}, the
@@ -1094,68 +983,11 @@
 %This can be viewed as a commons: everybody will hope that somebody else
 %takes the latency hit.
 %
-%Not all users have the same sensitivity level, however. In fact users
-%will have different sensitivity levels for different communications.
-%In~\cite{econymics} it was shown that there can be optimal levels of
-%free riding: more-sensitive users have incentive to provide ``free''
-%communications service for less-sensitive users by running network
-%nodes because this will still provide additional value in the form of
-%better anonymity protection for the more-sensitive users. This can
-%provide adequate incentive even if there are many others running nodes.
-%Similarly, while the existence of higher $\alpha$ traffic may reduce
-%my incentive to set higher $\alpha$ levels for my own traffic, it
-%does not eliminate that incentive.
-%
-%Choosing alpha range based on the sensitivity and timeliness
-%constraints of one's own messages increases the autonomy and control
-%over one's own security and utility. Indeed, if an adversary can make
-%reasonable guesses about a choice of alpha range for a message, then
-%increased alpha for other messages in a mix might not increase
-%anonymity for a target message, e.g., if that increased alpha would
-%only vary the range that is after the target message is expected to
-%have left the mix.
-%
-%More significantly, however, security is hard to get right when it
-%doesn't depend on the strategic behavior of others. Users of the
-%system are not likely to have such finetuned knowledge of the system,
-%the behavior of others, and their own needs. Thus if we can
-%prescribe recommendations for choice of alpha, e.g., based on
-%analysis and observed patterns within the network, we can expect
-%most people to heed them. (Although they may not follow them. We
-%can expect hyperbolic discounting of risk, disregard of risk
-%for expedience, etc.~\cite{acquisti04}.)
-%
-%Alpha mixing itself is likely to affect the applications that can
-%be securely used and how, so recommendations are likely to evolve.
-%Initial recommendations can be guided by existing anonymity networks.
-%Traffic that must arrive in realtime obiously must have $\sum \alpha = 0$.
-%For more sensitive traffic, we might initially try to follow 
-%networks such as Mixminion. But how to do that since mixing*****
-%
-%
-%some of the users favor performance and some favor anonymity. Three
-%factors are most important here:
-%
-%  - their interest in anonymity.
-%
-%  - their willingness to accept delay.
-%
-%  - their guess at (expectation of) the current alpha levels
-%
-%    in the network.
-%
-%
 %C.f. Andrei's complaints about stop-and-go -- in that design you had
 %to anticipate the network load to get good anonymity, and in this
 %design you need to anticipate the other users' anonymity/latency
 %requirements to get what you want.
 %
-%
-%To help people guess current alpha levels, we could run a node and use
-%that to estimate the average $\Sigma \alpha$ in use in the network.
-%We could publish it to the users, which would cause some sort of
-%feedback loop.
-%
 %Speaking of which, how does the network bootstrap? See the econymics
 %paper where we argue that people who don't care about anonymity will
 %be the early adopters, since the people who do care won't be willing
@@ -1163,42 +995,9 @@
 %choose $\alpha$ of 0, and then sometime later the users willing to try
 %a non-zero $\alpha$ will arrive?
 %
-%
 %The attacker's a priori expectation of Alice's paranoia improves his
 %attack.
 %
-%\section{Distributing your alphas over the path}
-%
-%******************************************\\
-%Presumably Andrei will incorporate this with the section above
-%(\label{sec:distributing-alpha}) when he picks it up later today.\\
-%******************************************\\
-%
-%The first node (or second, or...) can learn about Alice's paranoia
-%level. So given $\Sigma \alpha$ which represents Alice's concern for
-%this message's anonymity, we need an algorithm for allocating which
-%$\alpha$ goes where in the path.
-%
-%If we allocate all the $\alpha$'s uniformly, each node can compute the
-%expected $\Sigma \alpha$ for that message. If we allocate them
-%non-uniformly, we're in better shape. Need high expected variance for
-%better uncertainty too.
-%
-%AAS: Any knowledge of alpha degrades anonymity.... So agree with all
-%the above. Note the similarity between picking an alpha and message
-%splitting \cite{SM05} -- in both cases they are distributions over
-%partitions. The best you can do is precisely to pick from a uniform
-%distribution over the partitions of $\Sigma \alpha$ into $l$ buckets
-%where the buckets themselves are indistinguishable. The number of such
-%partitions are given by 
-%
-%\[
-%\sum_k=1^l Q(\Sigma \alpha, k)
-%\]
-%
-%where $Q(n,k)$ denotes the number of ways of partitioning $n$ into
-%exactly $k$ distinct parts. 
-%
 %\section{Example scenarios}
 %
 %Work a couple of examples scenarios with sample numbers of types of

***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxx with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/