[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: header-swap isn't perfectly indistinguishable (was: problem in 3.2 "Replies")
-----BEGIN PGP SIGNED MESSAGE-----
Zooko wrote:
> David Hopwood wrote:
> > I'm not convinced there is really any problem with forcing all chains to be
> > the same length; the extra hops for forward messages still contribute to
> > anonymity, so they are not wasted.
> [...]
> > Consider any synchronous design (e.g. a pure cascade, or a batch synchronous
> > design [...]
> > Note also that increasing the path length in this case does not have much
> > effect on latency, which is primarily determined by the batch period (and how
> > many times a message is delayed to the next batch). It does have an effect on
> > reliability for forward-only messages, but the reliability is no worse than
> > for replies.
>
> [...]
> From your comments about latency it appears that you are envisioning a
> batching strategy that enforces a very high added constant latency to each
> message.
Don't pay too much attention to the specific times I gave (perhaps I'm just
more paranoid, and/or thinking of applications where that much latency would
be acceptable).
What matters is the trade-off between latency and anonymity - I'm claiming
that synchronous networks can offer an improved trade-off, because the
latency distribution can be made closer to uniform, and is independent of
the number of nodes on the path. For asynchronous networks with random
deferment, the latency follows approximately a normal distribution, which
is not really what we want.
[Following the Babel paper, I'll use the verb "defer" for when a message
is deliberately deferred to a later batch, and "delay" only when a message
is removed and then reintroduced later by an attacker.]
> I hope that is not necessary, as I wish to use Mixminion for
> applications with latency on the order of minutes, not of days.
If the number of messages going through the network in a few minutes is large
enough to provide a reasonable anonymity set, you can set t_batch to a few
minutes.
For the source-timed variant, it's possible to use the same network with
different latencies for different applications, although that obviously splits
the anonymity sets to some extent.
Let's compare an asynchronous network (using random deferment), with a simple
synchronous batching network (not using deferment, to begin with). Assume that
paths are free routes chosen by the same method.
The latency distribution in the asynchronous network is approximately a normal
distribution (it is the sum of several uniform-random deferment times). In the
synchronous network it is uniform on [\ell*t_hop, t_batch + \ell*t_hop], where
\ell is the number of nodes on a path, and \ell*t_hop < t_batch.
Let this distribution be L(t) = Pr[latency = t]. I think that when users
judge how slow a network is, they will consider the worst case (or the 90th
percentile of L, say) to be a more significant measure than the average case.
Anyway, suppose that parameters are chosen so that users perceive the two
networks to be equally fast or slow.
Then:
- the asynchronous network will be vulnerable to the attacks in Section 4
of [disad-free-routes]; the synchronous network will not.
- for the synchronous network, users know in advance exactly when their
message will be sent from the last hop, which is a significant practical
advantage. For the asynchronous network, users never know exactly how
long their messages will take to get through the network.
- consider a global passive adversary, Eve, who watches the network inputs
and outputs. Assume that messages enter the network at a roughly constant
rate. Eve sees a message M enter at time t_enter. It could leave the
network at any later time up to the maximum latency - but all latencies
are not equally likely. If Eve has no other information, then if she sees
k messages that *could* correspond to M leave the network at times t_1..t_k,
she will weight the probabilities that they actually correspond to M
proportionally to L(t_i - t_enter). Intuitively, the "flatter" the
distribution L is for a given maximum latency, the less information Eve
gains - which favours the synchronous network.
(We could make this more precise by considering how much Shannon
information Eve receives, or some similar metric.)
- in both cases the bandwidth usage is (network throughput * \ell), and
the reliability is p^\ell.
(OTOH, the synchronous network needs the path length \ell to be twice
as long for forward-only messages, since it has to be the same as for
anonymised replies. Note that this does not affect latency, only
bandwidth and reliability. Also, the security benefit of the longer
path is not lost.)
- for the asynchronous network, an extra mechanism like "swap", "strip",
or "two payloads" is needed to provide indistinguishability between
forward messages and replies. For the synchronous network this comes
for free: only first-half and last-half messages can be distinguished,
but the security analysis assumes that anyway, and it is not a problem.
(SPRP encryption of payloads should be used in either case. For the
synchronous network, it is not relied on as much, because a tagging
attack on a forward message will not work on the first half of the
path, which is long enough to provide anonymity. However, it should be
used anyway if we want the security benefit of the longer path.)
The above is for a simple synchronous network without deferment. More
generally, we can use a source-timed network, where each hop header specifies
the time that a message must be received from the previous hop, and the time
it must be sent to the next hop.
(Actually it is which messages appear in which mini-batches that matters,
not the precise timing, but it is simpler to label the mini-batches with
timestamps.)
Source-timed deferment allows different users to make different latency/
anonymity trade-offs. In fact *any* latency distribution can be chosen,
not just normal, binomial or uniform distributions. To show that a network
using source-timed deferment is a strict improvement over one with random
deferment, note that it can approximate the latter by having the client
choose deferment times independently for each hop (as the nodes would do
for random deferment), and by setting other parameters the same.
In that case there are only two differences:
1. The network with random deferment is subject to the [disad-free-routes]
attacks, and the source-timed network isn't. (This is because an attacker
cannot delay messages in a source-timed network, it can only permanently
remove them.)
2. A source-timed network cannot use threshold batching strategies (where
the timing of batch flushes and/or the number of messages that are
retained in a pool depends on the number of messages received). However,
that's not really a problem, because it can use the following strategy
("interval batching with minimum output set") instead:
If the number of messages in a batch, k, is less than a bound k_min,
then the node adds k_min - k dummy messages to the output. These dummies
follow the remainder of a path for this batch, chosen in the same way
as a normal message path, except that they are discarded at the last
node.
This prevents simple blocking attacks, and cases where too few messages
are sent without any active attack, but it does not prevent blending
attacks. However, threshold batching doesn't prevent blending attacks
either, so nothing has been lost.
(There is still a blocking attack if the attacker owns all nodes on the
path after the one that has its input blocked - but as long as at least
two nodes on the path are honest, this attack fails.)
A source-timed network with deferment is not quite as easy to analyse
as one where batches are completely independent:
- it's more difficult to make sure that changes to the network meta-
information appear to take effect instantly. For example, suppose that
this information consists simply of a global set of reliable nodes that
are used to construct paths at random. Also suppose that a node N is
removed from this set starting from batch T. Any messages still going
through the network that reach N in batches >= T, must have been deferred
from a previous batch, and so their anonymity set is reduced.
This could be solved by saying that changes in the network meta-
information must be signalled long enough in advance that they can take
effect for all messages in the same period (i.e. one maximum latency
time in advance). I don't think this would be a significant limitation
in practice.
- depending on the chosen latency distribution, the probability of a
message being deferred at one node of a path might not be independent
of the probability that it is deferred at other nodes. For instance,
suppose the strategy is to first choose from a uniform distribution
how many times the message will be deferred, and then decide at random
how those deferments are assigned to nodes on the path. In this case
if a node sees a deferred message, it knows that it has a higher
probability of being deferred at other nodes (and conversely, if it is
not deferred at this node, then it has a lower probability of being
deferred at other nodes).
I don't think this leads to any significant attack, though.
So, my suggestion is that we should use:
- source-timed synchronous batching with deferment (and discuss
further what the latency distribution should be),
- interval batching with minimum output set, and
- the simplest option for doing replies, as described in
<http://archives.seul.org/mixminion/dev/Apr-2002/msg00079.html> and
<http://archives.seul.org/mixminion/dev/Apr-2002/msg00084.html>
(what we called "distinguishable" before, although that's not an
accurate name in this context).
- --
David Hopwood <david.hopwood@zetnet.co.uk>
Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip
-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv
iQEVAwUBPN7iuTkCAxeYt5gVAQHlKQgAh67HyU9G2fmnr4vxR1HnJBh2G2Dp2dx7
sBfQTn9vOamPLf5HMNNNm4gAymSN/obkYqlUy3nS9ZBBiKrLsrXx+ztlpax6Rf3w
sh9rqIAc0JV9z66f4lY8HmVSmqGIbmWKvSF323BoqWWgghBcDcMa6DFMLMhqkABT
lzhC5Y4YKUGP0eTHZ7skDYyNyEEcYtVTGi+vbIlgv6fVmTCrsOr69YQPFTkmWpBF
nZb/SR83++8qupCW+y2wQLgojSPzgwlwoaBFJWF/65eeMSZ/TGZ2wfvmjEsx+Sny
utFc0Bmx/4KQnPUUXMPKxuh6D8T2vyj3Mue1VtNZenGh/Uy4cEbzeA==
=NqPh
-----END PGP SIGNATURE-----