[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[freehaven-dev] Re: request for comments on Mixnet Reputations paper


[sorry to anyone who receives this more than once]

Roger Dingledine wrote:
> We (Michael Freedman, David Molnar, and I) have been writing a submission
> for the Info Hiding Workshop (http://chacs.nrl.navy.mil/IHW2001/),
> and we'd be interested to have you take a look and give us your
> reactions. I give the abstract below, and the paper itself is available
> from http://freehaven.net/doc/mix-acc/mix-acc.ps (or .tex). Our submission
> deadline is Dec 12 (Tuesday).

# One common way to prevent raters from cheating is to design some means of
# verifying transactions. In many systems, transaction verification is done
# by having both parties exchange signed digital receipts, or by performing
# the transaction publically. We extend the latter notion to develop a mix
# design in which all messages in the MIX­net are publicly known. That is,
# MIX nodes publish their input and output messages. This information is
# exactly what would be seen anyway by a passive adversary with complete
# network access. We retain unlinkability between message sender and
# recipient due to the batching and reordering that each MIX performs
# internally.

This seems like a very strong assumption to be making about the security of
the mix-net. It is possible to send messages between mixes using secure
channels (which would be quite desirable in order to turn potential passive
attacks into attacks that require compromising some nodes), so the assumption
that a global passive adversary would see all messages anyway is not
necessarily correct for all mix-net protocols.

# This idea was originally due to Chaum in [4]. The reason we obtain
# anonymity is because the messages coming into the MIX are encrypted. The
# MIX decrypts them and sends out the decryptions, which are themselves
# encrypted messages (except for the last hop). If we have a semantically
# secure cryptosystem, then an adversary has only negligible probability of
# determining which incoming message corresponds to which outgoing message.

Semantic security is definitely not sufficient to protect against an
active attacker.

For example, I've been working on a class of attacks that I'll call
"marking" attacks, where an active adversary changes a message in such a
way that it can be identified later in the route. Although Mixmaster
and many other mix-net protocols have message integrity checks, they do
not cover the whole message (which is quite difficult to do because of
the random padding to a fixed length).

By making a change to a specific part of the message, the adversary can
ensure that it is only be detected as invalid n hops later, where n is
determined by the adversary. If the adversary also controls that later
node (and possibly even if he does not - see below), then he can identify
the message. In particular if the message is marked before the first
node, and identified at the last node, the whole path is compromised.

It may also be possible to perform this attack even when the last node
is not controlled by the adversary. Suppose that a node always has the
same number of output messages as input messages in a given time period.
In that case it can be detected whether the node has dropped a message,
because the number of output messages will be one less than expected.
This attack can also be extended to the case where (inputs - outputs) is
random but varies only by a small amount.

Publishing the inputs and outputs potentially aids this attack (as
well as traffic analysis and probably other attacks), because it makes
any adversary into a global passive adversary, i.e. there is no need to
eavesdrop on the inputs and outputs of the last node, or know which
node this is in advance.

It is possible to prevent marking attacks by making sure that a change
to the ciphertext is always detected by the next hop. This is quite
complicated, and requires the use of deterministic rather than random
padding to a fixed length. I've been working on a mix-net protocol that
does this, as a refinement of Babel (it also has some nice features like
supporting reply messages that are as secure against replay attacks as
outgoing messages, which I don't think any other protocol does).

# However, we should keep in mind that traffic analysis research is still
# in its infancy, and it may be that hiding even the information that a
# passive adversary would see is crucial to real anonymity for remailer
# networks; see Section 5.8 for more consideration of this point. 
# In general, a system to verify transactions should provide verification
# of both successful and failed transactions. However, due to the
# pseudonymous nature of a mixnet, successful transactions can easily be
# spoofed (via pseudospoofing, as described in Section 5.5) and thus do
# not provide an accurate metric for MIX reliability.

If the mix-net supports reply blocks then it can also support receipts,
which allow successful message transmissions to be verified by the

This does not allow the sender (Alice) to prove to anyone else that the
message was or was not sent correctly, and it doesn't allow Alice to
determine precisely which mix on the path failed. However, Alice could
potentially send more messages, with paths that cross the previous
path, to determine how reliable each of those mixes is.

Also, it is still possible for reviewers/scorers to test reliability,
either for real messages or for dummy messages that a reviewer sends
to itself. These will be indistinguishable from real messages if the
network is working as intended, so they will be no more or less reliable
than real messages on average. I.e. sections 5.5 to 5.7 of the paper
apply even when inputs and outputs are not published.

Another possibility is for Alice to be able to request information
about whether a message she sent that failed to arrive was received
by each mix on the path. Call a mix "dishonest" if *either* it lies in
response to one of these requests, or it dropped a message. If all of
the mixes on the path say they did receive the message, then the last
mix is dishonest. Otherwise, let i be the index of the first mix that
says it did not receive the message; then either N_{i-1} or N_i or
both are dishonest. This is not as good as being able to blame a
particular mix, but it does not require publishing messages. We can
enforce that only Alice can obtain this information (so that it is
still hidden from passive attackers attempting to do traffic analysis,
etc.) by having Alice prove to the mix, over a secure channel, that
she knows a value that was included in the relevant onion layer. If so
then the answer is sent back over the same secure channel.

Unfortunately even when Alice knows that either N_{i-1} or N_i is
dishonest, she can't prove it to anyone else.

# In our notation, we will explicitly include the random value used in
# the encryption, so E_1(M, r) means the encryption of message M and
# random value r under the public key of N_1.
# [footnote] Note that while E_1(M) is a random variable ranging over
# all the possible encryptions of M, E_1(M, r) is a particular string. 

This should be E_1(r, M), to be consistent with the notation in the
rest of the paper.

# This section describes the protocol for sending a message from Alice
# to Bob, using our mixnet design with verifiable failures. 
# 1. Alice chooses a recipient Bob.
# 2. Alice chooses k MIX nodes N_1, ..., N_k to form a "MIX path".
# 3. Alice picks k + 1 random padding values r1, r2, ..., rk, rbob.
# 4. Alice creates an initial packet P, defined as
#    P = E_1(r1, 2, E_2(r2, 3, ... E_k(rk, Bob, E_Bob(rbob, M)) ...)) 

It would be slightly clearer to use
     P = E_1(r1, I_2, E_2(r2, I_3, ...

where I_j is the identity or name of node N_j.

# We now define ciphertext collision resistance in a manner similar to
# hash functions. Call a public key cryptosystem weakly ciphertext
# collision resistant if it is infeasible for a probabilistic
# polynomial­time adversary, given a target ciphertext C, to produce a
# public key pk, message M, and random padding value r such that
# C = E_pk(r, M). A public key cryptosystem is then strongly ciphertext
# collision resistant if it is infeasible for a polynomial­time
# adversary to produce any collision whatsoever, not necessarily a
# targeted one. Somewhat surprisingly, common cryptosystems are not
# ciphertext collision resistant.

Actually they are in the sense required.

Definition: a "public key encryption scheme with unambiguous decryption"
is a public key encryption scheme (G, E, D) where

  G : Coins -> PubKey x PrivKey  (G is probabilistic)
  E : PubKey x Seed x PT -> CT
  D : PrivKey x CT -> PT \union {INVALID}

such that for all (pk, sk) = G(coins), there is a *one-to-many*
relation R_pk : PT x CT -> {0,1}, and a set ValidCT_pk, defined by

  R_pk(M, C) = 1, if \exists r \in Seed s.t. E(pk, r, M) = C
             = 0, otherwise
  ValidCT_pk = { C \in CT : \exists M s.t. R_pk(M, C) = 1 }

such that D(sk, C) = the unique M s.t. R_pk(M, C) = 1, if C \in ValidCT_pk,
                   = INVALID, otherwise.

A ciphertext in ValidCT_pk is called a "valid ciphertext for pk".
E is supposed to be used by choosing a seed at random from the set
Seed for each message.

(To visualise this, draw a diagram with each plaintext in PT mapping
to a collection of ciphertexts in CT, so that all the collections in
CT are disjoint. All the remaining points in CT decrypt to INVALID.)

RSA-OAEP, DHAES, and almost all other PK encryption schemes have
unambiguous decryption. The only example I can immediately think of
that doesn't is the Rabin encryption scheme (which is one reason why
it isn't commonly used).

# Call a public key cryptosystem fixed­key strongly ciphertext collision
# resistant if for a fixed public key pk, it is infeasible for a
# probabilistic polynomial time adversary to find a tuple
# (M, r, M0, r0) such that E_pk(r, M) = E_pk(r0, M0) and M != M0 or
# r != r0. The "fixed­key" here means that we restrict ourselves to
# ciphertext collisions only over a single fixed key.

This is not really the right property; we do not need to care about
collisions where r != r0 but M = M0.

Call a public key cryptosystem fixed­key ciphertext collision-free
if for a fixed public key pk generated by G, there does not exist a
tuple (M, r, M0, r0) such that E_pk(r, M) = E_pk(r0, M0) and M != M0.

Because R_pk is one-to-many, there cannot be ciphertext collisions
for different messages under a given public key generated by G, so all
encryption schemes with unambiguous decryption are fixed-key ciphertext
collision-free, and this is sufficient for the proofs.

About the Elgamal example (reference 34),

David Wagner wrote:
> David A Molnar wrote:
> > Suppose I have a message M and a "target" ciphertext C = E_PK(M,r) for
> > some randomized public key encryption and padding scheme E_PK with
> > padding value r. How difficult is it to find a different M',r' such that
> > E_PK(M',r') = C ?
> For El Gamal, it is trivial.  If C = E(M) and C' = E(M'), then
> C*C' = E(M*M'), so generate C' = E(1) [a random encryption of "1"],
> and then multiply C by C'.

The ciphertext in Elgamal is (g^r, M*g^xr), where r is the random seed
and x is the private key. I think David Wagner is talking about the second
part of the ciphertext; it is not possible to find a fixed-key collision
(i.e. with M' != M) for the whole ciphertext, because that would mean
that decryption was ambiguous.

It is certainly possible to find collisions with r' != r and M' = M;
for Elgamal, r' = r + ord(g) is a collision in that sense. But this is
completely harmless for the delivery failure proof: exhibiting *any*
r and M such that E(pk, r, M) = C proves that C \in ValidCT_pk, and
therefore D_sk(C) = M.

[Collisions for different keys are not really relevant, because the
key of the node that is being accused of failure can be assumed to be
known to any verifier. We can also assume that this key is valid (i.e.
was generated by G) and secure, because the node chose it, and
has no incentive to generate an invalid or insecure key. This is
important because unambiguous decryption does not necessarily hold
for invalid keys.]

#    OAEP(m, r) = m0^k1 XOR G(r) || r XOR H(m0^k1 XOR G(r))

Note that the version of OAEP that has been standardised (in IEEE
Std 1363-2000 and PKCS #1 v2.0) and is now commonly used, is

     OAEP(r, M, P) = DB XOR G(r) || r XOR H(DB XOR G(r))
     where DB = h(P) || 0^k || 1 || M
            P = encoding parameters (often zero length).

[In fact when G and H are constructed using "MGF1" with the hash
function h, as specified in the above standards, then RSA-OAEP could
be proven fixed-key strongly ciphertext collision resistant assuming
only that h is collision resistant, i.e. without using the random
oracle model - but this is academic because fixed-key strong
ciphertext collision resistance isn't required.]

Next problem: the proofs of Goals 2 and 3 are flawed (although they
can be fixed).

In the proof of Goal 3, the string that Alice presents is called
D_k(foo), when it is not known to be of that form. In fact there's
another minor mistake here - the decryption of foo would include
(k+1) - but the main point I want to make is that foo is not known
a priori to be a valid ciphertext, and the D_k(foo) notation makes
an implicit assumption that foo can be validly decrypted. Also the
verifier cannot compute the function D_k, so cannot check directly
that this value is equal to D_k(foo) (checking by re-encryption is
not the same thing).

Suppose, then, that Alice presents rk, I_{k+1}, and bar, such that
E_k(rk, I_{k+1}, bar) = foo. The proof would then continue as follows:

  If N_k finds that foo is not a valid ciphertext for pk, then it
  would ignore the packet. But if Alice can provide any plaintext
  that encrypts to foo under N_k's public key, then foo must have
  been a valid ciphertext for pk. So D_k(foo) != INVALID. Since
  we are using an encryption scheme with unambiguous decryption
  and since N_k's public key can be assumed to be valid,
  E_k(rk, I_{k+1}, bar) = foo implies that D_k(foo) = (I_{k+1}, bar).

  Thus N_k can obtain (I_{k+1}, bar) from foo. If such an entry was
  on the ledger (who wrote it there is irrelevant), and N_k did not
  later post "To N_{k+1}: bar" to the ledger, then N_k really did
  fail. Alice's claim is therefore true.

  Therefore, we satisfy Goal 3.

Goal 2 gave me a real headache, until I realised that it is not
necessary. The proof of Goal 3 already shows that no-one can exhibit
(rk, I_{k+1}, bar) showing that N_k dropped a message which is
valid up to and including the N_k layer, unless N_k actually
dropped it. So it is sufficient to merge Goal 2 and Goal 3 into:

# Goal 2, Reject false claims: Any participant (including Alice)
# cannot claim that node N_j failed to pass on a well-formed message
# sent by Alice to N_{j+1} (and possibly onward from there to Bob),
# unless it really did fail to send such a message.

Saying "a well-formed message sent by Alice to N_{j+1}",
instead of "a well-formed message sent by Alice to Bob", is
important because the protocol does not prove whether the parts
of the message encrypted to N_{j+1}, ... Bob are well-formed or
not. In fact we do not care whether those parts are well-formed:
if Alice shows that N_j failed to pass on a message to N_{j+1},
but the later parts of the message were ill-formed, then N_j has
still failed. Similarly, we don't care about how the message
to N_j got onto the ledger; it is considered well-formed iff
the layer encrypted to N_j's public key is correct.

So the whole proof for the merged Goal 2 (which addresses soundness,
not completeness) can be written as:

  False claims are rejected

  Sketch  We wish to show that any participant cannot claim that
  node N_j failed to pass on a well-formed message sent by Alice
  to N_{j+1}, unless it really did fail to send such a message.
  Without loss of generality we will assume that the prover is
  also Alice (this allows the prover to use information that
  only Alice has, e.g. the random seed values).

   - If Alice posted a real message P which includes N_j as normal,
     then a well-behaving, honest N_j will have forwarded it on and
     produced an intermediate message. Any verifier will see this
     intermediate message when Alice claims N_j failed and
     disbelieve Alice. By assumption, Alice cannot delete the
     intermediate message from the ledger, so N_j is protected
     from such a false claim.

   - In order for Alice to claim that N_j failed to pass on her
     message as above, Alice would need:
     - an entry "To N_j: foo" present on the ledger,
     - knowledge of some string bar, seed rj, and destination
       N_{j+1}, such that E_j(rj, I_{j+1}, bar) = foo.

     If N_j finds that foo is not a valid ciphertext for pk, then
     it would ignore the packet. But if Alice can provide any
     plaintext (I_{j+1}, bar) that encrypts to foo under N_j's
     public key, then foo must have been a valid ciphertext for pk.
     So D_j(foo) is not INVALID. Since we are using an encryption
     scheme with unambiguous decryption, E_j(rj, I_{j+1}, bar) = foo
     implies that D_j(foo) = (I_{j+1}, bar).

     (We can assume that N_j's public key is validly generated,
     because it is chosen by N_j, who would have no incentive
     to generate an invalid key.)

     Thus N_j can obtain (I_{j+1}, bar) from foo. If such an entry
     was on the ledger (who wrote it there is irrelevant), and N_j
     did not later post "To N_{j+1}: bar" to the ledger, then N_j
     really did fail. Alice's claim is therefore true.

  Therefore, we satisfy Goal 2. Note that this also covers the
  case where Alice attempts to send N_j ill-formed messages, since
  (I_{j+1}, bar) is not ill-formed, and we have proven that when
  Alice's claim is accepted, N_j must have received a message with
  that as the plaintext. Any other messages that N_j may have received
  are irrelevant.

It is also necessary to show that the cryptosystem remains secure
when random seeds are published, but this is true because the seeds
are only published by someone who encrypts to the public key, and does
not know the private key. In that case, revealing the seed can only
affect the secrecy of the specific message encrypted under that seed
(which is revealed anyway at the same time).

I think that was all :-) Good luck with the deadline.

- -- 
David Hopwood <hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

Version: 2.6.3i
Charset: noconv