[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
End to end Mixminion issues, with proposed solutions
Hello, everyone!
I wrote this piece up to try to resolve a bunch of nagging points in the
current spec. I propose the following combined solution. (Comments
would be really neat.)
PART I: END-TO-END MIXMINION: Or, "Weren't there supposed to be clients?"
Right now, we've got several aspects of the system in a half-specified
state, and it turns out that they all cover a similar area: end-to-end
communication. These half-specified aspects are:
1) End-to-end encryption on forward messages
[Alice wants to send a message to Bob that only Bob can read. Alice
knows Bob's identity, but wants to remain anonymous herself.]
2) 'Stateless' reply blocks
[Alice wants to generate a bunch of SURBs, but not have to store
separate keys for each one.]
3) Recognizing message sizes
[Bob has just received a message from Alice. How does he know where
it ends?]
4) Recognizing forward/reply messages as a recipient
[Bob has just received a message from Alice. Must he decode it?
Should he try his public key, or is the message a reply, or is it
unencrypted?]
5) The ad-hoc use of the TAG field.
(To abbreviate, I'm going to refer to the following 'message types':
FWD (unencrypted forward message)
EFWD (encrypted forward message)
REPL (Reply block delivery)
SREPL ('Stateless' reply: requires minimal client-side storage.)
JUNK (Any other message type, with payload tagged after crossover.)
)
Here are some design goals:
1) A recipient should be able to decrypt all incoming messages
with a minimum of effort.
2) Although it will always be more secure for a recipient to run
her own exit node, a recipient should be able to receive FWD
and S?REPL messages with without running her own exit node and
still get acceptable anonymity. [This is desirable because
some recipients are likely to have unreliable network
connections or no fixed IP address.]
3) Minimal exit distinguishability: as much as possible, an
adversary observing an exit node should not be able to tell the
different 'message types' apart. (Obviously, because FWD
messages are delivered as plaintext, they can't be mistaken for
EFWD,REPL,SREPL, or JUNK. The others, though, should be
indistinguishable to anybody but the recipient.) [This is
desirable to keep an edge-watching adversary from partitioning
the traffic more than necessary.]
4) FWD recipients must still be able to read their messages
without any client support.
5) All message types must be deliverable by SMTP.
6) As much as possible, recipients should be able to tell whether
they have decrypted a message correctly.
And here's the design. It's got some weaknesses, but I think it's a
good start. Please let me know what's wrong, and how to fix it.
_META_
A) All exit types have their Routing-Info field begin with a 20-byte
Tag. In all cases, the contents of a Tag field should be a 20-byte
random number, with MSB=0. (I.e., 0 <= Tag < 2**(159): Tag is really
only 159 bits.)
[See C below to find out why the MSB=0.]
(Remember, the Tag is in the header; the sender generates it for
forward messages, and the recipient generates it for replies.)
SENDER:
B) For FWD messages, the sender generates a 159-bit random number for
the Tag. For a text message, the sender pads the payload with 00
bytes. For a binary message, the sender sends 2 size bytes,
the payload, and random padding up to 28K.
C) For EFWD messages, the sender generates a 128-bit temporary key K at
random, and RSA-OAEP encrypts it with the recipient's public key.
If the encrypted key EK has its MSB=1, the sender generates a new
key and tries again. Otherwise, the sender uses the first 20
bytes of EK as the Tag. The payload contains: The remaining bytes
of EK, followed by the LIONESS encryption (with K) of:
2 size bytes,
the message
Random padding up to 28K.
{{This is the iffy part. I know that RSA-encrypted data is
distinguishable from random numbers, since the encrypted message
is never greater than the public key modulus. I _think_ that
discarding results with MSB=1 forces the encrypted data to 'look
random', but I've really got no clue.
Does this work at all? How well? If this doesn't work, what might?
Efficiency shouldn't be too bad: On average, a sender only has
to do 1 extra RSA public-key encryptions per EFWD message.}}
D) REPL/SREPL messages: The sender does nothing to the Tag field;
that's part of the SURB header, and under the recipient's
control. The recipient can't tell whether the SURB is 'stateless'
or not.
Each SURB includes an encryption key K. The sender generates a
payload containing the string 'Mix Reply', two bytes of size
data, the message, and random padding up to 28K. The sender then
encrypts this payload with K and sends it.
SMTP DELIVERY:
E) If the message is to be delivered over a text medium such as
SMTP, the exit node checks whether the message is contains only
'printing' characters (hex 07..0D, 20..FF), and
ends with a string of 00 characters. If so, the exit node strips
the 00 characters at the end, and delivers the message as-is.
Otherwise, the exit node sends the entire payload, base-64
encoded.
In either case, the exit node includes the tag as part of the
SMTP body, base-64 encoded.
{This way, senders can send text-only FWD messages to
recipients without special software. The problem is, I think
this fails in the presence of UTF-8 and friends. Is this okay?}
SURB GENERATION:
F) Generating a 'stateful' reply block:
The SURB generator generates a set of shared secrets: one for
each hop of the SURB, and one for end-to-end encryption. The
generator also creates a 159-bit identifier, and stores the keys
on disk, indexed by this identifier. The generator uses the
identifier as the Tag field.
G) Generating a 'stateless' reply block:
The SURB generator must remember a secure password, or have a
long-term secret stored on disk. We call this secret SEC.
The SURB generator searches for a 159-bit seed S such that
H(S|SEC|"Validate") ends with a 0 byte. This seed becomes the
Tag. The SURB generator then uses H(S|SEC|"Generate") to seed a
PRNG to generate shared secrets as above.
{See J below to learn why we search for H(S|SEC|"Validate")
ending with 0.}
RECEIVING A MESSAGE:
H) First, if the message is a text FWD message, the sender will
probably know what to do with it; it's obviously text.
(Else go to step I.)
I) Second, if a recipient uses 'stateful' reply blocks, she checks
whether the messages Tag field matches the index of any of her
stored key sets. If so, she uses that key set to decrypt the
message. If the end result does not start with 'Mix Reply',
then the payload has been junked.
(If the Tag field is *not* the index of a stored keyset, go to
step J.)
J) If a recipient uses 'stateless' reply blocks, she checks
whether H(Tag|SEC|"Validate") ends with 0. If so, she uses
H(Tag|SEC|"Generate") to reconstruct the key set. If the end
result does not start with 'Mix Reply', then the payload has
been junked, _OR_ the message was not actually generated with a
reply block.
{Why the ends-with-0 step? Performance. Decrypting a 16-hop
message takes as much as 95 ms on my laptop, because of the 17
LIONESS steps. On the other hand, trying 256 SHA-1 hashes to
find one ending with 0 takes 1 ms. By spending 1 ms per SURB
generation, we save save an average of 94.6 ms every time we
receive a message _not_ send with SREPL.}
(If this step fails, go to step K.)
K) If the recipient has a public key of size N, she appends the
tag to the first N-20 bytes of the payload, and tries to
decrypt the result. (If the OAEP padding doesn't check, the
message is junk, or was not encrypted: go to step L.) The
result is a key; the recipient uses it to decrypt the rest of
the message.
L) The message is either unencrypted data, or junk.
Open questions:
1) We should figure out compression at some point.
2) Likewise for fragments.
3) Likewise for k-of-n and friends.
4) Did I really get indistinguishability for everything but FWD?
5) Will my scary hack really make RSA-encrypted data hard to distinguish
from random numbers? If not, do we have any other choices?
Part II: WHAT'S AN ADDRESS?
Another end-to-end issue is: how can clients specify addresses?
To send a forward message, or to generate a SURB, the program must know:
1) The final routing type.
2) The non-Tag portion of the routing info.
3) If the sender wants to send a EFWD message, the recipient's
public key.
4) If the routing type depends on a particular node (as does
MBOX), a valid sever descriptor for that node.
If the sender knows (3) and (4), we're okay. But -- as George asks --
how is a sender supposed to find them out? If the sender just gets them
off a keyserver, an observer (or the keyserver) can tell who wants to
talk to whom.
Possible solution: just use the Mix system to query the keyserver! The
would-be recipient uses Minion to sent the keyserver a list of users and
a SURB to receive the reply. (Alternatively, just send a non-encrypted
anonymous mail to the user and ask for his key.)
Of course, we still have some pretty severe usability problems. When I
want to use this thing from the command line, what do I need to tell the
client? For now, I'm going to make users specify public keys and
'final' nodes on the command line, but that'll probably need to change.
Otherwise, it'll be too easy to accidentally send a reply to the wrong
server.