[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Keeping anonymity when sending/receiving large messages



The issue of how to send messages that are more than ~60k payload has been
bugging us for a while. I'm going to start some discussion on it. This
is really tricky stuff, so some of the below should be considered just
a first proposal. Please break it. :)

Let's say Johnny is retrieving a file from Free Haven. He knows that Bob,
a Free Haven server, has his file. The file is 1228800 bytes. Johnny
has a working reply block for Bob.

So he picks five nodes in the mix-net, and builds a bunch of reply blocks.
Specifically, at 60k (60*1024) payload, he's going to need Bob to send
the file in 20 different packets, so he builds 20 reply blocks [1]. At
640 bytes to a reply block (128*5), that totals 12800 bytes, which easily
fits into a single message to Bob.

Johnny wants to give reply blocks that prevent people, even Bob, from
learning his identity. He could make a different path for each of the
20 reply blocks; but an adversary which owns all the nodes in *any*
of the paths has now learned his identity -- without any work at all.

By using the same path for all 20 reply blocks, Johnny is more likely
to maintain his unlinkability. On the other hand, an adversary able
to watch quite a bit of the mix-net can still use flooding and timing
attacks to watch the pile of reply messages traverse that path; we must
hope the honest nodes will rearrange and obfuscate streams of messages
enough to foil these attacks.

The basic heuristic here is:
* If you want to prevent people from learning your identity, use the
same path over and over until it breaks, then use a second one, etc.
* If you want to prevent people from profiling your behavior, use a
new path for each transaction.

In this case Johnny is worried about leaking identity, so he should use
the same path for all the reply blocks. For the same reason, Bob should
anonymize his replies using one path for all pieces.

Should Bob use that same path for transactions with other people? If so,
then basically he has an 'exit point' which is where all traffic from
Bob comes out onto the mix-net. An evil Johnny could build a reply block
where he owns the first hop, to learn Bob's exit point. Then Johnny can
observe Bob's exit point to see how much traffic he's putting out (exactly
20 messages to a node means he's serving that same file to somebody else).
Perhaps this is ok; it's definitely something to keep in mind.

So how does this discussion influence the design of the mix-net itself?
The main way is in the observation that we must not do any redundancy
operations at the level of the mix-net itself. If people have a long
message to send, then they should be able to simply break it into a set
of 60k chunks and send each of them in a mixminion message. If they want
to get redundancy, they should get it at a higher level -- say, breaking
their file into more chunks, and sending more mixminion messages so not
all of them have to get through. And realistically, if people use the
same path for all of the chunks, then usually either all of them will
get through or none of them will.

And another nasty point -- what about when Johnny wants a file that's
larger than 5.5 megs? That's the point where he can't fit enough reply
blocks in a single mixminion message. One approach is to limit the size
of large messages, and declare that anything larger should be thought
of as separate transactions (eg, separate files at the Free Haven
level). Another idea is to use a "self-addressed stamped envelope"
approach to get more reply blocks from somebody. Hm.

Anyway, I'm out of steam here. Am I on crack? Is this all there is to say
on the subject?

--Roger

[1] Actually, would the file need to be ascii-armored? How much does
that lose?