[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: End to end mixminion rationalized, with compression and K-of-N



Summary: I start out disagreeing with Zooko, believing that our failure
model renders the dangers of the chunking approach harmless in terms of
reliability.  A little math, however, convinces me that the non-chunking
approach is harmless in terms of performance.  I close by recommending a
wait-and-see approach .

On Fri, 2002-10-11 at 08:27, Zooko wrote:
> 
> No time to chat, but Mojo Nation used "chunking", exactly as you describe, and 
> it proved to be a major reliability problem.  This is because it incurs the 
> P^K reliability problem, where P is the reliability of a chunk (the 
> probability of recovering a chunk), and K is the number of chunks.

I don't know Mnet's failure model, but are you sure that it's the same
for Mixminion? 

I'm guessing that most failures in a remailer network are not single
packet failures, but rather server-level failures where a given server's
unavailability causes all packets sent through that server to be delayed
or dropped.  Thus, the number of packets doesn't matter so much as the
number of paths. 

Math time: If failures *are* packet-level, assuming a assuming a modest
90% success rate and an EXF of 1.5 or so, a 20-chunk (8960KB) message
will arrive with 99.4% probability.  Success probability only drops
below 99% when message size

On the other hand, if failures are mainly path-level, the actual number
of packets is irrelevant; only K, N, and the number of paths matter.
Assume that 20% of paths fail.  An EXF of 1.6 or so will divide each
chunk into enough pieces so that we can reconstruct messages on the exit
nodes.

 [...]
> The next generation data structure in Mnet will not do *any* chunking, which 
> we can get away with by specifying variable-length and large packet sizes.

Actually, variable-length packet sizes were the first recommendation
that Justin made to me; they're a great choice for non-anonymous
systems.  Just hold K constant and let the packet size equal Len(Msg)/K.

> 
> For your purposes, you might consider (a) using larger K's (how do you know 
> that K > 32 performs badly?),

Hm.  Let me see. The VDM code takes time equal to:

          C * K * payload-size * reconstruct

where C is about 7.7 msec/MB on an 800Mhz Pentium III, and 'reconstruct'
is the number of packets in the range 1..K that are missing and must be
reconstructed.  

Assuming packets arrive in a random order and that we reconstruct each
chunk greedily, the expected value of 'reconstruct' will be (N-K)*K/N.
Thus, we're looking at (C * K^2 * (N-K)/N * payload-size) seconds to
reconstruct a chunk.

Wait a second.  We're already decoding at least K packets for every
chunk, and even the simplest packet takes 17.8 msec on similar
hardware.  Thus, assuming EXF=1.5, reconstruction time won't take more
than 10% of decoding time until K=24; won't take more than 20% of
decoding time until K=48; won't take more than 30% of decoding time
until K=72; and so on.  

(This doesn't mean that K-of-N is fast; it just means that crypto is
slow.  Decoding 1MB worth of packets will take: 
           658 msec in the plaintext-forward case,
          1221 msec in the encrypted-forward case,
     and  2719 msec when using an when using a 8-hop reply block. 
Compared to numbers like this, reconstruction doesn't hurt much.  (We'd
save a lot of time by using BEAR instead of LIONESS; see my message of
last month.) )

Gosh.  Perhaps this means that we can afford to set K much higher.  Of
course, this is only useful if our failure model is packet-basaed rather
than path-based.  In a path-based system, you get very little by setting
N higher than the actual number of paths you're using.

>              or (b) defining it to fragment into pieces of 
> whatever size for K = 16 then recursively fragment those pieces again until 
> the packets are small enough.  (That would get complicated I think.)

Let's run the numbers here too.  I'm only going to worry about the
two-level case, since K^2 packets should be enough for anybody. :) In
this case, instead of EXF overhead, we have EXF^2 overhead. In return,
we get more reliability... but how much?  And at what performance cost?

Define BinomTail(P,low,max)=
      _max_
      >      Choose(max,i) P^i (1-P)^(max-i)
      -----
      i=low 

If packets are lost independently, we'll receive each level-one fragment
with probability

            BinomTail(P, K, N)

and reconstruct the message with probability

            BinomTail(BinomTail(P, K, N), K, N)
          = BinomTail(BinomTail(P, K, N), K, N)

On the other hand, the time to reconstruct the message becomes:

             C * K^2 * (N-K)/N * [K*payloadsize]  (for top-level)
       + K * C * K^2 * (N-K)/N * [payloadsize]    (for each sub-level)

If K=30, EXF=SQRT(1.5), and P=.90, we send 1350 packets to deliver a
900-packet message, and reconstruct the message with P=.999996 for
independent packet loss and P=fixed for path-based packet loss.  It
takes us 2.2 seconds to reconstruct thee message, out of at least 16
seconds to decode all the packets.

On the other hand, a chunking approach with K=30, EXF=1.5 takes the same
number of packets, and succeeds with P=.999989.  Reconstruction now
requires 1.94 seconds.

===============

Thus, in any case, we can afford to raise our maximum K.  I'll do that.

What to do after that depends on what we think happens to lost packets: 
Chunking is the best we can do if most failures take down whole paths.
We prefer other approaches if most failures are single-packet.

I'm going to suggest that we delay the actual choice here until we can
measure network reliability, including:
       Ppkt (the probability of a single packet arriving), 
   and Psvr (the probability of a server staying up)

Yours,
--
Nick