[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [or-cvs] Implemented link padding and receiver token buckets



On Mon, Jul 15, 2002 at 09:12:18PM -0400, Roger Dingledine wrote:
> Speaking of too full, if you run three servers at 100kB/s with -l debug,
> it spends too much time printing debugging messages to be able to keep
> up with the cells. The outbuf ultimately fills up and it kills that
> connection. If you run with -l err, it works fine up through 500kB/s and
> probably beyond. Down the road we'll want to teach it to recognize when
> an outbuf is getting full, and back off.

On more thought, this is an absolutely insane design.

Let me tell you a bit more about the implementation:

The outbuf has three parameters:
* len is the number of bytes allocated for the outbuf.
* datalen is the amount of stuff ready to send.
* flushlen is the amount of stuff I would like to push onto the network
  now.

Currently I queue a cell for every click (the frequency of clicks is
determined by bandwidth -- so for 10240B/s, a click should be every
12.5ms), regardless of datalen. That is, I always insist on sending out
'bandwidth' per second. The current implementation is: for every click,
we increase flushlen appropriately, and if it exceeds datalen, we queue
a padding cell. Whenever we can write onto the network, we try to push
out as much of flushlen as possible.

But the problem here is that if the network is full, then flushlen keeps
increasing, and we keep adding padding cells. So first of all, when the
connection slows down we end up with more and more cells to send out,
eventually ending up with a datalen == len. Bad news. Secondly, when a
new data cell arrives, it needs to wait til all that padding is flushed
before it can go onto the network.

Solution: we don't let flushlen exceed a certain size, say 10ms worth
of cells or 10 cells, whichever is larger. Thus we'll never have more
than one flushlen worth of padding cells on the outbuf, ever.

Our padding still successfully hides whether cells are data or padding,
because we stick to the rule "whenever we can send a cell onto the network
we do, but not to exceed 'bandwidth'". And our datalen will only grow
large if we have lots of legitimate data cells queued.

As an aside, there are probably timing attacks here: right now
we pre-crypt data cells, but we crypt-on-demand padding cells. So
an observer might guess which type of cell it is based on how long
it takes us to get it onto the wire. Similarly, if we add and crypt
padding cells _right after write() finishes_, which would allow us to
treat them indistinguishably later on, then an observer might learn
whether we already have data cells ready, based on how long it takes us
to process his input.

This latter approach feels better, but I'm going to ignore this timing
attack issue for now. We can change the code appropriately in the
future. I'm going to go put the rest of this mail into effect.

--Roger