[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bandwidth throttling (was Re: Padding)



On Fri, 5 Jul 2002, Roger Dingledine wrote:

> The problem is that padding cells don't trigger rebroadcasts. So if
> there's only one circuit open, the only people broadcasting will be
> people who are part of that circuit.
Hmmm - you're right. But can we not make padding cells trigger broadcasts 
as well (again impose a cut-off point where we just pad the whole link to 
make this scalable)?

I think that's still better than doing full-scale padding outright? What 
worries me is that when I explain this to non-anonymity computer 
scientists the first reaction I invariably get is - oh please this is 
coimpletely useless, look at all the bandwidth you are wasting.

> Instead, when a new onion arrives and we're full of circuits, we should
> choose the "most blocked circuit" and tear it down. We can choose this
> by "number of milliseconds we've been blocked writing to it", probably
> modified by some fudge factor that notices "nearly blocked" circuits
> (where the adversary reads one byte just often enough to not count as
> blocked), or by "number of bytes queued but unwritable."
Yes but what about genuine slow connections? I can also (alas this is much 
more expensive) fill up the router with high-bandwidth connections and 
this will kill the little guy who is transfering data to his dial-up pal ...

I agree in that dropping cells shifts the buffering problem to some other 
"victim" but we can make the onion proxy do this and not bother the core 
routing nodes?

> I don't think that dropping cells protects us as well as dropping the
> whole circuit -- if you drop cells but somebody earlier in the stream
> still has to buffer them, then we're just shifting the DoS problem to
> elsewhere in the network. As long as we make it hard for the adversary
> to target a connection and force us to tear it down (and it is hard,
> because he has to run many circuits that don't block), I think we're ok.

True, this sort of attack is expensive and (I think) pretty hard to do in 
practice.

> 
> As for giving us too many onions, we can either start dropping new
> ones when we're too full (which means the adversary can deny service
> to legitimate users), or we can require some sort of recipient-specific
> hashcash (which means we need to keep a list of recently-used hashcash
> tokens, and check against it every time we receive a new onion).
> 
> Take a look at http://freehaven.net/doc/oreilly/accountability-ch16.html
> for more discussion of responses to resource allocation and DoS problems.
> 
Will have a look.

> Then I watch the connections coming out of the onion routing network,
> ordering them by
> a) how excited I would be if I learned Alice was talking with that Bob
> b) how correlated the data stream to that Bob seems to be (number and
>    timing of data) with the data cell stream I'm observing out of Alice.
> I take my remaining roving adversary nodes and put them at the onion
> routers that are talking to each suspicious Bob, in order of suspicion.
> This allows me to match timing and frequency patterns of data cells ---
> if I don't control the exit router, I just see the text from the payloads,
> not the cells themselves.
Yes that's a bitch. But even if you don't ocontrol any onion routers you 
can still confirm the relationship between Alice and Bob by observing the 
network for a longer period of time and corrleating the times of 
connections.

> 
> I can make my job easier, once I control Alice's onion router, by holding
> her data cells there and then releasing them in bursts, one at a time,
> etc, in a way that makes the pattern different from normal traffic
> (and thus easier to recognize on the other end).
Yep. But the shaping has to be considerable so that an honest node in 
between can't screw things up for you with reshaping it. Easy to do 
unfortunately, but will make it easier to spot that something dodgy is 
going on.

> 
> Another attack is to wait until an opportune moment, then close the
> circuit from Alice's end. Alice was talking to one of the Bobs who
> disconnects soon after. Even if I have only one roving node (and use it
> to watch Alice's router), I can correlate the time the circuit closes
> on Alice's end to the time Bob's connection closes.
> 
Again, with heavy traffic on the network you have to do this several times 
to confirm the relationship with some certainty. In general, timing 
attacks are a killer - do you think they are plausible attacks in 
practice?

> Since there are no checksums on the payloads, another attack is to modify
> the payloads on data cells Alice sends, or even to simply fabricate a
> huge pile of data cells and send them downstream. Bob will suddenly start
> getting garbage text. (Note that the circuit will be unrecoverable at
> this point because the stream ciphers will get out of sync with Alice's
> [actually, that points out a difference in my code compared to yours --
> currently I encrypt and decrypt the entire payload, even at the endpoints
> of the conversation. From there I use cell->length to decide how much
> to use. This makes it harder to get out of sync as above; are there any
> reasons we might not want to do it this way?])
Absolutely none :-). None I can think of anyway.

Your attack here is very plausible - but if a router wants to 
destroy/corrupt a circuit going through it, it can simply refuse to 
forward the data. We need some sort of redundancy for this. We'll probably 
have to tackle this at some point down the line.

> 
> Assuming we're pretty good at ordering Bobs based on suspicion, this sort
> of attack is very effective. Can anybody improve on my attacks? Are they
> attacks we're willing to live with?
As yo usuggest below there are ways of dealing with the attacks - the 
one that worries me is the simple "observe the connection times and 
correlate" - how do other anonymity systems deal with that, if at all?

> 
> Here are some approaches to making the attack harder, mainly through
> leaking less information at the entry node:
> * Eventually we should mix and delay 'create' and 'destroy' cells. This
>   will make our network less usable overall; so it's low priority.
> * We should implement the "long-range padding cells that look like data
>   cells" design that Matej is developing, and that Paul already knows
>   about but is too busy to notice we're talking about. ;) Pipenet
>   proposes making all dummy-data messages end at the exit router; but in
>   our case the exit router could then correlate number and timing of cells
>   with the entry router, regardless of whether they're data or dummy-data.
>   So we must drop dummy-data cells at random nodes in the path.
Yes you're right - the way I imagine this to work (and Paul, feel free to 
jump in any time ;-) is that the onion proxy randomly inserts dummy cells 
and makes them end at a random point in the circuit. 

> * If we put a hash into the payload, such that Alice crafts her payloads
>   in reverse order getting the hash right at each step, then the bad guy
>   can't insert bogus data cells, because they'll be dropped at the first
>   honest node. (Perhaps that's the right way to do dummy-data messages
>   too -- just give them a bad hash.) Since we're only worried about the
>   chance that the adversary would get lucky and create a data cell that
>   would be valid *at every hop*, we really only need a few bytes of hash
>   (say, 4) per hop.
>   But true to form (this is the same problem we had with Mixminion,
>   and it was a bitch to solve), this doesn't work for return data. The
>   exit node (which builds data cells for the return path) doesn't know
>   the keys which will be used for crypting along the circuit, and so he
>   can't get the hash right. Come to think of it, does this also mean that
>   he can't generate dummy-data cells at all? It seems hard to design a
>   dummy-data protocol that the exit node can use too. Is it useful to
>   use dummy-data cells if we can only use them on the forward path?
>   And I won't even mention the phrase 'reply onion' here. :)
> 
I like this approach. But I don't think it would work for long-range dummy 
cells - the first node in the circuit can still just drop the bad-hash 
cells. Since they come from the onion proxy which is unlikely to corrupt 
its own data stream, it can assume that they are all dummy padding.

With regards to return data - 
Dummy cells :
I think we can tell each node how to generate dummy cells (embed this in 
the onion) - the other nodes won't be able to recognize them and they will 
get forwarded all the way to the onion proxy (which will discard them) but 
this isn't much of a problem. It will still prevent other nodes (with the 
exception of the exit node) from learning the true number of cells being 
transmitted. The nly way to beat this is to control all the nodes in a 
circuit, which is a killer anyway.

Doing the hashes correctly is a more difficult problem - I will have a 
think.

Matej
> --Roger
> 
> 

-- 
Matej Pfajfar

GPG Public Keys @ http://matejpfajfar.co.uk/keys