[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Proposal: Bring Back PathlenCoinWeight



On Wed, Apr 18, 2007 at 02:35:04PM -0700, Mike Perry wrote:
>   The idea is that users should be able to choose a weight which
>   probabilistically chooses their path lengths to be 2 or 3 hops. This
>   weight will essentially be a biased coin that indicates an
>   additional hop (beyond 2) with probability P. The user should be
>   allowed to choose 0 for this weight to always get 2 hops and 1 to
>   always get 3. 
>
>   This value should be modifiable from the controller, and should be 
>   available from Vidalia.

To give some context, the PathlenCoinWeight config option was originally
there as a geometric distribution to add to some default minimum path
length. Its goal was to stop your first hop from knowing for sure that
it was your first hop. That is, if path lengths are unpredictable,
then no hop ever knows his distance from Alice for sure.

(I say 'was originally there' rather than 'was originally introduced'
because we inherited it from the previous onion routing designs.)

We dropped it for three reasons:

- Timing attacks probably give statistical confidence to a given node
  about its position in the circuit.
- Our use of entry guards, and our use of CREATE_FAST cells,
  automatically give it away in many cases, especially for non-relays.
- Given the first two points, if three hops is adequate for security,
  then more than three hops just slows you down with not much benefit.

So this proposal isn't about bringing back the old behavior. It appears
rather to be some combination of:

a) We should let people choose two hops if they want, for better
   performance and to reduce the overall load on the network.
b) We should make path length have a random component, so it's hard
   to tell if a given user prefers two-hop paths or three-hop paths.
c) We should change our policy for dropping guards based on when they
   fail.

Letting each user choose his own random weight is probably a recipe
for disaster, in that it adds complexity to the system and to analysis
and doesn't really provide any clear win. So giving people a couple of
weights to choose from (0, 1, and maybe a few more) seems a smarter move.

>   Furthermore, when blocking resistance measures insert an extra relay
>   hop into the equation, 4 hops will certainly be completely unusable 
>   for these users, especially since it will be considerably more
>   difficult to balance the load across a dark relay net than balancing
>   the load on Tor itself (which today is still not without its flaws).

I believe the blocking-resistance plan is to have 3 hops also -- the
bridge acts quite like an entry guard, and then two more. But all of
that is up for more thorough analysis too.

>   Therefore, the user has little benefit from the extra hop, assuming
>   the adversary does timing correlation on their nodes. The real
>   protection is the probability of getting both the first and last hop, 
>   and this is constant whether the client chooses 2 hops, 3 hops, or 42.

My intuition agrees -- against e2e timing correlation attacks, three
hops is not (much) better than two.

>   Partitioning attacks form another concern. Since Tor uses telescoping
>   to build circuits, it is possible to tell a user is constructing only
>   two hop paths at the entry node. It is questionable if this data is
>   actually worth anything though, especially if the majority of users
>   have easy access to this option, and do actually choose their path
>   lengths semi-randomly.

Agreed. The possibility that an entry guard could learn that its user
is a 2-hop user doesn't bother me very much. At least not until it
turns into a deeper attack.

>   Perhaps most seriously, two hop paths do allow malicious guards 
>   to easily fail circuits if they do not extend to their colluding peers
>   for the exit hop. Since guards can detect the number of hops in a
>   path, they could always fail the 3 hop circuits and focus on
>   selectively failing the two hop ones until a peer was chosen.

Yep. In fact, assuming Alice uses Tor a lot, it doesn't even need to fail
every single "safe" circuit. It can just take advantage of the circuits
where it wins, and/or maybe fail a few to help that along. Note that it's
trivial for it to recognize which circuits are the winners, so there's
no synchronization or communication needed except when it's a sure bet.

Another (passive) variant of this attack is that the entry guard could
notice the exit node Alice chooses. If she chooses one with an unusual
exit policy (e.g. it only exits to a few IP addresses, or only to a few
ports) that gives away a lot about Alice's destination. Does this mean
Alice needs to compare the exit policy of her chosen exit to all the other
exits, detect if it's giving away "too much" and add an extra hop if so?
In any case, since Alice keeps using this entry guard, it can build up a
profile via intersection attacks of what types of exit she prefers. How
bad is this? I'm not sure; but it does complicate things.

The inverse of this attack is an exit node building a profile of Alice's
entry guards, if Alice is one of a few people connecting to an unusual
destination. It would use this profile to recognize when a new connection
is "more likely" to have come from Alice.

This in turn plays into a worry that Nick raised a while ago: if a
set of entry guards is a good enough fingerprint, and Alice is on a
laptop and moves around a lot, what if she moves to a hostile network
and somebody has is looking for a fingerprint of preferred entry guards
that matches hers?

It's hard to know how to weight or rank these concerns. But they do make
it less clear that two hops provides equivalent security to three.

>   I believe currently guards are rotated if circuits fail, which does 
>   provide some protection,

Actually, no. Guards are marked as inactive if the initial connection
to them fails, or the create cell to them fails. If a circuit through
them fails, we don't mark anything.

Guards are only actually dropped from the guard list if they are inactive
for many weeks. Otherwise we try them again as soon as the directory
authorities tell us they're running again. Section 5 of path-spec.txt
has a few more details.

> but this could be changed so that an entry 
>   guard is completely abandoned after a certain number of extend or 
>   general circuit failures, though perhaps this also could be gamed 
>   to increase guard turnover.

Right. Remember that we're not only worried about an attacking guard
colluding with an exit node or destination website; we're also worried
about a local network colluding with an exit node or destination website.

But I guess in that second context it doesn't really help the adversary
to force Alice to rotate entry guards. Hm.

> Such a game would be much more noticeable 
>   than an individual guard failing circuits, though, since it would
>   affect all clients, not just those who chose a particular guard.

I'm not sure I follow this. You mean if a middle hop decides to fail
connections from certain guards to encourage users to move away from them?
If the attacker has already learned that Alice is using that guard, that
would make sense; but he'd have to own an awful lot of middle hops for
that attack to be practical.

Another worry is the local network triggering repeated failures only
for Alice. But actually, Alice should be able to distinguish failures
on her local network from authenticated messages from the guards saying
that an extend failed.

Hm. What other issues are there with abandoning guards after failure? One
issue is that we could more quickly rotate away from honest guards onto
bad guards; so we would need to make sure that the bar for failure is
sufficiently high that it doesn't trigger except during one of these
attacks. This part needs more thought.

> Why not fix Pathlen=2?:
> 
>   The main reason I am not advocating that we always use 2 hops is that 
>   in some situations, timing correlation evidence by itself may not be 
>   considered as solid and convincing as an actual, uninterrupted, fully 
>   traced path. Are these timing attacks as effective on a real network 
>   as they are in simulation? Would an extralegal adversary or authoritarian 
>   government even care? In the face of these situation-dependent unknowns, 
>   it should be up to the user to decide if this is a concern for them or not.

Fair enough. How do the above anonymity-breaking attacks look to you, in
this light?

Thanks!
--Roger