[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Comments on Proposal 105 -- handshake-revision



On Wed, Oct 17, 2007 at 11:48:54PM +0100, Steven Murdoch wrote:
> I would suggest the proposal specify precisely the behaviour of a node
> on receiving the addresses in the NETINFO cell. Currently it is vague.
> I also worry about a host behind NAT. It will send out the internal IP
> address in the NETINFO cell, so will not match the IP address that the
> receiver sees. How would this case be handled?

It's been a while since we wrote that MITM part. I'm not sure I still
believe in it.

The primary vulnerability we worry about is due to the fact that Tor
combines multiple circuits inside the same TCP connection, and it assumes
it's got the right connection just based on the fellow on the other side
having the right key.

So the attack is that a malicious client somewhere guesses that there's no
connection yet between server1 and server2, builds a circuit to server1,
then asks server1 to extend the circuit to decoyIP:decoyPort specifying
the identity fingerprint of server2. Then decoyIP turns out to just
bounce traffic to server2. The handshake works, and everybody's happy
-- except the new Alice who comes along and asks server1 to extend her
circuit to server2, at which point server1 says "oh, I already have a
connection to server2, I'll just use that." (Alice also has problems if
she builds a circuit to server2 and asks it to extend to server1.)

One answer is this complex NETINFO cell that tells both sides who they're
actually talking to in an authenticated manner.

How about if we just remember the IP address we connected to (or the
IP address we got the connection from), and when somebody asks us to
extend to a given identity fingerprint and he specifies an IP:port that
doesn't currently have a connection open, then we open a new one. That
way we refuse to combine circuits that think they're going to different
places, but we're free to combine circuits that think they're going to
the same place.

We become vulnerable to an attacker who actually can redirect traffic
en route between server1 and server2, but we're in bad shape against
this attacker anyway.

This "you wanted server2, I see you got his IP address wrong, I'll do what
you meant instead of what you asked for" behavior is designed, if I recall
correctly, to tolerate servers on dynamic IP addresses, so even if you
have an old descriptor it still might work. It's also left-over from the
days of "server twins", where two servers in different locations shared
the same identity key. The design choice is 5 years old at this point,
and probably should have been taken out when we took out server twins.

(I don't know how our NETINFO cell is going to handle this situation any
better, because there's no easy way to distinguish between a malicious
neighbor with a similar IP address and a server that just got a new
lease.)

If we ever give servers the ability to advertise multiple IP addresses,
they should specify a "primary one", and clients should ask for this
primary one when sending an extend cell from one server to another
server. (I make the assumption that multiple advertised IP addresses
are helpful only to clients who are trying to get around some firewall.)
(I also make the assumption that IP address is all that matters -- if
the attacker shares the host computer and can "be" other ports on the
legitimate IP address, more power to him.)

Ok, the next stumbling block: what about those servers who advertise one
address yet bind to a different one for outbound connections? One option
for them would be to have them detect the address they actually use,
and write it in their descriptor. Then server1 can make an exception for
"well, you asked for server2's primary IP address, but I see I have a
connection to another one that server2 promised was acceptable too. You'll
be ok with that."

(If I'm not mistaken, we would need to do something like this for the
NETINFO approach too, since we'll have exactly the same problem, right?)

Ok, here's one advantage of the NETINFO approach, in the case of servers
that have multiple IP addresses: the NETINFO cell can communicate all
of these IP addresses. Otherwise we either divide the anonymity set
(in practice, there will be one TCP connection for circuits initiated by
server1 to server2, and a second TCP connection for circuits initiated by
server2 to server1), or we assume that server1 has a copy of server2's
descriptor so he can learn that both IP addresses are legit. (As a
hack, we could just declare that any IP address on the same /24 as the
official one is safe enough. This may cut down on the number of duplicate
connections enough that we're ok with what's left.) I'm guessing that
since we've been trying to maintain the "servers don't need to know about
the network" goal, the NETINFO cell approach will end up winning. Nick,
was this what you've been thinking all along, or did I just clarify some
of our assumptions?

--Roger