[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Proposal 163: Detecting whether a connection comes from a client



On Fri, May 22, 2009 at 02:59:53AM -0400, Nick Mathewson wrote:
>    There are at least two reasons for which Tor servers want to tell
>    which connections come from clients and which come from other
>    servers:
> 
>      1) Some exits, proposal 152 notwithstanding, want to disallow
>         their use as single-hop proxies.
>      2) Some performance-related proposals involve prioritizing
>         traffic from relays, or limiting traffic per client (but not
>         per relay).

We should think about these two approaches separately.

In particular, I think #2 is a distraction. In performance.pdf I proposed
that one option to improve Tor's speed is to rate limit clients (say,
10KB bandwidthrate and 500KB bandwidthburst). Nick wants to do this
rate limiting on relays, because it would be too easy for a client to
cheat and remove the self-limiting. But can't a bittorrent user cheat
the relay limiting by saying "numentryguards 100"? What we'd really want
(if we decide it's a good idea in the first place) is to limit the total
traffic the client can induce.

So I think if we want to do this proposal, it should be because of
reason #1 above. That also helps us clarify our threat model and security
requirements.

>    When a node or circuit tries to use server privileges, if it is
>    "definitely a client" as per above, we can refuse it immediately.
> 
>    If it's "probably a server" as per above, we can accept it.

So far so good.

>    Otherwise, we have either a client, or a server that is neither
>    listed in any consensus or used by any other clients -- in other
>    words, a new or private server.
> 
>    For these servers, we should attempt to build one or more test
>    circuits through them.  If enough of the circuits succeed, the
>    node is a real relay.  If not, it is probably a client.
> 
>    While we are waiting for the test circuits to succeed, we should
>    allow a short grace period in which server privileges are
>    permitted.  When a test is done, we should remember its outcome
>    for a while, so we don't need to do it again.

I think Sebastian had an important point here: we're not doing very
well at the arms race if the attacker just needs to pass the test once
or twice per time period. In fact, if our goal is
   To make grabbing relay privileges at least as difficult as just
   running a relay.
then what is the hard part of running a relay? One answer is "setting
up port forwarding". Another answer is "transiting lots of bandwidth
and potentially getting your ISP upset at you". I would argue that the
arms race you propose focuses on the former, and to stay ahead of the
game it ought to focus on the latter.

> [... a variety of complex and fragile steps in the arms race ...]

Here's a counterproposal: if it's listed in the consensus, then it's
a relay, and if it's not, then it isn't.

This could have some false negatives. In particular:

1) Won't we refuse to exit for relays that just joined the consensus? I
think these are fine actually: assuming exit relays fetch the consensus
at the accelerated "directory mirror" rate, then they should always hear
about new relays before ordinary clients do.

2) What about relays that are dropped from the most recent consensus
(say because they failed recent reachability tests), yet they're still
up and clients are still using them? I guess that argues for remembering
(summaries of) the last few consensuses so you can guess better. Yuck.

3) What about exit relays that just started up, and don't magically know
the past few consensuses? First, it will be a few cycles before clients
learn about the exit relay, so they'll rarely be totally without history.
Second, perhaps those are acceptable degradations -- as long as the begin
cell is refused with one of the reasons in edge_reason_is_retriable(),
then the client will try somewhere else. If the reason is EXITPOLICY,
the client will avoid that exit until it gets a new descriptor; I'm not
sure if that means we should make use of that or not.

4) What about the glorious future when Tor has scaled and it's harder
for every exit to know all consensuses? One answer is that we can tackle
this problem then. Another answer is that the alternative arms races
start to suck then too, because too many people are in the "we have to
do active checking" edge case.

While I'm at it, here's another design: the authorities sign a little
token that a relay can use when connecting to another relay to prove
its relayness. No fresh token, you're not a relay. A problem with this
design though is that the incentives aren't lined up right: why does
the middle hop care whether he gets a token? He never sees any harm to
himself by not having one. I guess we could just make it part of the
protocol, and he'll never even need to know it's happening. Is this a
direction worth exploring?

I guess another option is to stop trying to complexify things and accept
the attack. My main reasoning against letting people do path selection
in a non-standard way was because our load balancing algorithm couldn't
handle it. It looks like mikeperry's proposal 161 will do much better
in this regard. Maybe that's good enough?

--Roger