Re: [tor-dev] Proposal 259: New Guard Selection Behaviour

On 25 Mar 2016, at 22:26, George Kadianakis <desnacked@xxxxxxxxxx> wrote:

Tim Wilson-Brown - teor <teor2345@xxxxxxxxx> writes:

[ text/plain ]

On 25 Mar 2016, at 00:31, George Kadianakis <desnacked@xxxxxxxxxx> wrote:

Tim Wilson-Brown - teor <teor2345@xxxxxxxxx <mailto:teor2345@xxxxxxxxx>> writes:

[ text/plain ]

On 24 Mar 2016, at 22:55, George Kadianakis <desnacked@xxxxxxxxxx <mailto:desnacked@xxxxxxxxxx>> wrote:

<snip>

I think Reinaldo et al. were also thinking of incorporating the
ReachableAddresses logic in there, so that DYSTOPIC_GUARDS changes based on the
reachability settings of the client. I'm not sure exactly how that would work,
especially when the user can change ReachableAddresses at any moment. I think
we should go for the simplest thing possible here, and improve our heuristics
in the future based on testing.

I suggest that we compose the set of UTOPIC guards based on addresses that are reachable and preferred (or, if there are no guards with preferred addresses, those guards that are reachable). I suggest that we use the same mechanism with DYSTOPIC guards, but add a port restriction to 80 & 443 to all the other restrictions. (This may result in the empty set.)

Alright, this seems like a good process here. We should do it like that.

What happens if a utopic guard suddenly is not included in the
ReachableAddresses anymore? Maybe we mark it as 'bad' (the same way we mark
relays that leave the consensus).

Yes, if it's not available, it doesn't really matter why.

(Having different behaviours for different reasons would complicate guard selection.)

<snip>

I think the current proposal tries to balance this, by enabling this heuristic
only after Alice exhausts her utopic guardlist. Also, keep in mind that the
utopic guardlist might contain 80/443 guards as well. So if Alice is lucky, she
got an 80/443 guard in her utopic guard list, and she will still bootstrap
before the dystopic heuristic triggers.

There are various ways to make this heuristic more "intelligent", but I would
like to maintain simplicity in our design (both simple to understand and to
implement). For example, we could imagine that we always put some 80/443 guards
as our primary guards, or in the utopic guardlist. Or, that we reduce the 2%
requirement so that we go trigger the dystopic heuristic faster.

Or that tor can get a hint about which ports it can access based on which ports it used to bootstrap.
(See below for details.)

Yes, could be.

How would that work though?

We pass the port(s) that we've successfully bootstrapped on to the guard selection algorithm as an initial hint.

The algorithm ensures than X% of the relays it selects are on those port(s).

The problem with this approach is that it biases guard selection towards the DirPorts that authorities and fallback directories are on.

So X% must be high enough to ensure we can continue to load descriptors if all other ports are blocked, but low enough not to overload guards on those ports.

And what happens if the network changes? How does the hint work then though?

There are two scenarios:

If the network changes after a short period of downtime (<24 hours), the consensus will still be current, and we won't bootstrap again.

Some of our guards will fail, and we will choose other guards from the original list.

If the network changes after a long period of downtime (>=24 hours), the consensus will expire, and we will bootstrap again, and get a new hint.

We will check if the original list contains Y% guards on these new ports (where Y <= X).

If it doesn't, we can augment the list with new guards on those ports, or create an entirely new list.

There is a risk here that the list grows without bound if Y% is high, and we regularly switch between N sites that each allow a small number of different ports.

Currently, I'm hoping that we will understand the value of this heuristic
better when we implement it, and test it on real networks...

Any suggestions?

There's a whole lot of my thoughts below.

Why such a large list of guards?

Apart from the fingerprinting issue (which I think gets worse with a larger list, at least if it's tried in order), I wonder why we bother trying such a large UTOPIC guardlist.
Surely after you've tried 10 guards, the chances that the 11th is going to connect is vanishingly small.
(Unless it's on a different port or netback, I guess.)
And if our packets are reaching the guard, and being dropped on the way back, we have to consider the load this places on the network.

Indeed, I also feel that 80 guards is a lot of guards to try before switching to dystopic mode.

I would be up for reducing it. I wonder what's the right number here.

My fear with having a small number of sampled guards in a guardlist is that if
all of them go down at the same time, then that guardlist is useless.

I would imagine that the probability of 10+ guards going down at the same time is minuscule, unless the network has major issues, or a port is blocked on the client side.

Also, this reminds me that the proposal does not precisely specify what happens
when guards in SAMPLED_UTOPIC_GUARDS become bad (they drop out of the
consensus). Do we keep them on the list but marked as bad? What happens if
lots of them become bad? When do we add new guards? Currently the proposal only
says:

     It will be filled in by the algorithm if it's empty, or if it contains
     less than SAMPLE_SET_THRESHOLD guards after winnowing out older
     guards. It should be filled by using NEXT_BY_BANDWIDTH with UTOPIC_GUARDS
     as an argument.

I think we should be more specific here.

Yes, we need to behave sensibly if lots of guards become bad.

It's most likely that we're effectively blocked from Tor.

Or that we can only use a small number of ports to get out.

In this case, it would help to add guards with known good ports (from a recent bootstrap hint).

Client Bootstrap

The proposal ignores client bootstrap.

There are a limited number of hard-coded authorities and fallback directories available during client bootstrap.
The client doesn't select guards until it has bootstrapped from one of the 9 authorities or 20-200 fallback directories.

What do you think should be mentioned here?

That clients bootstrap before selecting guards.

That loading a consensus takes additional time during initial bootstrap (5-30s?) that's not counted in these calculations.

Bootstrap / Launch Time

The proposal calculates bootstrap and launch time incorrectly.

The proposal assumes that Tor attempts to connect to each guard, waits for failure before trying another. But this isn't how Tor actually works - it sometimes tries multiple connections simultaneously. So summing the times for individual connection attempts to each guard doesn't provide an accurate picture of the actual connection time.

When bootstrapping in 0.2.7 and earlier, tor will try an authority, wait up to 10 seconds for it to fail, then try another.
Then there's a 60 second wait before the third authority, but at that point the user has likely lost interest.

In 0.2.8, tor connects to authorities and fallbacks concurrently. It will try 3 fallbacks and 1 authority in the first 10 seconds, and download from whichever one connects first So 0.2.8 is far more likely to connect within a few seconds.

In all current versions, tor then downloads the consensus (~1.5MB, could take 10 seconds or more), and chooses directory guards.
Then it simultaneously connects to 3 directory guards to download certificates and descriptors.
The time it takes tor to work out if a connection to a directory guard has succeeded happens simultaneously with other directory guard timeouts.

So under this proposal, it would really take tor:
10 seconds for initial bootstrap
20 seconds (or more) to download the consensus
600 seconds / 3 directory guards = 200 seconds to exhaust its UTOPIC guardlist

Where does the "600 seconds" figure come from here?

It's the existing figure in the proposal. 10 seconds x 60 guards in the UTOPIC list.

Although if tor is building preemptive paths at the same time, the calculation could well be:

600 seconds / (3 directory guards + 1 OR guard) = 150 seconds to exhaust its UTOPIC guardlist

(tor skip the first two phases if it has a live consensus)

Can we revise the proposal to take this into account?

Are you talking about section 4? Yes, that could be rewritten a bit.

However, I think that section does not specifically talk about bootstrap as you
seem to be doing.

So, if you have Tor running and you move your laptop to a network with
FascistFirewall, you will not be bootstrapping again with 3 directory
guards. Instead, you are going to be walking over the guard list with a single
guard. So in that case section 4 will be more accurate.

Or am I wrong?