[tor-dev] Faster Bootstrap - Prop #210 (Revised)

Filename: 210-faster-headless-consensus-bootstrap.txt

Title: Faster Headless Consensus Bootstrapping

Author: Mike Perry, Tim Wilson-Brown, Peter Palfrader

Created: 01-10-2012

Last Modified: 02-10-2015

Status: Open

Target: 0.2.8.x+

Overview and Motiviation

This proposal describes a way for clients to fetch the initial

consensus more quickly in situations where some or all of the directory

authorities are unreachable. This proposal is meant to describe a

solution for bug #4483.

Design: Bootstrap Process Changes

The core idea is to attempt to establish bootstrap connections in

parallel during the bootstrap process, and download the consensus from

the first connection that completes.

Connection attempts will be performed on an exponential backoff basis.

Initially, connections will be performed to a randomly chosen hard

coded directory mirror and a randomly chosen canonical directory

authority. If neither of these connections complete, additional mirror

and authority connections are tried. Mirror connections are tried at

a faster rate than authority connections.

We specify that mirror connections retry after half a second, and then

double the retry time with every connection:

0, 1, 2, 4, 8, 16, 32, ...

We specify that directory authority connections retry after 5 seconds,

and then double the retry time with every connection:

0, 10, 20, ...

If the client has both an IPv4 and IPv6 address, we try IPv4 and IPv6

mirrors and authorities on the following schedule:

IPv4, IPv6, IPv4, IPv6, ...

We try IPv4 first to avoid overloading IPv6-enabled authorities and

mirrors. Mirrors and auths get a separate IPv4/IPv6 schedule. This

ensures that we try an IPv6 authority within the first 10 seconds.

This helps implement #8374 and related tickets.

The maximum retry time for both timers is 3 days + 1 hour. This places a

small load on the mirrors and authorities, while allowing a client that

regains a network connection to eventually download a consensus.

The retry timers must reset on HUP and any network reachability events,

[ TODO: do we have network reachability events? ]

so that clients that have unreliable networks can recover from network

failures.

The first connection to complete will be used to download the consensus

document and the others will be closed, after which bootstrapping will

proceed as normal.

A benefit of connecting to directory authorities is that clients are

warned if their clock is wrong. Therefore, when closing a directory

authority connection, we check to see if we have successfully connected

to an authority during this run of the Tor client. If not, we allow the

authority TLS connection to complete, then close the connection.

We expect the vast majority of clients to succeed within 4 seconds,

after making up to 4 connection attempts to mirrors and 1 connection

attempt to an authority. Clients which can't connect in the first

10 seconds, will try 1 more mirror, then try to contact another

directory authority. We expect almost all clients to succeed within

10 seconds. This is a much better success rate than the current Tor

implementation, which fails k/n of clients if k of the n directory

authorities are down. (Or, if the connection fails in certain ways,

(k/n)^2.)

If at any time, the total outstanding bootstrap connection attempts

exceeds 10, no new connection attempts are to be launched until an

existing connection attempt experiences full timeout. The retry time

is not doubled when a connection is skipped.

Design: Fallback Dir Mirror Selection

The set of hard coded directory mirrors from #572 shall be chosen using

the 100 Guard nodes with the longest uptime.

The fallback weights will be set using each mirror's fraction of

consensus bandwidth out of the total of all 100 mirrors, adjusted to

ensure no fallback directory sees more than 10% of clients. We will

also exclude fallback directories that are less than 1/1000 of the

consensus weight, as they are not large enough to make it worthwhile

including them.

This list of fallback dir mirrors should be updated with every

major Tor release. In future releases, the number of dir mirrors

should be set at 20% of the current Guard nodes (approximately 200 as

of October 2015), rather than fixed at 100.

Performance: Additional Load with Current Parameter Choices

This design and the connection count parameters were chosen such that

no additional bandwidth load would be placed on the directory

authorities. In fact, the directory authorities should experience less

load, because they will not need to serve the consensus document for a

connection in the event that one of the directory mirrors complete their

connection before the directory authority does.

However, the scheme does place additional TLS connection load on the

fallback dir mirrors. Because bootstrapping is rare and all but one of

the TLS connections will be very short-lived and unused, this should not

be a substantial issue.

The dangerous case is in the event of a prolonged consensus failure

that induces all clients to enter into the bootstrap process. In this

case, the number of TLS connections to the fallback dir mirrors within

the first second would be 2*C/100, or 40,000 for C=2,000,000 users. If

no connections complete before the 10 retries, 7 of which go to

mirrors, this could reach as high as 140,000 connection attempts, but

this is extremely unlikely to happen in full aggregate.

However, in the no-consensus scenario today, the directory authorities

would already experience 2*C/9 or 444,444 connection attempts. (Tor

currently tries 2 authorities, before delaying the next attempt.) The

10-retry scheme, 3 of which go to authorities, increases their total

maximum load to about 666,666 connection attempts, but again this is

unlikely to be reached in aggregate. Additionally, with this scheme,

even if the dirauths are taken down by this load, the dir mirrors

should be able to survive it.

Implementation Notes: Code Modifications

The implementation of the bootstrap process is unfortunately mixed

in with many types of directory activity.

The process starts in update_consensus_networkstatus_downloads(),

which initiates a single directory connection through

directory_get_from_dirserver(). Depending on bootstrap state,

a single directory server is selected and a connection is

eventually made through directory_initiate_command_rend().

There appear to be a few options for altering this code to retry multiple

simultaneous connections. Without refactoring, one approach would be to

set a connection retry helper function timer in

directory_initiate_command_routerstatus() from

directory_get_from_dirserver() if the purpose is

DIR_PURPOSE_FETCH_CONSENSUS and the only directory servers available

are the authorities and the fallback dir mirrors. (That is, there is no

valid consensus.) The retry helper function would check the list of

pending connections and, if it is 10 or greater, skip the connection

attempt, and leave the retry time constant.

The code in directory_initiate_command_rend() would then need to be

altered to maintain a list of the dircons created for this purpose as

well as avoid immediately queuing the directory_send_command() request

for the DIR_PURPOSE_FETCH_CONSENSUS purpose. A flag would need to be set

on the dircon to be checked in connection_dir_finished_connecting().

The function connection_dir_finished_connecting() would need to be

altered to examine the list of pending dircons, determine if this one is

the first to complete, and if so, then call directory_send_command() to

download the consensus and close the other pending dircons.

connection_dir_finished_connecting() would also cancel the timer.

Reliability Analysis

We make the pessimistic assumptions that 50% of connections to directory

mirrors fail, and that 20% of connections to authorities fail. (Actual

figures depend on relay churn, age of the fallback list, and authority

uptime.)

We expect the first 10 connection retry times to be:

Mirror: 0s 1s 2s 4s 8s 16s 32s

Auth: 0s 10s 20s

Success: 90% 95% 97% 98.7% 99.4% 99.89% 99.94% 99.988% 99.994%

97% of clients succeed in the first 2 seconds.

99.4% of clients succeed without trying a second authority.

99.89% of clients succeed in the first 10 seconds.

0.11% of clients remain, but in this scenario, 2 authorities are down,

so the client is most likely blocked from the Tor network.

The current implementation makes 1 or 2 authority connections within the

first second, depending on exactly how the first connection fails. Under

the 20% authority failure assumption, these clients would have a success

rate of either 80% or 96% within a few seconds. The scheme above has a

greater success rate in the first few seconds, while spreading the load

among a larger number of directory mirrors. In addition, if all the

authorities are blocked, current clients will inevitably fail, as they

do not have a list of directory mirrors.