[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #9946 [Tor]: Bootstrapping Tor client hangs up on its chosen guards
#9946: Bootstrapping Tor client hangs up on its chosen guards
------------------------+------------------------------------
Reporter: arma | Owner:
Type: defect | Status: new
Priority: normal | Milestone: Tor: 0.2.4.x-final
Component: Tor | Version:
Keywords: tor-client | Actual Points:
Parent ID: | Points:
------------------------+------------------------------------
Here are some log snippets from a freshly bootstrapping Tor client:
{{{
Sep 27 03:24:11.369 [info] add_an_entry_guard(): Chose
$3DD2523F1B241F01D54818F327714CDA7F54238B~wannabe2 at 144.76.110.103 as
new entry guard.
Sep 27 03:24:11.369 [info] log_entry_guards(): wannabe2
[3DD2523F1B241F01D54818F327714CDA7F54238B] (up never-contacted)
Sep 27 03:24:11.371 [info] add_an_entry_guard(): Chose
$821FAFA24651419FAE76A5C80BA79EA7E0743C4D=servbr3 at 212.83.158.171 as new
entry guard.
Sep 27 03:24:11.371 [info] log_entry_guards(): wannabe2
[3DD2523F1B241F01D54818F327714CDA7F54238B] (up never-contacted),servbr3
[821FAFA24651419FAE76A5C80BA79EA7E0743C4D] (up never-contacted)
Sep 27 03:24:11.373 [info] add_an_entry_guard(): Chose
$CFB4CB595C2504444E8F8F76CAC42F0C97200734=SECxFreeBSD64 at 195.169.125.226
as new entry guard.
Sep 27 03:24:11.373 [info] log_entry_guards(): wannabe2
[3DD2523F1B241F01D54818F327714CDA7F54238B] (up never-contacted),servbr3
[821FAFA24651419FAE76A5C80BA79EA7E0743C4D] (up never-
contacted),SECxFreeBSD64 [CFB4CB595C2504444E8F8F76CAC42F0C97200734] (up
never-contacted)
}}}
So now I have my three guards, and they're all considered up (the
consensus said so) and never-contacted (that's how they start out).
{{{
Sep 27 03:24:11.362 [info] launch_descriptor_downloads(): Launching 45
requests
for 4084 microdescs, 92 at a time
}}}
Then choose_random_entry_impl() runs this code:
{{{
if (!entry->made_contact) {
/* Always start with the first not-yet-contacted entry
* guard. Otherwise we might add several new ones, pick
* the second new one, and now we've expanded our entry
* guard list without needing to. */
goto choose_and_finish;
}
}}}
so all 45 of those microdescriptor requests decide to ask wannabe2. It
queues its 45 directory attempts.
So it connects to wannabe2:
{{{
Sep 27 03:24:11.373 [debug] directory_initiate_command_rend(): Initiating
microdescriptor fetch
Sep 27 03:24:11.373 [info] connection_ap_make_link(): Making internal
direct tunnel to 144.76.110.103:443 ...
[...]
Sep 27 03:24:11.373 [debug] circuit_handle_first_hop(): Looking for
firsthop '144.76.110.103:443'
Sep 27 03:24:11.373 [info] circuit_handle_first_hop(): Next router is
$3DD2523F1B241F01D54818F327714CDA7F54238B~3DD2523F1B241F01D54 at
144.76.110.103: Not connected. Connecting.
[...]
Sep 27 03:24:11.373 [debug] connection_connect(): Connecting to
"144.76.110.103":443.
Sep 27 03:24:11.373 [debug] connection_connect(): Connection to
"144.76.110.103":443 in progress (sock 13).
Sep 27 03:24:11.373 [debug] connection_add_impl(): new conn type OR,
socket 13, address 144.76.110.103, n_conns 7.
[...]
Sep 27 03:24:11.373 [debug] circuit_handle_first_hop(): connecting in
progress (or finished). Good.
[...]
Sep 27 03:24:11.484 [debug] connection_or_finished_connecting(): OR
connect() to router at 144.76.110.103:443 finished.
Sep 27 03:24:11.485 [debug] connection_tls_start_handshake(): starting TLS
handshake on fd 13
[...]
Sep 27 03:24:11.937 [info] channel_tls_process_versions_cell(): Negotiated
version 4 with 144.76.110.103:443; Waiting for CERTS cell
Sep 27 03:24:11.937 [debug] connection_or_process_cells_from_inbuf(): 13:
starting, inbuf_datalen 1473 (0 pending in tls object).
Sep 27 03:24:11.938 [info] channel_tls_process_certs_cell(): Got some good
certificates from 144.76.110.103:443: Authenticated it.
[...]
Sep 27 03:24:11.938 [debug] circuit_n_chan_done(): chan to
NULL/144.76.110.103:443, status=1
Sep 27 03:24:11.938 [debug] circuit_n_chan_done(): Found circ, sending
create cell.
[...]
Sep 27 03:24:11.938 [info] circuit_send_next_onion_skin(): First hop:
finished sending CREATE_FAST cell to
'$3DD2523F1B241F01D54818F327714CDA7F54238B~wannabe2 at 144.76.110.103'
Sep 27 03:24:11.938 [info] channel_tls_process_netinfo_cell(): Got good
NETINFO cell from 144.76.110.103:443; OR connection is now open, using
protocol version 4. Its ID digest is
3DD2523F1B241F01D54818F327714CDA7F54238B. Our address is apparently
76.99.61.6.
}}}
But wait, what's that? We send our CREATE_FAST cells before we get our
NETINFO cell? Not sure if that is a bug.
In any case, time passes, and we give up on all those circuits:
{{{
Sep 27 03:25:12.150 [notice] No circuits are opened. Relaxed timeout for
circuit 3 (a General-purpose client 1-hop circuit in state doing
handshakes with channel state open) to 60000ms. However, it appears the
circuit has timed out anyway. 0 guards are live.
Sep 27 03:25:12.150 [info] circuit_expire_building(): Abandoning circ 3
144.76.110.103:443:-2147451209 (state 0,0:doing handshakes, purpose 5, len
1)
Sep 27 03:25:12.150 [info] internal circ (length 1, last hop
3DD2523F1B241F01D54818F327714CDA7F54238B):
$3DD2523F1B241F01D54818F327714CDA7F54238B(waiting for keys)
[...]
Sep 27 03:25:12.150 [info] circuit_build_failed(): Our circuit failed to
get a response from the first hop (144.76.110.103:443). I'm going to try
to rotate to a better connection.
Sep 27 03:25:12.150 [info] entry_guard_register_connect_status(): Unable
to connect to entry guard 'wannabe2'
(3DD2523F1B241F01D54818F327714CDA7F54238B). Marking as unreachable.
}}}
and then we give up on the 45 directory requests:
{{{
Sep 27 03:25:12.150 [info] connection_ap_fail_onehop(): Closing one-hop
stream to '$3DD2523F1B241F01D54818F327714CDA7F54238B/144.76.110.103'
because the OR conn just failed.
Sep 27 03:25:12.150 [info] connection_ap_fail_onehop(): Closing one-hop
stream to '$3DD2523F1B241F01D54818F327714CDA7F54238B/144.76.110.103'
because the OR conn just failed.
[and 43 more]
}}}
So let's try again! We marked our first guard down so now we'll ask our
second guard:
{{{
Sep 27 03:25:14.151 [debug] launch_descriptor_downloads(): There are
enough downloadable microdescs to launch requests.
Sep 27 03:25:14.151 [info] launch_descriptor_downloads(): Launching 45
requests for 4084 microdescs, 92 at a time
}}}
{{{
Sep 27 03:25:14.159 [info] onion_pick_cpath_exit(): Using requested exit
node '$821FAFA24651419FAE76A5C80BA79EA7E0743C4D~821FAFA24651419FAE7 at
212.83.158.171'
[...]
Sep 27 03:25:14.160 [info] circuit_handle_first_hop(): Next router is
$821FAFA24651419FAE76A5C80BA79EA7E0743C4D~821FAFA24651419FAE7 at
212.83.158.171: Not connected. Connecting.
[...]
Sep 27 03:25:14.160 [debug] connection_connect(): Connecting to
"212.83.158.171":443.
Sep 27 03:25:14.160 [debug] connection_connect(): Connection to
"212.83.158.171":443 in progress (sock 13).
Sep 27 03:25:14.160 [debug] connection_add_impl(): new conn type OR,
socket 13, address 212.83.158.171, n_conns 6.
[...]
Sep 27 03:25:14.261 [debug] connection_or_finished_connecting(): OR
connect() to router at 212.83.158.171:443 finished.
[...]
Sep 27 03:25:14.683 [info] channel_tls_process_versions_cell(): Negotiated
version 4 with 212.83.158.171:443; Waiting for CERTS cell
[...]
Sep 27 03:25:14.683 [info] channel_tls_process_certs_cell(): Got some good
certificates from 212.83.158.171:443: Authenticated it.
}}}
And here's where it gets good:
{{{
Sep 27 03:25:14.683 [info] entry_guard_register_connect_status():
Connected to new entry guard 'servbr3'
(821FAFA24651419FAE76A5C80BA79EA7E0743C4D). Marking earlier entry guards
up. 0/3 entry guards usable/new.
Sep 27 03:25:14.683 [info] log_entry_guards(): wannabe2
[3DD2523F1B241F01D54818F327714CDA7F54238B] (up made-contact),servbr3
[821FAFA24651419FAE76A5C80BA79EA7E0743C4D] (up made-contact),SECxFreeBSD64
[CFB4CB595C2504444E8F8F76CAC42F0C97200734] (up never-contacted)
Sep 27 03:25:14.683 [debug] channel_do_open_actions(): New entry guard was
reachable, but closing this connection so we can retry the earlier entry
guards.
Sep 27 03:25:14.683 [debug] circuit_n_chan_done(): chan to
NULL/212.83.158.171:443, status=0
Sep 27 03:25:14.683 [info] circuit_n_chan_done(): Channel failed; closing
circ.
Sep 27 03:25:14.683 [info] circuit_build_failed(): Our circuit died before
the first hop with no connection
Sep 27 03:25:14.683 [info] entry_guard_register_connect_status(): Unable
to connect to entry guard 'servbr3'
(821FAFA24651419FAE76A5C80BA79EA7E0743C4D). Marking as unreachable.
}}}
And then we close all the directory requests that we'd queued for it
because we closed the connection:
{{{
Sep 27 03:25:14.683 [info] connection_ap_fail_onehop(): Closing one-hop
stream to '$821FAFA24651419FAE76A5C80BA79EA7E0743C4D/212.83.158.171'
because the OR conn just failed.
[and 44 more]
}}}
And then my client launches 45 new directory requests, which spread over
the three available guards and things bootstrap ok.
It's the "New entry guard was reachable, but closing this connection so we
can retry the earlier entry guards." behavior that is the bug here. I
think the fix might be to mark newly selected guards as already-contacted
when you first pick them at the beginning of time.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9946>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs