[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #675 [Tor Client]: update_consensus_networkstatus_downloads() & router_pick_directory_server() rarely called
#675: update_consensus_networkstatus_downloads() & router_pick_directory_server()
rarely called
--------------------------------+-------------------------------------------
Reporter: jasemandude | Type: defect
Status: new | Priority: major
Milestone: Tor: 0.2.2.x-final | Component: Tor Client
Version: 0.2.0.25-rc | Resolution: None
Keywords: | Parent:
--------------------------------+-------------------------------------------
Old description:
> Possible solution to #648
>
> Machine was left on for several hours with TOR running but no dial up
> network, then dial up network was brought
> back up. TOR spends half an hour in a loop where it can't find nodes.
> Throughout this time, using TORbutton to
> switch off proxies in Firefox allowed me to browse the web with no
> problems, so I know the network is working fine.
>
> After waiting 30 minutes, killing and relaunching TOR seemed to fix the
> problem, though from the logs below it looks
> like TOR had finally made some circuits just before I killed it.
>
> Taking a closer look at the logs, I can see it wasn't able to make
> circuits for a whole 30 minutes until it called
> update_consensus_networkstatus_downloads() and
> router_pick_directory_server() - then all of a sudden everything
> started to work.
>
> Are these functions rarely called because of bandwidth issues? Could some
> logic be added so that if we get
> "[Warning] No available nodes when trying to choose node. Failing." then
> these functions get called?
>
> Here are the log messages:
>
> Apr 26 18:28:46.820 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:28:46.822 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:28:46.824 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.
> Apr 26 18:28:46.825 [Info] choose_good_exit_server_general(): Found 74
> servers that might support 11/12 pending connections.
> Apr 26 18:28:46.828 [Info] choose_good_exit_server_general(): Chose exit
> server 'oemloi'
> Apr 26 18:28:46.830 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.832 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.834 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.837 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.839 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.842 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.844 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.847 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.850 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.853 [Warning] Failed to find node for hop 0 of our path.
> Discarding this circuit.
> Apr 26 18:28:46.856 [Info] onion_populate_cpath(): Generating cpath hop
> failed.
> Apr 26 18:28:47.860 [Info] choose_good_exit_server_general(): Found 74
> servers that might support 11/12 pending connections.
> Apr 26 18:28:47.864 [Info] choose_good_exit_server_general(): Chose exit
> server 'gashmish'
> Apr 26 18:28:47.899 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:47.903 [Warning] No available nodes when trying to choose
> node. Failing.
>
> [... and so on for 30 minutes, still no nodes ...]
>
> Apr 26 18:58:16.130 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:58:16.177 [Warning] Failed to find node for hop 0 of our path.
> Discarding this circuit.
> Apr 26 18:58:16.219 [Info] onion_populate_cpath(): Generating cpath hop
> failed.
> Apr 26 18:58:16.266 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
>
> [... but suddenly ...]
>
> Apr 26 18:58:43.588 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:43.994 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.Apr 26 18:58:44.258 [Info]
> update_consensus_router_descriptor_downloads(): 0 router descriptors
> downloadable. 0 delayed; 1782 present (0 of those were in old_routers); 0
> would_reject; 391 wouldnt_use; 0 in progress.
> Apr 26 18:58:44.329 [Info] routerlist_remove_old_routers(): We have 2211
> live routers and 0 old router descriptors. At most 2173 must be retained
> because of networkstatuses.
> Apr 26 18:58:44.398 [Info] update_consensus_networkstatus_downloads():
> Launching networkstatus consensus download.
> Apr 26 18:58:44.467 [Info] router_pick_directory_server(): No reachable
> router entries for dirservers. Trying them all again.
> Apr 26 18:58:44.567 [Debug] directory_initiate_command(): anonymized 0,
> use_begindir 1.
> Apr 26 18:58:44.640 [Debug] directory_initiate_command(): Initiating
> consensus network-status fetch
> Apr 26 18:58:44.709 [Info] connection_ap_make_link(): Making internal
> anonymized tunnel to [scrubbed]:443 ...
> Apr 26 18:58:44.778 [Debug] connection_add(): new conn type Socks, socket
> -1, n_conns 11.
> Apr 26 18:58:44.847 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:44.916 [Info] connection_ap_make_link(): ... application
> connection created and linked.
> Apr 26 18:58:44.985 [Debug] connection_add(): new conn type Directory,
> socket -1, n_conns 12.
> Apr 26 18:58:45.054 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:45.124 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
>
> Apr 26 18:58:45.893 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.
> Apr 26 18:58:45.964 [Debug] new_route_len(): Chosen route length 3 (2211
> routers available).
> Apr 26 18:58:46.034 [Info] choose_good_exit_server_general(): Found 119
> servers that might support 6/7 pending connections.
> Apr 26 18:58:46.105 [Debug] smartlist_choose_by_bandwidth(): Total
> weighted bw = 17053612, exit bw = 23966189, nonexit bw = 1614269, exit
> weight = 1.000000 (for exit == 1), guard bw = 19886442, nonguard bw =
> 5694016, guard weight = 0.571224 (for guard == 0)Apr 26 18:58:46.176
> [Info] choose_good_exit_server_general(): Chose exit server 'dotplex1'
> Apr 26 18:58:46.247 [Debug] onion_extend_cpath(): Path is 0 long; we want
> 3
> Apr 26 18:58:46.320 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.Apr 26
> 18:58:46.391 [Info] add_an_entry_guard(): Chose 'dannenberg' as new entry
> guard.
> Apr 26 18:58:46.462 [Info] log_entry_guards(): BrainwashEducation (up
> made-contact),GuyMontag (up made-contact) [...]
>
> Apr 26 18:58:46.540 [Debug] onion_extend_cpath(): Chose router dannenberg
> for hop 1 (exit is dotplex1)
> Apr 26 18:58:46.621 [Debug] onion_extend_cpath(): Path is 1 long; we want
> 3
> Apr 26 18:58:46.698 [Debug] choose_good_middle_server(): Contemplating
> intermediate hop: random choice.
> Apr 26 18:58:46.774 [Debug] smartlist_choose_by_bandwidth(): Total
> weighted bw = 70934573, exit bw = 23971981, nonexit bw = 107505582, ex
> it weight = 0.000000 (for exit == 0), guard bw = 114082699, nonguard bw =
> 17394864, guard weight = 0.615841 (for guard == 0)
> Apr 26 18:58:46.851 [Debug] onion_extend_cpath(): Chose router
> RentalSponge for hop 2 (exit is dotplex1)
> Apr 26 18:58:46.928 [Debug] onion_extend_cpath(): Path is 2 long; we want
> 3
> Apr 26 18:58:47.006 [Debug] onion_extend_cpath(): Chose router dotplex1
> for hop 3 (exit is dotplex1)
> Apr 26 18:58:47.084 [Debug] onion_extend_cpath(): Path is complete: 3
> steps long
>
> [Automatically added by flyspray2trac: Operating System: OSX 10.5
> Leopard]
New description:
Possible solution to #648
Machine was left on for several hours with TOR running but no dial up
network, then dial up network was brought
back up. TOR spends half an hour in a loop where it can't find nodes.
Throughout this time, using TORbutton to
switch off proxies in Firefox allowed me to browse the web with no
problems, so I know the network is working fine.
After waiting 30 minutes, killing and relaunching TOR seemed to fix the
problem, though from the logs below it looks
like TOR had finally made some circuits just before I killed it.
Taking a closer look at the logs, I can see it wasn't able to make
circuits for a whole 30 minutes until it called
update_consensus_networkstatus_downloads() and
router_pick_directory_server() - then all of a sudden everything
started to work.
Are these functions rarely called because of bandwidth issues? Could some
logic be added so that if we get
"[Warning] No available nodes when trying to choose node. Failing." then
these functions get called?
Here are the log messages:
Apr 26 18:28:46.820 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:28:46.822 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:28:46.824 [Info] circuit_predict_and_launch_new(): Have 0 clean
circs (0 internal), need another exit circ.
Apr 26 18:28:46.825 [Info] choose_good_exit_server_general(): Found 74
servers that might support 11/12 pending connections.
Apr 26 18:28:46.828 [Info] choose_good_exit_server_general(): Chose exit
server 'oemloi'
Apr 26 18:28:46.830 [Info] router_choose_random_node(): We couldn't find
any live, guard routers; falling back to list of all routers.
Apr 26 18:28:46.832 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.834 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.837 [Info] router_choose_random_node(): We couldn't find
any live, guard routers; falling back to list of all routers.
Apr 26 18:28:46.839 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.842 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.844 [Info] router_choose_random_node(): We couldn't find
any live, guard routers; falling back to list of all routers.
Apr 26 18:28:46.847 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.850 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:28:46.853 [Warning] Failed to find node for hop 0 of our path.
Discarding this circuit.
Apr 26 18:28:46.856 [Info] onion_populate_cpath(): Generating cpath hop
failed.
Apr 26 18:28:47.860 [Info] choose_good_exit_server_general(): Found 74
servers that might support 11/12 pending connections.
Apr 26 18:28:47.864 [Info] choose_good_exit_server_general(): Chose exit
server 'gashmish'
Apr 26 18:28:47.899 [Info] router_choose_random_node(): We couldn't find
any live, guard routers; falling back to list of all routers.
Apr 26 18:28:47.903 [Warning] No available nodes when trying to choose
node. Failing.
[... and so on for 30 minutes, still no nodes ...]
Apr 26 18:58:16.130 [Warning] No available nodes when trying to choose
node. Failing.
Apr 26 18:58:16.177 [Warning] Failed to find node for hop 0 of our path.
Discarding this circuit.
Apr 26 18:58:16.219 [Info] onion_populate_cpath(): Generating cpath hop
failed.
Apr 26 18:58:16.266 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
[... but suddenly ...]
Apr 26 18:58:43.588 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:58:43.994 [Info] circuit_predict_and_launch_new(): Have 0 clean
circs (0 internal), need another exit circ.Apr 26 18:58:44.258 [Info]
update_consensus_router_descriptor_downloads(): 0 router descriptors
downloadable. 0 delayed; 1782 present (0 of those were in old_routers); 0
would_reject; 391 wouldnt_use; 0 in progress.
Apr 26 18:58:44.329 [Info] routerlist_remove_old_routers(): We have 2211
live routers and 0 old router descriptors. At most 2173 must be retained
because of networkstatuses.
Apr 26 18:58:44.398 [Info] update_consensus_networkstatus_downloads():
Launching networkstatus consensus download.
Apr 26 18:58:44.467 [Info] router_pick_directory_server(): No reachable
router entries for dirservers. Trying them all again.
Apr 26 18:58:44.567 [Debug] directory_initiate_command(): anonymized 0,
use_begindir 1.
Apr 26 18:58:44.640 [Debug] directory_initiate_command(): Initiating
consensus network-status fetch
Apr 26 18:58:44.709 [Info] connection_ap_make_link(): Making internal
anonymized tunnel to [scrubbed]:443 ...
Apr 26 18:58:44.778 [Debug] connection_add(): new conn type Socks, socket
-1, n_conns 11.
Apr 26 18:58:44.847 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:58:44.916 [Info] connection_ap_make_link(): ... application
connection created and linked.
Apr 26 18:58:44.985 [Debug] connection_add(): new conn type Directory,
socket -1, n_conns 12.
Apr 26 18:58:45.054 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:58:45.124 [Info] circuit_get_open_circ_or_launch(): No safe
circuit (purpose 5) ready for edge connection; delaying.
Apr 26 18:58:45.893 [Info] circuit_predict_and_launch_new(): Have 0 clean
circs (0 internal), need another exit circ.
Apr 26 18:58:45.964 [Debug] new_route_len(): Chosen route length 3 (2211
routers available).
Apr 26 18:58:46.034 [Info] choose_good_exit_server_general(): Found 119
servers that might support 6/7 pending connections.
Apr 26 18:58:46.105 [Debug] smartlist_choose_by_bandwidth(): Total
weighted bw = 17053612, exit bw = 23966189, nonexit bw = 1614269, exit
weight = 1.000000 (for exit == 1), guard bw = 19886442, nonguard bw =
5694016, guard weight = 0.571224 (for guard == 0)Apr 26 18:58:46.176
[Info] choose_good_exit_server_general(): Chose exit server 'dotplex1'
Apr 26 18:58:46.247 [Debug] onion_extend_cpath(): Path is 0 long; we want
3
Apr 26 18:58:46.320 [Info] router_choose_random_node(): We couldn't find
any live, guard routers; falling back to list of all routers.Apr 26
18:58:46.391 [Info] add_an_entry_guard(): Chose 'dannenberg' as new entry
guard.
Apr 26 18:58:46.462 [Info] log_entry_guards(): BrainwashEducation (up
made-contact),GuyMontag (up made-contact) [...]
Apr 26 18:58:46.540 [Debug] onion_extend_cpath(): Chose router dannenberg
for hop 1 (exit is dotplex1)
Apr 26 18:58:46.621 [Debug] onion_extend_cpath(): Path is 1 long; we want
3
Apr 26 18:58:46.698 [Debug] choose_good_middle_server(): Contemplating
intermediate hop: random choice.
Apr 26 18:58:46.774 [Debug] smartlist_choose_by_bandwidth(): Total
weighted bw = 70934573, exit bw = 23971981, nonexit bw = 107505582, ex
it weight = 0.000000 (for exit == 0), guard bw = 114082699, nonguard bw =
17394864, guard weight = 0.615841 (for guard == 0)
Apr 26 18:58:46.851 [Debug] onion_extend_cpath(): Chose router
RentalSponge for hop 2 (exit is dotplex1)
Apr 26 18:58:46.928 [Debug] onion_extend_cpath(): Path is 2 long; we want
3
Apr 26 18:58:47.006 [Debug] onion_extend_cpath(): Chose router dotplex1
for hop 3 (exit is dotplex1)
Apr 26 18:58:47.084 [Debug] onion_extend_cpath(): Path is complete: 3
steps long
[Automatically added by flyspray2trac: Operating System: OSX 10.5 Leopard]
--
Comment(by nickm):
Just upping the required num_present value up to 3 wouldn't help
completely: we could have 3 routers that we weren't able to use in some
particular circuit for some other reason.
So perhaps we should just make the choose_random_node() code do something
in the case when it says "No available nodes when trying to choose node.
Failing" that makes us willing to retry downloads.
We don't want to have it always just make router_have_minimum_dir_info()
return 0, because although we lack the directory information we need to
build that one circuit, we don't necessarily lack enough directory info to
build _any_ multihop circuits.
Perhaps when choose_random_node fails on a _general_ circuit we should say
"no, not enough directory info"? Or perhaps we should split
router_have_minimum_dir_info() into two functions, one of which means "try
really hard to get more dir info" and another of which means "don't even
bother trying circuits"?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/675#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs