[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[tor-bugs] #13718 [Tor]: Reachability Tests aren't conducted if there are no exit nodes
#13718: Reachability Tests aren't conducted if there are no exit nodes
--------------------+---------------------
Reporter: tom | Owner:
Type: defect | Status: new
Priority: normal | Milestone:
Component: Tor | Version:
Keywords: | Actual Points:
Parent ID: | Points:
--------------------+---------------------
Context:
* https://lists.torproject.org/pipermail/tor-dev/2014-October/007613.html
* https://lists.torproject.org/pipermail/tor-dev/2014-October/007654.html
On 22 October 2014 05:48, Roger Dingledine <arma@xxxxxxx> wrote:
>> What I had to do was make one of my Directory Authorities an exit -
>> this let the other nodes start building circuits through the
>> authorities and upload descriptors.
>
> This part seems surprising to me -- directory authorities always publish
> their dirport whether they've found it reachable or not, and relays
> publish their descriptors directly to the dirport of each directory
> authority (not through the Tor network).
>
> So maybe there's a bug that you aren't describing, or maybe you are
> misunderstanding what you saw?
>
> See also https://trac.torproject.org/projects/tor/ticket/11973
>
>> Another problem I ran into was that nodes couldn't conduct
>> reachability tests when I had exits that were only using the Reduced
>> Exit Policy - because it doesn't list the ORPort/DirPort! (I was
>> using nonstandard ports actually, but indeed the reduced exit policy
>> does not include 9001 or 9030.) Looking at the current consensus,
>> there are 40 exits that exit to all ports, and 400-something exits
>> that use the ReducedExitPolicy. It seems like 9001 and 9030 should
>> probably be added to that for reachability tests?
>
> The reachability tests for the ORPort involve extending the circuit to
> the ORPort -- which doesn't use an exit stream. So your relays should
> have been able to find themselves reachable, and published a descriptor,
> even with no exit relays in the network.
Okay, so the behavior I saw, and reproduced, is that reachability tests
didn't succeed (and therefore descriptors weren't uploaded) when there
were no exits. I think I may have figured out why, but there are some
internals I haven't completely figured out. I'm going to lay out what I
think and then the parts I'm not completely sure about.
First off, you're (obviously) correct about me misunderstanding extending
the circuit via an Exit stream, that's not necessary. But still, I think
the lack of Exits stopped the reachability tests from succeeding.
== too long; didn't read ==
I don't think reachability tests happen when there are no Exit nodes
because of a quirk in the bootstrapping process, where we never think we
have a minimum of directory information.
== target function: consider_testing_reachability ==
A reachability test is conducted from `consider_testing_reachability` (I
think it's only conducted from here? Although maybe there's other
situations it could happen..?) `consider_testing_reachability` is called
from `circuit_send_next_onion_skin`, `circuit_testing_opened`,
`run_scheduled_events`, and `directory_info_has_arrived`.
== call site #1: directory_info_has_arrived ==
This is called very frequently on router startup. But
`consider_testing_reachability` will not be called if
`router_have_minimum_dir_info` returns false:
{{{
void directory_info_has_arrived(time_t now, int from_cache)
{ //...
if (!router_have_minimum_dir_info()) {
//...
return;
} else { /* ... */ }
if (server_mode(options) && !net_is_disabled() && !from_cache &&
(can_complete_circuit || !any_predicted_circuits(now)))
consider_testing_reachability(1, 1);
}
}}}
`router_have_minimum_dir_info` returns the static variable
`have_min_dir_info`. This variable is only set to 1 in
`update_router_have_minimum_dir_info` and then only if there are Exits!
Specifically, we will trigger `paths <
get_frac_paths_needed_for_circs(options,consensus)` because we have 0% of
the Exit Bandwidth, as shown by this error message:
{{{
Nov 09 22:10:26.000 [notice] I learned some more directory information,
but not enough to build a circuit: We need more descriptors: we have 5/5,
and can only build 0% of likely paths. (We have 100% of guards bw, 100% of
midpoint bw, and 0% of exit bw.)
}}}
{{{
update_router_have_minimum_dir_info(void)
{ //...
char *status = NULL;
int num_present=0, num_usable=0;
double paths = compute_frac_paths_available(consensus, options, now,
&num_present, &num_usable,
&status);
if (paths < get_frac_paths_needed_for_circs(options,consensus)) {
tor_snprintf(dir_info_status, sizeof(dir_info_status),
"We need more %sdescriptors: we have %d/%d, and "
"can only build %d%% of likely paths. (We have %s.)",
using_md?"micro":"", num_present, num_usable,
(int)(paths*100), status);
//...
res = 0;
goto done;
}
res = 1;
}
done:
if (res && !have_min_dir_info) { /* ... */ }
if (!res && have_min_dir_info) {
int quiet = directory_too_idle_to_fetch_descriptors(options, now);
tor_log(quiet ? LOG_INFO : LOG_NOTICE, LD_DIR,
"Our directory information is no longer up-to-date "
"enough to build circuits: %s", dir_info_status);
/* a) make us log when we next complete a circuit, so we know when Tor
* is back up and usable, and b) disable some activities that Tor
* should only do while circuits are working, like reachability tests
* and fetching bridge descriptors only over circuits. */
can_complete_circuit = 0;
control_event_client_status(LOG_NOTICE, "NOT_ENOUGH_DIR_INFO");
}
have_min_dir_info = res;
}
}}}
(The exact source line is in `frac_nodes_with_descriptors`, called by
`compute_frac_paths_available`:)
{{{
/** For all nodes in <b>sl</b>, return the fraction of those nodes,
weighted
* by their weighted bandwidths with rule <b>rule</b>, for which we have
* descriptors. */
double
frac_nodes_with_descriptors(const smartlist_t *sl,
bandwidth_weight_rule_t rule)
{
//...
if (smartlist_len(sl) == 0)
return 0.0;
}}}
This prevents reachability from occurring from
`directory_info_has_arrived`.
== call site #2: run_scheduled_events (and call site #3) ==
There's a litany of conditions to call `consider_testing_reachability`
from `run_scheduled_events`. In particular, there's
`can_complete_circuit`
{{{
if (time_to_check_descriptor < now && !options->DisableNetwork) {
//...
/* also, check religiously for reachability, if it's within the first
* 20 minutes of our uptime. */
if (is_server &&
(can_complete_circuit || !any_predicted_circuits(now)) &&
!we_are_hibernating()) {
if (stats_n_seconds_working <
TIMEOUT_UNTIL_UNREACHABILITY_COMPLAINT) {
consider_testing_reachability(1, dirport_reachability_count==0);
}}}
`can_complete_circuit` is only set in `circuit_send_next_onion_skin`, but
then only if a circuit is built and it is not
`circ->build_state->onehop_tunnel`. I _think_ this means the circuit is a
full circuit, complete with Exit. Right?
{{{
int circuit_send_next_onion_skin(origin_circuit_t *circ)
{ //...
if (circ->cpath->state == CPATH_STATE_CLOSED) {
// ...
} else {
//...
hop = onion_next_hop_in_cpath(circ->cpath);
if (!hop) {
//...
if (!can_complete_circuit && !circ->build_state->onehop_tunnel) {
can_complete_circuit=1;
/* FFFF Log a count of known routers here */
log_notice(LD_GENERAL,
"Tor has successfully opened a circuit. "
"Looks like client functionality is working.");
//...
if (server_mode(options) && !check_whether_orport_reachable()) {
inform_testing_reachability();
consider_testing_reachability(1, 1);
}}}
This is also the third place `consider_testing_reachability` is called -
there is only one left:
== call site #4: circuit_testing_opened ==
{{{
/** A testing circuit has completed. Take whatever stats we want.
* Noticing reachability is taken care of in onionskin_answer(),
* so there's no need to record anything here. But if we still want
* to do the bandwidth test, and we now have enough testing circuits
* open, do it.
*/
static void
circuit_testing_opened(origin_circuit_t *circ)
{
if (have_performed_bandwidth_test ||
!check_whether_orport_reachable()) {
/* either we've already done everything we want with testing circuits,
* or this testing circuit became open due to a fluke, e.g. we picked
* a last hop where we already had the connection open due to an
* outgoing local circuit. */
circuit_mark_for_close(TO_CIRCUIT(circ), END_CIRC_AT_ORIGIN);
} else if (circuit_enough_testing_circs()) {
router_perform_bandwidth_test(NUM_PARALLEL_TESTING_CIRCS, time(NULL));
have_performed_bandwidth_test = 1;
} else
consider_testing_reachability(1, 0);
}
}}}
But... as far as I can tell - a testing circuit is only used for two
things: conducting a reachability test and conducting a bandwidth self-
test. The only place a bandwidth self-test is called is inside
`circuit_testing_opened`. So this call of `consider_testing_reachability`
is a chicken or the egg problem.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/13718>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs