[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #12595 [Tor]: Think of better data structures for guard nodes
#12595: Think of better data structures for guard nodes
------------------------+--------------------------------
Reporter: asn | Owner:
Type: defect | Status: new
Priority: normal | Milestone: Tor: 0.2.6.x-final
Component: Tor | Version:
Resolution: | Keywords: tor-guard
Actual Points: | Parent ID:
Points: |
------------------------+--------------------------------
Comment (by asn):
Replying to [comment:7 nickm]:
>
> Reading through all of the above stuff, in fact, I think that the right
way to approach design here might be to step back from the "what is the
right data structure" question and ask ourselves, "what interface must
these algorithms expose, and how should they implement it?" IOW, can we
enumerate their inputs and outputs, the events that affect them, and the
operations they need to perform? If we figure that out, we should be able
to examine some ideas in pseudocode and converge on something clever.
I agree that this will be helpful.
When I started thinking about this problem, I wanted to formalize it
but I quickly found myself unsure of how much formalization is
actually productive here.
I found it helpful to think of this as an OOP system. I thought of a
class called 'Guard' and a class called 'GuardList':
A 'Guard' is basically a Tor node plus some metadata about it. Using
the metadata you can answer questions like "is it fast?", "is it
dir?", "was it unreachable?", "should we retry?", "is it too old?".
A 'GuardList' contains many 'Guard' objects and knows how to select
between them. It basically acts as a conveyor belt for them.
The main interface to the guard subsystem that Tor needs to use are
the functions that return entry guards and directory guards. In the
current codebase, this is choose_random_entry_impl().
We can imagine this function as a method of the GuardList that does
something similar to what was described in comment:9: it takes as
input some restrictions (the type of the circuit, the exit node
family, whether bridges are used etc.) and outputs a node that can be
used (or NULL if nothing could be found). The method has side-effects,
since the guard list might need to be extended to find a suitable
node.
Some more methods of the guard list could be:
- load_guards(), which loads guards from the torrc and state file.
- save_guards(), which saves guards to the state file.
- refresh(), which kills old/bad guards based on a newly arrived
consensus (see entry_guards_compute_status())
- network_is_back_up(), which is the "network up" trigger (see comment:5)
- stuff like add_guard(), remove_guard(), get_num_guards(), etc.
Continuing with comment:9, the main methods of the Guard class would
be:
`bool should_be_used_as_dir_guard(restrictions)` and
`bool should_be_used_as_circuit_guard(restrictions)`
These methods take a look at the latest consensus, the metadata of
each guard and the restrictions imposed, to figure out whether a Guard
can be used as a guard *right now*. These methods will be used by the
GuardList when choosing a guard node.
Now, here are various events that need to be taken into account:
- The network is down! (comment:5)
- Directory guard does not have descriptor X (comment:8)
- Top circuit guard is not a directory.
- Guard used to be OK, but now: is not a guard anymore
is not a directory
is path bias disabled
is in the same family as exit
is too old
is an excluded node
Any suggestions on how I should be looking at this better?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12595#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs