This proposal aims to allow us to load balance properly between Guard,
Middle, and Exit nodes with the addition of padding traffic to the
network.
...
1. Overview
For padding overhead due to Proposals 251 and 254, and changes to hidden
service path selection in Proposal 247, it will be useful to be able to
specify a pair of parameters that represents the additional traffic
present on Guard and Middle nodes due to these changes.
I don't know if it's worth noting that proposals 252 and 260 (the Single Onion
Services variants) will reduce traffic to guard and middle nodes, as they
remove the nodes between the hidden service and client-controlled endpoint
(rendezvous point in Rendezvous Single Onion Services, extend point in
Single Onion Services).
I think these will just be part of the overhead parameters?
...
2. Simplified Load Balancing Equations
Recall that the point of the load balancing equations in section 3.8.3
of dir-spec.txt is to ensure that an equal amount of client traffic is
distributed between Guards, Middles, Exits, and Guard+Exits, where each
flag type can occupy one or more positions in a path. This allocation is
accomplished by solving a system of equations for weights for flag
position selection to ensure equal allocation of client traffic for each
position in a circuit.
If we ignore overhead for the moment and treat Guard+Exit nodes as Exit
nodes, then this allows the simplified system of equations to become:
Wgg*G == M + Wme*E + Wmg*G # Guard position == middle position
Wgg*G == Wee*E # Guard position == equals exit position
Wmg*G + Wgg*G == G # Guard allocation weights sum to 1
Wme*E + Wee*E == E # Exit allocation weights sum to 1
This system has four equations and four unknowns, and by transitivity we
ensure that allocated capacity for guard, middle, and exit positions are
all equal. Unlike the equations in 3.8.3 of dir-spec.txt, there are no
special cases to the solutions of these equations because there is no
shortage of constraints and no decision points for allocation based on
scarcity. Thus, there is only one solution. Using SymPy's symbolic
equation solver (see attached script) we obtain:
E + G + M E + G + M 2*E - G - M 2*G - E - M
Wee: ---------, Wgg: ---------, Wme: -----------, Wmg: ------------
3*E 3*G 3*E 3*G
For the rest of the flags weights, we will do the following:
Dual-flagged (Guard+Exit) nodes should be treated as Exits:
Wgd = 0, Wmd = Wme, Wed = Wee
Directory requests use middle weights:
Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm
Handle bridges and strange exit policies:
Wgm=Wgg, Wem=Wee, Weg=Wed
I think this analysis assumes that all paths in the Tor network are 3-hop paths.
What about the 4-hop paths created by circuit cannibalisation?
(We don't actually know what proportion of the Tor network has 3-hop vs 4-hop paths.)
Depending on whether an exit or internal circuit is cannibalised, they can look like:
G M E E
G M M E
And what about hidden service paths (paths that include two middle nodes?)
G M M
Or, if cannibalised from an exit or internal circuit:
G M E M
G M M M
Again, I think these will just be part of the overhead parameters?