Hey, > > This algorithm keeps track of the unreachability status for guards > > in state private to the algorithm - this is re-initialized every time > > START is called. > > > > Hmm, didn't we decide to persist the unreachability status over runs, right? > Or not? Yeah, I think we did decide to persist it between runs, but not more permanently. I've changed it now. > > SAMPLED_UTOPIC_GUARDS > > This is a set that contains all guards that should be considered > > for connection under utopic conditions. This set should be > > persisted between runs. It will be filled in by the algorithm if > > it's empty, or if it contains less than SAMPLE_SET_THRESHOLD > > guards after winnowing out older guards. It should be filled by > > using NEXT_BY_BANDWIDTH with UTOPIC_GUARDS as an argument. > > > > Should we use UTOPIC_GUARDS or REMAINING_UTOPIC_GUARDS as the > argument? It should be UTOPIC_GUARDS, since REMAINING_UTOPIC_GUARDS will always be a subset of SAMPLED_UTOPIC_GUARDS. > I guess you mean SAMPLED_DYSTOPIC_GUARDS. Yep, thanks. Fixed. > > REMAINING_UTOPIC_GUARDS > > This is a running set of the utopic guards we have not yet tried > > to connect to. It should be initialized to be SAMPLED_UTOPIC_GUARDS > > without USED_GUARDS. > > > > Maybe here we should also mention that we will reinsert guards that we have not > tried in a long time (GUARDS_RETRY_TIME) as specified by 2.2.2? Yep, good clarification. I've added that. > > [XXX defining "was not possible to connect" as "entry is not live" according > > to current definition of "live entry guard" in tor source code, seems > > to improve success rate on the flaky network scenario. > > See: https://github.com/twstrike/tor_guardsim/issues/1#issuecomment-187374942] > > > > Hmm, I'm not sure what this XXX means exactly. I believe we should actually try > to _connect_ to those primary guards and not just check if we think > they are live. Yeah, I don't know where it comes from either - @rjunior, care to expand on it? > > §2.2.2. The STATE_TRY_UTOPIC state > > > > In order to give guards that have been marked as unreachable a > > chance to come back, add all entries in TRIED_GUARDS that were > > marked as unreachable more than GUARDS_RETRY_TIME minutes ago back > > to REMAINING_UTOPIC_GUARDS. > > > > I'm a bit puzzled by this mechanism. Maybe it's benefits can be explained a bit > more clearly? > > When we add guards back to REMAINING_UTOPIC_GUARDS, do we also remove them from > TRIED_GUARDS? Well, TRIED_GUARDS doesn't really do much at the moment. In fact, it might be easier to just remove it. I've done that and it simplifies things as well. > Now that we have persistent SAMPLED_UTOPIC_GUARDS is this still useful? Won't > we have fully populated our SAMPLED_*_GUARDS structures by the point this rule > triggers? Agree, I've removed it. Much nicer and neater now! =D > > §2.2.5. ON_NEW_CONSENSUS > > > > First, ensure that all guard profiles are updated with information > > about whether they were in the newest consensus or not. If not, the > > guard is considered bad. > > > > Maybe instead of "If not" we could say "If a guard is not included in the newest > consensus" to make it a bit clearer. Good clarification, done. > > [XXX Does "add it back in the place it should have been in PRIMARY_GUARDS > > if it had been non-bad" implies keeping original order?] > > > > If I understand correctly, I think the answer to this XXX is > "Ideally, yes.". Yes, that is definitely the answer. > I'm curious to see how this mechanism will be implemented because it's important > and it would be nice if it's done cleanly. I can see a few different ways to do it easily. One of them would be to just rerun the original primary guard selection algorithm until we find the guard we want to insert. > Also, we should be careful about when we count 'bad' guards. After a few weeks > of operation, the USED_GUARDS list can accumulate multiple bad guards, and we > should make sure we don't count them when we do our threshold > checks. Absolutely. > Just a reminder that we also discussed adding the "Retry primary guards if we > have looped over the whole guardlist" heuristic somewhere here. Because in many > cases the network can go down and then back up in less than a minute. Actually, that retry heuristic is there. Or maybe I misunderstand the point. > IIUC, if the guard is not in USED_GUARDS it should be added *last* (that is, > with lowest priority). Yep, added that. > We should decide if we want to actually use a dynamic percentage here, or just > set the threshold to a constant value. > > A dynamic percentage might give us better security and reachability as the > network evolves, but might also cause unpredictable behaviors if we suddently > get too many guards or too many of them disappear. > > I don't have a strong opinion here. Me neither. I think a percentage is a good starting point - it feels easier to tweak in different ways. > It seems to me that the value 20 here could get reduced to something like 5 or > even less. Of course 5 is also an arbitrary value and to actually find out the > "best" number here we should test the algorithm ourselves in various network > types. Arbitrarily changed to 5. =) Cheers -- Ola Bini (https://olabini.se) "Yields falsehood when quined" yields falsehood when quined.
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev