[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Snakes On A Tor Scanner - 0.0.3



Thus spake Roger Dingledine (arma@xxxxxxx):

> > Lastly, the metatroller currently does not subscribe to router info or
> > (non-existent) network status events, so it should be restarted
> > periodically. When network-status events are available in 0.1.2.x I'll
> > support them.
> 
> If you could help us get moving on that, that would be great. Some sort
> of spec patch and preliminary code patch would be fabulous.

Ok, I'll try to get to this. Might be a while though. In the short
term I'm going to hack on the reason stats so we can figure out why
these circuits are failing, but perl's data structure limitations are
proving even that to be a bit too much...

> > 3. SOAT is not likely to work optimally if you are using the same Tor
> > client for other things. In some cases this can cause the exit to change
> > between the time that SOAT uses it and the time that it detects an
> > error and asks Metatroller what exit was used.
> 
> When you extendcircuit, you can specify purpose=controller, and
> then Tor won't ever touch those circuits on its own.

Yes, but in this case Tor doesn't close the circuits when they are
old/unused, and I would have to maintain that info myself. I'm opting
for letting Tor maintain the destruction of them for me.

However, that's not the problem with concurrent SOAT+normal usage. The
problem is mainly that if you try to connect to some port where the
current SOAT exit can't connect to, a new circuit will be built and a
new exit will be chosen. Right now there is no notion of circuits in
SOAT, so it just asks for the last exit used. Hence, if you caused
metatroller to build a new circuit while SOAT was using an old one,
and there is an MD5 error for some URL, the wrong exit will be blamed
for it.

> > I'm also suspicious of the 7/8 node cutoff for "fast" nodes.  I think
> > that perhaps it should be raised to 65% or so, but I have no hard data
> > as of yet to illustrate this cutoff point. Since adoption is critical
> > to anonymity, and regular people won't use Tor if they think it is
> > slow, I believe it is far more imporant that we have known reliable,
> > fast nodes than lots of slow ones that are prone to dropping circuits.
> > Hopefully we can discover these cutoffs using this tool.
> 
> That's an interesting question -- do the slow ones drop circuits more
> often? I'd be curious to hear some data on that.
> 
> More generally, while using a fraction of the nodes (7/8 or 65%) lets
> us adapt better to whatever network we have available, it may still not
> be the right approach if our goal is to have high chances of getting a
> non-sucky circuit. On the other hand, people who sign up to be relays
> but never get used may be sad. On the third hand, so what? Hm.

Yeah, we'll need to wait for me to do stream bandwidth statistics to
best figure this out. This may be a while out, but I thought I'd throw
it out there for consideration.

> While we're at it, would it be interesting to look into adding a country
> code to the network-status list, saying our best guess based on whois
> or whatever of where the node is? As more and more tools hardcode "fetch
> it from serifos, unauthenticated and with a single point of failure",
> it might be nice to offer a better option.

Yeah. If that shows up, I will make use of it. I'm a bit
over-committed to do this myself though.


Here's my task list:

1. Failure stats based on reason codes
2. Network/routerinfo status events
3. Node stats for stream bandwidth
4. Statstics on reasonable cutoff %-age
5. Not get fired from my day job 

A "Rewrite in Python" task may be inserted in there anywhere from 0-5,
depending on how many brick walls perl presents. #1 alone is getting
extremely annoying because of limitations on thread-shared structures.

Due to Task 5, other tasks may experience arbitrary delays ;)

-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs