[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Snakes On A Tor

So I've hacked up some crazy perl to scan for changes in md5sums of
urls and SSL certs by exit nodes. Doesn't scrape google yet. Maybe I
will get to that, maybe not. Probably not in the immediate future. My
attention span is running out on this project. I'm thinking it's in
the "good enough" state, and I did manage to produce some fun toys for
y'all up to this point.

Here are some excerpts from the README:

The project essentially consists of two perl files. One called
'soat.pl', which is the scanner, and the other called 'metatroller.pl'
which is a meta-controller for Tor that can do all sorts of neat
tricks. The Metatroller can be run without SOAT, but SOAT requires the

These both have been tested on perl v5.8 on Linux. The Metatroller has
also been tested on Windows ActivePerl 5.8.8 and does in fact play
nice with Vidalia. However SOAT itself depends on wget, openssl, and
md5sum, which are not likely to be present outside of Cygwin.

Once the Metatroller is up and running, you can telnet localhost 9052
and type HELP. You should get this:

220 Welcome to the Tor Metatroller v0.0.1! Try HELP for Info

214 Commands:
214   - Pick a two letter country code to select exits from, or ALL
214   - List countries that have tor nodes (not necessarily full exits)
214   - What % of the network is considered 'fast' for node selection
214  BWCUTOFF <#>
214   - Minimum observed bandwidth (KB) that a node must have to be selected
214  UNIFORM <1|0>
214   - Should selection among fast nodes be uniform (or bandwidth-biased)?
214  ORDEREXITS <1|0>
214   - Should exits be chosen one after another instead of randomly?
214  FASTEXITS <1|0>
214   - Should exits be chosen from 'fast' nodes or all nodes?
214  PATHLEN <#>
214   - What should the path length of circuits be?
214   - Reports the current exit
214   - Throw away all circuits and choose a new exit

Hopefully this is self-explanatory. I particularly like the COUNTRY
option. Since I'm a xenophobic American who's never left the country
except for the occasional visit to America Junior, it's like my own
little international travel simulator, but without all the laptop
scaning, stolen carryon items, and cattle prodding.


--------------  BIG FAT WARNING  -----------------

While many of these options may seem desirable at first glance, a
consideration of how Tor works can actually reveal that they are
extremely dangerous to your anonymity.

The major risk factors are:

PATHLEN - This is a bad idea all around to mess with if you need
anonymity. Since Tor uses telescoping to build circuits, it is
trivial for the first hop to fingerprint you based on the number of
packets that traverse it during circuit construction. If you are
the only one on the network building 6 hop circuits, you are
essentially giving nodes a unique fingerprint to track you with.
Likewise short paths are dangerous because it becomes trivial for a
node or pair of nodes to trace your exact path through the network.

BWCUTOFF/PERCENTFAST - these option may seem like a good way to
improve Tor speed, but in actual fact it doesn't help you much once
you get beyond about 60% or so in my experience. This is where Tor
node bandwidths start to grow beyond 50kb/sec a piece. After this
point, not only are you skewing the load balancing on the Tor network,
you are starting to seriously limit the number of nodes your client
chooses, making it much more likely that high-bandwidth adversaries
observe the first and last node in your circuit and can correlate
them. It also doesn't help speed that much either past that point.

Also, note that the metatroller itself may introduce weird timing
and/or circuit usage signature patterns that may or may not give you
away. I did my best to make the defaults look as much like Tor as
possible, but there may be subtle differences that can be picked up
on. Perhaps the most obvious one is the fact that wget and not Tor is
being used to fetch directory information. I have set the user agents
to be identical, but there may be other differences.

Another possible giveaway is that I do not use uptime information in
the node selection process. Nodes may be able to tell you are a
Metatroller client if one of their neighbors for that circuit has
extremely low uptime.

A couple of caveats/rough edges with SOAT:

1. You should customize the list of exe's yourself to add some random
stuff, various document/image URLs, and so on. This list shouldn't be
published if you intend to post results, except in the case of
corruption. You should change it every once and a while. Future
versions may automate this by scraping shit off google/sourceforge.

2. Some nodes simply return "Connection: close" for URLs, perhaps
because of a malfunctioning upstream squid proxy or who knows what.
I've decided to let this trigger the MD5 warning, because it is
freaking annoying when I use Tor normally, and it might be nice to
build a list of nodes that do this to either fix them or simply
shitlist them.

3. Some SSL websites (for example citibank.com) actually have a whole
collection of SSL certificates for some unknown reason. Even saving
SSL certs specific to each IP of a round-robin DNS host still isn't
good enough.  Ugh.

4. When you exit via multiple control-C's, the last couple MD5s will
show up as bogus because system() function causes the SIGINT to be
delivered to wget and not perl. The fix for this listed in the perlfaq
did not work. :(

5. SOAT is likely to not work optimally if you are using the same Tor
client for other things. In some cases this can cause the exit to
change between the time that SOAT uses it and the time that it detects
an error and asks Metatroller what exit was used. It is probably best
to run a secondary Tor client with a different control port just for
SOAT and the Metatroller. You probably want this for anonymity reasons
also, especially since the default path length used by SOAT is only 2.

Note that Tor node operators can concievably run SOAT on their Tor
nodes with a path length of 1, since for them scanned nodes won't be
able to tell for sure if they are the originators, or just relaying
another circuit.


I will be running this thing myself. If I notice anything interesting,
I'll post it to the list. Of course my own exit node is always clean
and never ever ever injects malicious code. So no one needs to scan it
at all. You can all trust me. Nobody else should scan. ;)

So far my "Connection: close" list is:

- baphomet
- err
- moulticastfrsrv
- ni
- pax

Anyone know what causes this? They don't do it all the time. Just

Mike Perry
Mad Computer Scientist
fscked.org evil labs