Thus spake gojosan@xxxxxxxxxxxxx (gojosan@xxxxxxxxxxxxx): > I just noticed this talk at the Security and Privacy Day from May 2008. > While I understand that Tor's thread model does not defend against a GPA > I am still curious what effect this attack can have against the current, > real Tor network? > > Simulating a Global Passive Adversary for Attacking Tor-like Anonymity > Systems > http://web.crypto.cs.sunysb.edu/spday/ A handful of comments about the paper (many of these they themselves brought up, but some they did not): 0. They are really an active adversary here. They need to be controlling a website in order to unmask users, or have control of some exit nodes or their upstream routers, and must modulate the bandwidth of TCP connections considerably (+/- 30-40KB/sec or more, for a period of 10 minutes). 1. Their results for path recognition (25% false negatives, 10% false positives) are based on their limited sample set of only 13 trial circuits via a small set of nodes that must be geographically close to and well-peered with their pinging 'vantage point(s)'. I suspect reality is a lot less forgiving than this limited sample size indicates when more nodes and trials are involved using the real Tor path selection algorithm. 2. The bandwidth estimation technique they utilized (based on TCP Westwood/CapProbe/PacketPair) is very sensitive to any queuing and congestion that occurs between the target and the 'vantage point(s)'. As soon as congestion happens along the path, these types of estimates report a large amount of excess capacity (rather than no capacity) due to the acks/responses getting compressed together in queues. The way this has been 'fixed' in TCP Westwood+ is to filter out the high estimates and perform weighted averaging to smooth fluctuations (precisely what they are trying to measure). It would have been nice if they provided some more realistic testing of their bandwidth estimation consistency using real world nodes as opposed to the lab results on half-duplex ethernet. 3. Based on my measurements last year, only the top ~5-10% nodes are capable of transmitting this much data in an individual stream, and only if all of the nodes in your path are from this set. Furthermore, as load balancing improves (and we still have more work to do here beyond my initial improvements last year), these averages should in theory come down for these nodes (but increase for slower nodes). So how they will fair once we figure out the bottlenecks of the network is unknown. They could do better in this case, but it is probably more likely the average stream capacity for most nodes will drop below their detection threshold. 4. Right now these few fast nodes carry about 60% of the network traffic. A rough back of the envelope calculation based on our selection algorithm means that only ~22% (.6*.6*.6) of the paths of the network have this property for normal traffic, and only ~4.5% of hidden service paths (which are 6 hops). 5. Their error rates do get pretty high once they've begun trying to trace the stream back to its ISP (on top of the rates for just path recognition). Any other fluctuations in traffic are going to add error to this ability, and I imagine traffic fluctuates like crazy along these paths. They also assume full a-priori knowledge of these routes which in practice means a full map of all of the peering agreements of the Internet, and 'vantage point(s)' that have no queuing delay to all of them.. A couple countermeasures that are possible: 1. Nodes that block ICMP and filter closed TCP ports are less susceptible to this attack, since they would force the adversary to measure the capacity changes at upstream routers instead (which will have other noise introduced due to peers utilizing the link as well). I am wondering if this means we should scan the network to see how many of these top nodes allow ICMP and send TCP resets, and if it is feasible to notify their operators that they may want to consider improving their firewalls, since we're only talking about 100-150 IPs here. There are a lot more critical things to scan for though, so this is probably lower priority. 2. Roger pointed out that clients can potentially protect themselves by setting 'BandwidthRate 25KB' and setting 'BandwidthBurst' to some high value, so that short lived streams will still get high capacity if it is available, but once streams approach the 10-20minute lifetime needed for this attack to work, they should be below the detectable threshold. I think this is a somewhat ugly hack, and should probably be governed by a "High Security Mode" setting that would be specifically tuned to this purpose (and be a catching point for other hacks that protect against various attacks but at the expense of performance/usability). All this aside, this is a very clever attack, and further evidence that we should more closely study capacity properties, reliability properties, queuing properties, and general balancing properties of the network. -- Mike Perry Mad Computer Scientist fscked.org evil labs
Attachment:
pgp5TMPlQiamt.pgp
Description: PGP signature