[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Exit node connection statistics

Sebastian Hahn:
But you are right. Maybe top 100 is too much and I should switch to a top 20 or so?

No, you should turn it off. Having those statistics doesn't add any value to the Tor network, you cannot even make broad statements like "30% of all traffic in Tor goes to xy.com", because you see only a tiny fraction and the real usage is likely to be entirely different - think about how different exit policies etc come into play. Generally, it's always recommended to not log unless you have a reason (for example a bug you're trying to find).

The question is not, if it adds value to Tor, but if it adds value in general. And if this is the case I cannot tell yet, and I claim you can't either. It's just a first idea.

The stats are port specific, so they are independent of exit policies. Since I assume most users don't use specific exit nodes, I believe it's a fair assumption that the stats are more or less representative.

So it doesn't tell you anything, that flickr.com for example makes more than 5% during the last days, while the next host is below 1%? Massive abuse is as much a reason as a bug in my eyes.

The less verbose your logs are, the less likely it is someone will find them interesting and makes you give them out. This applies to the whole community of relay operators - if it is a well-known fact that most of them log, adversaries might become more persuasive when they ask for logs.

I doubt this "well-known fact" depends on wether somebody is publishing stats. You always have to assume, that a Tor relay might be logging, and so do the investigators. If they become active depends then on wether they were successful before in getting useful logs. My logs are not useful for backtracing, so I don't contribute to this effect.

Generally, Tor exit nodes must always be assumed to be malicious, but this of course doesn't mean that once it's a proven fact that an exit is malicious, it will be excluded.

Define "malicious". The key feature of Tor is, that it doesn't rely on the trustworthiness of the relay operators, else it would be useless. So I think the log issue is being overrated.

So, a personal question: What is your motive? Do you feel you have a right to know what people are doing? Because this is where the ice gets really thin...

My motive is that of any researcher: learn something. And yes, I do feel that I have the right to know what people are doing, but I don't have the right to know what a person is doing. That's a big difference. The ice gets thin if the Tor-FAQ argues: "we feel that we're doing pretty well at striking a balance currently", although we don't have any idea how much abuse is currently happening. (You cannot estimate it by the number of complaints.)

There are always side effects, so what side effects does Tor have? Maybe Tor in the end reduces privacy instead of improving it, if you look at the big picture? (For example because it enables data-miners to anonymously break their privacy policies?) If we don't dare to look what actually happens on the wire, with the excuse that Tor is about anonymity, we risk to do the wrong thing. And the good thing is: most of the transport-layer data is already anonymized. If you make studies in the normal carrier networks, you always have to make a big effort to anonymize the data before giving something out. With Tor exit connections that's a lot easier, since the source is already unknown.

One could even take up this provocative position: Everybody can operate a Tor node. So everything that a Tor node sees, is public by definition, as it can be seen by a random non-trustworthy person. So it doesn't make a difference from a security point of view, if any information of the traffic is made public. What will become public then is information which is "lost" anyhow. P2P encryption is essential for sensitive data, with Tor even more, and making all info public would just make that very clear to everybody.