[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: eliminating bogus port 43 exits



On 6/12/2009 3:29 AM, Scott Bennett wrote:
> In other words, by restricting just port 43 exits to only the legitimate whois
> IP addresses, I eliminated at least 70% of *all* exits through my tor node,
> which suggests to me that the vast, overwhelming majority of exits from the
> tor network are illegitimate and place a terribly taxing load upon the tor
> network as a whole.

Scott,

Thanks for your continued analysis, this is interesting information.
However, the list of WHOIS servers you mentioned (and I snipped for
brevity) is by no means a complete set of "the legitimate WHOIS IP
addresses".  In fact, it's much much too small to draw any significant
conclusions, for at least two major reasons:

1) Any .com or .net WHOIS queries that hit whois.verisign-grs.com (aka
whois.internic.net in your list) with a legitimate domain name will
result in a referral to an individual registrar's WHOIS server, which
will often be followed by the client, and would not be allowed by your
exit policy.  There are potentially tens of thousands of these registrar
WHOIS servers out there.

2) Your list significantly excludes all ccTLD WHOIS servers.  While the
numbers of domains registered in ccTLDs are not significant compared to
.com/.net, their use is quite popular in a number of places,
particularly in some where Tor is also quite popular, ie Germany.

I'd be interested in seeing a comparison done with a more significantly
complete list.  I understand you feel very strongly about sampling the
contents of the traffic, and that's perfectly understandable and
appropriate, but it is probably the only way to actually make a firm
determination of how much of this exit traffic really is WHOIS, without
crafting a VERY large Exit policy.  It may be possible, with
appropriately engineered tools, to sample the traffic in a suitably
anonymous way but still draw some conclusions, perhaps by simply
attempting to determine if the TCP session involves mostly text or
binary data.  That may still be a bit too intrusive, so I suppose we
might just never know.

Given these shortcomings in the list, I definitely wouldn't suggest that
such a list be considered a "default", as you'll be blocking a
potentially significant amount of legitimate WHOIS traffic.

If you do attempt to dig up a more complete list of WHOIS servers, I'd
certainly be interested to see what you come up with, but of course
understand you're doing this all on your own time and dime, and would
never suggest that you're by any means obligated to do so. :)

Best Regards,
Tim