[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Proposal: MapAddress wilcards [*]



Probably should have gone to or-dev, not or-talk...

Many sites these days have multiple hosts in their domains. These
sites may have various administrative, logging or restrictive
policies. The same goes for the path to them if the user is unfortunate
enough to reside in strange lands.

As pure example, note myspace. They have myspace.com, full of
subdomains and hosts. There is also myspacecdn.com and a couple
others. They all work together to deliver the full content, images,
etc. This is common for load balancing, service segmentation and
so on.

Problem:
(A) Tor makes the use of MapAddress with sites that use multiple
hosts like these difficult and insufficient because:

1 - Each host requires another MapAddress statement.
2 - It is impossible to know all the hosts the site uses beforehand.
3 - The sites commonly change hosts on a whim.

And missing the mapping due to this could affect either the user
or the site in unintended ways. Mapping should be a bit smarter and
able to do the right thing. Users commonly desire to 'send all my
traffic for site x via exit y and make it just work'.

Solution:
(B) So the following feature is proposed. Allow wildcards in the
MapAddress function such that:

1 - MapAddress google.com=google.com.<exit>.exit
 Is now, and should remain, single host specific as usual.

2 - MapAddress *.google.com=*.google.com.<exit>.exit
 Matches any third level domain such as www.google.com, but obviously
 not google.com itself, as that is handled by (1) above. The name
 must have three levels to match.

3 - MapAddress **.google.com=**.google.com.<exit>.exit
 Matches any third or deeper level domain such as a.b.c.d.google.com.
 This is a sensible hack. It is meant to allow future expansion of
 MapAddress to use some form of regex. Since '**' isn't really used
 in regexes, it is a useful glob for this purpose of allowing
 everything to match... which the user would _really_ want to have
 happen easily, without resorting to the obvious further nonsense
 in (4), which would be subject to the same problems in (A) above.

4 - MapAddress *.*.google.com=*.*.google.com.<exit>.exit
 Matches any third and fourth domain. Only four level names would
 match. This is a NON-proposal.

5 - *google.com
 This is also a NON-proposal. It is too far down the path of some
 form of regex for the quick fix this proposal is meant to be. And
 it would obviously match all sorts of undesirables. The dots are
 important in DNS.

Note that having globs on the right side of the '=' doesn't make
sense from a routing point of view, but it's not supposed to. It
is done so that scripts can continue to keep track and do things
like:

/bin/sh
 # add map
 dst=$src.$exit.exit
 printf "authenticate \"foo\"\r\nmapaddress $src=$dst\r\nquit\r\n" |
 # remove map
 dst=$src
 printf "authenticate \"foo\"\r\nmapaddress $src=$dst\r\nquit\r\n" |

And of course the below command should list the mappings as usual.
Both the static mapping that was entered, and the dynamic ones that
result from it...

getinfo address-mappings/all

google.com google.com.<exit>.exit NEVER
**.google.com **.google.com.<exit>.exit NEVER
google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:20"
mail.google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:22"
a.b.google.com.<exit>.exit <ip.ad.dr.ess>.<exit>.exit "2009-06-05 18:01:24"

At this time, it is unimportant which rule the dynamic entry resulted
from as that is not denoted in the current versions of Tor. A simple
numeric tag in the first column of every rule would suffice for
that in the future.