[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

(FWD) Re: Proposal 171 (revised): Separate streams across circuits by connection metadata

[Forwarding because Nikita isn't subscribed at this address. -RD]

----- Forwarded message from owner-or-dev@xxxxxxxxxxxxx -----

From: Nikita Borisov <nikita@xxxxxxxxxxxx>
Date: Fri, 21 Jan 2011 16:00:44 -0600
Subject: Re: Proposal 171 (revised): Separate streams across circuits by
 connection metadata
To: or-dev@xxxxxxxxxxxxx

I have a suggestion: streams that have been explicitly designated for
isolation by the use of different ports or usernames should also use a
different set of guard nodes.  My thinking is that there have been
attacks proposed in the past that can profile the set of guard nodes
used by a client over time, as long as it's possible to externally
link the connections (e.g., the connections contain a pseudonymous
username in the cleartext).  If these attacks are used to profile two
sets of externally linkable connections (i.e., two pseudonyms) and
they come up with the same set of guards, that is a pretty strong
indication that the pseudonyms are in fact linked to each other.  If I
used a different port to separate the two pseudonyms, however, and Tor
used a different guard set for each, this would not be a problem.
Conversely, the advantage of using (the same set of) guard nodes
disappears for streams that are not externally linkable, since the
guards do not change the overall probability that each individual
stream will be compromised.

(I think it's harder to make the case that you want to do this based
on implicit session indicators, since there's a chance that those
streams will still be somehow linked, particularly if the indicators
are short-lived, such as PIDs or source ports.)

- Nikita

On Tue, Dec 7, 2010 at 10:02 AM, Nick Mathewson <nickm@xxxxxxxxxxxxxx> wrot=
> Hi, all. =A0I'm trying to get the proposal-171 discussion settled down a
> bit more, so I revised the proposal to try to say more like what I
> think it should say. =A0Here goes:
> Filename: 171-separate-streams.txt
> Title: Separate streams across circuits by connection metadata
> Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson
> Created: 21-Oct-2008
> Modified: 7-Dec-2010
> Status: Open
> Summary:
> =A0We propose a new set of options to isolate unrelated streams from one
> =A0another, putting them on separate circuits so that semantically
> =A0unrelated traffic is not inadvertently made linkable.
> Motivation:
> =A0Currently, Tor attaches regular streams (that is, ones not carrying
> =A0rendezvous or directory traffic) to circuits based only on whether Tor
> =A0circuit's current exit node supports the destination, and whether the
> =A0circuit has been dirty (that is, in use) for too long.
> =A0This means that traffic that would otherwise be unrelated sometimes
> =A0gets sent over the same circuit, allowing the exit node to link such
> =A0streams with certainty, and allowing other parties to link such
> =A0streams probabilistically.
> =A0Older versions of onion routing tried to address this problem by
> =A0sending every stream over a separate circuit; performance issues made
> =A0this unfeasible. Moreover, in the presence of a localized adversary,
> =A0separating streams by circuits increases the odds that, for any given
> =A0linked set of streams, at least one will go over a compromised
> =A0circuit.
> =A0Therefore we ought to look for ways to allow streams that ought to be
> =A0linked to travel over a single circuit, while keeping streams that
> =A0ought not be linked isolated to separate circuits.
> Discussion:
> =A0Let's call a series of inherently-linked streams (like a set of
> =A0streams downloading objects from the same webpage, or a browsing
> =A0session where the user requests several related webpages) a "Session".
> =A0"Sessions" are a necessarily a fuzzy concept. =A0While users typically
> =A0consider some activities as wholly unrelated to each other ("My IM
> =A0session has nothing to do with my web browsing!"), the boundaries
> =A0between activities are sometimes hard to determine. =A0If I'm reading
> =A0lolcats in one browser tab and reading about treatments for an
> =A0embarrassing disease in another, those are probably separate sessions.
> =A0If I search for a forum, log in, read it for a while, and post a few
> =A0messages on unrelated topics, that's probably all the same session.
> =A0So with the proviso that no automated process can identify sessions
> =A0100% accurately, let's see which options we have available.
> =A0Generally, all the streams on a session come from a single
> =A0application. =A0Unfortunately, isolating streams by application
> =A0automatically isn't feasible, given the lack of any nice
> =A0cross-platform way to tell which local process originated a given
> =A0connection. =A0(Yes, lsof works. =A0But a quick review of the lsof cod=
> =A0should be sufficient to scare you away from thinking there is a
> =A0portable option, much less a portable O(1) option.) =A0So instead, we'=
> =A0have to use some other aspect of a Tor request as a proxy for the
> =A0application.
> =A0Generally, traffic from separate applications is not in the same
> =A0session.
> =A0With some applications (IRC, for example), each stream is a session.
> =A0Some applications (most notably web browsing) can't be meaningfully
> =A0split into sessions without inspecting the traffic itself and
> =A0maintaining a lot of state.
> =A0How well do ports correspond to sessions? =A0Early versions of this
> =A0proposal focused on using destination ports as a proxy for
> =A0application, since a connection to port 22 for SSH is probably not in
> =A0the same session as one to port 80. This only works with some
> =A0applications better than others, though: while SSH users typically
> =A0know when they're on port 22 and when they aren't, a web browser can
> =A0be coaxed (though img urls or any number of releated tricks) into
> =A0connecting to any port at all. =A0Moreover, when Tor gets a DNS lookup
> =A0request, it doesn't know in advance which port the resulting address
> =A0will be used to connect to.
> =A0So in summary, each kind of traffic wants to follow different rules,
> =A0and assuming the existence of a web browser and a hostile web page or
> =A0exit node, we can't tell one kind of traffic from another by simply
> =A0looking at the destination:port of the traffic.
> =A0Fortunately, we're not doomed.
> Design:
> =A0When a stream arrives at Tor, we have the following data to examine:
> =A0 =A01) The destination address
> =A0 =A02) The destination port (unless this a DNS lookup)
> =A0 =A03) The protocol used by the application to send the stream to Tor:
> =A0 =A0 =A0 SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy=
> =A0 =A0 =A0 mechanism the kernel gives us.
> =A0 =A04) The port used by the application to send the stream to Tor --
> =A0 =A0 =A0 that is, the SOCKSListenAddress or TransListenAddress that th=
> =A0 =A0 =A0 application used, if we have more than one.
> =A0 =A05) The SOCKS username and password, if any.
> =A0 =A06) The source address and port for the application.
> =A0We propose to use 3, 4, and 5 as a backchannel for applications to
> =A0tell Tor about different sessions. =A0Rather than running only one
> =A0SOCKSPort, a Tor user who would prefer better session isolation should
> =A0run multiple SOCKSPorts/TransPorts, and configure different
> =A0applications to use separate ports. Applications that support SOCKS
> =A0authentication can further be separated on a single port by their
> =A0choice of username/password. =A0Streams sent to separate ports or usin=
> =A0different authentication information should never be sent over the
> =A0same circuit. =A0We allow each port to have its own settings for
> =A0isolation based on destination port, destination address, or both.
> =A0Handling DNS can be a challenge. =A0We can get hostnames by one of thr=
> =A0means:
> =A0 =A0A) A SOCKS4a request, or a SOCKS5 request with a hostname. =A0This
> =A0 =A0 =A0 case is handled trivially using the rules above.
> =A0 =A0B) A RESOLVE request on a SOCKSPort. =A0This case is handled using=
> =A0 =A0 =A0 rules above, except that port isolation can't work to isolate
> =A0 =A0 =A0 RESOLVE requests into a proper session, since we don't know w=
> =A0 =A0 =A0 port will eventually be used when we connect to the returned
> =A0 =A0 =A0 address.
> =A0 =A0C) A request on a DNSPort. =A0We have no way of knowing which
> =A0 =A0 =A0 address/port will be used to connect to the requested address=
> =A0When B or C is required but problematic, we could favor the use of
> =A0AutomapHostsOnResolve.
> Interface:
> =A0We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in
> =A0favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax:
> =A0ClientPortLine =3D OptionName SP (Addr ":")? Port (SP Options?)
> =A0OptionName =3D "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort"
> =A0Addr =3D An IPv4 address / an IPv6 address surrounded by brackets.
> =A0 =A0 =A0 =A0 If optional, we default to
> =A0Port =3D An integer from 1 through 65535 inclusive
> =A0Options =3D Option
> =A0Options =3D Options SP Option
> =A0Option =3D IsolateOption / GroupOption
> =A0GroupOption =3D "SessionGroup=3D" UINT
> =A0IsolateOption =3D =A0OptNo ("IsolateDestPort" / "IsolateDestAddr" /
> =A0 =A0 =A0 =A0 "IsolateSOCKSUser"/ "IsolateClientProtocol" /
> =A0 =A0 =A0 =A0 "IsolateClientAddr") OptPlural
> =A0OptNo =3D "No" ?
> =A0OptPlural =3D "s" ?
> =A0SP =3D " "
> =A0UINT =3D An unsigned integer
> =A0All options are case-insensitive.
> =A0The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by
> =A0default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively
> =A0turn them off. =A0The IsolateDestPort and IsolateDestAddr and
> =A0IsolateClientProtocol options are off by default. =A0NoIsolateDestPort=
> =A0NoIsolateDestAddr and NoIsolateClientProtocol have no effect.
> =A0Given a set of ClientPortLines, streams must NOT be placed on the same
> =A0circuit if ANY of the following hold:
> =A0 =A0* They were sent to two different client ports, unless the two
> =A0 =A0 =A0client ports both specify a "SessionGroup" option with the sam=
> =A0 =A0 =A0integer value.
> =A0 =A0* At least one was sent to a client port with the IsolateDestPort
> =A0 =A0 =A0active, and they have different destination ports.
> =A0 =A0* At least one was sent to a client port with IsolateDestAddr
> =A0 =A0 =A0active, and they have different destination addresses.
> =A0 =A0* At least one was sent to a client port with IsolateClientProtoco=
> =A0 =A0 =A0active, and they use different protocols (where SOCKS4, SOCKS4=
> =A0 =A0 =A0SOCKS5, TransPort, NatdPort, and DNS are the protocols in ques=
> =A0 =A0* At least one was sent to a client port with IsolateSOCKSUser
> =A0 =A0 =A0active, and they have different SOCKS username/password values
> =A0 =A0 =A0configurations. =A0(For the purposes of this option, the
> =A0 =A0 =A0username/password pair of ""/"" is distinct from SOCKS without
> =A0 =A0 =A0authentication, and both are distinct from any non-SOCKS clien=
> =A0 =A0 =A0non-authentication.)
> =A0 =A0* At least one was sent to a client port with IsolateClientAddr
> =A0 =A0 =A0active, and they came from different client addresses. =A0(For=
> =A0 =A0 =A0purpose of this option, any local interface counts as the same
> =A0 =A0 =A0address. =A0So if the host is configured with addresses 10.0.0=
> =A0 =A0 =A0192.0.32.10, and, then traffic from those addresses =
> =A0 =A0 =A0leave on the same circuit, but traffic to from (for
> =A0 =A0 =A0example) could not share a circuit with any of them.)
> =A0These rules apply regardless of whether the streams are active at the
> =A0same time. =A0In other words, if the rules say that streams A and B mu=
> =A0not be on the same circuit, and stream A is attached to circuit X,
> =A0then stream B must never be attached to stream X, even if stream A is
> =A0closed first.
> Alternative Interface:
> =A0We're cramming a lot onto one line in the design above. =A0Perhaps
> =A0instead it would be a better idea to have grouped lines of the form:
> =A0 =A0StreamGroup 1
> =A0 =A0SOCKSPort 9050
> =A0 =A0TransPort 9051
> =A0 =A0IsolateDestPort 1
> =A0 =A0IsolateClientProtocol 0
> =A0 =A0EndStreamGroup
> =A0 =A0StreamGroup 2
> =A0 =A0SOCKSPort 9052
> =A0 =A0DNSPort 9053
> =A0 =A0IsolateDestAddr 1
> =A0 =A0EndStreamGroup
> =A0This would be equivalent to:
> =A0 SOCKSPort 9050 SessionGroup=3D1 IsolateDestPort NoIsolateClientProtoc=
> =A0 TransPort 9051 SessionGroup=3D1 IsolateDestPort NoIsolateClientProtoc=
> =A0 SOCKSPort 9052 SessionGroup=3D2 IsolateDestAddr
> =A0 DNSPort =A0 9053 SessionGroup=3D2 IsolateDestAddr
> =A0But it would let us extend range of allowed options later without
> =A0having client port lines group without bound. =A0For example, we might
> =A0give different circuit building parameters to different session
> =A0groups.
> Example of use:
> =A0Suppose that we want to use a web browser, an IRC client, and a SSH
> =A0client all at the same time. =A0Let's assume that we want web traffic =
> =A0be isolated from all other traffic, even if the browser makes
> =A0connections to ports usually used for IRC or SSH. =A0Let's also assume
> =A0that IRC and SSH are both used for relatively long-lived connections,
> =A0and we want to keep all IRC/SSH sessions separate from one another.
> =A0In this case, we could say:
> =A0 =A0SOCKSPort 9050
> =A0 =A0SOCKSPort 9051 IsolateDestAddr IsolateDestPort
> =A0We would then configure our browser to use 9050 and our IRC/SSH
> =A0clients to use 9051.
> Advanced example of use, #2:
> =A0Suppose that we have a bunch of applications, and we launch them all
> =A0using torsocks, and we want to keep each applications isolated from
> =A0one another. =A0We just create a shell script, "torlaunch":
> =A0 =A0#!/bin/bash
> =A0 =A0export TORSOCKS_USERNAME=3D"$1"
> =A0 =A0exec torsocks $@
> =A0And we configure our SOCKSPort with IsolateSOCKSUser.
> =A0Or if we're on Linux and we want to isolate by application invocation,
> =A0we would change the TORSOCKS_USERNAME line to:
> =A0 =A0export TORSOCKS_USERNAME=3D"`cat /proc/sys/kernel/random/uuid`"
> Advanced example of use, #2:
> =A0Now suppose that we want to achieve the benefits of the first example
> =A0of use, but we are stuck using transparent proxies. =A0Let's suppose
> =A0this is Linux.
> =A0 =A0TransPort 9090
> =A0 =A0TransPort 9091 IsolateDestAddr IsolateDestPort
> =A0 =A0DNSPort 5353
> =A0 =A0AutomapHostsOnResolve 1
> =A0Here we use the iptables --cmd-owner filter to distinguish which
> =A0command is originating the packets, directing traffic from our irc
> =A0client and our SSH client to port 9091, and directing other traffic to
> =A09090. =A0Using AutomapHostsOnResolve will confuse ssh in its default
> =A0configuration; we'll need to find a way around that.
> Security Risks:
> =A0Disabling IsolateClientAddr is a pretty bad idea.
> =A0Setting up a set of applications to use this system effectively is a
> =A0big problem. =A0It's likely that lots of people who try to do this wil=
> =A0mess it up. =A0We should try to see which setups are sensible, and see
> =A0if we can provide good feedback to explain which streams are isolated
> =A0how.
> Performance Risks:
> =A0This proposal will result in clients building many more circuits than
> =A0they do today. =A0To avoid accidentally hammering the network, we shou=
> =A0have in-process limits on the maximum circuit creation rate and the
> =A0total maximum client circuits.
> Specification:
> =A0The Tor client circuit selection process is not entirely specified.
> =A0Any client circuit specification must take these changes into account.
> Implementation notes:
> =A0The more obvious ways to implement the "find a good circuit to attach
> =A0to" part of this proposal involve doing an O(n_circuits) operation
> =A0every time we have a stream to attach. =A0We already do such an
> =A0operation, so it's not as if we need to hunt for fancy ways to make it
> =A0O(1). =A0What will be harder is implementing the "launch circuits as
> =A0needed" part of the proposal. =A0Still, it should come down to "a simp=
> =A0matter of programming."
> =A0The SOCKS4 spec has the client provide authentication info when it
> =A0connects; accepting such info is no problem. =A0But the SOCKS5 spec ha=
> =A0the client send a list of known auth methods, then has the server send
> =A0back the authentication method it chooses. =A0We'll need to update the
> =A0SOCKS5 implementation so it can accept user/password authentication if
> =A0it's offered.
> =A0If we use the second syntax for describing these options, we'll want
> =A0to add a new "section-based" entry type for the configuration parser.
> =A0Not a huge deal; we already have kludged up something similar for
> =A0hidden service configurations.
> =A0Opening circuits for predicted ports has the potential to get a little
> =A0more complicated; we can probably get away with the existing
> =A0algorithm, though, to see where its weak points are and look for
> =A0better ones.
> =A0Perhaps we can get our next-gen HTTP proxy to communicate browser tab
> =A0or session into to tor via authentication, or have torbutton do it
> =A0directly. =A0More design is needed here, though.
> Alternative designs:
> =A0The implementation of this option may want to consider cases where the
> =A0same exit node is shared by two or more circuits and
> =A0IsolateStreamsByPort is in force. =A0Since one possible use of the opt=
> =A0is to reduce the opportunity of Exit Nodes to attack traffic from the
> =A0same source on multiple ports, the implementation may need to ensure
> =A0that circuits reserved for the exclusive use of given ports do not
> =A0share the same exit node. =A0On the other hand, if our goal is only th=
> =A0streams should be unlinkable, deliberately shunting them to different
> =A0exit nodes is unnecessary and slightly counterproductive.
> =A0Earlier versions of this design included a mechanism to isolate
> =A0_particular_ destination ports and addresses, so that traffic sent to,
> =A0say, port 22 would never share a port with any traffic *not* sent to
> =A0port 22. =A0You can achieve this here by having all applications that
> =A0send traffic to one of these ports use a separate SOCKSPort, and
> =A0then setting IsolateDestPorts on that SOCKSPort.
> Lingering questions:
> =A0I suspect there are issues remaining with DNS and TransPort users, and
> =A0that my "just use AutomapHostsOnResolve" suggestion may be
> =A0insufficient.

Nikita Borisov - http://hatswitch.org/~nikita/
Assistant Professor, Electrical and Computer Engineering
Tel: (217) 903-4401, Office: 460 CSL

----- End forwarded message -----