[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Proposal: GETINFO controller option for connection information
- To: or-dev@xxxxxxxxxxxxx
- Subject: Proposal: GETINFO controller option for connection information
- From: Damian Johnson <atagar1@xxxxxxxxx>
- Date: Wed, 14 Apr 2010 09:16:04 -0700
- Delivered-to: archiver@xxxxxxxx
- Delivered-to: or-dev-outgoing@xxxxxxxx
- Delivered-to: or-dev@xxxxxxxx
- Delivery-date: Wed, 14 Apr 2010 12:16:14 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=6jop78DYE5W9XdDfSSQmqF1wN6fjl2BQ6ONmsFcn/Ok=; b=hsKCq/MJ4mAC949gEmfr38QVpsWHkVoirdS2w/ynTMhLs6QTIx5YiaaGsg6dK9I0BH hQmmjU6+cOjRnrhPGdKJArYA1UX/yQvMgX2yEYBiIK3VNLPyAn10kJLnXXeO+skMqaCs T4opGomMtVgLxrLO31HvZiOyGHkGhTpSKTRco=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=IBjSVDkmcsYUKtv252IfCAXXT4aLDWANeeNoTQ/wnDfr3RRFGr5LjAGtVPvt8uWdck Ko08yHZ40yvvU+JYCU3nN+CIiiodoXVa4F42tXJ5RPJbB4rQZzeq4SraRnURwJSatMXH F+noS9nuOCxIl4/Oq0Lm0xBjKSdOvWV/Um9zM=
- Reply-to: or-dev@xxxxxxxxxxxxx
- Sender: owner-or-dev@xxxxxxxxxxxxx
Time to take the defibrillator paddles to this proposal once again. As per Nick's request this is a bit more focused on the motivation for getting connection related information. The proposed use cases are just some naive examples I've come up with. If anyone with a stronger security background (which wouldn't take much...) has the time I'd love comments like "WTF?!? This idiot's looking for the completely wrong things! This is obviously worthless if he doesn't look for X."
Also, could we move forward on the other (less controversial) items? For instance, bandwidth totals tend to be a very highly requested piece of information and pipe's already provided a nice patch to get it (http://www.mail-archive.com/or-talk@xxxxxxxxxxxxx/msg13085.html). For reference, here's the not-so-controversial GETINFO options I proposed:
"info/relay/bw-limit" -- Effective relayed bandwidth limit (currently
RelayBandwidthRate if set, otherwise BandwidthRate).
"info/relay/burst-limit" -- Effective relayed burst limit.
"info/relay/read-total" -- Total bytes relayed (download).
"info/relay/write-total" -- Total bytes relayed (upload).
"info/uptime-process" -- Total uptime of the tor process (in seconds).
"info/uptime-reset" -- Time since last reset (startup or sighup signal, in
seconds).
"info/descriptor-used" -- Count of file descriptors used.
"info/descriptor-limit" -- File descriptor limit (getrlimit results).
"ns/authority" -- Router status info (v2 directory style) for all
recognized directory authorities, joined by newlines.
I'm not planning on converting the following to the customary 80-character width until it's at least past being a first draft for a couple reasons:
1. I find editing fixed-width documents to be a time consuming pain in the ass.
2. I've yet to hear why we do this. Is it just to cater to mail clients too dumb to know how to line wrap?
that said, keeping my fingers crossed that this starts going somewhere! -Damian
PS. For previous discussions of this proposal see:
http://marc.info/?t=126101683100002&r=1&w=1
----------------------------------------
Filename: xxx-connection-getinfo-option.txt
Title: GETINFO controller option for connection information
Author: Damian Johnson
Created: 14-Apr-2010
Status: Draft
Overview:
This details an additional GETINFO option for tor controllers that would provide information concerning a relay's current connections.
Motivation:
All Internet facing applications (tor included) are possible vectors for attack on the operator's system. With hundreds of connections to relatively unknown destinations tor is already the bane of any network based IDS, and unless tor can be proved infallible and bug free (which would be quite a feat!) it cannot be blindly trusted.
While it is impossible to guard against every potential future vulnerability, controllers can attempt to mitigate this threat by both auditing tor's behavior and providing indicator of its activity to savvy users. Connection related information is a useful tool for both of these purposes.
In terms of auditing, the following are some conditions controllers can check for with connection information:
- Persistent unestablished circuits. For instance a circuit has an outbound connection without a corresponding inbound counterpart. If such a connection was active (had substantial traffic) this would be troubling enough to alert the user.
- Relatively asymmetric traffic on circuits. Ie, if the controller sees 10 kb/s inbound on a circuit and 5 mb/s outbound this could be a good indicator that someone's using tor to issue a dos, fetch data from the local system, etc.
- Any connections to the local network when ExitPolicyRejectPrivate is set, indicating that tor's being used to proxy connections to the local lan.
- Peculiar patterns of connections, for instance numerous outbound connections to a single IP, or if 99% of all bandwidth belonging to a single circuit.
- Scrubbed connection data limits our ability to check for
obedience to the exit policy, but for strictly
non-exit relays we can still alert the user if any non-relay outbound
connections occur.
Of course if we're working from the assumption that tor has been compromised, then the information provided from the control port cannot be blindly trusted. Hence connection data should be validateable against the system's connection querying utilities (netstat, ss, lsof, etc - which are more likely to be under a host based IDS, if present). This requires that the system's been completely compromised (elevated permissions) before controllers can be tricked, rather than just tor.
While automated detection is handy for detecting known behavior that might indicate issues, visualization gives us the possibility of finding much more thanks to our tinfoil hat wearing user base. A clear display of tor's current behavior gives assurance that tor's functioning as it should, plus a level of transparency desirable from anyone with even the slightest bit of paranoia. Tor is a guest process in the system of relay operators and we should not hide what it does without legitimate reason.
Another (albeit unintended) benefit of visualizing tor's behavior is that it becomes a helpful tool in puzzling out how tor works. For instance, tor spawns numerous client connections at startup (even if unused as a client). As a newcomer to tor these asymmetric (outbound only) connections mystified me for quite a while until until Roger explained their use to me. The proposed TYPE_FLAGS would let controllers clearly label them as being client related, making their purpose a bit clearer.
At the moment connection data can only be retrieved via commands like netstat, ss, and lsof. However, fetching it via the control port provides several advantages:
- scrubbing for private data
Raw connection data has no notion of what's sensitive and what is not. The relay's flags and cached consensus can be used to take educated guesses concerning which connections could possibly belong to client or exit traffic, but this is both difficult and inaccurate.
- additional information
All connection querying commands strictly provide the ip address and port of connections, and nothing else. However, for auditing and visualization the far more interesting attributes are the connection's bandwidth usage, uptime, and the circuit to which it belongs.
- improved performance
Querying connection data is an expensive activity, especially for busy relays or low end processors (such as mobile devices). Tor already internally knows its circuits and connections, allowing for vastly quicker lookups.
- cross platform capability
The connection querying utilities mentioned above not only aren't available under Windows, but differ widely among different *nix platforms. FreeBSD in particular takes a very unique approach, dropping important options from netstat and assigning ss to a spreadsheet application instead. A controller interface, however, would provide a uniform means of retrieving this information.
Security Implications:
The original version of this proposal left the responsibility of scrubbing connection data with client applications (vidalia, arm, etc). However, this was deemed unacceptable by Sebastian and Nick in previous discussions. The proposal now includes dropping the ip address/port of client and exit connections from the controller's response. That said, I think it's a mistake to drop those connections entirely since some of their attributes *are* of legitimate usefulness:
- Existence
At the very least it'd be nice if Tor indicated their existence (ie, I'd say "yea, an exit connection exists on this circuit but we won't tell you where it goes."). This would be useful, for instance, if the relay operator has misconfigured their firewall to block some of the outbound ports permitted by their exit policy (arm would show this as RELAY -> YOU -> UNESTABLISHED, and provide a warning to indicate the issue).
- Bandwidth
For auditing the most interesting attribute of connections, imho, is the bandwidth. If, says 10 KB/s is coming in and 1 MB/s is going out on a circuit that's a good indicator that something is *very* wrong (I'd start suspecting a security issue, personally). If we rounded all bandwidth measurements (say, to the nearest KB) would this be sufficient to prevent entry/exits from correlating this data to attack anonymity?
- Uptime
If connections are being cycled abnormally quickly (say, all connection longevity is under thirty seconds) this could indicate the ISP (or other middlemen like the great firewall) are sending reset packets to kill the relay's attempts to make exit connections.
Specification:
The following addition would be made to the control-spec's GETINFO section:
"conn/<Circuit identity>/<Connection identity>" -- Provides entry for the
associated connection, formatted as:
CONN_ID CIRC_ID OR_ID IP PORT L_PORT TYPE_FLAGS READ WRITE UPTIME
none of the parameters contain whitespace, and additional results must be
ignored to allow for future expansion. Parameters are defined as follows:
CONN_ID - Unique identifier associated with this connection.
CIRC_ID - Unique identifier for the circuit this belongs to (0 if this
doesn't belong to any circuit). At most their may be two connections
(one inbound, one outbound) with any given CIRC_ID except in the case
of exit connections.
OR_ID - Relay fingerprint, 0 if connection doesn't belong to a relay.
IP/PORT - IP address and port used by the associated connection, 0 if
connection is used for relaying client or exit traffic.
L_PORT - Local port used by the connection, 0 if connection is used for
relaying client or exit traffic.
TYPE_FLAGS - Single character flags indicating directionality and type
of the connection (consists of one from each category, may become
longer for future expansion).
Connection Directionality:
I: inbound, i: listening (unestablished inbound),
O: outbound, o: unestablished outbound
Usage Type:
C: client traffic, R: relaying traffic,
X: control, H: hidden service, D: directory
Destination:
T: inter-tor connection, t: outside the tor network
For instance, "IRt" would indicate that this was an established
1st-hop (or bridged) relay connection.
READ/WRITE - Total bytes read/written over the life of this connection.
UPTIME - Time the connection's been established in seconds.
"conn/all" -- Newline separated listing of all current connections.