[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Control Spec Addition First Draft



Hi Sebastian, thanks for the feedback!

As always, I'm very uncomfortable with giving away users'/destinations' ip addresses or ports. I do realize that the same information can be obtained from netstat and friends, but I still think we should actively discourage the use and acquisition of this data. I realize that this is against the intentions of this proposal, but I hope that it is still useful even without client/destination identifying information.

Disagree for the following reasons:
- As mentioned on IRC: all Internet facing applications (browsers, email clients, tor) are attack vectors for my system. Tor's developers are good, but I'm not so sure that they're infallible (sorry Nick) and hence the process can't be blindly trusted - that's why I think transparency is the best way to go. With hundreds of connections to relatively unknown destinations tor is already the bane of network based IDS so it would be nice if we could provide some accounting to system administrators that tor is behaving as it should. For instance say the tor process claims a big outbound connection taking 90% of your bandwidth that can't be accounted for as belonging to a circuit. If you aren't using it as a client that would be... bad.

- I agree that for correlation attacks this data is of concern in the event that numerous relays store or share this information. However, for an individual relay operator having this data shouldn't pose *any* threat to tor users (if it does... we have an issue). From what I can tell this proposal doesn't do anything that makes correlation attacks more dangerous since netstat running in a cron job is all they need (assuming they own a big chunk of the relays).

- Tor was designed with a certain level of distrust of relays. Beyond that the best we can do is discourage them from risky behaviour (ie, running outdated versions, looking at exit traffic, sharing connection data, etc). By including connection types controllers will have the opportunity to tell relay operators "Oi! Please don't look at these exit connections unless you have a damn good reason.". As it stands I don't have a way of telling them apart, and hence can't even hide them by default.

- As you mentioned we can't (and imho shouldn't) prevent relay operators from seeing the connections made to/from their own system. This proposal doesn't seem to exasperate any privacy issues while providing some nice benefits (performance and some handy bits of extra data that'll make security anomalies far easier to detect).

This is, I think, a misunderstanding of what a connection is. More below.

No, the hidden service question isn't. I'm assuming that when hosting hidden services there's some connections dedicated to providing that service. If so, a TYPE_FLAG should probably be included since they don't really belong to any of the other groups. Changed proposal to include one till someone tells me this is wrong.

Here, the connection identity needs to either include the CIRC_ID, or this is ambigious...

Thanks for the catch! Made the following three corrections:

- Changed signature to "conn/<Circuit identity>/<Connection identity>" to avoid ambiguity. I'm assuming that in general people will use the "conn/all" to discover the circuit/connection ids (actually, can't think of a use for getting a single connection - just including it to conform with other control-spec GETINFO options).

- Noted that more than two connections could have the same circuit ID in the case of exit connections.

- Including a L_PORT (local port) parameter - wasn't mentioned but definitely an oversight.

These flags seem to be mostly redundant. Again, they don't necessarily work because a connection can be used for many things. As for the Ee flag, I don't really see the purpose, we certainly shouldn't look at exit traffic going through the connection to decide if it is encrypted or not.

Yea, I wasn't sure if they should be like argument flags (given a default if excluded) or always explicitly stated. Opted for the later since in general explicit is better than implicit, and this way implementers (like TorCtl) won't need to hard code any defaults. Both minor points and glad to discuss more if people disagree.

Yes, if this was only associated with a connection it wouldn't work, but circuit/connection combinations should be unique so issue fixed there.

As for the Ee flag I'm suspecting that it would have use for client connections since any unencrypted traffic there is sniffable. This isn't important to the use cases I care about so we can drop it if others think it's a bad idea.

Here's the revised proposal:

-------------------------------------------------------------------------------

  "conn/<Circuit identity>/<Connection identity>" -- Provides entry for the
    associated connection, formatted as:
      CONN_ID CIRC_ID OR_ID IP PORT L_PORT TYPE_FLAGS READ WRITE UPTIME BUFF

    none of the parameters contain whitespace, and additional results must be
    ignored to allow for future expansion. Parameters are defined as follows:
      CONN_ID - Unique identifier associated with this connection.
      CIRC_ID - Unique identifier for the circuit this belongs to (0 if this
        doesn't belong to any circuit). At most their may be two connections
        (one inbound, one outbound) with any given CIRC_ID except in the case
        of exit connections.
      OR_ID - Relay fingerprint, 0 if connection doesn't belong to a relay.
      IP/PORT - IP address and port used by the associated connection.
      L_PORT - Local port used by the connection.
      TYPE_FLAGS - Single character flags indicating directionality and type
        of the connection (consists of one from each category, may become
        longer for future expansion).
          I: inbound, i: listening (unestablished inbound),
            O: outbound, o: unestablished outbound
          C: client related, R: relay related, X: control, H: hidden service,
            D: directory
          T: inter-tor connection, t: outside the tor network
          E: encrypted traffic, e: unencrypted traffic
        For instance, "IRtE" would indicate that this was an established
        1st-hop (or bridged) relay connection.
      READ/WRITE - Total bytes read/written over the life of this connection.
      UPTIME - Time the connection's been established in seconds.
      BUFF - Bytes of data buffered for this relay connection.

  "conn/all" -- Newline separated listing of all current connections.

  "info/relay/bw-limit" -- Effective relayed bandwidth limit (currently
    RelayBandwidthRate if set, otherwise BandwidthRate).

  "info/relay/burst-limit" -- Effective relayed burst limit.

  "info/relay/read-total" -- Total bytes relayed (download).

  "info/relay/write-total" -- Total bytes relayed (upload).

  "info/relay/buffer-cap" -- Maximum buffer size for relay connections.

  "info/uptime-process" -- Total uptime of the tor process (in seconds).

  "info/uptime-reset" -- Time since last reset (startup or sighup signal, in
    seconds).

  "info/descriptor-used" -- Count of file descriptors used.

  "info/descriptor-limit" -- File descriptor limit (getrlimit results).

  "ns/authority" -- Router status info (v2 directory style) for all
    recognized directory authorities, joined by newlines.

-------------------------------------------------------------------------------

Cheers! -Damian

On Sat, Dec 19, 2009 at 11:43 PM, Sebastian Hahn <hahn.seb@xxxxxx> wrote:
Hi Damian,

please find my comments inline below.

On Dec 17, 2009, at 3:24 AM, Damian Johnson wrote:

[snip]
>  - Anything dangerous? Doubt it, but the bandwidth measurements should probably
>  either be rounded or provided occasionally (say, every second) to address
>  correlation attacks. I'm sure Sebastian will enthusiastically sink some
>  paranoia into this later. ;)

As always, I'm very uncomfortable with giving away users'/destinations' ip addresses or ports. I do realize that the same information can be obtained from netstat and friends, but I still think we should actively discourage the use and acquisition of this data. I realize that this is against the intentions of this proposal, but I hope that it is still useful even without client/destination identifying information.

> - When hosting hidden services I'd imagine some connections are dedicated to
>  them. If so, lets add a flag to indicate them.

This is, I think, a misunderstanding of what a connection is. More below.

[snip]
>    "conn/<Connection identity>" -- Provides entry for the associated
>      connection, formatted as:
>        CONN_ID CIRC_ID OR_ID IP PORT TYPE_FLAGS READ WRITE UPTIME BUFF
>
>      none of the parameters contain whitespace, and additional results must be
>      ignored to allow for future expansion. Parameters are defined as follows:
>        CONN_ID - Unique identifier associated with this connection.
>        CIRC_ID - Unique identifier for the circuit this belongs to (0 if this
>          doesn't belong to any circuit). At most their may be two connections
>          (one inbound, one outbound) with any given CIRC_ID.

Here, the connection identity needs to either include the CIRC_ID, or this is ambigious. Tor mutliplexes many circuits over the same connection, so there is no way to infer the circuit id from a connection id. Also, for exit connections, there may be more than two connections with the same circuit id. What this means: We either want a seperate query to learn about circuits, or we want the conn_id to list all the circuits that it has attached, or we want to only allow queries of this kind when circ id and conn id are both known to the controller

>        OR_ID - Relay fingerprint, 0 if connection doesn't belong to a relay.
>        IP/PORT - IP address and port used by the associated connection.
>        TYPE_FLAGS - Single character flags indicating directionality and type
>          of the connection (consists of one from each category, may become
>          longer for future expansion).
>            I: inbound, i: listening (unestablished inbound),
>              O: outbound, o: unestablished outbound
>            C: client related, R: relay related, X: control, D: directory
>            T: inter-tor connection, t: outside the tor network
>            E: encrypted traffic, e: unencrypted traffic
>          For instance, "IRtE" would indicate that this was an established
>          1st-hop (or bridged) relay connection.

These flags seem to be mostly redundant. Again, they don't necessarily work because a connection can be used for many things. As for the Ee flag, I don't really see the purpose, we certainly shouldn't look at exit traffic going through the connection to decide if it is encrypted or not.

>        READ/WRITE - Total bytes read/written over the life of this connection.
>        UPTIME - Time the connection's been established in seconds.
>        BUFF - Bytes of data buffered for this relay connection.
>
>    "conn/all" -- Newline separated listing of all current connections.
>
>    "info/relay/bw-limit" -- Effective relayed bandwidth limit (currently
>      RelayBandwidthRate if set, otherwise BandwidthRate).
>
>    "info/relay/burst-limit" -- Effective relayed burst limit.
>
>    "info/relay/read-total" -- Total bytes relayed (download).
>
>    "info/relay/write-total" -- Total bytes relayed (upload).
>
>    "info/relay/buffer-cap" -- Maximum buffer size for relay connections.
>
>    "info/uptime-process" -- Total uptime of the tor process (in seconds).
>
>    "info/uptime-reset" -- Time since last reset (startup or sighup signal, in
>      seconds).
>
>    "info/descriptor-used" -- Count of file descriptors used.
>
>    "info/descriptor-limit" -- File descriptor limit (getrlimit results).
>
>    "ns/authority" -- Router status info (v2 directory style) for all
>      recognized directory authorities, joined by newlines.
>

These all sound sane.


Sebastian