[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[or-cvs] r9226: Make the "Next Version" of the Tor protocol called "v2", not (in tor/trunk: . doc)
- To: or-cvs@xxxxxxxxxxxxx
- Subject: [or-cvs] r9226: Make the "Next Version" of the Tor protocol called "v2", not (in tor/trunk: . doc)
- From: nickm@xxxxxxxx
- Date: Sun, 31 Dec 2006 14:31:51 -0500 (EST)
- Delivered-to: archiver@seul.org
- Delivered-to: or-cvs-outgoing@seul.org
- Delivered-to: or-cvs@seul.org
- Delivery-date: Sun, 31 Dec 2006 14:32:08 -0500
- Reply-to: or-talk@xxxxxxxxxxxxx
- Sender: owner-or-cvs@xxxxxxxxxxxxx
Author: nickm
Date: 2006-12-31 14:31:45 -0500 (Sun, 31 Dec 2006)
New Revision: 9226
Added:
tor/trunk/doc/tor-spec-v2.txt
Removed:
tor/trunk/doc/tor-spec-v0.txt
Modified:
tor/trunk/
tor/trunk/doc/tor-spec.txt
Log:
r11775@Kushana: nickm | 2006-12-31 14:27:02 -0500
Make the "Next Version" of the Tor protocol called "v2", not "v1". Make tor-spec.txt canonical and current again; make tor-spec-v2.txt be the "splufty next version" document.
Property changes on: tor/trunk
___________________________________________________________________
svk:merge ticket from /tor/trunk [r11775] on c95137ef-5f19-0410-b913-86e773d04f59
Deleted: tor/trunk/doc/tor-spec-v0.txt
===================================================================
--- tor/trunk/doc/tor-spec-v0.txt 2006-12-31 06:18:16 UTC (rev 9225)
+++ tor/trunk/doc/tor-spec-v0.txt 2006-12-31 19:31:45 UTC (rev 9226)
@@ -1,734 +0,0 @@
-$Id$
-
- Tor Protocol Specification
-
- Roger Dingledine
- Nick Mathewson
-
-Note: This document specifies Tor as currently implemented in versions
-0.1.2.1-alpha and earlier. Current protocol designs are described in
-tor-spec.txt.
-
-0. Preliminaries
-
-0.1. Notation and encoding
-
- PK -- a public key.
- SK -- a private key.
- K -- a key for a symmetric cypher.
-
- a|b -- concatenation of 'a' and 'b'.
-
- [A0 B1 C2] -- a three-byte sequence, containing the bytes with
- hexadecimal values A0, B1, and C2, in that order.
-
- All numeric values are encoded in network (big-endian) order.
-
- H(m) -- a cryptographic hash of m.
-
-0.2. Security parameters
-
- Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman
- protocol, and a hash function.
-
- KEY_LEN -- the length of the stream cipher's key, in bytes.
-
- PK_ENC_LEN -- the length of a public-key encrypted message, in bytes.
- PK_PAD_LEN -- the number of bytes added in padding for public-key
- encryption, in bytes. (The largest number of bytes that can be encrypted
- in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.)
-
- DH_LEN -- the number of bytes used to represent a member of the
- Diffie-Hellman group.
- DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x).
-
- HASH_LEN -- the length of the hash function's output, in bytes.
-
- CELL_LEN -- The length of a Tor cell, in bytes.
-
-0.3. Ciphers
-
- For a stream cipher, we use 128-bit AES in counter mode, with an IV of all
- 0 bytes.
-
- For a public-key cipher, we use RSA with 1024-bit keys and a fixed
- exponent of 65537. We use OAEP padding, with SHA-1 as its digest
- function. (For OAEP padding, see
- ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
-
- For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we
- use the 1024-bit safe prime from rfc2409, (section 6.2) whose hex
- representation is:
-
- "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
- "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
- "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
- "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
- "49286651ECE65381FFFFFFFFFFFFFFFF"
-
- As an optimization, implementations SHOULD choose DH private keys (x) of
- 320 bits. Implementations that do this MUST never use any DH key more
- than once.
-
- For a hash function, we use SHA-1.
-
- KEY_LEN=16.
- DH_LEN=128; DH_GROUP_LEN=40.
- PK_ENC_LEN=128; PK_PAD_LEN=42.
- HASH_LEN=20.
-
- When we refer to "the hash of a public key", we mean the SHA-1 hash of the
- DER encoding of an ASN.1 RSA public key (as specified in PKCS.1).
-
- All "random" values should be generated with a cryptographically strong
- random number generator, unless otherwise noted.
-
- The "hybrid encryption" of a byte sequence M with a public key PK is
- computed as follows:
- 1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK.
- 2. Otherwise, generate a KEY_LEN byte random key K.
- Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M,
- and let M2 = the rest of M.
- Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher,
- using the key K. Concatenate these encrypted values.
- [XXX Note that this "hybrid encryption" approach does not prevent
- an attacker from adding or removing bytes to the end of M. It also
- allows attackers to modify the bytes not covered by the OAEP --
- see Goldberg's PET2006 paper for details. We will add a MAC to this
- scheme one day. -RD]
-
-0.4. Other parameter values
-
- CELL_LEN=512
-
-1. System overview
-
- Tor is a distributed overlay network designed to anonymize
- low-latency TCP-based applications such as web browsing, secure shell,
- and instant messaging. Clients choose a path through the network and
- build a ``circuit'', in which each node (or ``onion router'' or ``OR'')
- in the path knows its predecessor and successor, but no other nodes in
- the circuit. Traffic flowing down the circuit is sent in fixed-size
- ``cells'', which are unwrapped by a symmetric key at each node (like
- the layers of an onion) and relayed downstream.
-
-2. Connections
-
- There are two ways to connect to an onion router (OR). The first is
- as an onion proxy (OP), which allows the OP to authenticate the OR
- without authenticating itself. The second is as another OR, which
- allows mutual authentication.
-
- Tor uses TLS for link encryption. All implementations MUST support
- the TLS ciphersuite "TLS_EDH_RSA_WITH_DES_192_CBC3_SHA", and SHOULD
- support "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available.
- Implementations MAY support other ciphersuites, but MUST NOT
- support any suite without ephemeral keys, symmetric keys of at
- least KEY_LEN bits, and digests of at least HASH_LEN bits.
-
- An OP or OR always sends a two-certificate chain, consisting of a
- certificate using a short-term connection key and a second, self-
- signed certificate containing the OR's identity key. The commonName of the
- first certificate is the OR's nickname, and the commonName of the second
- certificate is the OR's nickname, followed by a space and the string
- "<identity>".
-
- All parties receiving certificates must confirm that the identity key is
- as expected. (When initiating a connection, the expected identity key is
- the one given in the directory; when creating a connection because of an
- EXTEND cell, the expected identity key is the one given in the cell.) If
- the key is not as expected, the party must close the connection.
-
- All parties SHOULD reject connections to or from ORs that have malformed
- or missing certificates. ORs MAY accept or reject connections from OPs
- with malformed or missing certificates.
-
- Once a TLS connection is established, the two sides send cells
- (specified below) to one another. Cells are sent serially. All
- cells are CELL_LEN bytes long. Cells may be sent embedded in TLS
- records of any size or divided across TLS records, but the framing
- of TLS records MUST NOT leak information about the type or contents
- of the cells.
-
- TLS connections are not permanent. An OP or an OR may close a
- connection to an OR if there are no circuits running over the
- connection, and an amount of time (KeepalivePeriod, defaults to 5
- minutes) has passed.
-
- (As an exception, directory servers may try to stay connected to all of
- the ORs -- though this will be phased out for the Tor 0.1.2.x release.)
-
-3. Cell Packet format
-
- The basic unit of communication for onion routers and onion
- proxies is a fixed-width "cell". Each cell contains the following
- fields:
-
- CircID [2 bytes]
- Command [1 byte]
- Payload (padded with 0 bytes) [CELL_LEN-3 bytes]
- [Total size: CELL_LEN bytes]
-
- The CircID field determines which circuit, if any, the cell is
- associated with.
-
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 6.2)
- 1 -- CREATE (Create a circuit) (See Sec 4.1)
- 2 -- CREATED (Acknowledge create) (See Sec 4.1)
- 3 -- RELAY (End-to-end data) (See Sec 4.5 and 5)
- 4 -- DESTROY (Stop using a circuit) (See Sec 4.4)
- 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4.1)
- 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4.1)
- 7 -- HELLO (Introduce the OR) (See Sec 7.1)
-
- The interpretation of 'Payload' depends on the type of the cell.
- PADDING: Payload is unused.
- CREATE: Payload contains the handshake challenge.
- CREATED: Payload contains the handshake response.
- RELAY: Payload contains the relay header and relay body.
- DESTROY: Payload contains a reason for closing the circuit.
- (see 4.4)
- Upon receiving any other value for the command field, an OR must
- drop the cell.
-
- The payload is padded with 0 bytes.
-
- PADDING cells are currently used to implement connection keepalive.
- If there is no other traffic, ORs and OPs send one another a PADDING
- cell every few minutes.
-
- CREATE, CREATED, and DESTROY cells are used to manage circuits;
- see section 4 below.
-
- RELAY cells are used to send commands and data along a circuit; see
- section 5 below.
-
- HELLO cells are used to introduce parameters and characteristics of
- Tor clients and servers when connections are established.
-
-4. Circuit management
-
-4.1. CREATE and CREATED cells
-
- Users set up circuits incrementally, one hop at a time. To create a
- new circuit, OPs send a CREATE cell to the first node, with the
- first half of the DH handshake; that node responds with a CREATED
- cell with the second half of the DH handshake plus the first 20 bytes
- of derivative key data (see section 4.2). To extend a circuit past
- the first hop, the OP sends an EXTEND relay cell (see section 5)
- which instructs the last node in the circuit to send a CREATE cell
- to extend the circuit.
-
- The payload for a CREATE cell is an 'onion skin', which consists
- of the first step of the DH handshake data (also known as g^x).
- This value is hybrid-encrypted (see 0.3) to Bob's public key, giving
- an onion-skin of:
- PK-encrypted:
- Padding padding [PK_PAD_LEN bytes]
- Symmetric key [KEY_LEN bytes]
- First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes]
- Symmetrically encrypted:
- Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN)
- bytes]
-
- The relay payload for an EXTEND relay cell consists of:
- Address [4 bytes]
- Port [2 bytes]
- Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
- Identity fingerprint [HASH_LEN bytes]
-
- The port and address field denote the IPV4 address and port of the next
- onion router in the circuit; the public key hash is the hash of the PKCS#1
- ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
- above.) (Including this hash allows the extending OR verify that it is
- indeed connected to the correct target OR, and prevents certain
- man-in-the-middle attacks.)
-
- The payload for a CREATED cell, or the relay payload for an
- EXTENDED cell, contains:
- DH data (g^y) [DH_LEN bytes]
- Derivative key data (KH) [HASH_LEN bytes] <see 4.2 below>
-
- The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer,
- selected by the node (OP or OR) that sends the CREATE cell. To prevent
- CircID collisions, when one OR sends a CREATE cell to another, it chooses
- from only one half of the possible values based on the ORs' public
- identity keys: if the sending OR has a lower key, it chooses a CircID with
- an MSB of 0; otherwise, it chooses a CircID with an MSB of 1.
-
- Public keys are compared numerically by modulus.
-
- As usual with DH, x and y MUST be generated randomly.
-
-4.1.1. CREATE_FAST/CREATED_FAST cells
-
- When initializing the first hop of a circuit, the OP has already
- established the OR's identity and negotiated a secret key using TLS.
- Because of this, it is not always necessary for the OP to perform the
- public key operations to create a circuit. In this case, the
- OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first
- hop only. The OR responds with a CREATED_FAST cell, and the circuit is
- created.
-
- A CREATE_FAST cell contains:
-
- Key material (X) [HASH_LEN bytes]
-
- A CREATED_FAST cell contains:
-
- Key material (Y) [HASH_LEN bytes]
- Derivative key data [HASH_LEN bytes] (See 4.2 below)
-
- The values of X and Y must be generated randomly.
-
- [Versions of Tor before 0.1.0.6-rc did not support these cell types;
- clients should not send CREATE_FAST cells to older Tor servers.]
-
-4.2. Setting circuit keys
-
- Once the handshake between the OP and an OR is completed, both can
- now calculate g^xy with ordinary DH. Before computing g^xy, both client
- and server MUST verify that the received g^x or g^y value is not degenerate;
- that is, it must be strictly greater than 1 and strictly less than p-1
- where p is the DH modulus. Implementations MUST NOT complete a handshake
- with degenerate keys. Implementations MUST NOT discard other "weak"
- g^x values.
-
- (Discarding degenerate keys is critical for security; if bad keys
- are not discarded, an attacker can substitute the server's CREATED
- cell's g^y with 0 or 1, thus creating a known g^xy and impersonating
- the server. Discarding other keys may allow attacks to learn bits of
- the private key.)
-
- (The mainline Tor implementation, in the 0.1.1.x-alpha series, discarded
- all g^x values less than 2^24, greater than p-2^24, or having more than
- 1024-16 identical bits. This served no useful purpose, and we stopped.)
-
- If CREATE or EXTEND is used to extend a circuit, the client and server
- base their key material on K0=g^xy, represented as a big-endian unsigned
- integer.
-
- If CREATE_FAST is used, the client and server base their key material on
- K0=X|Y.
-
- From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of
- derivative key data as
- K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ...
-
- The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward
- digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
- KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K
- are discarded.
-
- KH is used in the handshake response to demonstrate knowledge of the
- computed shared key. Df is used to seed the integrity-checking hash
- for the stream of data going from the OP to the OR, and Db seeds the
- integrity-checking hash for the data stream from the OR to the OP. Kf
- is used to encrypt the stream of data going from the OP to the OR, and
- Kb is used to encrypt the stream of data going from the OR to the OP.
-
-4.3. Creating circuits
-
- When creating a circuit through the network, the circuit creator
- (OP) performs the following steps:
-
- 1. Choose an onion router as an exit node (R_N), such that the onion
- router's exit policy includes at least one pending stream that
- needs a circuit (if there are any).
-
- 2. Choose a chain of (N-1) onion routers
- (R_1...R_N-1) to constitute the path, such that no router
- appears in the path twice.
-
- 3. If not already connected to the first router in the chain,
- open a new connection to that router.
-
- 4. Choose a circID not already in use on the connection with the
- first router in the chain; send a CREATE cell along the
- connection, to be received by the first onion router.
-
- 5. Wait until a CREATED cell is received; finish the handshake
- and extract the forward key Kf_1 and the backward key Kb_1.
-
- 6. For each subsequent onion router R (R_2 through R_N), extend
- the circuit to R.
-
- To extend the circuit by a single onion router R_M, the OP performs
- these steps:
-
- 1. Create an onion skin, encrypted to R_M's public key.
-
- 2. Send the onion skin in a relay EXTEND cell along
- the circuit (see section 5).
-
- 3. When a relay EXTENDED cell is received, verify KH, and
- calculate the shared keys. The circuit is now extended.
-
- When an onion router receives an EXTEND relay cell, it sends a CREATE
- cell to the next onion router, with the enclosed onion skin as its
- payload. The initiating onion router chooses some circID not yet
- used on the connection between the two onion routers. (But see
- section 4.1. above, concerning choosing circIDs based on
- lexicographic order of nicknames.)
-
- When an onion router receives a CREATE cell, if it already has a
- circuit on the given connection with the given circID, it drops the
- cell. Otherwise, after receiving the CREATE cell, it completes the
- DH handshake, and replies with a CREATED cell. Upon receiving a
- CREATED cell, an onion router packs it payload into an EXTENDED relay
- cell (see section 5), and sends that cell up the circuit. Upon
- receiving the EXTENDED relay cell, the OP can retrieve g^y.
-
- (As an optimization, OR implementations may delay processing onions
- until a break in traffic allows time to do so without harming
- network latency too greatly.)
-
-4.4. Tearing down circuits
-
- Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all streams on a circuit are closed and the
- circuit's intended lifetime is over. Circuits may be torn down
- either completely or hop-by-hop.
-
- To tear down a circuit completely, an OR or OP sends a DESTROY
- cell to the adjacent nodes on that circuit, using the appropriate
- direction's circID.
-
- Upon receiving an outgoing DESTROY cell, an OR frees resources
- associated with the corresponding circuit. If it's not the end of
- the circuit, it sends a DESTROY cell for that circuit to the next OR
- in the circuit. If the node is the end of the circuit, then it tears
- down any associated edge connections (see section 5.1).
-
- After a DESTROY cell has been processed, an OR ignores all data or
- destroy cells for the corresponding circuit.
-
- To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell
- signaling a given OR (Stream ID zero). That OR sends a DESTROY
- cell to the next node in the circuit, and replies to the OP with a
- RELAY_TRUNCATED cell.
-
- When an unrecoverable error occurs along one connection in a
- circuit, the nodes on either side of the connection should, if they
- are able, act as follows: the node closer to the OP should send a
- RELAY_TRUNCATED cell towards the OP; the node farther from the OP
- should send a DESTROY cell down the circuit.
-
- The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet,
- describing why the circuit is being closed or truncated. When sending a
- TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell,
- the error code should be propagated. The origin of a circuit always sets
- this error code to 0, to avoid leaking its version.
-
- The error codes are:
- 0 -- NONE (No reason given.)
- 1 -- PROTOCOL (Tor protocol violation.)
- 2 -- INTERNAL (Internal error.)
- 3 -- REQUESTED (A client sent a TRUNCATE command.)
- 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.)
- 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.)
- 6 -- CONNECTFAILED (Unable to reach server.)
- 7 -- OR_IDENTITY (Connected to server, but its OR identity was not
- as expected.)
- 8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit
- died.)
-
- [Versions of Tor prior to 0.1.0.11 didn't send reasons; implementations
- MUST accept empty TRUNCATED and DESTROY cells.]
-
-4.5. Routing relay cells
-
- When an OR receives a RELAY cell, it checks the cell's circID and
- determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the RELAY cell.
-
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the payload
- with the stream cipher, as follows:
- 'Forward' relay cell (same direction as CREATE):
- Use Kf as key; decrypt.
- 'Back' relay cell (opposite direction from CREATE):
- Use Kb as key; encrypt.
- Note that in counter mode, decrypt and encrypt are the same operation.
-
- The OR then decides whether it recognizes the relay cell, by
- inspecting the payload as described in section 5.1 below. If the OR
- recognizes the cell, it processes the contents of the relay cell.
- Otherwise, it passes the decrypted relay cell along the circuit if
- the circuit continues. If the OR at the end of the circuit
- encounters an unrecognized relay cell, an error has occurred: the OR
- sends a DESTROY cell to tear down the circuit.
-
- When a relay cell arrives at an OP, the OP decrypts the payload
- with the stream cipher as follows:
- OP receives data cell:
- For I=N...1,
- Decrypt with Kb_I. If the payload is recognized (see
- section 5.1), then stop and process the payload.
-
- For more information, see section 5 below.
-
-5. Application connections and stream management
-
-5.1. Relay cells
-
- Within a circuit, the OP and the exit node use the contents of
- RELAY packets to tunnel end-to-end commands and TCP connections
- ("Streams") across circuits. End-to-end commands can be initiated
- by either edge; streams are initiated by the OP.
-
- The payload of each unencrypted RELAY cell consists of:
- Relay command [1 byte]
- 'Recognized' [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- Data [CELL_LEN-14 bytes]
-
- The relay commands are:
- 1 -- RELAY_BEGIN [forward]
- 2 -- RELAY_DATA [forward or backward]
- 3 -- RELAY_END [forward or backward]
- 4 -- RELAY_CONNECTED [backward]
- 5 -- RELAY_SENDME [forward or backward]
- 6 -- RELAY_EXTEND [forward]
- 7 -- RELAY_EXTENDED [backward]
- 8 -- RELAY_TRUNCATE [forward]
- 9 -- RELAY_TRUNCATED [backward]
- 10 -- RELAY_DROP [forward or backward]
- 11 -- RELAY_RESOLVE [forward]
- 12 -- RELAY_RESOLVED [backward]
-
- Commands labelled as "forward" must only be sent by the originator
- of the circuit. Commands labelled as "backward" must only be sent by
- other nodes in the circuit back to the originator. Commands marked
- as either can be sent either by the originator or other nodes.
-
- The 'recognized' field in any unencrypted relay payload is always set
- to zero; the 'digest' field is computed as the first four bytes of
- the running digest of all the bytes that have been destined for
- this hop of the circuit or originated from this hop of the circuit,
- seeded from Df or Db respectively (obtained in section 4.2 above),
- and including this RELAY cell's entire payload (taken with the digest
- field set to zero).
-
- When the 'recognized' field of a RELAY cell is zero, and the digest
- is correct, the cell is considered "recognized" for the purposes of
- decryption (see section 4.5 above).
-
- (The digest does not include any bytes from relay cells that do
- not start or end at this hop of the circuit. That is, it does not
- include forwarded data. Therefore if 'recognized' is zero but the
- digest does not match, the running digest at that node should
- not be updated, and the cell should be forwarded on.)
-
- All RELAY cells pertaining to the same tunneled stream have the
- same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY
- cells that affect the entire circuit rather than a particular
- stream use a StreamID of zero.
-
- The 'Length' field of a relay cell contains the number of bytes in
- the relay payload which contain real payload data. The remainder of
- the payload is padded with NUL bytes.
-
- If the RELAY cell is recognized but the relay command is not
- understood, the cell must be dropped and ignored. Its contents
- still count with respect to the digests, though. [Before
- 0.1.1.10, Tor closed circuits when it received an unknown relay
- command. Perhaps this will be more forward-compatible. -RD]
-
-5.2. Opening streams and transferring data
-
- To open a new anonymized TCP connection, the OP chooses an open
- circuit to an exit that may be able to connect to the destination
- address, selects an arbitrary StreamID not yet used on that circuit,
- and constructs a RELAY_BEGIN cell with a payload encoding the address
- and port of the destination host. The payload format is:
-
- ADDRESS | ':' | PORT | [00]
-
- where ADDRESS can be a DNS hostname, or an IPv4 address in
- dotted-quad format, or an IPv6 address surrounded by square brackets;
- and where PORT is encoded in decimal.
-
- [What is the [00] for? -NM]
- [It's so the payload is easy to parse out with string funcs -RD]
-
- Upon receiving this cell, the exit node resolves the address as
- necessary, and opens a new TCP connection to the target port. If the
- address cannot be resolved, or a connection can't be established, the
- exit node replies with a RELAY_END cell. (See 5.4 below.)
- Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose
- payload is in one of the following formats:
- The IPv4 address to which the connection was made [4 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- or
- Four zero-valued octets [4 octets]
- An address type (6) [1 octet]
- The IPv6 address to which the connection was made [16 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
- field. No version of Tor currently generates the IPv6 format.
-
- Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later
- versions set the TTL to the last value seen from a DNS server, and expire
- their own cached entries after a fixed interval. This prevents certain
- attacks.]
-
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package stream data in RELAY_DATA cells, and upon receiving such
- cells, echo their contents to the corresponding TCP stream.
- RELAY_DATA cells sent to unrecognized streams are dropped.
-
- Relay RELAY_DROP cells are long-range dummies; upon receiving such
- a cell, the OR or OP must drop it.
-
-5.3. Closing streams
-
- When an anonymized TCP connection is closed, or an edge node
- encounters error on any stream, it sends a 'RELAY_END' cell along the
- circuit (if possible) and closes the TCP connection immediately. If
- an edge node receives a 'RELAY_END' cell for any stream, it closes
- the TCP connection completely, and sends nothing more along the
- circuit for that stream.
-
- The payload of a RELAY_END cell begins with a single 'reason' byte to
- describe why the stream is closing, plus optional data (depending on
- the reason.) The values are:
-
- 1 -- REASON_MISC (catch-all for unlisted reasons)
- 2 -- REASON_RESOLVEFAILED (couldn't look up hostname)
- 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*]
- 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port)
- 5 -- REASON_DESTROY (Circuit is being destroyed)
- 6 -- REASON_DONE (Anonymized TCP connection was closed)
- 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out
- while connecting)
- 8 -- (unallocated) [**]
- 9 -- REASON_HIBERNATING (OR is temporarily hibernating)
- 10 -- REASON_INTERNAL (Internal error at the OR)
- 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request)
- 12 -- REASON_CONNRESET (Connection was unexpectedly reset)
- 13 -- REASON_TORPROTOCOL (Sent when closing connection because of
- Tor protocol violations.)
-
- (With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address
- forms the optional data; no other reason currently has extra data.
- As of 0.1.1.6, the body also contains a 4-byte TTL.)
-
- OPs and ORs MUST accept reasons not on the above list, since future
- versions of Tor may provide more fine-grained reasons.
-
- [*] Older versions of Tor also send this reason when connections are
- reset.
- [**] Due to a bug in versions of Tor through 0095, error reason 8 must
- remain allocated until that version is obsolete.
-
- --- [The rest of this section describes unimplemented functionality.]
-
- Because TCP connections can be half-open, we follow an equivalent
- to TCP's FIN/FIN-ACK/ACK protocol to close streams.
-
- An exit connection can have a TCP stream in one of three states:
- 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
- of modeling transitions, we treat 'CLOSED' as a fourth state,
- although connections in this state are not, in fact, tracked by the
- onion router.
-
- A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
- the corresponding TCP connection, the edge node sends a 'RELAY_FIN'
- cell along the circuit and changes its state to 'DONE_PACKAGING'.
- Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to
- the corresponding TCP connection (e.g., by calling
- shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
-
- When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
- also sends a 'RELAY_FIN' along the circuit, and changes its state
- to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
- 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to
- 'CLOSED'.
-
- If an edge node encounters an error on any stream, it sends a
- 'RELAY_END' cell (if possible) and closes the stream immediately.
-
-5.4. Remote hostname lookup
-
- To find the address associated with a hostname, the OP sends a
- RELAY_RESOLVE cell containing the hostname to be resolved. (For a reverse
- lookup, the OP sends a RELAY_RESOLVE cell containing an in-addr.arpa
- address.) The OR replies with a RELAY_RESOLVED cell containing a status
- byte, and any number of answers. Each answer is of the form:
- Type (1 octet)
- Length (1 octet)
- Value (variable-width)
- TTL (4 octets)
- "Length" is the length of the Value field.
- "Type" is one of:
- 0x00 -- Hostname
- 0x04 -- IPv4 address
- 0x06 -- IPv6 address
- 0xF0 -- Error, transient
- 0xF1 -- Error, nontransient
-
- If any answer has a type of 'Error', then no other answer may be given.
-
- The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the
- corresponding RELAY_RESOLVED cell must use the same streamID. No stream
- is actually created by the OR when resolving the name.
-
-6. Flow control
-
-6.1. Link throttling
-
- Each node should do appropriate bandwidth throttling to keep its
- user happy.
-
- Communicants rely on TCP's default flow control to push back when they
- stop reading.
-
-6.2. Link padding
-
- Currently nodes are not required to do any sort of link padding or
- dummy traffic. Because strong attacks exist even with link padding,
- and because link padding greatly increases the bandwidth requirements
- for running a node, we plan to leave out link padding until this
- tradeoff is better understood.
-
-6.3. Circuit-level flow control
-
- To control a circuit's bandwidth usage, each OR keeps track of
- two 'windows', consisting of how many RELAY_DATA cells it is
- allowed to package for transmission, and how many RELAY_DATA cells
- it is willing to deliver to streams outside the network.
- Each 'window' value is initially set to 1000 data cells
- in each direction (cells that are not data cells do not affect
- the window). When an OR is willing to deliver more cells, it sends a
- RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
- receives a RELAY_SENDME cell with stream ID zero, it increments its
- packaging window.
-
- Each of these cells increments the corresponding window by 100.
-
- The OP behaves identically, except that it must track a packaging
- window and a delivery window for every OR in the circuit.
-
- An OR or OP sends cells to increment its delivery window when the
- corresponding window value falls under some threshold (900).
-
- If a packaging window reaches 0, the OR or OP stops reading from
- TCP connections for all streams on the corresponding circuit, and
- sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
-[this stuff is badly worded; copy in the tor-design section -RD]
-
-6.4. Stream-level flow control
-
- Edge nodes use RELAY_SENDME cells to implement end-to-end flow
- control for individual connections across circuits. Similarly to
- circuit-level flow control, edge nodes begin with a window of cells
- (500) per stream, and increment the window by a fixed value (50)
- upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
- cells when both a) the window is <= 450, and b) there are less than
- ten cell payloads remaining to be flushed at that edge.
-
Added: tor/trunk/doc/tor-spec-v2.txt
===================================================================
--- tor/trunk/doc/tor-spec-v2.txt 2006-12-31 06:18:16 UTC (rev 9225)
+++ tor/trunk/doc/tor-spec-v2.txt 2006-12-31 19:31:45 UTC (rev 9226)
@@ -0,0 +1,942 @@
+$Id$
+
+ Tor Protocol Specification
+
+ Roger Dingledine
+ Nick Mathewson
+
+Note: This document aims to specify Tor as implemented in 0.1.2.1-alpha-dev
+and later. Future versions of Tor will implement improved protocols, and
+compatibility is not guaranteed.
+
+THIS DOCUMENT IS UNSTABLE. Right now, we're revising the protocol to remove
+a few long-standing limitations. For the most stable current version of the
+protocol, see tor-spec-v0.txt; current versions of Tor are backward-compatible.
+
+This specification is not a design document; most design criteria
+are not examined. For more information on why Tor acts as it does,
+see tor-design.pdf.
+
+TODO for v2 revision:
+ - Fix onionskin handshake scheme to be more mainstream, less nutty.
+ Can we just do
+ E(HMAC(g^x), g^x) rather than just E(g^x) ?
+ No, that has the same flaws as before. We should send
+ E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
+ Better ask Ian; probably Stephen too.
+ - Versioned CREATE and friends
+ - Length on CREATE and friends
+ - Versioning on circuits
+ - Versioning on create cells
+ - SHA1 is showing its age
+ - Not being able to upgrade ciphersuites or increase key lengths is
+ lame.
+
+TODO:
+ - REASON_CONNECTFAILED should include an IP.
+ - Copy prose from tor-design to make everything more readable.
+ - Spec when we should rotate which keys (tls, link, etc)?
+
+0. Preliminaries
+
+0.1. Notation and encoding
+
+ PK -- a public key.
+ SK -- a private key.
+ K -- a key for a symmetric cypher.
+
+ a|b -- concatenation of 'a' and 'b'.
+
+ [A0 B1 C2] -- a three-byte sequence, containing the bytes with
+ hexadecimal values A0, B1, and C2, in that order.
+
+ All numeric values are encoded in network (big-endian) order.
+
+ H(m) -- a cryptographic hash of m.
+
+0.2. Security parameters
+
+ Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman
+ protocol, and a hash function.
+
+ KEY_LEN -- the length of the stream cipher's key, in bytes.
+
+ PK_ENC_LEN -- the length of a public-key encrypted message, in bytes.
+ PK_PAD_LEN -- the number of bytes added in padding for public-key
+ encryption, in bytes. (The largest number of bytes that can be encrypted
+ in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.)
+
+ DH_LEN -- the number of bytes used to represent a member of the
+ Diffie-Hellman group.
+ DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x).
+
+ HASH_LEN -- the length of the hash function's output, in bytes.
+
+ PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509)
+
+ CELL_LEN -- The length of a Tor cell, in bytes.
+
+0.3. Ciphers
+
+ For a stream cipher, we use 128-bit AES in counter mode, with an IV of all
+ 0 bytes.
+
+ For a public-key cipher, we use RSA with 1024-bit keys and a fixed
+ exponent of 65537. We use OAEP padding, with SHA-1 as its digest
+ function. (For OAEP padding, see
+ ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
+
+ For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we
+ use the 1024-bit safe prime from rfc2409 section 6.2 whose hex
+ representation is:
+
+ "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
+ "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
+ "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
+ "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
+ "49286651ECE65381FFFFFFFFFFFFFFFF"
+
+ As an optimization, implementations SHOULD choose DH private keys (x) of
+ 320 bits. Implementations that do this MUST never use any DH key more
+ than once.
+ [May other implementations reuse their DH keys?? -RD]
+ [Probably not. Conceivably, you could get away with changing DH keys once
+ per second, but there are too many oddball attacks for me to be
+ comfortable that this is safe. -NM]
+
+ For a hash function, we use SHA-1.
+
+ KEY_LEN=16.
+ DH_LEN=128; DH_SEC_LEN=40.
+ PK_ENC_LEN=128; PK_PAD_LEN=42.
+ HASH_LEN=20.
+
+ When we refer to "the hash of a public key", we mean the SHA-1 hash of the
+ DER encoding of an ASN.1 RSA public key (as specified in PKCS.1).
+
+ All "random" values should be generated with a cryptographically strong
+ random number generator, unless otherwise noted.
+
+ The "hybrid encryption" of a byte sequence M with a public key PK is
+ computed as follows:
+ 1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK.
+ 2. Otherwise, generate a KEY_LEN byte random key K.
+ Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M,
+ and let M2 = the rest of M.
+ Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher,
+ using the key K. Concatenate these encrypted values.
+ [XXX Note that this "hybrid encryption" approach does not prevent
+ an attacker from adding or removing bytes to the end of M. It also
+ allows attackers to modify the bytes not covered by the OAEP --
+ see Goldberg's PET2006 paper for details. We will add a MAC to this
+ scheme one day. -RD]
+
+0.4. Other parameter values
+
+ CELL_LEN=512
+
+1. System overview
+
+ Tor is a distributed overlay network designed to anonymize
+ low-latency TCP-based applications such as web browsing, secure shell,
+ and instant messaging. Clients choose a path through the network and
+ build a ``circuit'', in which each node (or ``onion router'' or ``OR'')
+ in the path knows its predecessor and successor, but no other nodes in
+ the circuit. Traffic flowing down the circuit is sent in fixed-size
+ ``cells'', which are unwrapped by a symmetric key at each node (like
+ the layers of an onion) and relayed downstream.
+
+1.1. Protocol Versioning
+
+ The node-to-node TLS-based "OR connection" protocol and the multi-hop
+ "circuit" protocol are versioned quasi-independently. (Certain versions
+ of the circuit protocol may require a minimum version of the connection
+ protocol to be used.)
+
+ Version numbers are incremented for backward-incompatible protocol changes
+ only. Backward-compatible changes are generally implemented by adding
+ additional fields to existing structures; implementations MUST ignore
+ fields they do not expect.
+
+ Parties negotiate OR connection versions as described below in sections
+ 4.1 and 4.2.
+
+2. Connections
+
+ Tor uses TLS for link encryption. All implementations MUST support
+ the TLS ciphersuite "TLS_EDH_RSA_WITH_DES_192_CBC3_SHA", and SHOULD
+ support "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available.
+ Implementations MAY support other ciphersuites, but MUST NOT
+ support any suite without ephemeral keys, symmetric keys of at
+ least KEY_LEN bits, and digests of at least HASH_LEN bits.
+
+ Even though the connection protocol is identical, we think of the
+ initiator as either an onion router (OR) if it is willing to relay
+ traffic for other Tor users, or an onion proxy (OP) if it only handles
+ local requests. Onion proxies SHOULD NOT provide long-term-trackable
+ identifiers in their handshakes.
+
+ The connection initiator always sends a two-certificate chain,
+ consisting of a
+ certificate using a short-term connection key and a second, self-
+ signed certificate containing the OR's identity key. The commonName of the
+ first certificate is the OR's nickname, and the commonName of the second
+ certificate is the OR's nickname, followed by a space and the string
+ "<identity>".
+
+ Implementations running Protocol 1 and earlier use an
+ organizationName of "Tor" or "TOR". Future implementations (which
+ support the version negotiation protocol in section 4.1) MUST NOT
+ have either of these values for their organizationName.
+
+ All parties receiving certificates must confirm that the identity key is
+ as expected. (When initiating a connection, the expected identity key is
+ the one given in the directory; when creating a connection because of an
+ EXTEND cell, the expected identity key is the one given in the cell.) If
+ the key is not as expected, the party must close the connection.
+
+ All parties SHOULD reject connections to or from ORs that have malformed
+ or missing certificates. ORs MAY accept or reject connections from OPs
+ with malformed or missing certificates.
+
+ Once a TLS connection is established, the two sides send cells
+ (specified below) to one another. Cells are sent serially. All
+ cells are CELL_LEN bytes long. Cells may be sent embedded in TLS
+ records of any size or divided across TLS records, but the framing
+ of TLS records MUST NOT leak information about the type or contents
+ of the cells.
+
+ TLS connections are not permanent. Either side may close a connection
+ if there are no circuits running over it and an amount of time
+ (KeepalivePeriod, defaults to 5 minutes) has passed.
+
+ (As an exception, directory servers may try to stay connected to all of
+ the ORs -- though this will be phased out for the Tor 0.1.2.x release.)
+
+3. Cell Packet format
+
+ The basic unit of communication for onion routers and onion
+ proxies is a fixed-width "cell".
+
+ On a version 1 connection, each cell contains the following
+ fields:
+
+ CircID [2 bytes]
+ Command [1 byte]
+ Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
+
+ On a version 2 connection, each cell contains the following fields:
+
+ CircID [3 bytes]
+ Command [1 byte]
+ Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
+
+ The CircID field determines which circuit, if any, the cell is
+ associated with.
+
+ The 'Command' field holds one of the following values:
+ 0 -- PADDING (Padding) (See Sec 7.2)
+ 1 -- CREATE (Create a circuit) (See Sec 5.1)
+ 2 -- CREATED (Acknowledge create) (See Sec 5.1)
+ 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6)
+ 4 -- DESTROY (Stop using a circuit) (See Sec 5.4)
+ 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1)
+ 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
+ 7 -- VERSIONS (Negotiate versions) (See Sec 4.1)
+ 8 -- NETINFO (Time and MITM-prevention) (See Sec 4.2)
+
+ The interpretation of 'Payload' depends on the type of the cell.
+ PADDING: Payload is unused.
+ CREATE: Payload contains the handshake challenge.
+ CREATED: Payload contains the handshake response.
+ RELAY: Payload contains the relay header and relay body.
+ DESTROY: Payload contains a reason for closing the circuit.
+ (see 5.4)
+ Upon receiving any other value for the command field, an OR must
+ drop the cell. [XXXX Versions prior to 0.1.0.?? logged a warning
+ when dropping the cell; this is bad behavior. -NM]
+
+ The payload is padded with 0 bytes.
+
+ PADDING cells are currently used to implement connection keepalive.
+ If there is no other traffic, ORs and OPs send one another a PADDING
+ cell every few minutes.
+
+ CREATE, CREATED, and DESTROY cells are used to manage circuits;
+ see section 4 below.
+
+ RELAY cells are used to send commands and data along a circuit; see
+ section 5 below.
+
+ VERSIONS cells are used to introduce parameters and characteristics of
+ Tor clients and servers when connections are established.
+
+4, Connection management
+
+ Upon establishing a TLS connection, both parties immediately begin
+ negotiating a connection protocol version and other connection parameters.
+
+4.1. VERSIONS cells
+
+ When a Tor connection is established, both parties normally send a
+ VERSIONS cell before sending any other cells. (But see below.)
+
+ NumVersions [1 byte]
+ Versions [NumVersions bytes]
+
+ "Versions" is a sequence of NumVersions link connection protocol versions,
+ each one byte long. Parties should list all of the versions which they
+ are able and willing to support. Parties can only communicate if they
+ have some connection protocol version in common.
+
+ Version 0.1.x.y-alpha and earlier don't understand VERSIONS cells,
+ and therefore don't support version negotiation. Thus, waiting until
+ the other side has sent a VERSIONS cell won't work for these servers:
+ if they send no cells back, it is impossible to tell whether they
+ have sent a VERSIONS cell that has been stalled, or whether they have
+ dropped our own VERSIONS cell as unrecognized. Thus, immediately after
+ a TLS connection has been established, the parties check whether the
+ other side has an obsolete certificate (organizationName equal to "Tor"
+ or "TOR"). If the other party presented an obsolete certificate,
+ we assume a v0 connection. Otherwise, both parties send VERSIONS
+ cells listing all their supported versions. Upon receiving the
+ other party's VERSIONS cell, the implementation begins using the
+ highest-valued version common to both cells. If the first cell from
+ the other party is _not_ a VERSIONS cell, we assume a v0 protocol.
+
+ Implementations MUST discard cells that are not the first cells sent on a
+ connection.
+
+4.2. MITM-prevention and time checking
+
+ If we negotiate a v1 connection or higher, the first cell we send SHOULD
+ be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
+ times.
+
+ A NETINFO cell contains:
+ Timestamp [4 bytes]
+ This OR's address [variable]
+ Other OR's address [variable]
+
+ Timestamp is the OR's current Unix time, in seconds since the epoch. If
+ an implementation receives time values from many validated ORs that
+ indicate that its clock is skewed, it SHOULD try to warn the
+ administrator.
+
+ Each address contains Type/Length/Value as used in Section 6.4. The first
+ address is the address of the interface the party sending the VERSIONS cell
+ used to connect to or accept connections from the other -- we include it
+ to block a man-in-the-middle attack on TLS that lets an attacker bounce
+ traffic through his own computers to enable timing and packet-counting
+ attacks.
+
+ The second address is the one that the party sending the VERSIONS cell
+ believes the other has -- it can be used to learn what your IP address
+ is if you have no other hints.
+
+5. Circuit management
+
+5.1. CREATE and CREATED cells
+
+ Users set up circuits incrementally, one hop at a time. To create a
+ new circuit, OPs send a CREATE cell to the first node, with the
+ first half of the DH handshake; that node responds with a CREATED
+ cell with the second half of the DH handshake plus the first 20 bytes
+ of derivative key data (see section 5.2). To extend a circuit past
+ the first hop, the OP sends an EXTEND relay cell (see section 5)
+ which instructs the last node in the circuit to send a CREATE cell
+ to extend the circuit.
+
+ The payload for a CREATE cell is an 'onion skin', which consists
+ of the first step of the DH handshake data (also known as g^x).
+ This value is hybrid-encrypted (see 0.3) to Bob's public key, giving
+ an onion-skin of:
+ PK-encrypted:
+ Padding padding [PK_PAD_LEN bytes]
+ Symmetric key [KEY_LEN bytes]
+ First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes]
+ Symmetrically encrypted:
+ Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN)
+ bytes]
+
+ The relay payload for an EXTEND relay cell consists of:
+ Address [4 bytes]
+ Port [2 bytes]
+ Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
+ Identity fingerprint [HASH_LEN bytes]
+
+ The port and address field denote the IPV4 address and port of the next
+ onion router in the circuit; the public key hash is the hash of the PKCS#1
+ ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
+ above.) (Including this hash allows the extending OR verify that it is
+ indeed connected to the correct target OR, and prevents certain
+ man-in-the-middle attacks.)
+
+ The payload for a CREATED cell, or the relay payload for an
+ EXTENDED cell, contains:
+ DH data (g^y) [DH_LEN bytes]
+ Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below>
+
+ The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer,
+ selected by the node (OP or OR) that sends the CREATE cell. To prevent
+ CircID collisions, when one OR sends a CREATE cell to another, it chooses
+ from only one half of the possible values based on the ORs' public
+ identity keys: if the sending OR has a lower key, it chooses a CircID with
+ an MSB of 0; otherwise, it chooses a CircID with an MSB of 1.
+
+ Public keys are compared numerically by modulus.
+
+ As usual with DH, x and y MUST be generated randomly.
+
+[
+ To implement backward-compatible version negotiation, parties MUST
+ drop CREATE cells with all-[00] onion-skins.
+]
+
+5.1.1. CREATE_FAST/CREATED_FAST cells
+
+ When initializing the first hop of a circuit, the OP has already
+ established the OR's identity and negotiated a secret key using TLS.
+ Because of this, it is not always necessary for the OP to perform the
+ public key operations to create a circuit. In this case, the
+ OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first
+ hop only. The OR responds with a CREATED_FAST cell, and the circuit is
+ created.
+
+ A CREATE_FAST cell contains:
+
+ Key material (X) [HASH_LEN bytes]
+
+ A CREATED_FAST cell contains:
+
+ Key material (Y) [HASH_LEN bytes]
+ Derivative key data [HASH_LEN bytes] (See 5.2 below)
+
+ The values of X and Y must be generated randomly.
+
+ [Versions of Tor before 0.1.0.6-rc did not support these cell types;
+ clients should not send CREATE_FAST cells to older Tor servers.]
+
+ If an OR sees a circuit created with CREATE_FAST, the OR is sure to be the
+ first hop of a circuit. ORs SHOULD reject attempts to create streams with
+ RELAY_BEGIN exiting the circuit at the first hop: letting Tor be used as a
+ single hop proxy makes exit nodes a more attractive target for compromise.
+
+5.2. Setting circuit keys
+
+ Once the handshake between the OP and an OR is completed, both can
+ now calculate g^xy with ordinary DH. Before computing g^xy, both client
+ and server MUST verify that the received g^x or g^y value is not degenerate;
+ that is, it must be strictly greater than 1 and strictly less than p-1
+ where p is the DH modulus. Implementations MUST NOT complete a handshake
+ with degenerate keys. Implementations MUST NOT discard other "weak"
+ g^x values.
+
+ (Discarding degenerate keys is critical for security; if bad keys
+ are not discarded, an attacker can substitute the server's CREATED
+ cell's g^y with 0 or 1, thus creating a known g^xy and impersonating
+ the server. Discarding other keys may allow attacks to learn bits of
+ the private key.)
+
+ (The mainline Tor implementation, in the 0.1.1.x-alpha series, discarded
+ all g^x values less than 2^24, greater than p-2^24, or having more than
+ 1024-16 identical bits. This served no useful purpose, and we stopped.)
+
+ If CREATE or EXTEND is used to extend a circuit, the client and server
+ base their key material on K0=g^xy, represented as a big-endian unsigned
+ integer.
+
+ If CREATE_FAST is used, the client and server base their key material on
+ K0=X|Y.
+
+ From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of
+ derivative key data as
+ K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ...
+
+ The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward
+ digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
+ KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K
+ are discarded.
+
+ KH is used in the handshake response to demonstrate knowledge of the
+ computed shared key. Df is used to seed the integrity-checking hash
+ for the stream of data going from the OP to the OR, and Db seeds the
+ integrity-checking hash for the data stream from the OR to the OP. Kf
+ is used to encrypt the stream of data going from the OP to the OR, and
+ Kb is used to encrypt the stream of data going from the OR to the OP.
+
+5.3. Creating circuits
+
+ When creating a circuit through the network, the circuit creator
+ (OP) performs the following steps:
+
+ 1. Choose an onion router as an exit node (R_N), such that the onion
+ router's exit policy includes at least one pending stream that
+ needs a circuit (if there are any).
+
+ 2. Choose a chain of (N-1) onion routers
+ (R_1...R_N-1) to constitute the path, such that no router
+ appears in the path twice.
+
+ 3. If not already connected to the first router in the chain,
+ open a new connection to that router.
+
+ 4. Choose a circID not already in use on the connection with the
+ first router in the chain; send a CREATE cell along the
+ connection, to be received by the first onion router.
+
+ 5. Wait until a CREATED cell is received; finish the handshake
+ and extract the forward key Kf_1 and the backward key Kb_1.
+
+ 6. For each subsequent onion router R (R_2 through R_N), extend
+ the circuit to R.
+
+ To extend the circuit by a single onion router R_M, the OP performs
+ these steps:
+
+ 1. Create an onion skin, encrypted to R_M's public key.
+
+ 2. Send the onion skin in a relay EXTEND cell along
+ the circuit (see section 5).
+
+ 3. When a relay EXTENDED cell is received, verify KH, and
+ calculate the shared keys. The circuit is now extended.
+
+ When an onion router receives an EXTEND relay cell, it sends a CREATE
+ cell to the next onion router, with the enclosed onion skin as its
+ payload. The initiating onion router chooses some circID not yet
+ used on the connection between the two onion routers. (But see
+ section 5.1. above, concerning choosing circIDs based on
+ lexicographic order of nicknames.)
+
+ When an onion router receives a CREATE cell, if it already has a
+ circuit on the given connection with the given circID, it drops the
+ cell. Otherwise, after receiving the CREATE cell, it completes the
+ DH handshake, and replies with a CREATED cell. Upon receiving a
+ CREATED cell, an onion router packs it payload into an EXTENDED relay
+ cell (see section 5), and sends that cell up the circuit. Upon
+ receiving the EXTENDED relay cell, the OP can retrieve g^y.
+
+ (As an optimization, OR implementations may delay processing onions
+ until a break in traffic allows time to do so without harming
+ network latency too greatly.)
+
+5.4. Tearing down circuits
+
+ Circuits are torn down when an unrecoverable error occurs along
+ the circuit, or when all streams on a circuit are closed and the
+ circuit's intended lifetime is over. Circuits may be torn down
+ either completely or hop-by-hop.
+
+ To tear down a circuit completely, an OR or OP sends a DESTROY
+ cell to the adjacent nodes on that circuit, using the appropriate
+ direction's circID.
+
+ Upon receiving an outgoing DESTROY cell, an OR frees resources
+ associated with the corresponding circuit. If it's not the end of
+ the circuit, it sends a DESTROY cell for that circuit to the next OR
+ in the circuit. If the node is the end of the circuit, then it tears
+ down any associated edge connections (see section 6.1).
+
+ After a DESTROY cell has been processed, an OR ignores all data or
+ destroy cells for the corresponding circuit.
+
+ To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell
+ signaling a given OR (Stream ID zero). That OR sends a DESTROY
+ cell to the next node in the circuit, and replies to the OP with a
+ RELAY_TRUNCATED cell.
+
+ When an unrecoverable error occurs along one connection in a
+ circuit, the nodes on either side of the connection should, if they
+ are able, act as follows: the node closer to the OP should send a
+ RELAY_TRUNCATED cell towards the OP; the node farther from the OP
+ should send a DESTROY cell down the circuit.
+
+ The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet,
+ describing why the circuit is being closed or truncated. When sending a
+ TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell,
+ the error code should be propagated. The origin of a circuit always sets
+ this error code to 0, to avoid leaking its version.
+
+ The error codes are:
+ 0 -- NONE (No reason given.)
+ 1 -- PROTOCOL (Tor protocol violation.)
+ 2 -- INTERNAL (Internal error.)
+ 3 -- REQUESTED (A client sent a TRUNCATE command.)
+ 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.)
+ 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.)
+ 6 -- CONNECTFAILED (Unable to reach server.)
+ 7 -- OR_IDENTITY (Connected to server, but its OR identity was not
+ as expected.)
+ 8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit
+ died.)
+ 9 -- FINISHED (The circuit has expired for being dirty or old.)
+ 10 -- TIMEOUT (Circuit construction took too long)
+ 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE)
+ 12 -- NOSUCHSERVICE (Request for unknown hidden service)
+
+ [Versions of Tor prior to 0.1.0.11 didn't send reasons; implementations
+ MUST accept empty TRUNCATED and DESTROY cells.]
+
+5.5. Routing relay cells
+
+ When an OR receives a RELAY cell, it checks the cell's circID and
+ determines whether it has a corresponding circuit along that
+ connection. If not, the OR drops the RELAY cell.
+
+ Otherwise, if the OR is not at the OP edge of the circuit (that is,
+ either an 'exit node' or a non-edge node), it de/encrypts the payload
+ with the stream cipher, as follows:
+ 'Forward' relay cell (same direction as CREATE):
+ Use Kf as key; decrypt.
+ 'Back' relay cell (opposite direction from CREATE):
+ Use Kb as key; encrypt.
+ Note that in counter mode, decrypt and encrypt are the same operation.
+
+ The OR then decides whether it recognizes the relay cell, by
+ inspecting the payload as described in section 6.1 below. If the OR
+ recognizes the cell, it processes the contents of the relay cell.
+ Otherwise, it passes the decrypted relay cell along the circuit if
+ the circuit continues. If the OR at the end of the circuit
+ encounters an unrecognized relay cell, an error has occurred: the OR
+ sends a DESTROY cell to tear down the circuit.
+
+ When a relay cell arrives at an OP, the OP decrypts the payload
+ with the stream cipher as follows:
+ OP receives data cell:
+ For I=N...1,
+ Decrypt with Kb_I. If the payload is recognized (see
+ section 6..1), then stop and process the payload.
+
+ For more information, see section 6 below.
+
+6. Application connections and stream management
+
+6.1. Relay cells
+
+ Within a circuit, the OP and the exit node use the contents of
+ RELAY packets to tunnel end-to-end commands and TCP connections
+ ("Streams") across circuits. End-to-end commands can be initiated
+ by either edge; streams are initiated by the OP.
+
+ The payload of each unencrypted RELAY cell consists of:
+ Relay command [1 byte]
+ 'Recognized' [2 bytes]
+ StreamID [2 bytes]
+ Digest [4 bytes]
+ Length [2 bytes]
+ Data [CELL_LEN-14 bytes]
+
+ The relay commands are:
+ 1 -- RELAY_BEGIN [forward]
+ 2 -- RELAY_DATA [forward or backward]
+ 3 -- RELAY_END [forward or backward]
+ 4 -- RELAY_CONNECTED [backward]
+ 5 -- RELAY_SENDME [forward or backward] [sometimes control]
+ 6 -- RELAY_EXTEND [forward] [control]
+ 7 -- RELAY_EXTENDED [backward] [control]
+ 8 -- RELAY_TRUNCATE [forward] [control]
+ 9 -- RELAY_TRUNCATED [backward] [control]
+ 10 -- RELAY_DROP [forward or backward] [control]
+ 11 -- RELAY_RESOLVE [forward]
+ 12 -- RELAY_RESOLVED [backward]
+ 13 -- RELAY_BEGIN_DIR [forward]
+
+ Commands labelled as "forward" must only be sent by the originator
+ of the circuit. Commands labelled as "backward" must only be sent by
+ other nodes in the circuit back to the originator. Commands marked
+ as either can be sent either by the originator or other nodes.
+
+ The 'recognized' field in any unencrypted relay payload is always set
+ to zero; the 'digest' field is computed as the first four bytes of
+ the running digest of all the bytes that have been destined for
+ this hop of the circuit or originated from this hop of the circuit,
+ seeded from Df or Db respectively (obtained in section 5.2 above),
+ and including this RELAY cell's entire payload (taken with the digest
+ field set to zero).
+
+ When the 'recognized' field of a RELAY cell is zero, and the digest
+ is correct, the cell is considered "recognized" for the purposes of
+ decryption (see section 5.5 above).
+
+ (The digest does not include any bytes from relay cells that do
+ not start or end at this hop of the circuit. That is, it does not
+ include forwarded data. Therefore if 'recognized' is zero but the
+ digest does not match, the running digest at that node should
+ not be updated, and the cell should be forwarded on.)
+
+ All RELAY cells pertaining to the same tunneled stream have the
+ same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY
+ cells that affect the entire circuit rather than a particular
+ stream use a StreamID of zero -- they are marked in the table above
+ as "[control]" style cells. (Sendme cells are marked as "sometimes
+ control" because they can take include a StreamID or not depending
+ on their purpose -- see Section 7.)
+
+ The 'Length' field of a relay cell contains the number of bytes in
+ the relay payload which contain real payload data. The remainder of
+ the payload is padded with NUL bytes.
+
+ If the RELAY cell is recognized but the relay command is not
+ understood, the cell must be dropped and ignored. Its contents
+ still count with respect to the digests, though. [Before
+ 0.1.1.10, Tor closed circuits when it received an unknown relay
+ command. Perhaps this will be more forward-compatible. -RD]
+
+6.2. Opening streams and transferring data
+
+ To open a new anonymized TCP connection, the OP chooses an open
+ circuit to an exit that may be able to connect to the destination
+ address, selects an arbitrary StreamID not yet used on that circuit,
+ and constructs a RELAY_BEGIN cell with a payload encoding the address
+ and port of the destination host. The payload format is:
+
+ ADDRESS | ':' | PORT | [00]
+
+ where ADDRESS can be a DNS hostname, or an IPv4 address in
+ dotted-quad format, or an IPv6 address surrounded by square brackets;
+ and where PORT is encoded in decimal.
+
+ [What is the [00] for? -NM]
+ [It's so the payload is easy to parse out with string funcs -RD]
+
+ Upon receiving this cell, the exit node resolves the address as
+ necessary, and opens a new TCP connection to the target port. If the
+ address cannot be resolved, or a connection can't be established, the
+ exit node replies with a RELAY_END cell. (See 6.4 below.)
+ Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose
+ payload is in one of the following formats:
+ The IPv4 address to which the connection was made [4 octets]
+ A number of seconds (TTL) for which the address may be cached [4 octets]
+ or
+ Four zero-valued octets [4 octets]
+ An address type (6) [1 octet]
+ The IPv6 address to which the connection was made [16 octets]
+ A number of seconds (TTL) for which the address may be cached [4 octets]
+ [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
+ field. No version of Tor currently generates the IPv6 format.
+
+ Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later
+ versions set the TTL to the last value seen from a DNS server, and expire
+ their own cached entries after a fixed interval. This prevents certain
+ attacks.]
+
+ The OP waits for a RELAY_CONNECTED cell before sending any data.
+ Once a connection has been established, the OP and exit node
+ package stream data in RELAY_DATA cells, and upon receiving such
+ cells, echo their contents to the corresponding TCP stream.
+ RELAY_DATA cells sent to unrecognized streams are dropped.
+
+ Relay RELAY_DROP cells are long-range dummies; upon receiving such
+ a cell, the OR or OP must drop it.
+
+6.2.1. Opening a directory stream
+
+ If a Tor server is a directory server, it should respond to a
+ RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a
+ connection to its directory port. RELAY_BEGIN_DIR cells ignore exit
+ policy, since the stream is local to the Tor process.
+
+ If the Tor server is not running a directory service, it should respond
+ with a REASON_NOTDIRECTORY RELAY_END cell.
+
+ Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells,
+ and servers MUST ignore the payload.
+
+ [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients
+ SHOULD NOT send it to routers running earlier versions of Tor.]
+
+6.3. Closing streams
+
+ When an anonymized TCP connection is closed, or an edge node
+ encounters error on any stream, it sends a 'RELAY_END' cell along the
+ circuit (if possible) and closes the TCP connection immediately. If
+ an edge node receives a 'RELAY_END' cell for any stream, it closes
+ the TCP connection completely, and sends nothing more along the
+ circuit for that stream.
+
+ The payload of a RELAY_END cell begins with a single 'reason' byte to
+ describe why the stream is closing, plus optional data (depending on
+ the reason.) The values are:
+
+ 1 -- REASON_MISC (catch-all for unlisted reasons)
+ 2 -- REASON_RESOLVEFAILED (couldn't look up hostname)
+ 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*]
+ 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port)
+ 5 -- REASON_DESTROY (Circuit is being destroyed)
+ 6 -- REASON_DONE (Anonymized TCP connection was closed)
+ 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out
+ while connecting)
+ 8 -- (unallocated) [**]
+ 9 -- REASON_HIBERNATING (OR is temporarily hibernating)
+ 10 -- REASON_INTERNAL (Internal error at the OR)
+ 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request)
+ 12 -- REASON_CONNRESET (Connection was unexpectedly reset)
+ 13 -- REASON_TORPROTOCOL (Sent when closing connection because of
+ Tor protocol violations.)
+ 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a
+ non-directory server.)
+
+ (With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address
+ forms the optional data; no other reason currently has extra data.
+ As of 0.1.1.6, the body also contains a 4-byte TTL.)
+
+ OPs and ORs MUST accept reasons not on the above list, since future
+ versions of Tor may provide more fine-grained reasons.
+
+ [*] Older versions of Tor also send this reason when connections are
+ reset.
+ [**] Due to a bug in versions of Tor through 0095, error reason 8 must
+ remain allocated until that version is obsolete.
+
+ --- [The rest of this section describes unimplemented functionality.]
+
+ Because TCP connections can be half-open, we follow an equivalent
+ to TCP's FIN/FIN-ACK/ACK protocol to close streams.
+
+ An exit connection can have a TCP stream in one of three states:
+ 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
+ of modeling transitions, we treat 'CLOSED' as a fourth state,
+ although connections in this state are not, in fact, tracked by the
+ onion router.
+
+ A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
+ the corresponding TCP connection, the edge node sends a 'RELAY_FIN'
+ cell along the circuit and changes its state to 'DONE_PACKAGING'.
+ Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to
+ the corresponding TCP connection (e.g., by calling
+ shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
+
+ When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
+ also sends a 'RELAY_FIN' along the circuit, and changes its state
+ to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
+ 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to
+ 'CLOSED'.
+
+ If an edge node encounters an error on any stream, it sends a
+ 'RELAY_END' cell (if possible) and closes the stream immediately.
+
+6.4. Remote hostname lookup
+
+ To find the address associated with a hostname, the OP sends a
+ RELAY_RESOLVE cell containing the hostname to be resolved. (For a reverse
+ lookup, the OP sends a RELAY_RESOLVE cell containing an in-addr.arpa
+ address.) The OR replies with a RELAY_RESOLVED cell containing a status
+ byte, and any number of answers. Each answer is of the form:
+ Type (1 octet)
+ Length (1 octet)
+ Value (variable-width)
+ TTL (4 octets)
+ "Length" is the length of the Value field.
+ "Type" is one of:
+ 0x00 -- Hostname
+ 0x04 -- IPv4 address
+ 0x06 -- IPv6 address
+ 0xF0 -- Error, transient
+ 0xF1 -- Error, nontransient
+
+ If any answer has a type of 'Error', then no other answer may be given.
+
+ The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the
+ corresponding RELAY_RESOLVED cell must use the same streamID. No stream
+ is actually created by the OR when resolving the name.
+
+7. Flow control
+
+7.1. Link throttling
+
+ Each node should do appropriate bandwidth throttling to keep its
+ user happy.
+
+ Communicants rely on TCP's default flow control to push back when they
+ stop reading.
+
+7.2. Link padding
+
+ Link padding can be created by sending PADDING cells along the
+ connection; relay cells of type "DROP" can be used for long-range
+ padding.
+
+ Currently nodes are not required to do any sort of link padding or
+ dummy traffic. Because strong attacks exist even with link padding,
+ and because link padding greatly increases the bandwidth requirements
+ for running a node, we plan to leave out link padding until this
+ tradeoff is better understood.
+
+7.3. Circuit-level flow control
+
+ To control a circuit's bandwidth usage, each OR keeps track of
+ two 'windows', consisting of how many RELAY_DATA cells it is
+ allowed to package for transmission, and how many RELAY_DATA cells
+ it is willing to deliver to streams outside the network.
+ Each 'window' value is initially set to 1000 data cells
+ in each direction (cells that are not data cells do not affect
+ the window). When an OR is willing to deliver more cells, it sends a
+ RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
+ receives a RELAY_SENDME cell with stream ID zero, it increments its
+ packaging window.
+
+ Each of these cells increments the corresponding window by 100.
+
+ The OP behaves identically, except that it must track a packaging
+ window and a delivery window for every OR in the circuit.
+
+ An OR or OP sends cells to increment its delivery window when the
+ corresponding window value falls under some threshold (900).
+
+ If a packaging window reaches 0, the OR or OP stops reading from
+ TCP connections for all streams on the corresponding circuit, and
+ sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
+[this stuff is badly worded; copy in the tor-design section -RD]
+
+7.4. Stream-level flow control
+
+ Edge nodes use RELAY_SENDME cells to implement end-to-end flow
+ control for individual connections across circuits. Similarly to
+ circuit-level flow control, edge nodes begin with a window of cells
+ (500) per stream, and increment the window by a fixed value (50)
+ upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
+ cells when both a) the window is <= 450, and b) there are less than
+ ten cell payloads remaining to be flushed at that edge.
+
+
+A.1. Differences between spec and implementation
+
+- The current specification requires all ORs to have IPv4 addresses, but
+ allows servers to exit and resolve to IPv6 addresses, and to declare IPv6
+ addresses in their exit policies. The current codebase has no IPv6
+ support at all.
+
+B. Things that should change in a later version of the Tor protocol
+
+B.1. ... but which will require backward-incompatible change
+
+ - Circuit IDs should be longer.
+ - IPv6 everywhere.
+ - Maybe, keys should be longer.
+ - Maybe, key-length should be adjustable. How to do this without
+ making anonymity suck?
+ - Drop backward compatibility.
+ - We should use a 128-bit subgroup of our DH prime.
+ - Handshake should use HMAC.
+ - Multiple cell lengths.
+ - Ability to split circuits across paths (If this is useful.)
+ - SENDME windows should be dynamic.
+
+ - Directory
+ - Stop ever mentioning socks ports
+
+B.1. ... and that will require no changes
+
+ - Mention multiple addr/port combos
+ - Advertised outbound IP?
+ - Migrate streams across circuits.
+
+B.2. ... and that we have no idea how to do.
+
+ - UDP (as transport)
+ - UDP (as content)
+ - Use a better AES mode that has built-in integrity checking,
+ doesn't grow with the number of hops, is not patented, and
+ is implemented and maintained by smart people.
+
Property changes on: tor/trunk/doc/tor-spec-v2.txt
___________________________________________________________________
Name: svn:keywords
+ Author Date Id Revision
Name: svn:eol-style
+ native
Modified: tor/trunk/doc/tor-spec.txt
===================================================================
--- tor/trunk/doc/tor-spec.txt 2006-12-31 06:18:16 UTC (rev 9225)
+++ tor/trunk/doc/tor-spec.txt 2006-12-31 19:31:45 UTC (rev 9226)
@@ -5,38 +5,14 @@
Roger Dingledine
Nick Mathewson
-Note: This document aims to specify Tor as implemented in 0.1.2.1-alpha-dev
-and later. Future versions of Tor will implement improved protocols, and
+Note: This document aims to specify Tor as implemented in 0.1.2.x
+and earlier. Future versions of Tor may implement improved protocols, and
compatibility is not guaranteed.
-THIS DOCUMENT IS UNSTABLE. Right now, we're revising the protocol to remove
-a few long-standing limitations. For the most stable current version of the
-protocol, see tor-spec-v0.txt; current versions of Tor are backward-compatible.
-
This specification is not a design document; most design criteria
are not examined. For more information on why Tor acts as it does,
see tor-design.pdf.
-TODO for v2 revision:
- - Fix onionskin handshake scheme to be more mainstream, less nutty.
- Can we just do
- E(HMAC(g^x), g^x) rather than just E(g^x) ?
- No, that has the same flaws as before. We should send
- E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
- Better ask Ian; probably Stephen too.
- - Versioned CREATE and friends
- - Length on CREATE and friends
- - Versioning on circuits
- - Versioning on create cells
- - SHA1 is showing its age
- - Not being able to upgrade ciphersuites or increase key lengths is
- lame.
-
-TODO:
- - REASON_CONNECTFAILED should include an IP.
- - Copy prose from tor-design to make everything more readable.
- - Spec when we should rotate which keys (tls, link, etc)?
-
0. Preliminaries
0.1. Notation and encoding
@@ -82,9 +58,9 @@
0 bytes.
For a public-key cipher, we use RSA with 1024-bit keys and a fixed
- exponent of 65537. We use OAEP padding, with SHA-1 as its digest
- function. (For OAEP padding, see
- ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
+ exponent of 65537. We use OAEP-MGF1 padding, with SHA-1 as its digest
+ function. We leave optional the "Label" parameter unset. (For OAEP
+ padding, see ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we
use the 1024-bit safe prime from rfc2409 section 6.2 whose hex
@@ -100,6 +76,9 @@
320 bits. Implementations that do this MUST never use any DH key more
than once.
[May other implementations reuse their DH keys?? -RD]
+ [Probably not. Conceivably, you could get away with changing DH keys once
+ per second, but there are too many oddball attacks for me to be
+ comfortable that this is safe. -NM]
For a hash function, we use SHA-1.
@@ -143,21 +122,6 @@
``cells'', which are unwrapped by a symmetric key at each node (like
the layers of an onion) and relayed downstream.
-1.1. Protocol Versioning
-
- The node-to-node TLS-based "OR connection" protocol and the multi-hop
- "circuit" protocol are versioned quasi-independently. (Certain versions
- of the circuit protocol may require a minimum version of the connection
- protocol to be used.)
-
- Version numbers are incremented for backward-incompatible protocol changes
- only. Backward-compatible changes are generally implemented by adding
- additional fields to existing structures; implementations MUST ignore
- fields they do not expect.
-
- Parties negotiate OR connection versions as described below in sections
- 4.1 and 4.2.
-
2. Connections
Tor uses TLS for link encryption. All implementations MUST support
@@ -181,8 +145,8 @@
certificate is the OR's nickname, followed by a space and the string
"<identity>".
- Implementations running 0.2.1.0-alpha-dev and earlier used an
- organizationName of "Tor" or "TOR". Current implementations (which
+ Implementations running Protocol 1 and earlier use an
+ organizationName of "Tor" or "TOR". Future implementations (which
support the version negotiation protocol in section 4.1) MUST NOT
have either of these values for their organizationName.
@@ -215,19 +179,13 @@
The basic unit of communication for onion routers and onion
proxies is a fixed-width "cell".
- On a version 0 connection, each cell contains the following
+ On a version 1 connection, each cell contains the following
fields:
CircID [2 bytes]
Command [1 byte]
Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
- On a version 1 connection, each cell contains the following fields:
-
- CircID [3 bytes]
- Command [1 byte]
- Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
-
The CircID field determines which circuit, if any, the cell is
associated with.
@@ -239,8 +197,6 @@
4 -- DESTROY (Stop using a circuit) (See Sec 5.4)
5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1)
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
- 7 -- VERSIONS (Negotiate versions) (See Sec 4.1)
- 8 -- NETINFO (Time and MITM-prevention) (See Sec 4.2)
The interpretation of 'Payload' depends on the type of the cell.
PADDING: Payload is unused.
@@ -265,72 +221,8 @@
RELAY cells are used to send commands and data along a circuit; see
section 5 below.
- VERSIONS cells are used to introduce parameters and characteristics of
- Tor clients and servers when connections are established.
+4. [This section deliberately left blank.]
-4, Connection management
-
- Upon establishing a TLS connection, both parties immediately begin
- negotiating a connection protocol version and other connection parameters.
-
-4.1. VERSIONS cells
-
- When a Tor connection is established, both parties normally send a
- VERSIONS cell before sending any other cells. (But see below.)
-
- NumVersions [1 byte]
- Versions [NumVersions bytes]
-
- "Versions" is a sequence of NumVersions link connection protocol versions,
- each one byte long. Parties should list all of the versions which they
- are able and willing to support. Parties can only communicate if they
- have some connection protocol version in common.
-
- Version 0.1.x.y-alpha and earlier don't understand VERSIONS cells,
- and therefore don't support version negotiation. Thus, waiting until
- the other side has sent a VERSIONS cell won't work for these servers:
- if they send no cells back, it is impossible to tell whether they
- have sent a VERSIONS cell that has been stalled, or whether they have
- dropped our own VERSIONS cell as unrecognized. Thus, immediately after
- a TLS connection has been established, the parties check whether the
- other side has an obsolete certificate (organizationName equal to "Tor"
- or "TOR"). If the other party presented an obsolete certificate,
- we assume a v0 connection. Otherwise, both parties send VERSIONS
- cells listing all their supported versions. Upon receiving the
- other party's VERSIONS cell, the implementation begins using the
- highest-valued version common to both cells. If the first cell from
- the other party is _not_ a VERSIONS cell, we assume a v0 protocol.
-
- Implementations MUST discard cells that are not the first cells sent on a
- connection.
-
-4.2. MITM-prevention and time checking
-
- If we negotiate a v1 connection or higher, the first cell we send SHOULD
- be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
- times.
-
- A NETINFO cell contains:
- Timestamp [4 bytes]
- This OR's address [variable]
- Other OR's address [variable]
-
- Timestamp is the OR's current Unix time, in seconds since the epoch. If
- an implementation receives time values from many validated ORs that
- indicate that its clock is skewed, it SHOULD try to warn the
- administrator.
-
- Each address contains Type/Length/Value as used in Section 6.4. The first
- address is the address of the interface the party sending the VERSIONS cell
- used to connect to or accept connections from the other -- we include it
- to block a man-in-the-middle attack on TLS that lets an attacker bounce
- traffic through his own computers to enable timing and packet-counting
- attacks.
-
- The second address is the one that the party sending the VERSIONS cell
- believes the other has -- it can be used to learn what your IP address
- is if you have no other hints.
-
5. Circuit management
5.1. CREATE and CREATED cells
@@ -904,36 +796,3 @@
addresses in their exit policies. The current codebase has no IPv6
support at all.
-B. Things that should change in a later version of the Tor protocol
-
-B.1. ... but which will require backward-incompatible change
-
- - Circuit IDs should be longer.
- - IPv6 everywhere.
- - Maybe, keys should be longer.
- - Maybe, key-length should be adjustable. How to do this without
- making anonymity suck?
- - Drop backward compatibility.
- - We should use a 128-bit subgroup of our DH prime.
- - Handshake should use HMAC.
- - Multiple cell lengths.
- - Ability to split circuits across paths (If this is useful.)
- - SENDME windows should be dynamic.
-
- - Directory
- - Stop ever mentioning socks ports
-
-B.1. ... and that will require no changes
-
- - Mention multiple addr/port combos
- - Advertised outbound IP?
- - Migrate streams across circuits.
-
-B.2. ... and that we have no idea how to do.
-
- - UDP (as transport)
- - UDP (as content)
- - Use a better AES mode that has built-in integrity checking,
- doesn't grow with the number of hops, is not patented, and
- is implemented and maintained by smart people.
-