[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] rip out most of the design sec, integrate some of or...



Update of /home/freehaven/cvsroot/doc/rta04
In directory moria.mit.edu:/home2/arma/work/freehaven/doc/rta04

Modified Files:
	nato-rta04.tex 
Log Message:
rip out most of the design sec, integrate some of original intro


Index: nato-rta04.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/rta04/nato-rta04.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -d -r1.4 -r1.5
--- nato-rta04.tex	8 Jan 2004 05:46:12 -0000	1.4
+++ nato-rta04.tex	8 Jan 2004 06:39:15 -0000	1.5
@@ -177,9 +177,11 @@
 comparatively inexpensive and typically requires only symmetric
 encryption.  Because a circuit crosses several servers, and each
 server only knows the adjacent servers in the circuit, no single
-server can link a user to her communication partners.  There have been
-many circuit-based designs, making a variety of design choices; we again
-refer the reader to \cite{tor-design} for more information.
+server can link a user to her communication partners.
+
+There are many other circuit-based designs, that make a variety of
+design choices; we again refer the reader to \cite{tor-design} for
+more information.
 
 \section{Design goals and assumptions}
 \label{sec:assumptions}
@@ -196,7 +198,7 @@
 were DoD users, then traffic patterns of individuals, enclaves, and
 commands might be protected. However, any traffic emerging from the
 Onion Routing network to the Internet would still be recognized as coming
-from the DoD, since the network would only carry DoD traffic.  
+from the DoD, since the network would only carry DoD traffic.
 Therefore, it is necessary that the Onion Routing
 network carry traffic of a broader class of users. Similarly, having
 onion routers run by diverse entities, including nonmilitary entities
@@ -332,227 +334,103 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
-\section{Overview of the Tor Design}
+\section{Highlights of the Tor Design}
 \label{sec:design}
 
-The Tor network is an overlay network; each onion router (OR) 
-runs as a normal
-user-level process without any special privileges.
-Each onion router maintains a long-term TLS \cite{TLS}
-connection to every other onion router.
-%(We discuss alternatives to this clique-topology assumption in
-%Section~\ref{sec:maintaining-anonymity}.)
-% A subset of the ORs also act as
-%directory servers, tracking which routers are in the network;
-%see Section~\ref{subsec:dirservers} for directory server details.
-Each user
-runs local software called an onion proxy (OP) to fetch directories,
-establish circuits across the network,
-and handle connections from user applications.  These onion proxies accept
-TCP streams and multiplex them across the circuits. The onion
-router on the other side 
-of the circuit connects to the destinations of
-the TCP streams and relays data.
-
-Each onion router uses three public keys: a long-term identity key, a
-short-term onion key, and a short-term link key.  The identity
-key is used to sign TLS certificates, to sign the OR's \emph{router
-descriptor} (a summary of its keys, address, bandwidth, exit policy,
-and so on), and (by directory servers) to sign directories. Changing
-the identity key of a router is considered equivalent to creating a
-new router. The onion key is used to decrypt requests
-from users to set up a circuit and negotiate ephemeral keys. Finally,
-link keys are used by the TLS protocol when communicating between
-onion routers. Each short-term key is rotated periodically and
-independently, to limit the impact of key compromise.
-
-Section~\ref{subsec:cells} presents the fixed-size
-\emph{cells} that are the unit of communication in Tor. We describe
-in Section~\ref{subsec:circuits} how circuits are
-built, extended, truncated, and destroyed. Section~\ref{subsec:tcp}
-describes how TCP streams are routed through the network.  We address
-integrity checking in Section~\ref{subsec:integrity-checking},
-and resource limiting in Section~\ref{subsec:rate-limit}.
-Finally,
-Section~\ref{subsec:congestion} talks about congestion control and
-fairness issues.
-
-\subsection{Cells}
-\label{subsec:cells}
-
-Onion routers communicate with one another, and with users' OPs, via
-TLS connections with ephemeral keys.  Using TLS conceals the data on
-the connection with perfect forward secrecy, and prevents an attacker
-from modifying data on the wire or impersonating an OR.
+The Tor network is an overlay network; each onion router (OR) runs as
+a normal user-level process without any special privileges. Each onion
+router maintains a long-term TLS \cite{TLS} connection to every other
+onion router. Using TLS conceals the data on the connection with perfect
+forward secrecy, and prevents an attacker from modifying data on the wire
+or impersonating an OR. Each user runs local software called an onion
+proxy (OP) to fetch directories, establish circuits across the network,
+and handle connections from user applications. These onion proxies accept
+TCP streams and multiplex them across the circuits. The onion router on
+the other side of the circuit connects to the destinations of the TCP
+streams and relays data.
 
-Traffic passes along these connections in fixed-size cells.  Each cell
-is 256 bytes (but see Section~\ref{sec:conclusion} for a discussion of
-allowing large cells and small cells on the same network), and
-consists of a header and a payload. The header includes a circuit
-identifier (circID) that specifies which circuit the cell refers to
-(many circuits can be multiplexed over the single TLS connection), and
-a command to describe what to do with the cell's payload.  (Circuit
+Traffic passes along these connections in fixed-size cells.  Each cell is
+512 bytes, and consists of a header and a payload. The header includes
+a circuit identifier (circID) that specifies which circuit the cell
+refers to (many circuits can be multiplexed over each TLS connection),
+and a command to describe what to do with the cell's payload. (Circuit
 identifiers are connection-specific: each single circuit has a different
-circID on each OP/OR or OR/OR connection it traverses.)
-Based on their command, cells are either \emph{control} cells, which are
-always interpreted by the node that receives them, or \emph{relay} cells,
-which carry end-to-end stream data.   The control cell commands are:
-\emph{padding} (currently used for keepalive, but also usable for link
-padding); \emph{create} or \emph{created} (used to set up a new circuit);
-and \emph{destroy} (to tear down a circuit).
-
-Relay cells have an additional header (the relay header) after the
-cell header, containing a stream identifier (many streams can
-be multiplexed over a circuit); an end-to-end checksum for integrity
-checking; the length of the relay payload; and a relay command.  
-The entire contents of the relay header and the relay cell payload 
-are encrypted or decrypted together as the relay cell moves along the
-circuit, using the 128-bit AES cipher in counter mode to generate a
-cipher stream.
-The
-relay commands are: \emph{relay
-data} (for data flowing down the stream), \emph{relay begin} (to open a
-stream), \emph{relay end} (to close a stream cleanly), \emph{relay
-teardown} (to close a broken stream), \emph{relay connected}
-(to notify the OP that a relay begin has succeeded), \emph{relay
-extend} and \emph{relay extended} (to extend the circuit by a hop,
-and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
-(to tear down only part of the circuit, and to acknowledge), \emph{relay
-sendme} (used for congestion control), and \emph{relay drop} (used to
-implement long-range dummies).
+circID on each OP/OR or OR/OR connection it traverses.) Based on their
+command, cells are either \emph{control} cells, which are always
+interpreted by the node that receives them, or \emph{relay} cells,
+which carry end-to-end stream data.
 
+Relay cells have an additional header (the relay header) after the cell
+header, containing a stream identifier (many streams can be multiplexed
+over a circuit); an end-to-end checksum for integrity checking; the
+length of the relay payload; and a relay command.  The entire contents of
+the relay header and the relay cell payload are encrypted or decrypted
+together as the relay cell moves along the circuit, using the 128-bit
+AES cipher in counter mode to generate a cipher stream.
 
-\subsection{Circuits and streams}
-\label{subsec:circuits}
+In Tor, just as each connection can be shared by many circuits, each
+circuit can be shared by many application-level TCP streams. To avoid
+delays, users construct circuits preemptively. To limit linkability
+among their streams, users' OPs build a new circuit periodically if
+the previous one has been used, and expire old used circuits that no
+longer have any open streams. OPs consider making a new circuit once a
+minute: thus even heavy users spend negligible time building circuits,
+but a limited number of requests can be linked to each other through
+a given exit node. Also, because circuits are built in the background,
+OPs can recover from failed circuit creation without delaying streams
+(which would harm user experience).
 
-Onion Routing originally built one circuit for each
-TCP stream.  Because building a circuit can take several tenths of a
-second (due to public-key cryptography and network latency),
-this design imposed high costs on applications like web browsing that
-open many TCP streams.
+The full Tor design paper \cite{tor-design} describes the Onion Routing
+protocol in detail; we highlight a few of its properties here:
 
-In Tor, each circuit can be shared by many TCP streams.  To avoid
-delays, users construct circuits preemptively.  To limit linkability
-among their streams, users' OPs build a new circuit
-periodically if the previous one has been used,
-and expire old used circuits that no longer have any open streams.
-OPs consider making a new circuit once a minute: thus
-even heavy users spend negligible time
-building circuits, but a limited number of requests can be linked
-to each other through a given exit node. Also, because circuits are built
-in the background, OPs can recover from failed circuit creation
-without delaying streams and thereby harming user experience.\\
+\begin{itemize}
 
-\noindent{\large\bf Constructing a circuit}
-\label{subsubsec:constructing-a-circuit}\\
-%\subsubsection{Constructing a circuit}
-A user's OP constructs circuits incrementally, negotiating a
-symmetric key with each OR on the circuit, one hop at a time. To begin
-creating a new circuit, the OP (call her Alice) sends a
-\emph{create} cell to the first node in her chosen path (call him Bob).  
-(She chooses a new
-circID $C_{AB}$ not currently used on the connection from her to Bob.)
-The \emph{create} cell's
-payload contains the first half of the Diffie-Hellman handshake
-($g^x$), encrypted to the onion key of the OR (call him Bob). Bob
-responds with a \emph{created} cell containing the second half of the
-DH handshake, along with a hash of the negotiated key $K=g^{xy}$.
+\item {\bf{Perfect forward secrecy:}} Onion Routing was originally vulnerable
+to a single hostile node recording traffic and later compromising
+successive nodes in the circuit and forcing them to decrypt it. Rather
+than using a single multiply encrypted data structure (an \emph{onion})
+to lay each circuit, Tor now uses an incremental or \emph{telescoping}
+path-building design, where the initiator negotiates session keys
+with each successive hop in the circuit.  Once these keys are deleted,
+subsequently compromised nodes cannot decrypt old traffic. As a side
+benefit, onion replay detection is no longer necessary, and the process
+of building circuits is more reliable, since the initiator knows when
+a hop fails and can then try extending to a new node.
 
-Once the circuit has been established, Alice and Bob can send one
-another relay cells encrypted with the negotiated
-key.\footnote{Actually, the negotiated key is used to derive two
-  symmetric keys: one for each direction.}  More detail is given in
-\cite{tor-design}.\\
+\item {\bf{Leaky-pipe circuit topology:}} Through in-band signaling within
+the circuit, Tor initiators can direct traffic to nodes partway down the
+circuit. This novel approach allows traffic to exit the circuit from the
+middle---possibly frustrating traffic shape and volume attacks based on
+observing the end of the circuit. (It also allows for long-range padding
+if future research shows this to be worthwhile.)
 
+\item {\bf{End-to-end integrity checking:}} The original Onion Routing design
+did no integrity checking on data. Any node on the circuit could change
+the contents of data cells as they passed by---for example, to alter a
+connection request so it would connect to a different webserver, or to
+`tag' encrypted traffic and look for corresponding corrupted traffic at
+the network edges \cite{minion-design}.  Tor hampers these attacks by
+checking data integrity before it leaves the network.
 
-\noindent{\large\bf Relay cells}\\
-%\subsubsection{Relay cells}
-%
-Once Alice has established the circuit (so she shares keys with each
-OR on the circuit), she can send relay cells.  Recall that every relay
-cell has a streamID that indicates to which
-stream the cell belongs.  This streamID allows a relay cell to be
-addressed to any OR on the circuit.  Upon receiving a relay
-cell, an OR looks up the corresponding circuit, and decrypts the relay
-header and payload with the session key for that circuit.
-If the cell is headed downstream (away from Alice) the OR then checks
-whether the decrypted streamID is recognized---either because it
-corresponds to an open stream at this OR for the given circuit, or because
-it is the control streamID (zero).  If the OR recognizes the
-streamID, it accepts the relay cell and processes it as described
-below.  Otherwise, 
-the OR looks up the circID and OR for the
-next step in the circuit, replaces the circID as appropriate, and
-sends the decrypted relay cell to the next OR.  (If the OR at the end
-of the circuit receives an unrecognized relay cell, an error has
-occurred, and the cell is discarded.)
-\\ \\
-\noindent{\large\bf Opening and closing streams}\\
-\label{subsec:tcp}
-When Alice's application wants a TCP connection to a given
-address and port, it asks the OP (via SOCKS) to make the
-connection. The OP chooses the newest open circuit (or creates one if
-none is available), and chooses a suitable OR on that circuit to be the
-exit node (usually the last node, but maybe others due to exit policy
-conflicts; see Section~\ref{subsec:exitpolicies}.) The OP then opens
-the stream by sending a \emph{relay begin} cell to the exit node,
-using a streamID of zero (so the OR will recognize it), containing as
-its relay payload a new randomly generated streamID, the destination
-address, and the destination port.  Once the
-exit node completes the connection to the remote host, it responds
-with a \emph{relay connected} cell.  Upon receipt, the OP sends a
-SOCKS reply to notify the application of its success. The OP
-now accepts data from the application's TCP stream, packaging it into
-\emph{relay data} cells and sending those cells along the circuit to
-the chosen OR.
-\\
-\noindent{\large\bf Integrity checking on circuits}
-\label{subsec:integrity-checking}
-\\
-Because the old Onion Routing design used a stream cipher, traffic was
-vulnerable to a malleability attack: though the attacker could not
-decrypt cells, any changes to encrypted data
-would create corresponding changes to the data leaving the network.
-(Even an external adversary could do this, despite link encryption, by
-inverting bits on the wire.)
+\item {\bf{Improved robustness to failed nodes:}} A failed node in the
+old design meant that circuit building failed, but thanks to Tor's
+step-by-step circuit building, users notice failed nodes while building
+circuits and route around them. Additionally, liveness information from
+directories allows users to avoid unreliable nodes in the first place.
 
-This weakness allowed an adversary to change a padding cell to a destroy
-cell; change the destination address in a \emph{relay begin} cell to the
-adversary's webserver; or change an FTP command from
-{\tt dir} to {\tt rm~*}. Any OR or external adversary
-along the circuit could introduce such corruption in a stream, if it
-knew or could guess the encrypted content.
+\item {\bf{Congestion control:}} Even with bandwidth rate limiting, we still
+need to worry about congestion, either accidental or intentional. If
+enough users choose the same OR-to-OR connection for their circuits, that
+connection can become saturated. For example, an attacker could send a
+large file through the Tor network to a webserver he runs, and then refuse
+to read any of the bytes at the webserver end of the circuit. Without some
+congestion control mechanism, these bottlenecks can propagate back through
+the entire network. We don't need to reimplement full TCP windows (with
+sequence numbers, the ability to drop cells when we're full and retransmit
+later, and so on), because TCP already guarantees in-order delivery of
+each cell. Tor provides both circuit and stream level throttling.
 
-Tor prevents external adversaries from mounting this attack by
-using TLS on its links, which provides integrity checking.
-Addressing the insider malleability attack, however, is
-more complex. Detail is given in \cite{tor-design}.
-\\ \\
-\noindent{\large\bf Rate limiting and fairness}
-\label{subsec:rate-limit}
-\\
-Volunteers are generally more willing to run services that can limit
-their own bandwidth usage. To accommodate them, Tor servers use a
-token bucket approach \cite{tannenbaum96} to 
-enforce a long-term average rate of incoming bytes, while still
-permitting short-term bursts above the allowed bandwidth. Current bucket
-sizes are set to ten seconds' worth of traffic.
-\\ \\
-\noindent{\large\bf Congestion control}
-\label{subsec:congestion}
-\\
-Even with bandwidth rate limiting, we still need to worry about
-congestion, either accidental or intentional. If enough users choose
-the same OR-to-OR connection for their circuits, that connection can
-become saturated. For example, an attacker could send a large file
-through the Tor network to a webserver he runs, and then refuse to
-read any of the bytes at the webserver end of the circuit. Without
-some congestion control mechanism, these bottlenecks can propagate
-back through the entire network. We don't need to reimplement full TCP
-windows (with sequence numbers, the ability to drop cells when we're
-full and retransmit later, and so on), because TCP already guarantees
-in-order delivery of each cell. Tor provides both circuit and stream
-level throttling. See \cite{tor-design} for more details.
+\end{itemize}
 
 \section{Other design decisions}
 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/