[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[or-cvs] First cut at HACKING document



Update of /home/or/cvsroot/doc
In directory moria.mit.edu:/tmp/cvs-serv24603

Modified Files:
	HACKING 
Log Message:
First cut at HACKING document

Index: HACKING
===================================================================
RCS file: /home/or/cvsroot/doc/HACKING,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- HACKING	3 Oct 2003 19:37:38 -0000	1.3
+++ HACKING	9 Oct 2003 02:05:13 -0000	1.4
@@ -1,11 +1,418 @@
+			 Guide to Hacking Tor
 
-0. Intro.
-Onion Routing is still very much in development stages. This document
-aims to get you started in the right direction if you want to understand
-the code, add features, fix bugs, etc.
+(As of 8 October 2003, this was all accurate.  If you're reading this in
+the distant future, stuff may have changed.)
 
-Read the README file first, so you can get familiar with the basics.
+0. Intro and required reading
+
+  Onion Routing is still very much in development stages. This document
+  aims to get you started in the right direction if you want to understand
+  the code, add features, fix bugs, etc.
+
+  Read the README file first, so you can get familiar with the basics of
+  installing and running an onion router.
+
+  Then, skim some of the introductory materials in tor-spec.txt,
+  tor-design.tex, and the Tor FAQ to learn more about how the Tor protocol
+  is supposed to work.  This document will assume you know about Cells,
+  Circuits, Streams, Connections, Onion Routers, and Onion Proxies.
+
+1. Code organization
+
+1.1. The modules
+
+  The code is divided into two directories: ./src/common and ./src/or.
+  The "common" directory contains general purpose utility functions not
+  specific to onion routing.  The "or" directory implements all
+  onion-routing and onion-proxy specific functionality.
+
+  Files in ./src/common:
+
+     aes.[ch] -- Implements the AES cipher (with 128-bit keys and blocks),
+        and a counter-mode stream cipher on top of AES.  This code is
+        taken from the main Rijndael distribution.  (We include this
+        because many people are running older versions of OpenSSL without
+        AES support.)
+
+     crypto.[ch] -- Wrapper functions to present a consistent interface to
+        public-key and symmetric cryptography operations from OpenSSL.
+
+     fakepoll.[ch] -- Used on systems that don't have a poll() system call;
+        reimplements() poll using the select() system call.
+
+     log.[ch] -- Tor's logging subsystem.
+
+     test.h -- Macros used by unit tests.
+
+     torint.h -- Provides missing [u]int*_t types for environments that
+        don't have stdint.h.
+
+     tortls.[ch] -- Wrapper functions to present a consistent interface to
+        TLS, SSL, and X.509 functions from OpenSSL.
+
+     util.[ch] -- Miscellaneous portability and convenience functions.
+
+  Files in ./src/or:
+  
+   [General-purpose modules]
+
+     or.h -- Common header file: includes everything, define everything.
+
+     buffers.c -- Implements a generic buffer interface.  Buffers are 
+        fairly opaque string holders that can read to or flush from:
+        memory, file descriptors, or TLS connections.  
+
+        Also implements parsing functions to read HTTP and SOCKS commands
+        from buffers.
+
+     tree.h -- A splay tree implementatio by Niels Provos.  Used only by
+        dns.c.
+
+     config.c -- Code to parse and validate the configuration file.
+
+   [Background processing modules]
+
+     cpuworker.c -- Implements a separate 'CPU worker' process to perform
+        CPU-intensive tasks in the background, so as not interrupt the
+        onion router.  (OR only)
+
+     dns.c -- Implements a farm of 'DNS worker' processes to perform DNS
+        lookups for onion routers and cache the results.  [This needs to
+        be done in the background because of the lack of a good,
+        ubiquitous asynchronous DNS implementation.] (OR only)
+
+   [Directory-related functionality.]
+
+     directory.c -- Code to send and fetch directories and router
+        descriptors via HTTP.  Directories use dirserv.c to generate the
+        results; clients use routers.c to parse them.
+
+     dirserv.c -- Code to manage directory contents and generate
+        directories. [Directory only] 
+
+     routers.c -- Code to parse directories and router descriptors; and to
+        generate a router descriptor corresponding to this OR's
+        capabilities.  Also presents some high-level interfaces for
+        managing an OR or OP's view of the directory.
+
+   [Circuit-related modules.]
+
+     circuit.c -- Code to create circuits, manage circuits, and route
+        relay cells along circuits.
+
+     onion.c -- Code to generate and respond to "onion skins".
+
+   [Core protocol implementation.]
 
+     connection.c -- Code used in common by all connection types.  See
+        1.2. below for more general information about connections.
+
+     connection_edge.c -- Code used only by edge connections.
+
+     command.c -- Code to handle specific cell types. [OR only]
+
+     connection_or.c -- Code to implement cell-speaking connections.
+
+   [Toplevel modules.]
+
+     main.c -- Toplevel module.  Initializes keys, handles signals,
+        multiplexes between connections, implements main loop, and drives
+        scheduled events.
+
+     tor_main.c -- Stub module containing a main() function.  Allows unit
+        test binary to link against main.c
+
+   [Unit tests]
+
+     test.c -- Contains unit tests for many pieces of the lower level Tor
+        modules.
+
+1.2. All about connections
+
+  All sockets in Tor are handled as different types of nonblocking
+  'connections'.  (What the Tor spec calls a "Connection", the code refers
+  to as a "Cell-speaking" or "OR" connection.)
+  
+  Connections are implemented by the connection_t struct, defined in or.h.
+  Not every kind of connection uses all the fields in connection_t; see 
+  the comments in or.h and the assertions in assert_connection_ok() for
+  more information.
+
+  Every connection has a type and a state.  Connections never change their
+  type, but can go through many state changes in their lifetime.
+
+  The connection types break down as follows:
+
+     [Cell-speaking connections]
+       CONN_TYPE_OR -- A bidirectional TLS connection transmitting a
+          sequence of cells.  May be from an OR to an OR, or from an OP to
+          an OR.
+
+     [Edge connections]
+       CONN_TYPE_EXIT -- A TCP connection from an onion router to a
+          Stream's destination. [OR only]
+       CONN_TYPE_AP -- A SOCKS proxy connection from the end user to the
+          onion proxy.  [OP only]
+
+     [Listeners]
+       CONN_TYPE_OR_LISTENER [OR only]
+       CONN_TYPE_AP_LISTENER [OP only]
+       CONN_TYPE_DIR_LISTENER [Directory only]
+          -- Bound network sockets, waiting for incoming connections.
+
+     [Internal]
+       CONN_TYPE_DNSWORKER -- Connection from the main process to a DNS
+          worker. [OR only]
+       
+       CONN_TYPE_CPUWORKER -- Connection from the main process to a CPU
+          worker. [OR only]
+
+   Connection states are documented in or.h.
+
+   Every connection has two associated input and output buffers.
+   Listeners don't use them.  With other connections, incoming data is
+   appended to conn->inbuf, and outgoing data is taken from the front of
+   conn->outbuf.  Connections differ primarily in the functions called
+   to fill and drain these buffers.
+
+1.3. All about circuits.
+
+   A circuit_t structure fills two roles.  First, a circuit_t links two
+   connections together: either an edge connection and an OR connection,
+   or two OR connections.  (When joined to an OR connection, a circuit_t
+   affects only cells sent to a particular ACI on that connection.  When
+   joined to an edge connection, a circuit_t affects all data.)
+
+   Second, a circuit_t holds the cipher keys and state for sending data
+   along a given circuit.  At the OP, it has a sequence of ciphers, each
+   of which is shared with a single OR along the circuit.  Separate
+   ciphers are used for data going "forward" (away from the OP) and
+   "backward" (towards the OP).  At the OR, a circuit has only two stream
+   ciphers: one for data going forward, and one for data going backward.
+
+1.4. Asynchronous IO and the main loop.
+
+   Tor uses the poll(2) system call [or a substitute based on select(2)]
+   to handle nonblocking (asynchonous) IO.  If you're not familiar with
+   nonblocking IO, check out the links at the end of this document.
+        
+   All asynchronous logic is handled in main.c.  The functions
+   'connection_add', 'connection_set_poll_socket', and 'connection_remove'
+   manage an array of connection_t*, and keep in synch with the array of
+   struct pollfd required by poll(2).  (This array of connection_t* is
+   accessible via get_connection_array, but users should generally call
+   one of the 'connection_get_by_*' functions in connection.c to look up
+   individual connections.)
+
+   To trap read and write events, connections call the functions
+   'connection_{is|stop|start}_{reading|writing}'.
+
+   When connections get events, main.c calls conn_read and conn_write.
+   These functions dispatch events to connection_handle_read and
+   connection_handle_write as appropriate.
+
+   When connection need to be closed, they can respond in two ways.  Most
+   simply, they can make connection_handle_* to return an error (-1),
+   which will make conn_{read|write} close them.  But if the connection
+   needs to stay around [XXXX explain why] until the end of the current
+   iteration of the main loop, it marks itself for closing by setting
+   conn->connection_marked_for_close.
+
+   The main loop handles several other operations: First, it checks
+   whether any signals have been received that require a response (HUP,
+   KILL, USR1, CHLD).  Second, it calls prepare_for_poll to handle recurring
+   tasks and compute the necessary poll timeout.  These recurring tasks
+   include periodically fetching the directory, timing out unused
+   circuits, incrementing flow control windows and re-enabling connections
+   that were blocking for more bandwidth, and maintaining statistics.
+
+   A word about TLS: Using TLS on OR connections complicates matters in
+   two ways.  First, a TLS stream has its own read buffer independent of
+   the connection's read buffer.  (TLS needs to read an entire frame from
+   the network before it can decrypt any data.  Thus, trying to read 1
+   byte from TLS can require that several KB be read from the network and
+   decrypted.  The extra data is stored in TLS's decrypt buffer.)  Second,
+   the TLS stream's events do not correspond directly to network events:
+   sometimes, before a TLS stream can read, the network must be ready to
+   write -- or vice versa.
+
+   [XXXX describe the consequences of this for OR connections.]
+
+1.5. How data flows (An illustration.)
+
+   Suppose an OR receives 50 bytes along an OR connection.  These 50 bytes
+   complete a data relay cell, which gets decrypted and delivered to an
+   edge connection.  Here we give a possible call sequence for the
+   delivery of this data.
+
+   (This may be outdated quickly.)
+
+   do_main_loop -- Calls poll(2), receives a POLLIN event on a struct
+                 pollfd, then calls:
+    conn_read -- Looks up the corresponding connection_t, and calls:
+     connection_handle_read -- Calls:
+      connection_read_to_buf -- Notices that it has an OR connection so:
+       read_to_buf_tls -- Pulls data from the TLS stream onto conn->inbuf.
+      connection_process_inbuf -- Notices that it has an OR connection so:
+       connection_or_process_inbuf -- Checks whether conn is open, and calls:
+        connection_process_cell_from_inbuf -- Notices it has enough data for
+                 a cell, then calls:
+         connection_fetch_from_buf -- Pulls the cell from the buffer.
+         cell_unpack -- Decodes the raw cell into a cell_t
+         command_process_cell -- Notices it is a relay cell, so calls:
+          command_process_relay_cell -- Looks up the circuit for the cell,
+                 makes sure the circuit is live, then passes the cell to:
+           circuit_deliver_relay_cell -- Passes the cell to each of: 
+            relay_crypt -- Strips a layer of encryption from the cell and
+                 notice that the cell is for local delivery.
+            connection_edge_process_relay_cell -- extracts the cell's
+                 relay command, and makes sure the edge connection is
+                 open.  Since it has a DATA cell and an open connection,
+                 calls:
+             circuit_consider_sending_sendme -- [XXX]
+             connection_write_to_buf -- To place the data on the outgoing
+                 buffer of the correct edge connection, by calling:
+              connection_start_writing -- To tell the main poll loop about
+                 the pending data.
+              write_to_buf -- To actually place the outgoing data on the
+                 edge connection.
+             connection_consider_sending_sendme -- [XXX]
+
+   [In a subsequent iteration, main notices that the edge connection is
+    ready for writing.]
+
+   do_main_loop -- Calls poll(2), receives a POLLOUT event on a struct
+                 pollfd, then calls:
+    conn_write -- Looks up the corresponding connection_t, and calls:
+     connection_handle_write -- This isn't a TLS connection, so calls:
+      flush_buf -- Delivers data from the edge connection's outbuf to the
+                 network.
+      connection_wants_to_flush -- Reports that all data has been flushed.
+      connection_finished_flushing -- Notices the connection is an exit,
+                 and calls:
+       connection_edge_finished_flushing -- The connection is open, so it
+                 calls:
+        connection_stop_writing -- Tells the main poll loop that this
+                 connection has no more data to write.
+        connection_consider_sending_sendme -- [XXX]
+
+1.6. Routers, descriptors, and directories
+
+   All Tor processes need to keep track of a list of onion routers, for
+   several reasons:
+       - OPs need to establish connections and circuits to ORs.
+       - ORs need to establish connections to other ORs.
+       - OPs and ORs need to fetch directories from a directory servers.
+       - ORs need to upload their descriptors to directory servers.
+       - Directory servers need to know which ORs are allowed onto the
+         network, what the descriptors are for those ORs, and which of
+         those ORs are currently live.
+
+   Thus, every Tor process keeps track of a list of all the ORs it knows
+   in a static variable 'directory' in the routers.c module.  This
+   variable contains a routerinfo_t object for each known OR. On startup,
+   the directory is initialized to a list of known directory servers (via
+   router_get_list_from_file()).  Later, the directory is updated via
+   router_get_dir_from_string().  (OPs and ORs retrieve fresh directories
+   from directory servers; directory servers generate their own.)
+
+   Every OR must periodically regenerate a router descriptor for itself.
+   The descriptor and the corresponding routerinfo_t are stored in the
+   'desc_routerinfo' and 'descriptor' static variables in routers.c.
+
+   Additionally, a directory server keeps track of a list of the
+   router descriptors it knows in a separte list in dirserv.c.  It
+   uses this list, plus the open connections in main.c, to build
+   directories.
+
+1.7. Data model
+  
+  [XXX]
+
+1.8. Flow control
+
+  [XXX]
+
+2. Coding conventions
+
+2.1. Details
+
+  Use tor_malloc, tor_strdup, and tor_gettimeofday instead of their
+  generic equivalents.  (They always succeed or exit.)
+
+  Use INLINE instead of 'inline', so that we work properly on windows.
+
+2.2. Calling and naming conventions
+
+  Whenever possible, functions should return -1 on error and and 0 on
+  success.
+
+  For multi-word identifiers, use lowercase words combined with
+  underscores. (e.g., "multi_word_identifier").  Use ALL_CAPS for macros and
+  constants.
+
+  Typenames should end with "_t".
+
+  Function names should be prefixed with a module name or object name.  (In
+  general, code to manipulate an object should be a module with the same
+  name as the object, so it's hard to tell which convention is used.)
+
+  Functions that do things should have imperative-verb names
+  (e.g. buffer_clear, buffer_resize); functions that return booleans should
+  have predicate names (e.g. buffer_is_empty, buffer_needs_resizing).
+
+2.3. What To Optimize
+
+  Don't optimize anything if it's not in the critical path.  Right now,
+  the critical path seems to be AES, logging, and the network itself.
+  Feel free to do your own profiling to determine otherwise.
+
+2.4. Log conventions
+
+  Log convention: use only these four log severities.
+
+    ERR is if something fatal just happened.
+    WARNING is something bad happened, but we're still running. The
+      bad thing is either a bug in the code, an attack or buggy
+      protocol/implementation of the remote peer, etc. The operator should
+      examine the bad thing and try to correct it.
+    (No error or warning messages should be expected during normal OR or OP
+      operation.. I expect most people to run on -l warning eventually. If a
+      library function is currently called such that failure always means
+      ERR, then the library function should log WARNING and let the caller
+      log ERR.)
+    INFO means something happened (maybe bad, maybe ok), but there's nothing
+      you need to (or can) do about it.
+    DEBUG is for everything louder than INFO.
+
+  [XXX Proposed convention: every messages of severity INFO or higher should
+  either (A) be intelligible to end-users who don't know the Tor source; or
+  (B) somehow inform the end-users that they aren't expected to understand
+  the message (perhaps with a string like "internal error").  Option (A) is
+  to be preferred to option (B). -NM]
+
+3. References
+
+  About Tor
+
+     See http://freehaven.net/tor/
+         http://freehaven.net/tor/cvs/doc/tor-spec.txt
+         http://freehaven.net/tor/cvs/doc/tor-dessign.tex
+         http://freehaven.net/tor/cvs/doc/FAQ
+
+  About anonymity
+
+     See http://freehaven.net/anonbib/
+
+  About nonblocking IO
+
+     [XXX insert references]
+
+
+# ======================================================================
+# Old HACKING document; merge into the above, move into tor-design.tex,
+# or delete.
+# ======================================================================
 The pieces.
 
   Routers. Onion routers, as far as the 'tor' program is concerned,
@@ -99,20 +506,6 @@
   Currently the code tries for the primary router first, and if it's down,
   chooses the first available twin.
 
-Coding conventions:
 
- Log convention: use only these four log severities.
 
-  ERR is if something fatal just happened.
-  WARNING is something bad happened, but we're still running. The
-    bad thing is either a bug in the code, an attack or buggy
-    protocol/implementation of the remote peer, etc. The operator should
-    examine the bad thing and try to correct it.
-  (No error or warning messages should be expected. I expect most people
-    to run on -l warning eventually. If a library function is currently
-    called such that failure always means ERR, then the library function
-    should log WARNING and let the caller log ERR.)
-  INFO means something happened (maybe bad, maybe ok), but there's nothing
-    you need to (or can) do about it.
-  DEBUG is for everything louder than INFO.