[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Stream starvation in Tor (with patch)
or-dev folks,
A number of groups (the UW people, the UCSD people, Roger(?) and
possibly others) have noticed that when many active streams (more than
about 4) are opened on a single circuit, 3 or 4 of them get service and
the rest starve until those are finished. Note that, for example, most
web browsers will open multiple streams at once to fetch parts of web
pages.
Here's a plot of 9 active streams on one circuit:
https://thunk.cs.uwaterloo.ca/mashael/broken.png
(This graph was from measurements on a private Tor network on
PlanetLab. This was from 0.2.3.0-alpha-dev, but the same behaviour was
observed previously on 0.2.2.*.)
Here's Mashael's writeup about the problem.
Our network consists of five nodes. An authoritative directory, a client
(also running as Onion Proxy), 3 Tor Onion Routers and a server. We
refer to the three onion routers as Entry, Middle and Exit. Our client
constructs a circuit through the network, where the first hop is Entry,
the second is Middle and the third is Exit. The constructed circuit is
used to create multiple streams to the server. Each stream basically
carries a request from the client to the server, which replies with
10,000 cells. We repeated this experiment for 2, 5, 9 and 13 streams.
The common result is that the last three client streams created "almost"
starved the other earlier streams until they fully completed
transferring their 10,000 cells.
This problem only occurs when we have a bottleneck OR in the path. We
continue to see the problem as long as the bandwidth offered by the
bottleneck router is less than approximately 80% of the bandwidth that
is offered by the other routers in the circuit. For example, if the
bandwidthrate of Entry and Exit is 630 KB/s and 720 KB/s, respectively,
the problem is still visible even when the bandwidthrate of Middle is up
to 550 KB/s. We verified that by setting the bandwidthrate command for
the Middle to a minimum value (20 KB/s) and then gradually increasing it
until the starvation of streams disappears from the resulting graphs at
approximately 600 KB/s.
The reason the "streams problem" occurs is due to the complicated
interaction between Tor's congestion control and libevent. At some point
during the experiment, the circuit window is exhausted, which blocks all
edge streams. When a circuit level sendme is received at Exit, it
resumes edge reading by looping over linked list of edge streams, and
calling connection_start_reading() to inform libevent to resume reading.
When the streams are activated again, Tor gets the chance to service the
first three streams activated before the circuit window is exhausted
again, which causes all streams to be blocked again. As an experiment,
we reversed the order in which the streams are activated, and indeed the
first three streams, rather than the last three, got service, while the
others starved.
Our solution is to change the order in which streams are activated. We
choose a random edge connection from the linked list, and then we
activate streams starting from that chosen stream. When we reach the end
of the list, then we continue from the head of the list until our chosen
stream (treating the linked list as a circular linked list). It would
probably be better to actually remember which streams have received
service recently, but this way is simple and effective.
Here's the graph again, with the patched Exit:
https://thunk.cs.uwaterloo.ca/mashael/fixed.png
The patch is attached.
- Ian and Mashael
--- src/or/relay.c.orig 2010-11-19 11:00:22.140368267 -0500
+++ src/or/relay.c 2010-11-19 14:58:24.432368619 -0500
@@ -1507,6 +1507,8 @@
int packaged_this_round;
int cells_on_queue;
int cells_per_conn;
+ int num_streams = 0;
+ edge_connection_t *chosen_stream = NULL;
/* How many cells do we have space for? It will be the minimum of
* the number needed to exhaust the package window, and the minimum
@@ -1524,7 +1526,38 @@
/* Count how many non-marked streams there are that have anything on
* their inbuf, and enable reading on all of the connections. */
n_streams = 0;
- for (conn=first_conn; conn; conn=conn->next_stream) {
+ /* This used to start listening on the streams in the order they
+ * appeared in the linked list. That leads to starvation in the
+ * event that, for example, our circuit window is almost full, and
+ * there are lots of streams. Then the first few streams will have
+ * data read from them, and then the window will close again. When
+ * it reopens, we would enable reading from the beginning of the list
+ * again. Instead, we just pick a random stream on the list, and
+ * enable reading for streams starting at that point (and wrapping
+ * around as if the list were circular). It would probably be better
+ * to actually remember which streams we've serviced in the past, but
+ * this is simple and effective. */
+
+ /* Select a stream uniformly at random from the linked list. We
+ * don't need cryptographic randomness here. */
+ for(conn = first_conn; conn; conn = conn->next_stream) {
+ num_streams++;
+ if((random() % num_streams)==0) chosen_stream = conn;
+ }
+ /* Activate reading starting from the chosen stream */
+ for (conn=chosen_stream; conn; conn = conn->next_stream) {
+ /* Start reading for the streams starting from here */
+ if (conn->_base.marked_for_close || conn->package_window <= 0)
+ continue;
+ if (!layer_hint || conn->cpath_layer == layer_hint) {
+ connection_start_reading(TO_CONN(conn));
+
+ if (connection_get_inbuf_len(TO_CONN(conn)) > 0)
+ ++n_streams;
+ }
+ }
+ /* Go back and do the ones we skipped, circular-style */
+ for(conn = first_conn; conn != chosen_stream; conn = conn->next_stream) {
if (conn->_base.marked_for_close || conn->package_window <= 0)
continue;
if (!layer_hint || conn->cpath_layer == layer_hint) {