[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: Effect of Tor window size on performance
Roger Dingledine wrote:
On Wed, Feb 04, 2009 at 10:25:31AM +0100, Csaba Kiraly wrote:
http://disi.unitn.it/locigno/preprints/TR-DISI-08-041.pdf
[snip]
[snip]
So the next question is an implementation one. Right now the window sizes
are hard-coded at both ends. I've been meaning to extend the protocol
so sendme cells have a number in them, and so the initial window sizes
are specified in the 'create' and 'created' cells for circuits and the
'begin' and 'connected' cells for streams. But we haven't really fleshed
out the details of those designs, or how they could be phased in and still
handle clients and relays that don't use (or know about) the numbers.
So the big deployment question is: is it worth it to work on a design
for the above, and then either shrink the default window sizes or do
something smarter like variable window sizes, or should we just be
patient until a UDP-based solution is more within reach?
One answer is that if you were interested in working on a design proposal
and patch, it would be much more likely to get implemented. :)
We are doing verifications on this. Our lab experiments (the ones in the
tech report) show that there is a huge gain on the user side in delays,
while throughput is untouched. Throughput is capped with a static window
size, but I think the cap can be chosen better than what it is now.
There should also be a big gain in the memory consumption of ORs,
although we didn't measure it yet. Since the Tor network is kind of
overloaded all the time, memory usage should decrease almost linearly
with the window size.
Currently we are verifying one-side modification of the circuit, i.e.
whether one side of the connection can reduce the widow size on its own,
without explicitly notifying the other side. From the code it seems to
me that this will work, and if so, phasing in a smaller window size in a
new release should not be a problem.
Hey, that's a really good point. We don't have to change much code at
all if we want to use a *smaller* package window than we are allowed. We
simply pretend that the package window started out smaller, and either
side can do that independently!
Do you have a patch in mind for this? It would seem that if we
change init_circuit_base() so it sets circ->package_window to some
lower number x, and change connection_exit_begin_conn() so it sets
nstream->package_window to some even lower number y, that should be it.
The client side will send sendme's like normal, and the only difference
is that no more than y cells will ever be 'in flight' for a given stream.
Its not even that "complicated", you already have the defines in or.h :-)
#define CIRCWINDOW_START 1000
#define CIRCWINDOW_INCREMENT 100
#define STREAMWINDOW_START 500
#define STREAMWINDOW_INCREMENT 50
In our test code we've also changed some other things to have a better
control of the test environment, but the modification needed is only to
change these.
So here are the questions we need to consider:
1) What values of x and y are best? Presumably as we reduce them from
1000 and 500, the performance gets better, but at some point they become
so low that performance gets worse (because each round-trip through the
network takes extra time). As sample numbers, if we start x at 100 and
y at 50, then we need another round-trip for any stream that delivers
more than 24900 bytes, and/or for every 49800 bytes on the circuit.
Should our choices be influenced by the 'typical' Tor stream that the
network is seeing right now? (Not that we have those numbers, but maybe
we should get them.) What other factors are there?
Note that changing y would also change our overhead. Currently with 100
it is 1% (1 send_me sent back for every 100 cells received). This extra
traffic is on the "backward" path, so in the typical download scenario,
the overhead would be on the ADSL uplink. Increasing it too much would
be bad, but I think 5% is still fine.
What points in the direction of reducing y is that with a lower y we
have less burstiness in the traffic pumped in the circuit by the exit
node, which typically improves performance.
Another thing to consider is the x/y ratio. Currently it is 10. Lets see
some considerations:
- 2 seems a bit low: Say A and B are the two ends of the circuit and
traffic flows mainly from A to B. Assuming that one-way delays are equal
in both directions (this is reasonable since we have asymmetry only in
the A-B traffic, but overlay links of the same path are used by other
circuits in the reverse direction as well), the send_me cell is sent
from B when half of the cells arrives. The send_me arrives back to A
when all the cells have arrived to B. At this point A can start pumping
in new cells, which might be a bit late.
- 10 might be a bit large, I don't think we have such asymmetries in the
network, between OR nodes that justify such a high factor.
I think 500 is a safe bet for x to start with, but from our measurements
it seems that even 200 would do fine with nice performance improvements.
Of course we are working on more measurements to be on the safe side.
2) What are the effects of this patch in a hybrid Tor network? If some
exit relays use package windows of 1000, and newer ones use package
windows of 100, will the newer ones get 'clobbered' by the old ones?
That is, should we expect to see even more slow-down in the network
while the relays transition? Is there anything we can do about that,
since it will take a year or more for everybody to transition?
We've already done measurements on what we call the asymmetric scenario,
i.e. where the client is the old one and the exit node is the new one.
It seems that performance improvements are there. We are currently
thinking of some "fairness" tests; we have some ideas, but it is not yet
exactly clear how to perform them. Anyway, from current numbers it seems
that the 'clobbered' ones would be the old ones, so at least we have one
more incentive for people to change version :-)
Note that in this scenario y is practically 100 (client side matters),
while have done tests with x (server side) from 10000 down to 200. We
will add 100, just to see how things go bad with that.
3) Should we reduce the package_windows on the client side too? It would
be easy to do. How do we decide whether it's worth it?
I think it would be better to do it symmetrically, not differentiating
between the two sides. On one hand, with the typical client use, i.e.
web browsing, upload traffic is small so the value does not really
matter. Browsers do some pipelining of HTTP requests in 1.1, but that
should still be small amount of data. On the other hand, if a "client"
starts to do some nasty things, or just uploads a file, a lower window
value should be beneficial, just like in the exit node's case. I don't
think there are any asymmetries in the circuits. As said before, they
are bottlenecked because of the OR-OR connections traversed, and theses
are bottlenecked both ways because of the many circuits passing through
them.
Anything I missed?
Thanks!
--Roger
I think thats all. We are doing some more measurements, and coming up
with the results soon. As I said, the code modification is at the level
of changing defines and simple recompile, but we should better verify
more scenarios before making the change.
Thanks for the questions and ideas!
Csaba