> From: Yawning Angel <yawning@xxxxxxxxxxxxxxx> > Subject: Re: [tor-dev] gettimeofday() Syscall Issues > > On Thu, 01 Jan 2015 23:42:42 -0500 > Libertas <libertas@xxxxxxxxxxx> wrote: > >> The first two account for the bulk of the calls, as they are in the >> core data relaying logic. >> >> Ultimately, the problem seems to be that the caching is very weak. At >> most, only half of the calls to tor_gettimeofday_cached_monotonic() >> use the cache. It appears in the vomiting print statements that >> loading a single simple HTML page >> (http://www.openbsd.org/faq/ports/guide.html to be exact) will cause >>> 30 gettimeofday() syscalls. You can imagine how that would >>> accumulate for an exit carrying 800 KB/s if the caching >> doesn't improve much with additional circuits. > > So while optimization is cool and all, I'm not seeing why this > specifically is the underlying issue. > > Each cell can contain 498 bytes of user payload. Looking at things > simplistically this is 800 KiB/s -> 1644 cells/sec, leaving you with > approximately 608 microseconds of processing time per cell. > > On my i5-4250U box, gettimeofday() takes 22 ns on Linux, and 2441 ns on > FreeBSD. I'm not sure how accurate the FreeBSD results are as it was > in a VirtualBox VM (getpid() on the same VM takes 124 ns). If someone > has a OpenBSD box they should benchmark gettimeofday() and see how long > the call takes. > > Taking the FreeBSD case (since we know that tor works fine on Linux), a IPredator has complained that tor on Linux spends too much time calling time() when pushing 500Mbit/s, which is an issue for them under 3.x series kernels, but not kernel 2.6. https://ipredator.se/guide/torserver#performance > single gettimeofday() call takes approximately, 0.39% of the per-cell > processing budget. > > For reference (assuming gettimeofday() in *BSD really is this shit > performance wise), 7000 calls to gettimeofday() is 17.09 ms worth of > calls. > > The clock code in tor does need love, so I wouldn't object to cleanup, > but I'm not sure it's in the state where it's causing the massive > performance degradation that you are seeing. > Yawning/Libertas, I just reviewed my profiling of an exit relay running chutney verify with 200MB of random data. This is on OS X 10.9.5 with tor 0.2.6.2-alpha-dev running the chutney basic-min network. The three leaf functions that take the most time in the call graph are: * channel_timestamp_recv * channel_timestamp_active * time Each of these functions takes around 16% of the execution time, the next nearest function is sha1_block_data_order_avx on 4%. While I understand that OS X, BSD, and Linux syscalls aren't necessarily identical, we now have results for the following platforms suggesting that calling time() too often has a performance impact: * Linux kernel 3.x * OpenBSD * OS X 10.9 My results suggest a maximum performance improvement of 15% on OS X if we reduced the calls to time() to a reasonable number per second. teor pgp 0xABFED1AC hkp://pgp.mit.edu/ https://gist.github.com/teor2345/d033b8ce0a99adbc89c5 http://0bin.net/paste/Mu92kPyphK0bqmbA#Zvt3gzMrSCAwDN6GKsUk7Q8G-eG+Y+BLpe7wtmU66Mx
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev