On 2 Jan 2015, at 23:18 , teor <teor2345@xxxxxxxxx> wrote: > >> From: Yawning Angel <yawning@xxxxxxxxxxxxxxx> >> Subject: Re: [tor-dev] gettimeofday() Syscall Issues >> >> On Thu, 01 Jan 2015 23:42:42 -0500 >> Libertas <libertas@xxxxxxxxxxx> wrote: >> >>> The first two account for the bulk of the calls, as they are in the >>> core data relaying logic. >>> >>> Ultimately, the problem seems to be that the caching is very weak. At >>> most, only half of the calls to tor_gettimeofday_cached_monotonic() >>> use the cache. It appears in the vomiting print statements that >>> loading a single simple HTML page >>> (http://www.openbsd.org/faq/ports/guide.html to be exact) will cause >>>> 30 gettimeofday() syscalls. You can imagine how that would >>>> accumulate for an exit carrying 800 KB/s if the caching >>> doesn't improve much with additional circuits. >> >> So while optimization is cool and all, I'm not seeing why this >> specifically is the underlying issue. >> >> Each cell can contain 498 bytes of user payload. Looking at things >> simplistically this is 800 KiB/s -> 1644 cells/sec, leaving you with >> approximately 608 microseconds of processing time per cell. >> >> On my i5-4250U box, gettimeofday() takes 22 ns on Linux, and 2441 ns on >> FreeBSD. I'm not sure how accurate the FreeBSD results are as it was >> in a VirtualBox VM (getpid() on the same VM takes 124 ns). If someone >> has a OpenBSD box they should benchmark gettimeofday() and see how long >> the call takes. >> >> Taking the FreeBSD case (since we know that tor works fine on Linux), a > > IPredator has complained that tor on Linux spends too much time calling time() when pushing 500Mbit/s, which is an issue for them under 3.x series kernels, but not kernel 2.6. > > https://ipredator.se/guide/torserver#performance > >> single gettimeofday() call takes approximately, 0.39% of the per-cell >> processing budget. >> >> For reference (assuming gettimeofday() in *BSD really is this shit >> performance wise), 7000 calls to gettimeofday() is 17.09 ms worth of >> calls. >> >> The clock code in tor does need love, so I wouldn't object to cleanup, >> but I'm not sure it's in the state where it's causing the massive >> performance degradation that you are seeing. >> > > Yawning/Libertas, > > I just reviewed my profiling of an exit relay running chutney verify with 200MB of random data. > This is on OS X 10.9.5 with tor 0.2.6.2-alpha-dev running the chutney basic-min network. > > The three leaf functions that take the most time in the call graph are: > * channel_timestamp_recv > * channel_timestamp_active > * time > > Each of these functions takes around 16% of the execution time, the next nearest function is sha1_block_data_order_avx on 4%. > > While I understand that OS X, BSD, and Linux syscalls aren't necessarily identical, we now have results for the following platforms suggesting that calling time() too often has a performance impact: > * Linux kernel 3.x > * OpenBSD > * OS X 10.9 > > My results suggest a maximum performance improvement of 15% on OS X if we reduced the calls to time() to a reasonable number per second. Oh dear, I was working with an un-optimised build. Now calls to time() are a much more reasonable 4%. teor pgp 0xABFED1AC hkp://pgp.mit.edu/ https://gist.github.com/teor2345/d033b8ce0a99adbc89c5 http://0bin.net/paste/Mu92kPyphK0bqmbA#Zvt3gzMrSCAwDN6GKsUk7Q8G-eG+Y+BLpe7wtmU66Mx
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev