I also completely forgot to mention this comment: https://trac.torproject.org/projects/tor/ticket/14056#comment:6 I'll look into doing the replacements. If anyone has concerns about whether approx_time is exact enough, please let me know - I had never heard of it previously. I'll look into it myself as well. On 01/01/2015 11:42 PM, Libertas wrote: > On 01/01/2015 03:47 PM, Nick Mathewson wrote: >> What part of tor is accounting for all these time/gettimeofday calls? > > I set up an OpenBSD 5.6 VM and built Tor 0.2.5.10 on it. I couldn't get > chutney working because of a number of strange errors that seemed > platform-specific, so I went for a less thorough solution. I can't run a > Tor relay from home, so the only activity the Tor process had was some > Firefox traffic I created. > > There are surprisingly few syscall analysis tools available on OpenBSD, > and I'm new to syscall/interrupt tuning, so I eventually ended up > selecting the advanced "printf()" technique. > > As a quick refresher to those not familiar, there is a linear heirarchy > of time functions in the Tor code, each calling the one below it: > > gettimeofday() < tor_gettimeofday() < tor_gettimeofday_cached() < > tor_gettimeofday_cached_monotonic() > > The first two will always make the syscall, while the other two will > first try the cache. I applied print statements to instances of all four. > > There was a steady stream of tor_gettimeofday() calls from > circuit_expire_building() - about two per second with a single idle TCP > connection. > > When there was active traffic flow, there were many more gettimeofday() > calls that were almost exclusively the product of > tor_gettimeofday_cached_monotonic(). This function is called in three > places: > > * once in src/or/relay.c > * once in src/or/buffers.c > * once in src/or/circuitlist.c > > The first two account for the bulk of the calls, as they are in the core > data relaying logic. > > Ultimately, the problem seems to be that the caching is very weak. At > most, only half of the calls to tor_gettimeofday_cached_monotonic() use > the cache. It appears in the vomiting print statements that loading a > single simple HTML page (http://www.openbsd.org/faq/ports/guide.html to > be exact) will cause >30 gettimeofday() syscalls. You can imagine how > that would accumulate for an exit carrying 800 KB/s if the caching > doesn't improve much with additional circuits. > > If I understand correctly, this implies either a bug in the caching > logic or the simple fact that libevent calls back and clears the cache > too much for the caching to be very useful. > > The cache is only cleared in two places (both pretty self-explanatory, > and both triggered by libevent callbacks): > > * connection_handle_read() > * connection_handle_write() > > Apparently, the above functions are called >30 times when fetching a > JS-less non-HTTPS HTML page. I don't know enough about TCP to comment on > whether this makes sense. > > Sorry for the magnum opus. I thought I'd try to clarify the context for > those interested, and typing this helped organize my thoughts. > > Let me know if I've made any mistakes thus far. I'll start looking into > potential improvements. I'm eager to work with someone on this if > anyone's interested. Also, if you have any ideas or suggestions, please > share. > > Thanks for reading! > > Libertas > > > > _______________________________________________ > tor-dev mailing list > tor-dev@xxxxxxxxxxxxxxxxxxxx > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev >
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ tor-dev mailing list tor-dev@xxxxxxxxxxxxxxxxxxxx https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev