[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: hardware acceleration available for Tor ? On FreeBSD ?

On Mon, 12 Oct 2009, Wyllys Ingersoll wrote:

"tor is actually cpu-bound rather than ram-bound on the fast relays i
think you should be able to push 10MB/s in 1G of ram"

So crypto-acceleration appears to be useful.

     The symmetric-key processing is very fast and takes up little CPU time.
The apparent hangup on the high-rate relays is the asymmetric-key processing
(i.e., onion-skin encrypting/decrypting).  FWIW, when I was running a relay,
it could be running at rates over 300 KB/s while using less than 1% of the
CPU when it was simply passing cells back and forth among the various
connections.  When new onion skins came in to be decrypted was when tor would
suddenly use much more CPU time for a moment or two.

You discuss the performance benefits of using the AES CTR in hardware below, but before I get to that, I wonder if you experienced the same as what is quoted just above: that the real CPU load occurs during "the asymmetric-key processing (i.e., onion-skin encrypting/decrypting)" and that that is the only area where we can attempt to speed things up with hardware ?

- Is anyone _actually_ testing Tor, and more specifically, hardware crypto
acceleration of Tor, in high speed (gigabit) test environments ?

I did testing with the Niagara 2 chips on some Sun systems running Solaris and got good results.
The critical operation is not necessarily the SSL, but rather the AES CTR mode algorithms.
I did not test this on a gigabit test network though.

The problem I discovered was that just getting accelerated AES from hardware
was not giving much improvement if the CTR mode operations had to be done
in software.  The N2 supports AES CTR in hardware so you can pass
the entire buffer to be encrypted at once instead of doing 16 bytes at a time
and updating the counters in software.

Thank you very much for your testing, blog post, and this exchange.

So, if I understand correctly, the HardwareAccel option can be turned on by anyone, regardless of OS or hardware platform. It will then (probably) just do AES CTR with OpenSSL (since most people use OpenSSL) and just get no benefit because OpenSSL won't do AES CTR through hardware.

So, even if I got a BCM5825 working with with the ubsec driver on FreeBSD, and used HardwareAccel, it would just be wasted with OpenSSL.

On the other hand, there are Solaris-specific routines (crypto framework APIs (PKCS#11)) built into Solaris that Tor can call instead of OpenSSL, which _will_ do AES CTR in hardware, yielding a huge gain in performance (you mention 25x).

Do I have all of that correct ?

If so, some follow-up questions that I hope will be of help to others:

- Are there analogues to the "crypto framework APIs (PKCS#11)" in other OS's ? What is this layer called for Linux, for instance ? For FreeBSD ? Or is hardware AES CTR with HardwareAccel something we're only going to see on Solaris for now ?

- How does the T2 (Niagara 2) compare to dedicated hardware such as the Sun Crypto 6000 which is currently available ? Presumably the crypto framework APIs will use whatever is available, whether it be a SCA-60000 or a Broadcom based card or ... ?

- Am I correct that if a new version of OpenSSL appeared with AES CTR hardware support, an end user could just proceed blindly with card insertion, driver install, add HardwareAccel=1, and poof! Yes ?

If I am correct in the above, it would appear that there is a nice, platform specific feature in Solaris to complement a specific CPU feature in T2 (and later, I hope) processors, and that the correct path for the rest of the world is to lobby OpenSSL to implement hardware AES CTR so we can just plug and play.

Thanks very much for your help and contributions.
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxxx with
unsubscribe or-talk    in the body. http://archives.seul.org/or/talk/