[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: AES performance results



On Mon, Feb 26, 2007 at 05:05:23PM -0800, Adam Langley wrote:
> On 2/26/07, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:
> >METHODOLOGY: I wrote a stupid benchmark function in aes.c to encrypt a
> >million cell-sized chunks using our aes_crypt function, and timed it
> >with the unix "time" command.  I did this twice for each
> >(computer,code) pair, I took the median of three runs.
> 
> You have to be very careful of cache issues with micro-benchmarks like
> that. I'm think that you're ok because the cache profile of an AES
> function is probably pretty much fixed (it walks the input and the
> output and the tables are of fixed size I'm guessing). But if the
> faster impl uses different sized tables etc (or more code, looking at
> FULL_UNROLL) you might find that, when running with the rest of the
> Tor code, the results are rather different.

Right; I'm pretty confident of the 40% improvement from switching to
OpenSSL's assembly implementation where available, but less confident
of other improvements.

A couple more developments on this front, BTW:

  * I tried OpenSSL 0.9.8e on an x86_64 machine, and found out that
    either the i586 assembly code isn't used on x86_64, or it is used
    but offers no speed benefit over 0.9.7f.

  * It looks like OpenSSL 0.9.9 (or whatever they're calling the next
    one) will probably add assembly implementations for ARM, x86_64,
    and sparc.  Neat!

  * We suffer a bit for having our AES_CTR implementation have to work
    on unaligned data.  I did an experiment using 508-byte cell
    payloads instead of 509-byte cell payloads, and xoring uint32_ts
    rather than chars: it knocked about 10% off my benchmark.  This is
    probably something to look at when we redesign the cell format.

peace,
-- 
Nick Mathewson

Attachment: pgpCoPDFciZhh.pgp
Description: PGP signature