[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: tor callgrinds
On Wed, Feb 21, 2007 at 03:02:15AM -0500, Nick Mathewson wrote:
> Hi! One more question I'd like an answer for, when you have the time:
>
> On Sun, Feb 18, 2007 at 02:34:55AM -0800, Christopher Layne wrote:
> > On Sat, Feb 17, 2007 at 04:01:32PM -0500, Nick Mathewson wrote:
> [...]
> > > 3. To what extent does -O3 help over -O2? Most users seem to
> > > compile with -O2, so we should probably change our flags if the
> > > difference is nontrivial.
> >
> > I've found O3 to always benefit over O2. Primarily for:
> >
> > -finline-functions
>
> Can you quantify whether this improves Tor performance? If it does,
> I'll enable it for more recent GCCs. I have heard numerous rants
> about bad inlining decisions from 3.x gcc series, and numerous claims
> that things were never that bad, and numerous claims that things have
> gotten better. At this point, I will believe nothing but quantitative
> data. ;)
Yeah, for some reason it takes atleast 20 gcc releases before people drop
old grudges.
Well, personally I'm using gcc 4.1.1 and 4.2.0, so that kind of throws 3.x
out with the bathwater on this host. But I've been trying to run the gamut
on -O2, -O3, and -Os. -Os has serious potential, particularly on my host,
which only has a 16kB L1 and 256kB L2.
I've just recently setup OProfile, so that's another source of information
available to us. Currently, I'm restructuring some things on the host, but
basically we just have to collect more data.
Also, if you're running callgrind on your own, I recommend using
--simulate-cache to have access to cache statistics. These are arguably
more important than ifetches.
-cl