[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [f-cpu] Instruction census
- To: <f-cpu@seul.org>
- Subject: Re: [f-cpu] Instruction census
- From: <cyrano@nerim.net>
- Date: Wed, 15 Jan 2003 14:20:11 CET
- Delivered-to: archiver@seul.org
- Delivered-to: f-cpu-outgoing@seul.org
- Delivered-to: f-cpu@seul.org
- Delivery-date: Wed, 15 Jan 2003 08:21:46 -0500
- Reply-to: f-cpu@seul.org
- Sender: owner-f-cpu@seul.org
Is it possible to count typical function call depth if it's possible and to count the number of argument of each function ?
nicO
Michael Riepe <michael@stud.uni-hannover.de> a écrit :
> On Wed, Jan 15, 2003 at 07:34:11PM +0000, cedric wrote:
>
> > > 98104 instructions total
> > > 13079 move 13.3% ( 13.3% total)
> > > 10846 load 11.0% ( 24.3% total)
> > > 3545 storei 3.6% ( 78.1% total)
> > > 3533 loadi 3.6% ( 81.7% total)
> > > 3459 store 3.5% ( 85.3% total)
> >
> > > 21383 lsu 21.7% ( 21.7% total)
> >
> > Did you use the new register allocator for this test ? I hope it will change
>
> > a lot of result for RISC CPU, and perhaps it will remove some loadi/storei,
>
> > but I don't know the impact for load/store (Did you have an idea why we have
>
> > 3 time more load than store, but loadi and storei are very close) .
>
> Which register allocator? Did I miss something? I used what Martin
> provided, plus my own bug fixes and backend extensions.
>
> > > Goals for optimization (IMHO):
> > > - reduce number of load/store instructions
> >
> > Perhaps it's more easy for loadi/storei, but I really want to know where all
>
> > this load came from.
>
> Most of it comes from the code itself; with -O -fomit-frame-pointer,
> save/restore instructions are reduced to a minimum.
>
> > > - increase number of conditional moves (in favor of jmp{cc})
> > > - avoid shift-and-add where mul/mac is faster
> >
> > Hum, what about a "mac"shift instruction ?
>
> Martin proposed `shadd[i]' which calculates`r1 += r2
> or similar. But that is rather hard to do with separate SHL and ASU
> execution units, and won't be faster than explicit shift and add
> instructions. In fact, explicit instructions are more flexible.
> But an immediate version of `amac' would make sense, IMHO. I'll add
> one and see what happens.
>
> > > - make use of divrem[s] instruction
> > > - make use of SIMD instructions
> >
> > I think that gcc support SIMD only for string function, it's really hard to
>
> > give a real SIMD support to gcc.
>
> Gcc 3.x has real SIMD support, but you'll have to use builtin functions
> explicitly (or use the interface, where available). E.g.
> with my patched version of the latest official fcpu-gcc release,
> `fcpu-gcc -S -O -fomit-frame-pointer' translates
>
> /* use vecturs that consist of two floats */
> typedef float __v2sf __attribute__((__mode__(__V2SF__)));
>
> __v2sf
> sfmac_f(__v2sf a, __v2sf b, __v2sf c) {
> return __builtin_addv2sf(a, __builtin_mulv2sf(b, c));
> }
>
> into
>
> .p2align 5
> .global sfmac_f
> sfmac_f:
> sfmac.32 r3,r2,r1
> jmp r63
>
> which is all you can expect.
>
> --
> Michael "Tired" Riepe &lang=fr">Michael.Riepe@stud.uni-hannover.de>
> "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org
> with
> unsubscribe f-cpu in the body. http://f-cpu.seul.org/
___________________________________
Webmail Nerim, http://www.nerim.net/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/