[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Instruction census



Is it possible to count typical function call depth if it's possible and to count the number of argument of each function ?

nicO

Michael Riepe <michael@stud.uni-hannover.de> a écrit :

> On Wed, Jan 15, 2003 at 07:34:11PM +0000, cedric wrote:
> 
> > >   98104 instructions total
> > >   13079 move            13.3% ( 13.3% total)
> > >   10846 load            11.0% ( 24.3% total)
> > >    3545 storei           3.6% ( 78.1% total)
> > >    3533 loadi            3.6% ( 81.7% total)
> > >    3459 store            3.5% ( 85.3% total)
> > 
> > >   21383 lsu             21.7% ( 21.7% total)
> > 
> > 	Did you use the new register allocator for this test ? I hope it will change
> 
> > a lot of result for RISC CPU, and perhaps it will remove some loadi/storei,
> 
> > but I don't know the impact for load/store (Did you have an idea why we have
> 
> > 3 time more load than store, but loadi and storei are very close) .
> 
> Which register allocator?  Did I miss something?  I used what Martin
> provided, plus my own bug fixes and backend extensions.
> 
> > > Goals for optimization (IMHO):
> > > 	- reduce number of load/store instructions
> > 
> > Perhaps it's more easy for loadi/storei, but I really want to know where all
> 
> > this load came from.
> 
> Most of it comes from the code itself; with -O -fomit-frame-pointer,
> save/restore instructions are reduced to a minimum.
> 
> > > 	- increase number of conditional moves (in favor of jmp{cc})
> > > 	- avoid shift-and-add where mul/mac is faster
> > 
> > Hum, what about a "mac"shift instruction ?
> 
> Martin proposed `shadd[i]' which calculates`r1 += r2 
> or similar.  But that is rather hard to do with separate SHL and ASU
> execution units, and won't be faster than explicit shift and add
> instructions.  In fact, explicit instructions are more flexible.
> But an immediate version of `amac' would make sense, IMHO.  I'll add
> one and see what happens.
> 
> > > 	- make use of divrem[s] instruction
> > > 	- make use of SIMD instructions
> > 
> > I think that gcc support SIMD only for string function, it's really hard to
> 
> > give a real SIMD support to gcc.
> 
> Gcc 3.x has real SIMD support, but you'll have to use builtin functions
> explicitly (or use the  interface, where available).  E.g.
> with my patched version of the latest official fcpu-gcc release,
> `fcpu-gcc -S -O -fomit-frame-pointer' translates
> 
> 	/* use vecturs that consist of two floats */
> 	typedef float __v2sf __attribute__((__mode__(__V2SF__)));
> 
> 	__v2sf
> 	sfmac_f(__v2sf a, __v2sf b, __v2sf c) {
> 	    return __builtin_addv2sf(a, __builtin_mulv2sf(b, c));
> 	}
> 
> into
> 
> 		.p2align 5
> 		.global sfmac_f
> 	sfmac_f:
> 		sfmac.32 r3,r2,r1
> 		jmp r63
> 
> which is all you can expect.
> 
> -- 
>  Michael "Tired" Riepe &amp;lang=fr">Michael.Riepe@stud.uni-hannover.de>
>  "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org
> with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/



___________________________________
Webmail Nerim, http://www.nerim.net/


*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/