[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tr:[f-cpu] usage of 64 registers & ILP



> First during my developement days I never seen algorithm
> (except unrolled loops) which can use 64 regs in one stack
> frame range.
> 
> >>> Most of the time, if you want to put n (pipeline depth) cycles
> between the write of a register and it's read and if you use cmove trick
> (to avoid jump), you will have great pressure on the register set, so
> you will need so much register (there is no OOO in fcpu). 

Ok so that is assumption below true ?
- To keep pipeline full the program must exhibit ILP at
  least 5 at each place. 

If so, does it mean that binary tree or linked list handling
will cause about 4 cycles big bubbles in the pipeline ? :-0
I have had hard time optimizing QoS queue in linux kernel
for gigabit flow eth the code is full of list and tree searches ...

By the way anybody knows granularity of IA32,IA64 amd 21256
pipeline ?

have a nice day,
devik

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/