[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[f-cpu] GCC and jmpz vs. jmpl



In light of my prev. mail about zero_extending one other
thing to discuss.
When we do compares < > >= <= we got results as 1 or -1.
It is very nice as gcc can then often eliminate jump.
I set 1/-1 as scc value and it now knows that:
if (a>b) b++; else b--;
can be compiled as:
cmpg r1,r2,r3
add r2,r3,r2

and "return a>b" becomes
cmpg r1,r2,r3
neg r3,r1.

For different modes we can add truncation like
for "return (long)a > (long)b":
cmpg.64 r1,r2,r3
neg.32 r3,r1

or extension for "return (char)a > (char)b":
cmpg.8 r1,r2,r3
neg.8 r3,r1
widen.8 r1,r1 // what does this wide to ?? to 64bits ??

For these cases we could learn gcc's combiner that
if (a > b)
can use jmpl - it is because we know that nonequality
operator stores result in bit 0 regardless of operands
sizes and it is possibly faster than jmpz for FCPUs
with wider data types where zero flag computation
can took long time - also it relieves us from problems
with zero extending all results.

On other side there is a big problem with == and !=.
Just now I use xor and the jmpz/nz. If I want to use scc
I need to emit "cmple 0" to convert it to 1/-1 notation.
If we would like to use jmpl it is the same problem.

So that I'd like to ask, is it big problem to perform
"cmpe" in increment unit too ? I know it is done by
xor.and. but it is still not sure whether is will be there,
and if it will support more than 8 bits and even if so
it will take 2 cycles.
It could save cycles if zero flag is slow (if it is
possible at all !).

devik

PS: Will stall occur here (due zero flag computation) ? :
cmplei 2,r1,r2
nop
movez r2,r0,r1

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/