[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [f-cpu] GCC and jmpz vs. jmpl
hi !
Michael Riepe wrote:
On Tue, Jan 07, 2003 at 01:57:04PM +0100, devik wrote:
<snip>
For these cases we could learn gcc's combiner that
if (a > b)
can use jmpl - it is because we know that nonequality
operator stores result in bit 0 regardless of operands
sizes and it is possibly faster than jmpz for FCPUs
with wider data types where zero flag computation
can took long time - also it relieves us from problems
with zero extending all results.
That will work fine if the compare and jmpl instructions are paired.
a bypass path between ADD, ROP2 and the decoder will be made if possible.
But this can only be done when the core is complete.
If the condition comes from somewhere else (e.g. as a function parameter),
you'll have to compare with zero explicitly (usually, a zero extend
operation will be sufficient).
On other side there is a big problem with == and !=.
Yep, I know...
Just now I use xor and the jmpz/nz.
that's the best way, AFAIK.
If I want to use scc
I need to emit "cmple 0" to convert it to 1/-1 notation.
If we would like to use jmpl it is the same problem.
So that I'd like to ask, is it big problem to perform
"cmpe" in increment unit too ? I know it is done by
xor.and. but it is still not sure whether is will be there,
and if it will support more than 8 bits and even if so
it will take 2 cycles.
One solution would be a `chunk-size' logical operation that zero-extends
the result. If we really had `xor.b', you could just write
// beq r1, r2, r4
xor.b r1, r2, r3
jmpz r3, r4
// bne r1, r2, r4
xor.b r1, r2, r3
jmpnz r3, r4
because the high part of r3 would be guaranteed to be zero.
Look at the file FORMAT.txt in the snapshot : this is what is done.
I don't know what is in the latest manuals (shame on me)
but this is at least what i am going to implement.
But if you want to compile something like `return (int)a == (int)b',
you must use
// return a == b
xnor.and.q r1, r2, r3 // -1 if they're equal
andi $1, r3, r1
Just one question : is GCC bound to this "1/0" scheme ?
F-CPU is based on "0 / not 0", so in some cases we can drop the andi :-)
Or, if the result is not used in a computation or mask, the LSB
condition can
be used.
// return a != b
xor.or.q r1, r2, r3 // -1 if they're different
andi $1, r3, r1
or, if `xor.or.q' is not available,
// return a == b
xor r1, r2, r3
cmple.q r0, r3, r3
andi $1, r3, r1
// return a != b
xor r1, r2, r3
cmpg.q r0, r3, r3
andi $1, r3, r1
but that will stall for three cycles (one for xor, two for cmpg).
yup, and fortunately it is not necessary :-)
bye,
YG
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/