[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [f-cpu] IEEE FP exceptions
On Sat, Mar 01, 2003 at 11:27:00AM +0100, Yann Guidon wrote:
> however the F-CPU has no "FPU status flag register".
Not as a general register, that's right. But...
> It can be implemented as a SR (which is OK because
> it enforces ordering of the instructions, good for future architectures)
> and/or by a specific instruction that acts as a "fence" (ok too,
> because it would then only serialize the FP pipeline in a split arch.)
Exactly.
For FPU status, we may choose a serializing instruction or a special
register (or both). An `fstat' instruction is probably more convenient.
In order to be most useful, it should perform an atomic swap operation:
fstat rx, ry // ry := status; status := rx
where r0 in either slot means `don't': `fstat r0, ry' will only read
the status, `fstat rx, r0' will only write (clear?) it. The instruction
should also take an optional `w' suffix (for `wait') that lets us turn
the fence on and off as needed:
fstatw rx, ry // complete all outstanding FP operations first
fstat rx, ry // execute and return immediately
Note that `fstatw r0, r0' may be used as an FP serializing instruction
with no further effect.
There is, however, a big problem with FP status: Since instructions
complete out-of-order, it's undefined which instruction the status
comes from. You can only query the status of a series of FP instructions
collectively, unless you serialize the instruction stream after every
FP operation (which is going to be slooooooooow).
With or without exceptions, we'll need something to control the operating
modes of the FPU (set the default rounding mode, for example). That is,
there should be a (hidden) control register that can be read and written
with an `fcntl' instruction that doesn't have to wait for the previous
FP instruction to complete: When an ordinary FP operation starts,
it will (at least virtually) copy the current settings; that is, an
immediately following `fcntl' instruction won't affect its result.
Syntax and semantics could be similar to those of `fstat', except that
a different register is affected.
Note: these instructions have nothing to do with the Unix `fstat()' and
`fcntl()' system calls ;-)
> And of course there is a condition code left, we have
> zero, LSB and MSB, and the remaining code can be assigned
> to NaN or error condition, by reading a specific bit in the register
> (so it's not the same as a "FPU status register", in the principle,
> because there is no such thing as "sticky bits" and hacks like that).
I currently support the `nan' condition in the emulator, but its use
is limited because it only works with 64-bit data, similar to the `msb'
condition. I'd rather add an `ftest' instruction that checks its operand
and returns a `type indicator' as follows:
bit 0:
0 => sign == 0
1 => sign == 1
bit 1:
0 => mantissa == 0
1 => mantissa != 0
bit 3...2:
00 => exponent == 0...0
01 => exponent == 1...1
1x => exponent == any other value (*)
bit max...4 are reserved (set to 0).
This nicely translates to:
3210 exponent mant sign type
=======================================
0000 e==0...0 m==0 s==0 +zero
0001 e==0...0 m==0 s==1 -zero
0010 e==0...0 m!=0 s==0 +denormal
0011 e==0...0 m!=0 s==1 -denormal
0100 e==1...1 m==0 s==0 +infinity
0101 e==1...1 m==0 s==1 -infinity
0110 e==1...1 m!=0 s==0 +nan(m)
0111 e==1...1 m!=0 s==1 -nan(m)
1xx0 e==other any s==0 +normal (*)
1xx1 e==other any s==1 -normal (*)
(*) may be further subdivided in later versions
That can be done with a handful of and/or gates, and can also be
implemented for 32-bit (float) and SIMD modes. Of course it could also
be emulated easily with ordinary integer instructions. In either case,
it can be used to implement the ISO C99 `fpclassify', `isfinite', `isinf',
`isnan', `isnormal' and `signbit' functions.
We also need a decent FP compare instruction (the one documented in the
manual is crap). Here's a more reasonable definition:
fcomp r3, r2, r1
Contents of r1:
bit 0 = 1 if r2 < r3
bit 1 = 1 if r2 == r3
bit 2 = 1 if r2 > r3
bit max...3 are reserved (set to 0)
The following additional rules shall apply:
- if any operand is a NAN, the result is 0
(a NAN equals nothing, not even itself)
- zero equals zero, regardless of sign
- any other value equals itself
- the order of values is:
+inf > +normal > +denormal > zero > -denormal > -normal > -inf
This allows us to test for all possible conditions in a uniform manner:
(r1 & 001) != 0 => r2 < r3
(r1 & 010) != 0 => r2 == r3
(r1 & 011) != 0 => r2 <= r3
(r1 & 100) != 0 => r2 > r3
(r1 & 101) != 0 => r2 != r3
(r1 & 110) != 0 => r2 >= r3
(r1 & 111) != 0 => ordered (that is, any of the above)
(r1 & 111) == 0 => unordered (at least one operand was NAN)
The `less' condition can also be evaluated directly by a `jmpl' or
`movel' instruction.
Of course `fcomp' will also have float and SIMD variants.
Note that we have to implement most checks anyway:
- all FP operations must check if their operands are NAN
- many instructions also handle ħINF specially
- fiaprx/fdiv must check for x != 0
- fsqrtiaprx/fsqrt must check for x >= 0
- flog must check for x > 0
and so on. The hardware will already be there, we just have to make
use of it.
To cut a long story short: I have added `fcomp' and `ftest' to the
FC tools (Cedric, do you still listen?):
========= fcomp (FP compare) instruction =========
[s]fcomp[.f|.d] r3, r2, r1
r1 = fp_compare(r2, r3)
Result encoded as described above (please cut&paste)
Instruction encoding:
31...24 OP_FCOMP
23...22 FP size bits (00 = float, 01 = double, 1x = reserved)
21 SIMD flag
20...18 unused (0)
17...12 r3
11... 6 r2
5... 0 r1
========= ftest (FP test) instruction =========
[s]ftest[.f|.d] r2, r1
r1 = fp_classify(r2)
Result encoded as described above (please cut&paste)
Instruction encoding:
31...24 OP_FTEST
23...22 FP size bits (00 = float, 01 = double, 1x = reserved)
21 SIMD flag
20...12 unused (0)
11... 6 r2
5... 0 r1
========= end of manual pieces =========
Note: there is no `x' flag because these instructions never fail.
I'll also add `fcntl' and `fstat' if there are no objections, but I
haven't fully designed the details yet (in particular, the contents of
the status and control registers are still undefined).
Finally, back to condition codes. Since the `msb' and `nan' conditions
more and more turn out to be useless (and also badly designed), why
don't we drop them? It also turned out during the gcc port that the
`zero' condition is particularly hard to handle (it sometimes needs extra
zero-extend instructions), so we may as well redesign the whole thing.
My suggestion is to check only the appropriate chunk, and to also take
SIMD into account. With only two conditions (`lsb' and `zero', and
of course their negations) we could use the third bit as an `all/any'
indicator:
if SIMD = 0 then
branch if condition is true for chunk 0
elsif all = 1 then
branch if condition is true for all chunks
else
branch if condition is true for any chunk
end if
Of course this will add size and simd flags to all conditional operations
(in particular, jmpcc). On the other hand, conditional instructions
would become more flexible, especially with wide (> 64 bits) registers.
> hoping to revive this list,
I guess that was enough revival for a single mail, wasn't it? ;)
--
Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
"All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/