[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: F-CPU Mailing List <f-cpu@seul.org>*Subject*: [f-cpu] manual 0.2.5 quirks*From*: Michael Riepe <michael@stud.uni-hannover.de>*Date*: Mon, 3 Jun 2002 04:23:10 +0200*Delivered-To*: archiver@seul.org*Delivered-To*: f-cpu-outgoing@seul.org*Delivered-To*: f-cpu@seul.org*Delivery-Date*: Mon, 03 Jun 2002 13:03:54 -0400*Reply-To*: f-cpu@seul.org*Sender*: owner-f-cpu@seul.org

I'm currently busy with something different (in case you're nosy: symbolic verification of the F-CPU execution units) but I stopped to have a look at the PDF version of the latest manual... ... and I still found some quirks in the instruction set part: - Register naming is inconsistent in sub and div: sub r3, r2, r1 => r1 = r2 - r3 div r3, r2, r1 => r1 = r3 / r2 The `sub' variant is more logical. - The current ASU sets the `borrow' register to 1, not -1. See vhdl/eu_asu/iadd.vhdl, line 344ff. - The description of `subi' is wrong (registers swapped): subi imm8, r2, r1 => r2 = r1 - imm8 In general, <op>i and <op> should be consistent. That is, the instruction is supposed to calculate `r1 = r2 - imm8'. - Registers swapped in `mod' description. Should read: r1 = r2 % r3 BTW: What this instruction really computes is the *remainder* of the division r2 / r3, *not* the modulus (this makes a difference when the operands are signed numbers). It therefore should be called `rem', not `mod'. Analogously, `divm' should be called `divr', or maybe `divrem' (just like `addsub'). And while we're at it, `sort' could have `minmax' as an alias (since that's what it does: compute the minimum and maximum of its operands at the same time). - The `alternate result register' is still named `r1+1' throughout the manual. Didn't we change that to `r1^1'? - The description if `cmpl' is missing: r1 = r2 < r3 ? -1 : 0 - With all compare/max/sort instructions, the reference to IEEE floating-point is misleading. Even if the instructions work with signed integers, they will NEVER compare IEEE floats correctly. Adding signed integer comparision, on the other hand, should not be too hard (remember: just invert both sign bits before comparision). - Same for cmple (except that it performs: r1 = r2 <= r3 ? -1 : 0) - Syntax for `scmpli' and `scmplei' is wrong. Where did the immediate operand go? The descriptions are missing, too. - The `bitop' instruction is still listed with the bit shuffling ops, and contains a reference to the SHL unit. We *could* make this true, but that means we'll have to add a pipeline stage to the SHL unit (or route the output to the ROP2 unit, which will take an extra Xbar cycle). IIRC we decided that the function (F) is be encoded in the opcode, and that the immediate for bitopi should be 8 bits wide. The correct descriptions are: bitop r3, r2, r1 => r1 = F(r2, 1 << r3) bitopi imm8, r2, r1 => r1 = F(r2, 1 << imm8) - The `bitrev' instruction performs: bitrev r3, r2, r1 => r1 = reverse(r2) >> (size - r3 % size - 1) or, if you like that better: bitrev r3, r2, r1 => r1 = reverse(r2) >> (~r3 % size) That is, you always get ((r3 % size) + 1) result bits. The two-operand form (r3 = 0) makes no sense; it's essentially the same as `andi 1, r2, r1'. The -o suffix is unsupported unless we add a pipeline stage to the SHL unit (or similar, see `bitop'). The SIMD variant `sbitrev' is undocumented. Is it really useless? - The double-word shifts are missing. We currently have dshiftl r3, r2, r1 => r1 = r2 << r3 r1^1 = r2 >> (size - r3) (*) dshiftr r3, r2, r1 => r1 = (unsigned)r2 >> r3 r1^1 = r2 << (size - r3) (*) dshiftra r3, r2, r1 => r1 = (signed)r2 >> r3 r1^1 = r2 << (size - r3) (*) dbitrev r3, r2, r1 => r1 = reverse(r2) >> (size - r3 % size - 1) r1^1 = reverse(r2) << (r3 % size + 1) (*) (immediate and SIMD versions also available). (*) result will be zero if the shift count equals the chunk size. - The sshift*/srot*/sbitrev ops are available in `full-SIMD' and `half-SIMD' modes. The latter performs an implicit `sdup' on the shift count. The manual should state which is which (and also mention the other variant if we're going to support it). - In the drawings for mix/expand, it's still not clear whether `source #1' is r2 and `source #2' is r3, or vice versa. I suggest that the least significant chunk of the result should always come from r2 (that is, source #1 is r2 and source #2 is r3). - In the description of the logic operators, the registers are named inconsistently with the rest of the manual (r3 is destination). F is be encoded in the opcode in 3-bit form, and the immediate for `logici' is 8 bits wide (see `bitop' above). It's still not clear which operand is inverted when `andn' or `orn' is performed (I suggest r3, for symmetry). And finally: there is no `not' instruction. - Floating-point compare is completely broken. With IEEE floats, two numbers can be less, equal or greater with respect to each other, but they can also be *unordered* (if one or both of them is NaN). To make things even more complicated, +0.0 and -0.0 compare equal, although they have different representations, while NaNs *never* compare equal even if their representations are identical. - `fdiv' calculates r1 = r2 / r3, NOT r1 = r3 / r2. - `fsqrt' has only two operands: r1 = fsqrt(r2). - `flog'/`fexp' still have three-operand form which requires you to preload the logarithm's base into r3. If we *really* need three-operand forms, they should calculate something like r1 = log2(r2) / r3 and r1 = exp2(r2 * r3) with two-operand forms that implicitly set r3 to 1.0. If you set r3 to log2(n), you'll get log<n>(x) and <n>**x (for n > 0.0) without making the implementation of log() and exp() more complex than necessary. - `faddsub' is supposed to calculate r2 + r3 and r2 - r3 (NOT r3 - r2). - `move r0, r0' is no longer an alias for `nop'. There is a real `nop' instruction (opcode 0) now, and `move' has a different opcode. The textual description claims that `r2 is copied to r3', but the target is r1. - While I suggested that `loadcons imm, r1' should be an assembler shortcut for a sequence of `loadcons.n imm16, r1' instructions, I didn't mean that the constant to load *must* be 64 bits wide. If it's only 32 bits wide, the assembler might use a shorter instruction sequence (e.g. a loadcons followed by a loadconsx). The difference between `loadcons imm, r1' and `loadconsx imm, r1' should be that the assembler sign-extends the constant in the latter case. -- Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de> "All I wanna do is have a little fun before I die" ************************************************************* To unsubscribe, send an e-mail to majordomo@seul.org with unsubscribe f-cpu in the body. http://f-cpu.seul.org/

**Follow-Ups**:**Re: [f-cpu] manual 0.2.5 quirks***From:*Cedric BAIL <cedric.bail@free.fr>

**Re: [f-cpu] manual 0.2.5 quirks***From:*Yann Guidon <whygee@f-cpu.org>

- Prev by Date:
**Re: [f-cpu] new manual, new site...** - Next by Date:
**Re: [f-cpu] new manual, new site...** - Prev by thread:
**[f-cpu] LSM site is now complete** - Next by thread:
**Re: [f-cpu] manual 0.2.5 quirks** - Index(es):