[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[f-cpu] Partial Writes Considered Harmful



This may come a little late in the design and development process, but...

I suggest we drop the `partial write' feature. In many cases, it makes
no sense to keep the upper part of a register when only a fraction of
it is written. E.g. when you load a byte from memory and want to use
it as a full-width integer (or boolean - remember that jmp/move always
look at the *whole* register), you either have to

    - `zero out' the destination register
    - load the LSByte

or

    - load the LSByte
    - mask off any other bit

Additionally, partial writes make the register set and the scoreboard
logic (zero detection) much more complex than necessary - in fact, it
makes some promising register set implementations almost impossible. The
only thing it makes easier is constant loading. In fact that is the *only*
operation that needs the partial write ability *at all*. Doesn't it?

Currently, we need 8 constant loading instructions: loadcons and loadconsx
with four variants each (in order to be able to load up to 256 bits -
if there will ever be an F-CPU with registers wider than 256 bits, we
will need *even more* instructions!). On the other hand, two slightly
different instructions would be sufficient for *all* word sizes:

    loadcons $imm17, reg    // similar to the original `loadconsx'
    => reg := sign_extend(imm17)

    loadconsp $imm16, reg   // `p' means `partial'
    => reg := shift_left(reg, 16) | imm16

Values between -65536 and 65535, inclusively, can be loaded with a
single instruction, 32-bit values need two instructions, and so on.
This solution is more general than the original loadcons[x] instructions
and IMHO also much more elegant.

Since we need 8 bits for the opcode and 6 bits for the destination
register, we can encode all variants using only a single opcode (compared
to 8 opcodes for loadcons[x]):

         8   + 1 + 1 +   16  +  6  = 32 bits
    +--------+---+---+-------+-----+
    | opcode | P | S | imm16 | reg |
    +--------+---+---+-------+-----+

        P=0 => load full register; S is the sign bit
        P=1 => load least significant 16 bits of the register; S is ignored

In case you didn't notice it: the same encoding is used by `loadaddri[d]'.

Implementing the new `loadcons' is simple: the decoder sign-extends the
immediate value and sends it along. `loadconsp' is a little more tricky
because it needs a `feedback loop' from one of the register set's read
ports to one of the write ports. Fortunately, the left shift and the
`or' operations take almost no time (we need an extra mux, the rest is
just a bunch of wires).

Without partial writes, other (non-SIMD) instructions that operate on
partial words shall set the upper bits of the result to zero (= simple
AND operation). Sign extension can either be performed by `move[s]' or
by a separate `sext' instruction; the `widen' instruction is no longer
necessary (it was an ugly kludge anyway).

Ok, that had to be said.
Now it's your turn...
-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/