[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[f-cpu] Partial Writes Considered Harmful
This may come a little late in the design and development process, but...
I suggest we drop the `partial write' feature. In many cases, it makes
no sense to keep the upper part of a register when only a fraction of
it is written. E.g. when you load a byte from memory and want to use
it as a full-width integer (or boolean - remember that jmp/move always
look at the *whole* register), you either have to
- `zero out' the destination register
- load the LSByte
- load the LSByte
- mask off any other bit
Additionally, partial writes make the register set and the scoreboard
logic (zero detection) much more complex than necessary - in fact, it
makes some promising register set implementations almost impossible. The
only thing it makes easier is constant loading. In fact that is the *only*
operation that needs the partial write ability *at all*. Doesn't it?
Currently, we need 8 constant loading instructions: loadcons and loadconsx
with four variants each (in order to be able to load up to 256 bits -
if there will ever be an F-CPU with registers wider than 256 bits, we
will need *even more* instructions!). On the other hand, two slightly
different instructions would be sufficient for *all* word sizes:
loadcons $imm17, reg // similar to the original `loadconsx'
=> reg := sign_extend(imm17)
loadconsp $imm16, reg // `p' means `partial'
=> reg := shift_left(reg, 16) | imm16
Values between -65536 and 65535, inclusively, can be loaded with a
single instruction, 32-bit values need two instructions, and so on.
This solution is more general than the original loadcons[x] instructions
and IMHO also much more elegant.
Since we need 8 bits for the opcode and 6 bits for the destination
register, we can encode all variants using only a single opcode (compared
to 8 opcodes for loadcons[x]):
8 + 1 + 1 + 16 + 6 = 32 bits
| opcode | P | S | imm16 | reg |
P=0 => load full register; S is the sign bit
P=1 => load least significant 16 bits of the register; S is ignored
In case you didn't notice it: the same encoding is used by `loadaddri[d]'.
Implementing the new `loadcons' is simple: the decoder sign-extends the
immediate value and sends it along. `loadconsp' is a little more tricky
because it needs a `feedback loop' from one of the register set's read
ports to one of the write ports. Fortunately, the left shift and the
`or' operations take almost no time (we need an extra mux, the rest is
just a bunch of wires).
Without partial writes, other (non-SIMD) instructions that operate on
partial words shall set the upper bits of the result to zero (= simple
AND operation). Sign extension can either be performed by `move[s]' or
by a separate `sext' instruction; the `widen' instruction is no longer
necessary (it was an ugly kludge anyway).
Ok, that had to be said.
Now it's your turn...
Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
"All I wanna do is have a little fun before I die"
To unsubscribe, send an e-mail to firstname.lastname@example.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/