[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [f-cpu] New SHL EU
On Mon, May 13, 2002 at 11:59:00PM +0200, I wrote:
[...]
> The shifter now supports both `semi' and `full' SIMD operations -- that
> is, `chunk(a) op b' *and* `chunk(a) op chunk(b)'. I also hope that I
> got the bitrev instruction right this time. It should perform the
> function `y = rotate_right(bit_reverse(a), chunksize-b-1)'; but one
> never knows.
... as I said: one never knows. Of course the function is
y = shift_right(bit_reverse(a, chunksize), chunksize-b-1)
Instructions summarized (pseudo-C):
// In bitwise operations, B is always modulo the chunk size.
// That means:
// 0 <= B < chunksize
// But also:
// 0 < B + 1 <= chunksize
// 0 < chunksize - B <= chunksize
// 0 <= chunksize - B - 1 < chunksize
// You've been warned.
// In `full-SIMD' mode, B comes from the appropriate chunk.
// In `semi-SIMD' mode, B is taken from chunk 0 (zero).
// Immediate mode is *always* `semi-SIMD', but that's probably
// handled elsewhere (also for addi/subi/muli and so on).
(s)shiftl:
// regular output:
Y = A << B;
// alternate output ("leftovers"):
// use this for a `double-width' shift
Y2 = A >> (chunksize - B);
(s)shiftr/(s)shiftra:
// regular output:
// `shiftra' will duplicate the MSB (signed shift)
// `shiftr' zero-fills (unsigned shift)
Y = A >> B;
// alternate output ("leftovers"):
// use this for a `double-width' shift
Y2 = A << (chunksize - B);
(s)bitrev:
// regular output:
Y = bit_reverse(A, chunksize) >> (chunksize - B - 1);
// alternate output ("leftovers"):
// use this for a `double-width' bitrev
Y2 = bit_reverse(A, chunksize) << (B + 1);
// Notes:
// - the SIMD operation `sbitrev' is not yet documented.
// - the documented `bitrevo' instruction is not supported.
The 2r2w `double-width' shifts need to be documented, assigned opcodes,
mode flags and so on. Did we already choose a name for them, BTW?
I suggest `(s)dshiftXY' and `(s)dbitrev' (although there's probably not
much use for the latter).
(s)rotl:
// regular output:
Y = (A << B) | (A >> (chunksize - B));
// alternate output is currently unused
(s)rotr:
// regular output:
Y = (A >> B) | (A << (chunksize - B));
// alternate output is currently unused
If you can find any use for the second output port in the rotl/rotr
modes, tell me. Note that it is impossible to perform both rotl and
rotr at the same time, however.
// In bytewise operations, there is no `semi-SIMD' mode.
// There is no `shift count', either :)
(s)byterev:
// regular output:
Y = byterev(A, chunksize);
// alternate output (second channel):
Y2 = byterev(B, chunksize);
sdup:
// regular output:
Y = sdup(A, chunksize);
// alternate output (second channel):
Y2 = sdup(B, chunksize);
mix:
// regular output:
Y = mixl(A, B, chunksize);
// alternate output:
Y2 = mixh(A, B, chunksize);
expand:
// regular output:
Y = expandl(A, B, chunksize);
// alternate output:
Y2 = expandh(A, B, chunksize);
It's not yet clear which source operand is which in the mix and expand
instructions. But I can change that easily.
I could also tear mixl/mixh and expandl/expandh apart and make them
separate instructions, and at the same time restrict byterev and sdup
to one channel. That would save some space, but I like it better this way.
We should also add 2r2w `mix' and `expand' instructions (but keep the
2r1w versions).
The `widen' (integer sign/zero extension) instruction is still missing.
I'm not sure whether it was such a good idea.
Ok, that's it for now.
Have fun,
--
Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
"All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/