[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: scatter/gather op ( was:Re: [f-cpu] New EU_SHL Instruction)



hi,

nico wrote:

On Thu, 9 Jan 2003 14:01:51 +0100
Michael Riepe <michael@stud.uni-hannover.de> wrote:

On Thu, Jan 09, 2003 at 01:59:35AM +0100, Yann Guidon wrote:
[...]

and_reduce (or "combine" as written in ROP2) is not possible
for very wide data.

Furthermore, the xorn.and trick is useful for "detecting" that a
byte corresponds, but if you need to find the index of the
character, the "obvious" answer is to loop over the register.
if you have a result of 0x00FF000000000000, it's not a good
solution. So the idea is to "transpose" the bits in the word, that
would become 0x4040404040404040 and the last byte can then ben
binary encoded in INC (if it's implemented).

Wouldn't it be sufficient to `collapse' each chunk into a single bit?

that's a gather intra-chunk operation. (Such gather op are a lack in
all the f-cpu ISA because inter-chunk operation are maid in 64 bits cpu
instead of thinking about a 256 bits version.)

A add gather could be usefull too !

gather.add.64 V1 V2 R3

R3 = V1[0]+V1[1]+V1[2]+V1[3]
+V2[0]+V2[1]+V2[2]+V2[3]

(big tree adder ?)

This is easily "emulated" with a logarithmic shift/add sequence :
srhi 8, r1, r2
add.8 r1, r2, r1
srhi 16, r1, r2
add.16 r1, r2, r1
srhi 32, r1, r2
add.16 r1, r2, r1
and it works with any kind of instructions (boolean, arithmetic, FP etc.)

any comment ? (except "it is slow")

YG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/