[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: EU Report (was: Re: [f-cpu] Register set revised)



On Thu, Mar 20, 2003 at 10:26:32AM +0100, devik wrote:
> > What exactly is the SAD insn supposed to do?
> 
> Sum of Absolute Differences. It take two SIMD words, makes
> abs(a-b) for each corresponding bytes and adds all these differences.

IC. It's a little heavy for a single instruction, but it can be calculated
if there is a `byte adder'.  The abs(a-b) part can be calculated with
instructions from the current instruction set, either as `abs(a-b)' or
as `max(a,b)-min(a,b)', whatever is more convenient.

> This is used when computing motion vectors in mpeg encoder.
> I've seen it in several instruction sets. However I don't know where
> it is usable outside of mpeg.

The byte (or chunk) adder will also be useful in vector computations.
But I doubt that we will have a chunk adder that works with FP numbers.

In any case, the chunks of a word can be combined by using `mix':

	mix.8 r0, r1, r2	// distributes the chunks across r2 and r3
	add.16 r2, r3, r1
	mix.16 r0, r1, r2
	add.32 r2, r3, r1
	mix.32 r0, r1, r2
	add.64 r2, r3, r1	// gotcha!

This will also work with other commutative operations, e.g. mul.  A `chunk
add' insn may be more convenient, however (and will also be much faster).

> > 	tmp = A ア (B + lsb(C))
> > 	Y = tmp % pow(2, chunksize)
> > 	Z = tmp / pow(2, chunksize)
> >
> > Since that's a 3r2w operation, don't expect it to be implemented in FC0.
> > It would be quite useful for adding/subtracting big numbers, however.
> 
> seems useful. Big number libs (gmp, rsalib...) often need to resort
> to some trick most related to carry propagation. In GMP manual
> there is many info on the topic.

Yep, I know.  I've done things like that in the emulator, too.

The problem with this instruction is that we only have three register
number fields in the instruction word.  r1 and r1^1 will be the outputs
Y and Z, r2 and r3 will be A and B, respectively -- but where does C
come from?  My best guess is to use r1 for that as well -- but then
we'll have to move away r1^1 (which typically contains the result of
the last chunk computed) first.

> Would not be possible to do CMP in ASU too ? Both are working
> with "propagating" information from LSB toward MSB ...

The ASU could compare operands (at least in unsigned mode), but it
won't perform the other operations that EU_CMP supports: min/max/sort
and msb0/msb1 aren't possible with just an adder.  And since the adder
is busy enough (remember the instruction census?), it's probably better
to leave these operations where they are.  That does not mean that you
can't use `subb' to compare operands for unsigned-less, of course.

> Or other question, is the new source available ? I'd like to
> take a look :)

I'll make a new release of my VHDL sources as soon as I've finished my
CeBIT reports for next month's iX issue.  That is, by the end of march.

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/