[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] More broth in the Alphabet Soup

On Thu, Jun 20, 2002 at 12:15:54AM +0200, Yann Guidon wrote:
> > > > a scatter/gather instruction would be ideally performed using a "base"
> > > > pointer (checked the usual way) and a SIMD "offset", so every SIMD offset
> > > > chunk is parallelly checked against the maximum allowed offset (size of page
> > > > in TLB ?) and the TLB doens't need as many ports as there are chunks...
> > >
> > > Sounds good. Especially if we require the base pointer to be page
> > > aligned :)
> > >
> > What about dumb things like video displays that have multiple byte sizes?
> > Here your gather would need to be for bytes,words,and longs on a long boundry.
> hmm i see, that's the offset size problem...
> if you need to access more than 64K 16-bit words, then you can mask off
> the LSB and perform more 32-bit accesses. The results are shifted/masked
> to return the wanted part.

What's scatter/gather good for in a video framebuffer?

Scatter/gather in the form outlined above is most useful for parallel
(SIMD) table lookups - encode/decode small values (bytes or maybe
16-bit words), grab coefficients from a table and so on.

Ideally, scatter/gather will simply OR the base pointer with each of the
offsets, instead of adding them. If you really need a higher granularity
for the base pointer, you'll have to do it manually. I.e. instead of

	gather.d r2, r1, r3	// r1 = base, r2 = 16-bit offsets, r3 = destination

you'll write:

	loadcons PAGE_MASK, r3
	and r3, r1, r4	// aligned pointer
	andn r3, r1, r3	// common offset
	sdup.d r3, r3
	sadd r3, r2, r3	// add offsets
	gather.d r3, r4, r3

The reason is, of course, to keep the latency of scatter/gather as low
as possible. If the instruction had to perform up to eight adds first,
it would be inacceptably slow.

 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/