[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Problem with load/store size flags on >64bit F-CPU



On Fri, Jan 10, 2003 at 10:39:02AM +0100, devik wrote:
> Hi,
> 
> assume you have 256bit cpu. Then you certainly want to
> load whole register at once (whole cache line :))).
> 
> But as there was already said we don't need (or want to
> implement) more that 64bit chunk size.

That's true for arithmetic operations, because they don't scale well
(the size of the multiplier, for example, grows quadratically with the
word size, and most EUs would have a higher latency). Loads, on the
other hand, could be any size as long as the cache lines are long
enough.

> But then when one sets size flags to 8/16/32/64 bits
> (which is very useful combination imho) he can't do it.
> When is sets 8/16/32/256 he can't use 256 for other than
> LSU ops.

We can, by way of the special registers. In a `wide' F-CPU, you can re-map
the chunk sizes to, say, 8/256/32/64 (I expect 16-bit mode to be used
rather infrequently). It just isn't supported in the assembler/emulator yet.

> Would not be better to have separate SRs for LSU so that
> they use different size meaning ?

<RFC>

The current SR scheme is debatable anyway. Instead of 4 SRs (one for
each size), we should use a single one, which is faster to save/restore.
With a single SR_SIZE with four 16-bit chunks, we could address chunk
sizes up to 65536 bits if the sizes are stored in a linear fashion,
or 2^65535 if the values are logarithmic. I vote for the latter, btw.

Another question is how we can support remapped sizes in the
assembler. I remember that some Intel assemblers had an `assume'
directive that told the assembler which segment register held the
address of which segment. We could use something similar, e.g.

	geti $SR_SIZE, r1
	loadcons.1 $5, r1		// now you know why I suggested four 16-bit chunks :)
	puti $SR_SIZE, r1
	.regsize 0, 5, 2, 3		// .01 means 256 bits
	...
	load.256 r2, r3
	...
	loadcons.1 $1, r1
	puti $SR_SIZE, r1
	.regsize 0, 1, 2, 3		// restore to default case

Since the assembler knows the mapping, it can reject instructions that
are unavailable at that point (like load.16 in this case).

</RFC>

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/