[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] [Very] Late answer



I think the biggest of our problem is what whygee, Micheal and I think
about the register system of SIMD. Personnaly, it wasn't clear for me.

To keep things easier to understand it will be better to think of the
use of a 256 bits registers set. Because 64 is a big illness : the
biggest format couldn't be used as SIMD and ease a lot of thing for
inter-chunk operations.

For me, a 256 bits should handel 8/16/32/64 bits (the size of chunks) in
integer and 32/64/128 bits for floting point unit. Pointers are 64 bits.
Full stop. Load& store unit handel only a *single* load.

So there is 4 x 64 bits chunks (that's mainly this size that make me use
of 256 bits registers), 8*32 bits, 16*16 bits, 32*8 bits in integer  and
8*32,4*64,2*128 bits chunks for floating point unit.

You can absolutely *never* make a "global" operation on the 256 bits
registers. Otherwise it will slow done all the cpu (it's sure that some
wire will have to cross from bit 0 to bit 255). 

In that case, we will need powerfull chunk manipulation operation (it
will be strongly used by compiler !).

[partial write]
The partial write are only usefull for partial constant load. It's an
overkill to introduice raw dependancies but what about disabled the
bypass facilities for that particular instruction ? It does add nothing
in the cdp. The instruction still use 1 cycle and 4 instructions of that
style could be handel in the same time... hum, i just thing that we must
keep some more bit in the scorebord to handle that cases... Uh, bad... 1
bit for 16 bits chunk for the scorebord ? 

It remind me the problem of partial write false dependancies of Intel
chip !


Michael Riepe a écrit :
> 
> [...]
> > > Ok, that's a point. Then let's use the accumulator solution (where the
> > > shift is moved outside the Xbar/EU complex). It also has the advantage
> > > that it can't be abused as a `fast left shift' ;)
> > no ! loadcons is loadcons and is not a unit in itself.
> > otherwise, one has to use SHL instead.
> 
> Who says it isn't? As far as I am concerned, the current F-CPU
> specification is not cast in stone. If there are mistakes in it, we
> have to correct them - and believe me, partial writes (and partial data
> moves) *are* mistakes. There are others - like the `mac mistake': the
> mac and mul instructions use different result formats (mac results are
> widened "if the destination register is wide enough"). Since `mac.64'
> will behave differently on a 64- and a 128-bit F-CPU, this clearly is
> a Big Bad Bug in the specification. But there are always alternatives,
> and we should at least consider them. AND we should take care that we
> choose an alternative that is easy to implement. KISS principle.
> 

I really strongly agree !
nicO

> --
>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
>  "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/