[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Manual 0.2.6



Michael Riepe a écrit :
> 
> On Thu, Aug 01, 2002 at 05:11:25PM +0200, Yann Guidon wrote:
> > hi,
> >
> > just a little detail :
> >
> > Cedric BAIL wrote:
> > > The idea is to have this capability for every register size (8, 16, 32, 64).
> > > We need this instructions to load register that are bigger than 64 bits (the
> > > shifter only work on 64 bits, and it isn't its job to work in a inter chunk
> > > capability).
> >
> > beware : Michael did not implement inter-chunk shifts, but it is not a norm.
> > i would like to implement a different shifter structure which could shift
> > 64 bits, even between two neighbouring chunks.
> 
> It's not a problem to extend the shifter to 128, 256 or even more
> bits. But it will affect the latency, of course. The same is true for
> the ASU and IMU units; the only EU that is mostly immune to word size
> changes is ROP2.
> 
> A 256-bit F-CPU core that supports full-word operations will have longer
> pipelines or more delay per pipeline stage (or both). Unless we use
> variable latency EUs, almost all operations will be slower than in a
> 64-bit version.
> 
> The question is whether e.g. a 256-bit add, multiply or shift instruction
> is really useful. Most of the time, applications use small integers that
> fit into 32 (or less) bits, and IEEE single or double floats. That is,
> wider registers will be used for SIMD operations a lot, but rarely for
> `wide' operations.
> 

From the beginning, the FCPU is a 64 bit cpu. It manage 64 bits integer
value. It is the size of chunk. The number of chunk is a power of 2. So
we will never have to create a 256 bits shifter !

Numbers are fixed. i have soon made such speach few weeks, ago. At this
day i propose to use 256 bits regiter set, so must think of inter chunk
operation. Why 256 ? because it's 4*64 and this number was find to be an
optimal for a vectorising algorithme.


> How to move data to/from wide registers? Data that is stored in memory
> can be handled with load/store, immediate data will be handled by the
> loadcons* instructions, and the mix, expand, byterev and sdup instructions
> (which may handle wider chunks than 64-bit without significant performance
> loss) can be used to move data between different chunks.
> 
> If we're going to implement a `monster shift' anyway, I suggest a `chunk
> shift' instruction that operates on full-size registers (that is, there
> will be no SIMD mode) and always shifts 64-bit quantities:
> 
>         cshiftl r3, r2, r1      // r1 = r2 << (64 * r3)
>         cshiftr r3, r2, r1      // r1 = r2 >> (64 * r3)
> 
> This can probably be integrated with the SHL execution unit.

Why not using the SIMD flag ? And do :
	r1 = r2 << ([8|16|32|64]*r3)

nicO

> 
> --
>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
>  "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/