[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rep:Re: [f-cpu] FC0's RTL scheduler



hello,

i redirect your answer to the right address ;-)

>De: nicolas.boulay@ifrance.com
>A: <whygee@f-cpu.org>
>
>De: Yann Guidon <whygee@f-cpu.org>
>A: f-cpu@seul.org
>Date: 19/12/01
>Objet: Re: [f-cpu] FC0's RTL scheduler
>
>hi !
>
>Michael Riepe wrote:
>> On Tue, Dec 18, 2001 at 03:31:02PM +0100, Yann
>Guidon wrote:
>> 
>> > > > I see that we have most EUs going on or done
>> > > > so this is the best place to work. However i still
>> > > > have to figure how many output ports the IMU has.
>> > > Eight -- two for each chunk size (holding the high and low parts of
>> > > the double-width product). We have to reorder the result bits for the
>> > > original macl/mach instructions, however.
>> > can you be more precise ?
>> >
>> > does that mean that :
>> >  8-bits   -> 2 cycles
>> >  16-bits  -> 4 cycles
>> >  32-bits  -> 6 cycles
>> >  64-bits  -> 8 cycles
>> > plus an additional cycle for mach/macl ?
>> 
>>          8-bit low  part -> 3 cycles
>>          8-bit high part -> 4 cycles
>>         16-bit low  part -> 4 cycles
>>         16-bit high part -> 5 cycles
>>         32-bit low  part -> 5 cycles
>>         32-bit high part -> 5 cycles
>>         64-bit low  part -> 6 cycles
>>         64-bit high part -> 6 cycles
>> 
>> mach/macl work at the same speed, but the results don't appear at
>> the bit positions that are documented in the manual. In the manual,
>> there is a difference between mul and mac modes -- mul(h) places the
>> high part of every result chunk in the secondary destination register,
>> while mac[lh] pastes high and low parts together, doubling the chunk
>> size of the result, and then selects one half of the result and puts it
>> into the primary destination register (the secondary reg is never used).
>> The final chunk-shuffling for mach/macl is currently not implemented
>> (it can be handled by adding extra output ports for mach/macl, or by
>> reordering the chunks in the Xbar, whichever is cheaper).
>maybe we can "cheat" with the scheduler : swap the register number,
>
>>>>>>>>>>>>>>>>> O h yes ! So you could support my
>own ideas : the EU carry the written register adrress
>(it's usefull for complexe unit as if you pipeline
>the loop of the divider). But in that case, you only
>need to swap adresses.

let me finish the scheduler RTL code please ;-D

>In other hand, we speak to transform 3r2w in 2r1w
>with the trick of doubling virtually the number of
>bits by regsiter (for user it change nothing for us
>we will have 32 register of 128 bit with muxes to
>separate it).

?????????????????

> So we can use much more simple (and
>quick) memory IP.
???

> Double port memory are really
>common, more port need to be handmaid or realy slow.

if i understand we need 3x 2W1R banks.
This is going to use even more room ...

i'm thinking about small chuncks which can do 3R2W,
some chuncks are 8-bit wide and others are 16-bit wide,
so we can control the write bit independently.

>If we plane to fully synthetis it (as an IP) our cpui
>will be much more popular.
if we wanted to be popular, we would be at Big Brother, Star Academy
or Pop Stars ;-P

>nicO
YG, hoping he's not completely messing everything ;-)

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/