[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SR [was:Re: [f-cpu] delayed issue]



On Wed, 05 Mar 2003 17:12:28 +0100
Yann Guidon <whygee@f-cpu.org> wrote:
<...>
> >Sur, but in one side you have 6 gates thin multiplier and in other
> >side you try to put a 3r2w register bank of 64 entries in the same
> >slot size...
> >  
> >
> don't worry ....
> the decoding logic (data ready, unit ready, etc.) will probably need 
> some more pipe stages
> in "high-speed" versions (where the pipelines are correctly sliced).
> It's just a matter of splitting the stages correctly.
> Remember, the first FC0-OOOC had no jump latency and now it it one.
> If the register set is really slow, we can't do much better but it's
> not worth
> adding complex renaming buffers : the register's access time will not
> be faster.
> Enhancing (through longer pipeline) the decoder is a better solution.
> 

Sur but then you will have jump penalty.

> >>>>Additionally, more complexity means more silicon area,
> >>>>more dissipation, longer wires => more heat/dissipation,
> >>>>more expensive and probably slower.
> >>>>And control logic is certainly the least easy thing
> >>>>to test in a chip. This is why i'm satisfied with
> >>>>the current FC0.
> >>>>
> >>>not me :) Not when you saw the 3r2w regifile because of 1 or 2
> >>>instructions (like MAC). Not when you saw the mess of "special
> >register">>that should be memory mapped (with conditionnal memory
> >movemement like>>not buffered, if needed). Not when you see the
> >trap/expetion mess.>>
> >>1) 3R2W is necessary also for load and store instructions, otherwise
> >it's not  possible to perform pointer update in the instruction.>    
> >>
> >Sur but, it's not really a true speed up and add a raw dependancies.
> >  
> >
> ??????
> 
> oh, and what about your 'code density' argument ?
> if you don't allow post-increment, then you need more instructions to 
> compute the addresses.
> 

Humm... i think about "true" 4r2w (on liw) but with 1r1w split register
bank. The first problem is the use of 3r2w reg bank but 90% of the
instruction are 2r1w.

> 
> >>2) If you map SRs to memory, you will face race conditions and 
> >>synchronisation problems,
> >> and protection will not be enforced on a register or register group
> >>granularity basis.
> >>    
> >>
> >It's exaclty the same problem for IO register, and it's soon solved.
> >
> SRs are not for I/O (because it would become a bottleneck).
> The problem, hence the solution, is not the same.
> 

The problem is buffuring anf ordering. Like for IO registers memory
mapped.

> 
> > It's used by sparc and i'm pretty sur for ATM.
> >
> ATM ?
> 

ARM. Sorry.

> >There is no need for a specific buses, only the use of direct
> >addressing had an interrest.
> >  
> >
> ???

Use set/get is like a load/store using direct adressing.

> 
> >>3) what mess ?
> >>
> >The kind of linked-list that can't take nest interrupt that you speak
> >about regularly. (shadow register could be far more easy...)
> >  
> >
> ???
> 

SR was just for some constant reading, then you use it for system
control(like for trap handling). Each time some dust are on the design
pchout it send to SR, like a wide carpet. 

SR are slow because serialiased. SR can't be preserve by context switch
because it sill be a mess. So only read is premit. 

We could also put register trap pointer,  TLB ... But register mapped
seems so easier (don't forget that SR are direct mapped).

nicO

> >>>nicO
> >>>
> >>YG
> >>    
> >>
> >nicO
> >  
> >
> YG again
> 
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/