[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rep:Re: Re: [f-cpu] No latches, please !



hi,

Michael Riepe wrote:
> On Fri, Feb 15, 2002 at 02:48:54AM +0100, Yann Guidon wrote:
> [...]
> > > A latch is quicker in terms of what? Delay time? Setup time? Clock
> > > frequency?
> >
> > first, it's a bit smaller. Depending on the amount of control logic,
> > the gain varies. The memory cell (4 transistors) requires decoding
> > and clock buffering, IIRC the sxlib FF uses around 20 transistors.
> >
> > Setup time + hold are the same or not, depending on the kind
> > of gates you use. For example, in sxlib, they use a couple of 4T cells
> > which are overwritten by using a buffer with high driving capability.
> > On of the inverters of the 4T cell has a low drive and the buffer can
> > overwrite the previous value.
> 
> Master-slave-FF with the cheapest latches available?

i don't understand what you mean.

> [...]
> > > Latches have one big disadvantage: they can become transparent.
> > and sometimes, they do it :-)
> That's the problem ;)
if you are in a situation where it is, i can do nothing for you ;-)

> > This is both an advantage and a drawback so you have to choose
> > where to use them. For example, the register set uses latches,
> > both for the registers and their "showdow flags" because we want
> > to know the flag's value, EVEN during a cycle when it is being
> > written to...
> 
> Horror! As long as the latch is transparent, you can't be sure that
> its outputs are correct (and stable), so using them is absolutely
> pointless. If you need the value of the flag before the register is
> written, grab it at the input.

i'll have to be more precise.

there are "shadow flags" which indicate whether a register is zero or
not. This is one of the hottest/nastiest things in the register set "entity".
in the 64-bit implementation, it takes 5 bits, one bit per "slice"
(there are 2*8 bit slices and 3*16-bit slices). the bits are updated
depending on the write mask, it is read on every cycle so we know
whether a condition is true or false : it is a 2W1R bank.

The problem arises when there is a buble in the pipeline, which stalls
waiting for a result which conditions the issue of the currently decoded
instruction. During the R7 writeback cycle, the bits in a slice are ORed
together and sent to the 2W1R bank of 5*63 bits (each of the 5 bits are
conditionned by the write mask). The 1R output of the 5-bit vector is ORed
and gives the needed bit. If we use a transparent latch, the flow-through
time of the cell uses one gate delay.
Otherwise, using a FF, there is the need to create a bypass path :
it introduces a MUX for each of the 5 bits because we have to choose
which of the old or new flag is read, depending on the write mask
and other flags. Although you might not like it, using a transparent latch
is an elegant solution and compromise. Using synchronous logic increases
the complexity of the logic and my brain has limited computational power :-(


>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/