[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Re: FC0 XBAR



hi,

Michael Riepe wrote:
> On Thu, Aug 02, 2001 at 12:24:43PM +0200, Yann Guidon wrote:
> > Juergen Goeritz wrote:
> > > On Wed, 1 Aug 2001, Yann Guidon wrote:
> > > > could it be possible to switch this discussion on the main
> > > > english f-cpu list ?
> > > Sure!
> > cool :-)
> > other people will be able to participate and help.
> Other people (like me) first have to catch the thread and see what the
> heck is going on :(
we had started speaking about the FC0 and its features.
i thought that other people would find it useful and might step
in the discussion. after all there's nothing secret.

> [...]
> > > > now the problem is when the written register is the same as the read register.
> > > > gut feeling tells me that the signal couldn't propagate fast enough.
> > > > some kind of bypass could become necessary.
> > >
> > > Yes, but write usually is at least one cycle delayed, isn't it?
> > it depends, but it doesn't solve the problem : the delay only "moves"
> > the problem from one cycle to another...
> 
> IMHO, reading and writing "in the same cycle" means that *by the end
> of the cycle* (when the clock signal rises) the new data is stored in
> the register, while the reader gets a copy of its *previous* contents
> at the same time.

that's what i feared :-(

> > > So the compiler could take care of not using the same register
> > > as source in the next instruction(s) that must be written to
> > > first. But this implies that you know exactly how the pipeline
> > > is constructed and how it works inside the compiler.
> > this is not considered : the compiler and the binaries should
> > be independent from the microarchitecture (at least, have a minimal
> > compatibility). introducing such a constraint on the compiler is
> > not desirable and not easy either...
> >
> > we CAN detect when the bank is accessed both for read and write.
> > we can even delay the instruction that does that (but it's not desirable).
> > however, on some cases it might be that the hardware doesn't need
> > such a measure. it depends too much on the silicon characteristics...
> 
> If there is a read-after-write dependency, we have to a) bypass the
> result of the first instruction (if the result arrives in time) or b)
> delay the second instruction (if bypassing is not possible or the result
> ist NOT ready).

direct bypassing is already implemented.

Then there is another problem ! since the result is really written
during the next cycle but available only the cycle aftern, people who
want to optimise the instruction flow but who don't do it "perfetcly"
are hit by a 1-cycle delay. Obviously, delaying is not a good thing :
people will try to schedule the instructions close to the pipeline
features, but there is a large chance that, under the pression of
the program, available ressources and other stuff, the delay is reached.


> > > > i'll analyse this issue when C will be transformed into VHDL.
> > > Do you simulate the register accesses as well?
> > i start from the idea that the hardware CAN read and write the same data.
> > it is easier to handle because there is no special case.
> > i could then add a 1-cycle write latency feature, in the future,
> > but it makes the instruction decode/issue more complex...
> [...]
> 
> The same *data* or the same *register*?
ooops, register. more precisely, read AND write the same register.

>  The former needs a bypass (or 1-cycle delay), the latter doesn't.
i'll have to find a way to make a second level of bypass.

>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/