[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[f-cpu] delayed branches



hi !

Juergen Goeritz wrote:
> > > I think this kind of a hack should NEVER
> > > be taken for an bit width upward extensible core!!!
> > > Just my opinion though ;)
> >
> > don't worry : as soon as FC0 will be running, FC1 will be created.
> > then FC2, FC3... these new cores (yet to be done) will hopefully
> > take our errors into account :-)
> 
> :-] That's not what I want to achieve.
of course, because all the sources are under GPL, you can work
on the FC0 when FC1 and others are going on. unlike in the industry :-)

> If you look at
> today's designs they lack one thing - real reusability.
distribution under GPL ensures that.

> The designs of today like PCs are built to be dumped a
> short while later (e.g. 2-5 years) thus wasting a lot
> of ressources. The PC concept could be improved a lot.
> But that would mean bye-bye Windows, bye-bye Linux as
> they are today. The next technology jump will free us
> from operating system installation and driver problems!
> :-)

let's start by making CPUs first ;-)

> > Maybe FC1 will "solve" this issue but this trick keeps the pipeline short,
> > which balances the loss. nicolas proposed to use delayed branches but this is
> > not a recommended practice for something that could go superscalar !
> I agree - if you can't solve it later you must solve it before! :-)
i am simply taking into account what happened in the CPU field in the
last decades. some CPUs use delayed branches, but all the recent cores don't.
there are very good reasons for this choice.

> > Proposition for a HW-based instruction swapper to perform "delayed branch"
> > without software help :
> >   if there is no dependency between the jump instruction and the instruction
> >   that precedes it (if it not a jump instruction too, and if it doesn't use
> >   the same registers), swap the two instructions in the instruction cache.
> >
> > I hope that it makes nicolas happy :*)
> 
> Oops. Why do it in hardware? SPARC does it in software.
SPARC is almost 20 years old ;-) and it was not designed to
execute several instructions per cycle.

> They tell that the next instructions after the branch instruction
> will always be executed because of the pipeline structure.
> Thus every compiler for SPARC already supports the required
> feature.

however, CPUs like ALPHA (RIP) explicitely don't use that.
they rely on branch prediction to balance the loss, because
the pipeline gets so long that a delay slot would take several
cache lines ! If you can decode 4 instructions/cycle, and your
jump latency is 4 cycles, you need 512 bits of instructions.
-1) it is almost impossible to fill with usefull instructions
-2) in average, jumps can occur every 4 instructions.

There are good reasons to not use Delayed Branches for F-CPU,
but the FC0 could need one. So instead of forcing a bad choice
on ALL the F-CPU family member, i prefer to make a little HW hack
on one member.

> Read you
> JG
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/