[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Taking decision on the project



On Mon, 9 Dec 2002 22:08:03 +0100
Michael Riepe <michael@stud.uni-hannover.de> wrote:

> On Mon, Dec 09, 2002 at 02:22:27PM +0000, Nicolas Boulay wrote:
> 
> > I don't have time as Michael to go deep regularly inside the manual.
> > But i don't like so much to change the manual depending on what is
> > implemented ! That is not the goal of a manual.
> 
> What *is* the goal of a manual, then?
> It's not a wish list, is it?
> 

That's the main point. But a just point here the fact that manual come
first and bevore the implementation. I speak later about it.

> [...]
> > Our main document is the Manual, so Cedric find this way to speed up
> > decision : if nobody are against, it's adopted. During almost 2
> > year, the manual did not change !
> [...]
> 
> I guess `speed up' is an euphemism for `force'?
> 

:) At least, some dead discussion is keep elive.

> The manual must not change every day/week/month/year. It's THE
> reference for everybody who is going to implement something - VHDL, C
> or assembler, that doesn't matter. It should only change if it's
> wrong, or there is an implementation problem - and in either case,
> changes should be minimal. Everything else belongs into a wish list -
> but not into the manual.
> 
> If I can't rely on the manual, I can't do anything. I can't implement
> a feature (such as a new instruction) as long as its design is
> changing. Neither can I finish my assembler as long as the instruction
> set is modified again and again (therefore it's still at version 0.0),
> or start working on its emulator companion :(
> 
> The manual wasn't `not changing', it was *stable*, and that definitely
> was a Good Thing (tm). This point of view probably doesn't matter for
> you, but it does for me, and probably for everybody else who wants to
> contribute code to the F-CPU project - whether it's VHDL or supporting
> software like compilers, tools, operating system ports and so on.
> Frequent design changes will just annoy and discourage active
> developers as well as developers `in spe'.
> 

[fc0 problem]

Moving spec kill project, i could see in every day work. But from my
point of view, the manual is only 90% done. 

It lack some important point as interrupt/trap/exception management.

Some instruction are crazy (like the old cache management instruction,
since removed) but MAC instruction are bad too : it introduice a RAW
dependancies inside the instruction it-self. 

It lack mutli-cpu support that's a big lack for a 2004 cpu ! (ll/sc are
for mono cpu system !)

 [LSU issue]
That's a nice trick to tide cache line to the register adresse. So in
the same time you decode your opcode+register banck, a cache line is
ready. For very quick access it need to have as many line than register
(that's not a big deal !). That's great for code.

 But not for data. Why ? Because it introduice uncoherency in the memory
flow : what happen if the 2 registers map the same adresse ? 2 lines
will be fill up. 2 lines could be modified in it's own side !

Whygee propose to detect 2 alias maybe 3. Beside the fact that it will
be a hudge piece of silicon, i beleive that's not acceptable for
compiler. Finding all memory alias in C code will be such a
mess ! Our recent Gcc expert could better said about that.

My proposal was to use the stream hint. That's what they are supposed to
do : split memory stream, to avoid checking read-after-write memory
hasard. If compiler didn't use them cleanly : shame on it :) That
introduice 7 cache lines instead of 64, that's fewer but much easier to
handle.

 [OOC]
Maybe the bigest problem is the exception handling. "in the decoder" as 
whygee said. That's okay for integer op. That say : keep the pipeline in
order until all possible exception could be raised to cancel the
execution and goes to the trap handler. 

That's a nice feature : we could
hide latency of instruction as "div" by other instructions, register
number keep the coherency. The div_by_zero condition is control very
early in the pipeline, after it could run without problem
asynchronously. That means that further instruction will modify some
registers before the result of the div. In case of external interrupt we
just wait to complete those instructions.

But it didn't work any more for floating point unit. NAN, infinite must
be implemented. Those execption came at the end of the pipeline of the
unit. We could not hide latency any more. Each instruction must wait the
complete end of the previous one. It will be so slow ! 

Reordering buffer of today cpu are not a stupid bad dream of some
engineer that smoke to much!  It permit to do register bypass, 
continuous run and avoid waiting like that. It permit to relaxe the
condition of puting the exception traitement of each unit early in the
pipeline.

[fcO lack]

 [2r1w->3r2w]
Beside that there is "some choice" that i dislike in FC0. For example,
the trick to be 3r2w by using 2r1w instructions. We use 3r1w register
port but 90% of our instruction set are 2r1w (register access time
between 2r1w and 3r2w is at least 30 % slower, at least ! that the
difference between 2 Ghz and 1.4 Ghz cpu). And compiler risk to schedule
only 32 registers.

(That's why i propose my "little vliw", 4r2w but with 2 instructions
issue by cycle (or we could also used 2 separates register bank (and
double the register number :) to have the 2r1w speed by instruction
slot, so at least 1.3*2=2.6 the orginal speed). I don't think that
introduice so much change to the manual.)

 [.call mess]
The last point i dislike is the hundred clock cycle needed to make a
"typical" function call. The example given by Cedric was scaring ! I
program/use sparc like (ERC32 == Sparc V7) cpu, "typical" .call use 3
clock cycle... 

I beleive more and more that using window register should
be consider. That the compiler work to do the right job ! (maybe 24
rotating register + 40 fixe one, otherwise the window will be to small
?). Sparc always reserve one windows for interrupt handling, so always 8
refresh registers are ready for it (no stack manipulation for simple
handler, no complexe srb, no use of SR). 

> Look, I want this project to be successful. But if you don't stop
> changing the specs, FC0 will still not be up and running in 2022 :( So
> please let the manual stabilize again, and keep all your great ideas
> in a separate document, for FC1 or later versions of the F-CPU core.
> 

Thanks for the "great idea" :) I insist on it because it's 10% of the
manual change for roughly 250% speed improvement. I should write about
it :)

nicO

> -- 
>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
>  "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
> _____________________________________________________________________
> GRAND JEU SMS : Pour gagner un NOKIA 7650, envoyez le mot IF au 61321
> (prix d'un SMS + 0.35 euro). Un SMS vous dira si vous avez gagn_.
> R_glement : http://www.ifrance.com/_reloc/sign.sms
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/