[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Resume of Fcpu features



hi !

nicO wrote:
> I wrote a french article (good for prior art proof and money) about the fcpu.
yummmy ! i'm eager to read it :-) but i hope that it is not written in english.
if it is, then let me correct your grammar, at least :-)

concernant l'antériorité, j'ai qqs enveloppes Soleau au cas où...

> This work will be used for the 18C3 conference, too. (if they respond to my email !!!)
> 
> So the fcpu feature :
> - 64 bits SIMD (extensible by power of 2)
> - 64 registers
> - 32 bits instuctions word
> - superpipeline
> - none superscalare (4 ways max in the future)
are we speaking about the "FC0" or "F-CPU family" ?
do not forget to make a clear distinction between
architecture and implementation.

> - no OOO (no needs)
> - so no register renaming (no needs too)
> - no branch prediction (for the moment) <<-- no need.
that's for FC0.

> - associative memory between reg name and memory content to speed up
> memory fetch and load (but there is an hard trend how do we handel
> memory alias ?)
  --> memory aliases are coherent but the current implementation
      of FC0 makes a penalty of several cycles when 2 or more registers
      access the same 32-byte chunck. There is a way to solve that
      but it's heavier (at first, i had another idea then what is done
      now, but it is more "expensive").

> - intentive use of cmove and cjump required
we are modern or we are not :-)

> - RISCy instruction inst R1, R2, R3
you can reuse the EPF slides for this part.

> - 2 adressing mode : immediat and register only.
??? immediate ?

> - no register windows (handel the trap is an overkill, but maybe too
> increase the number of register in the future, or we will used 64 bit
> instructions word to access 65535 register ? why not ? There is enough
> room ! ;p)
can you image the size of the physical register bank ?
however you forget about the other alternative we discussed :
when memory is mapped to registers.

> - The SRB to speed up register saving for trap handler, it could be used
> for loading or saving memory from packet of register (packet of 8).
oh i remember why it is 8 : each cache line can contain 4*64 bit registers,
and it works with a "double-buffered" strategy, so there is a "sliding
window" of 8 registers that can be saved at a time.

> But it used mix of special register, and i have a problem where to put
> return adresse (there is non stack in the fcpu !), we put it in R0 ?

i should dust off the documents where we discuss the format of the CMB...
don't worry too much anyway : your "PC" is stored to memory, at a location
pointed to by the current CMB pointer, plus the offset that corresponds
to the PC (i don't remember which).

> - TLB very close to the processor but we use L0 cache to precheck the
> TLB before effectively used the data.-> so L1 cache are physicaly cached
> 
> My question are : does i forget something ?

the problem is : when you start to scratch the surface, it never ends.
i'm trapped here for a long time already and i don't know when i'll be
released ;-)

thank you,
> nicO
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/