[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [f-cpu] New suggestion about call convention




hi,

>De: nico

>On Tue, 05 Nov 2002 00:50:27 +0100
>Yann Guidon wrote:

<snip intro> 
>> <snip examples>
>> 
>> >The problem of the first solution are :
>> >- complexity
>> >- popcount unit must not be optional
>> >- block the CPU for 3/4 cycles (before being sure that no TLB trap append)
>> >  
>> not only that, but :
>>  - instruction lifelength is not static ==> more difficult to decode and 
>> schedule
>
>??? I need to see a proof of that.

proof of what ?
 * the instruction lifelength is not static because
 the number of operations is indicated by a register,
 not a field in the opcode.
 * if the instruction is not equivalent to a static
 dataflow graph, then it is not possible to schedule it
 in FC0.

Now you can admit that it is "a bit more complex" than
a simple ADD or even the division unit (which is an
exception to the static scheduling because it has a
static datapath).

>>  - instruction cannot be interrupted in the middle
>>      (IRQ/whatever) ==> IRQ response time is unpredictable :-(

>Like our /0 trick,

gni ?

> the pipeline should check IRQ first.
FC0 doesn't "check IRQ".
The new instruction flow is inserted in the pipeline
whenever it is available and can be issued.
It can be ok to delay IRQ while an instruction
waits for the operands to be ready before it is
issued, but allowing more delay (particularly
when it could have been avoided with the use of
discrete instructions) reduces the system's
responsiveness. It may be completely off-topic
for an average desktop PC, but F-CPU is not
meant to be used only there.


> And then the following stay asynchronous.

?

do you meant that IRQ is blocked when the instruction
runs, or do you allow asynchronous IRQ in the middle
of the instruction ?...

>>  - it can't be pipelined (issued and then another instruction can be 
>> decoded)
>It could.
then tell me how.

> Where is the probleme ?
people have sexy ideas but no way to integrate
them in the existing framework.

Think about it : the existing FC0 pipeline
is designed in such a way that an instruction
implements a simple function : "add" is decoded,
operands are fetched, result is computed and
written back. THAT can be pipelined and it works well.

Now if an instruction must perform several steps,
it has to "stay" in the decoder, so that the steps
can reuse the existing pipeline. This means that
the instruction is "blocking" because no other
instruction can start decoding. This is why it is
not "pipelinable" because even if the rest of the
data pipeline is used, the instruction fetch and decode
pipeline is stalled and no IRQ can be acknowledged.


> You have to play with a contention on the register bank.
i wouldn't call that "play"....

>>  - the read port is connected to the instruction buffer ==> it is not 
>> possible to generate the sequence of registers to be saved. And even a counter 
>> would not be ok (in order to generate the register numbers), because the 
>> mask can have holes !
>
>You could mask hole. But then you loose cycle.
heh. that's what i meant.

> I'm pretty sure that a
>"sequencer generator" could be used.
a #what# ?...

and don't forget about the "tight pipeline stages" :
if the "solution" takes more than 1 cycle per register,
then it's worthless.

>> >For the second solution :
>> >- complexity
>> >- popcount unit must not be optional
>> >- block the CPU for 3/4 cycles like the first solution, but you need to use 
>> >this instruction more frequently than the previous solution, but this 
>> >solution give you the possibility to pass a chunk if not needed.
>> same remarks as before.
>> it's multicycle, CISC instrtuction with most of the problems.
>
>the biggest probleme is the connection of the read/write port that
>annoyed instruction buffer but that the case of SRB, too.

remember that SRB is "optional" .....

>> >Sorry for this long, but I hope it could be interresting,
>> >	Cedric
>> >
>> well, at last it made it to this list.
>> 
>> YG 
>
>Maybe the idea of Michael is better (SW). It's okay if the linker could really do the job. Otherwise...

i would go for it. However, the loadm/storem have
some of the problems of the masked load/stores.

YG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/