[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[f-cpu] the end of the Smooth Register backup issue ?
Christophe Avoinne wrote:
No no ! you people are mixing wrong things...
I don't think it's the same as for ARM with some special registers mapped.
The main trouble is that having not a dedicated register for stack, we
cannot design interrupt and trap gates for F-CPU (that is, saving caller
return address in stack). So we are all kept in a struggle to find the best
way to accomodate CMB into a fast switching. But is it really worth ? I mean
is there any real situation where a handler really need to use all the 64
registers ? Im' quite doubtful...
i do not think that the absence of a stack is such a problem
because a stack brings its own issues.
May I also refresh the common memory ?
Year ago, it was found that the best way to accomodate the
lack of SRB in the early days is to create a set of "scratch SRs".
They are located outside the datapath and use the normal get/put
instruction, so there is no architectural impact.
These scratch registers are used in the entry of "fast" handlers
to save the first few vital registers : the absence of stack
does not let us "push" registers, so we can do it in fixed
locations in the SR space.
So when a trap/IRQ/exception/whatever occcurs AND no SRB
is available/possible/anti-orthodox/whatever, the handling code
simply start with
put SR_SCRATCH_1, r1
put SR_SCRATCH_2, r2
put SR_SCRATCH_3, r3
put SR_SCRATCH_4, r4
loadcons pointer_constant, r1
there must also be an additional mechanism to store the
PC in another SR.
i repeat : there is *no* need of register banks and "fast" switches.
Overall, the first latency problem comes from the availability of the
handler's code and the OOOC-flushing. Hence, considering
L1 cache misses, and accounting the fact that the handler is
running without address translation (hence avoiding any TLB miss),
the trap/exception/IRQ/... latency is roughly 10 cycles in a normal
system (and probably around 5 cycles in a dedicated system).
Latency is the famous cost of performance and security
(caching and protection mechanisms have obvious side effects).
A 1-cycle register bank switch is thus not justified
because it doesn't help smash this fixed latency.
I hope this post will help calm the debate :-)
To unsubscribe, send an e-mail to firstname.lastname@example.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/