[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] New suggestion about call convention



On Mon, Nov 04, 2002 at 11:34:36PM +0000, cedric wrote:

> 	On the French mailing-list Antoine (<Antoine@rezo.net>) has suggested a new 
> idea for the call convention. At the beginning it just say that it was a 
> funny idea, but it could be very interresting finally.
> 	So he suggest to specify a new register MR (mask register). Each bit in this 
> register specify if the corresponding register need to be saved or not before 
> using it. In the prologue of a function you make a "and" between the MR and a 
> local constant that represent which register are used, then you conditionally 
> load register to stack if a collision occur. Finally in the epilogue you 
> restore register with the same idea.
> 	When you call a function you update mr with something like this :
> mr = mr | register to preserve. Of course this mask can evolve during the 
> function.
> 	If you "randomly" select which register to use (when you don't which function 
> call me), you have some chance that no collision occur (You have more in most 
> case a chance that not a full collision occur). A second possibility when you 
> allocated your registers is to use feedback from run-time, but each time you 
> compile and run, you can have some different result...
[snip]

First, a decent F-CPU compiler should not select registers randomly.
It should analyze every function and assign register numbers so that
chances for a collision are minimized.

That is, in a module containing several functions that call each other,
each function should use a different set of registers. Additionally,
functions can have both an "internal" entry point for intra-module calls
(which need not save any registers) and a "public" entry point that
saves all registers used inside it, prepares for restoring them, and
then dispatches to the internal entry point. Unless there are recursive
functions, you'll have to save registers only when you enter the module
(and restore them when you leave it).

Ideally, each function will use a contiguous set of registers, and there
will be an entry in the object file reporting which registers it uses.
That way it becomes possible to further minimize collisions at link time
by renumbering the registers inside a function - you may also call it
"deferred register allocation" if you like.

Consider three functions A, B and C. A uses r16-r19, B uses r16-r21,
and C uses r16 and r17. If the functions do not depend on each other
(that is, don't call each other), you can simply link them together, and
the resulting set of used registers will be the union of the functions'
individual sets, that is, r16-r21. If function C calls A but not B,
you can rename registers and assign r20 and r21 to C, and the resulting
common set still is r16-r21. If C calls both A and B, it is assigned r22
and r23, and the resulting common set is r16-r23. In either case, there
will be no register collisions between the functions. Only if caller
and callee together need more than the number of available registers,
the linker will have to put save/restore code (since the register sets
are contiguous, storem/loadm will do the trick) "around" the callee:

	new_entry_point:
		// allocate stack space here
		storem xyz
		move r63, saved_reg	// save return address
		loadaddri old_entry_point, temp_reg
		jump temp_reg, r63	// call original function
	restore_code:
		move saved_reg, r63	// restore return address
		loadm xyz
		// deallocate stack space here
		jump r63

(This is generic code that may still be improved)

The same algorithm works with both functions and modules, and it only
needs the set of used registers per function/module and a dependency
graph. Both can be easily constructed from intermediate code. Register
renumbering is done on the object code directly (thanks for the uniform
register model and instruction format of the F-CPU which allow us to
`transpose' the object code -- note to guitar players: think `capo' ;-).

Conclusion: Using a smart linker, we can resolve register collision
issues statically. We can also take profiling (feedback) information
into account, like you suggested, to optimize the number and placement of
save/restore points. Therefore, I see no real need to do it at run time.

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/