[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[f-cpu] More Instruction Set Trouble



Hi *,

today I found another dark corner...

When an F-CPU program wants to access a (global) variable, it has to load
its address and use load/store.  For loading the address, we have several
options: loadaddrd, loadaddrid and loadcons[x] (or load it from memory,
but then we have to load the address of the place where the address
is stored...).

The address of a variable is usually not known at compile/assemble time.
The difference between the variable's address and the address of the
instruction loading it will also be unknown in most cases (code and data
usually belong to different sections that can be moved independently).
That means that we have to use relocations: the compiler/assembler will
create code for the general case, leaving the (absolute or relative)
address open (usually set to 0), and let the link editor patch in the
correct value.

Unfortunately, the term `general case' means four(!) successive loadcons
instructions (or similar).  There is no way to optimize one or two of
them away because the compiler/assembler doesn't know which value is
loaded.

You might say: "use PC-relative addresses".  But that doesn't work
either -- in fact it's even *worse* than using the absolute address
because you need four loadcons instructions to put the relative address
into a register, and then you still have to execute loadaddrd (that is,
you need one instruction *more* for relative addressing).

The same is true if you use a `base' register that points to the
beginning of the data section and calculate the distance to that point.
You still need 4x loadcons + 1x add to get the variable's address (and
yet another instruction if you want the data to be prefetched, because
`add' doesn't do that).

A module-local base register is also possible, but you'll have to save,
set and restore it in every function that is not module-local (e.g. has
`extern' linkage in C) and uses global variables -- and you need the full
`quadruple-loadcons' instruction sequence again each time you access a
global variable that belongs to *another* module (just in case it isn't
clear: `module' means `source file').

Another (inferior) way is to limit the program size to 2^32 bytes (or
maybe 2^48 bytes), which saves two (one) loadcons per address loaded.
Did anybody say `tiny/small/large/huge memory model'? *YUCK*

From a compiler's point of view, the loadaddr instruction is pretty
useless except for module-local jumps and calls, and access to jump tables
or other data that resides in the same (code) section of the same module.

Any other ideas how we can reduce/avoid the `address load penalty'?

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/