[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] calling conventions

----- Original Message -----
From: Michael Riepe <michael@stud.uni-hannover.de>
To: <f-cpu@seul.org>
Sent: Thursday, June 06, 2002 7:20 PM
Subject: Re: [f-cpu] calling conventions

> When we don't put them on the stack, va_start/va_arg will have to do a
> little more work, that's right. Is that a problem? I want my `regular'
> functions to run fast!

I know there is workaround but there are still much complicated than used by
Intel for example :). But I'm not sure there is a real interest to have a
little faster but more complicated calling conventions if your functions will
spend more time in code than pushing argulents. Which is long in printf or
scanf is not passing parameters but the string formatter.

> That will be the task of va_start(). In fact it is the reason why
> va_start() and va_end() exist at all! If you could be sure that the
> parameters are always on the stack and the stack always grows down,
> you could simply say `ap = (void**)(&last_fixed_arg + 1);'. Or maybe
> `ap = &...;', as they do in some compilers that treat `...' like an
> identifier.

I know why va_start and va_end exist but i'm not very convinced by the
necessity to optimize that if the time to pass argument is negligeable compared
with functions execution. It is only my concern. And please, don't forget F-CPU
has 32-bit opcodes, it takes a lot of place and you will need very big cache to
be sure that code is inside... whereas to access in stack would be just an load
with post incrementation and chance the data are in the data cache.

> I expect va_start() to look like this on the F-CPU:
> // prologue of calling function
> subi $16*8, r63, r63 // allocate space on stack
> move r63, r17 // let r17 hold `ap' (arbitrary choice)
> ...
> // (inline) call to va_start()
> loadconsx $<number_of_fixed_args>, r16 // known at compile time
> storem r1, [r17], r16 // save r1-r16 (syntax still unclear)
> ...
> // epilogue of calling function
> addi $16*8, r63, r63 // restore stack pointer
> That's all!

But finally you are passing arguments in stack !? so I don't see what you gain
here to do so afterwards.

> va_arg() in turn will not change `ap' but the value stored in ap[15]. Let
> `WORD' be the type corresponding to a machine word, then va_arg(AP, TYPE)
> becomes (for sizeof(type) <= sizeof(WORD)):
> WORD *pointer = (WORD*)AP;
> WORD i = AP[15];
> WORD tmp;
> if (i >= 14) {
> /* this is the only tricky part */
> pointer = (WORD*)AP[14] - 14;
> }
> tmp = pointer[i];
> AP[15] = i + 1;
> return (type)tmp;

no, much simpler, using gcc extensions because i'm lazy :

va_list arg;
    void *arg;

    ==> arg = ap; // our register which holds the pointer on the first variable

    ==>  *(type __attribute__ ((aligned(4))) *)arg)++;

va_end(arg); // nothing to do

> You see, it's no problem at all. There is a little overhead when dealing
> with variable argument lists, but that's ok as long as functions with
> a fixed number of arguments run fast.

As i told you functions with no variable arguments are given in register.

> > Personally, I think the best we can do when doing a printf is to push all
> > parameters in register and optionnal parameters in stack, so :
> >
> > printf (r1:char *fmt,stack:r15 ...);
> >
> > sprintf (r1:char *str,r2:char *fmt,stack:r15 ...);
> >
> > etc.
> No. All functions MUST use the same calling conventions, whether they
> take a variable number of arguments or not. (The C99 standard does not
> require it - in fact, it contains rules that avoid this situation -
> but there is a *lot* of C code out there that won't work otherwise).

I'm sorry, but it isn't what occurs in gcc.

For example, gcc for SH use registers for the four first arguments.

gcc for IA32 - when regparm option is used - use registers as possible except
when this function has variable arguments (all stack parameters in that case).

And I don't see why the calling conventions could not be the same for all the
functions if we consider those rules :

- fix arguments always as possible in registers, the rest in stack
- variable arguments always in stack (because the callee don't know the exact
number of arguments and their types).

> [...]
> > Another advantage :
> >
> > struct { r1:int error; r2:struct open_file *file; } file_open (r1:char
> > *name,...);
> >
> > You can have not only one register as result but several registers this way
> That's another valid option. The canonical way to return structures is
> different however: the function
> struct blah func(...) {
> struct blah x;
> ...;
> return x;
> }
> is turned into
> struct blah *func(struct blah *result, ...) {
> struct blah x;
> ...;
> *result = x;
> return result;
> }
> The advantage is that `struct blah' can be arbitraryly large. We can
> (and probably should) define that small structures (up to 15 words)
> are returned in registers, the way you proposed it.

True, I was just considering for the case where we can have enough registers
for result.

> You can use registers for any global variable in any application.
> E.g. inside the Linux kernel, you would use registers for `current',
> `jiffies' and probably the addresses of some often-used memory locations
> (like the task table, for instance). In a Forth interpreter, you can
> use global registers for the stack pointers, the instruction pointer,
> HERE and probably some well-known addresses. In C, you'll have stack and
> frame pointers, a pointer to the top of the heap, pointers to often-used
> global data (stdin/stdout/stderr/errno for example) and so on.
> How many examples do I have to list in order to convince you? :)

Ok but because it is not a hardware issue but just a language issue (c, pascal,
list and forth don't use the same convention), the need to fix exactly r48-r63
as global registers is not justified. just take the number you need according
the language and free the others as local registers meseems more judicious
rather than excluding a range, a small part of which would be used.

To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/