[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] F-CPU vs. Itanium



On Sun, Apr 21, 2002 at 11:39:39AM +0200, Martin Devera wrote:
[...]
> maybe I misused the term disambiguation - I understand that code
> above will go just well. But often you can do this:
> loada   r3, r4  ; start loading of r4, add r3 to disambig. mem (DM)
> loadimm  1, r1
> store   r2, r1  ; if r2 is in DM remove it
> verify  r3, r4  ; if r3 is not in DM behave as load (instead as nop)
> add r2, r4, r2
> 
> So that loada will have a time to get the data during loadimm.
> IMHO this code should be faster (only one cycle in this particular
> case).

Speculative loads can probably be implemented in a similar manner as
the load-linked/conditional-store instructions we've been talking about
recently. When `loada' is executed, mark the corresponding bytes in the
cache, and reset the markers whenever an instruction modifies the loaded
data. If the markers aren't all set (or the cache line was flushed),
the `verify' instruction will trap, jump to a piece of fixup code, or
simply re-load the modified bytes. The drawback of this approach is that
it doesn't work well if the same bytes are loaded more than once, or if
the loaded register is overwritten. I'm not sure how the Itanium handles
that case, however.

The other (probably more expensive) solution is a table that maps register
numbers to (virtual or physical) addresses. When a register is loaded,
make a new entry in the table; when the loaded data is modified or the
register is clobbered, remove or invalidate the entry.

> But you can do it only if you are sure [r3] is not later changed
> by store. And you never know (at compile time) that two pointer's
> might be the same (if they are the same type).

That's why ISO C99 adds the `restrict' pointer qualifier. E.g. you write

	void copy(char *restrict dest, const char *restrict src, size_t len) {
		...
	}

and the compiler will assume that source and destination do not overlap.
That is, it does not need to disambiguate those pointers. Of course
programmers have to be more careful -- calls like

	copy(array, array + 1, 10);

will produce unspecified results.

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/