[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [f-cpu] second order prefetch in FC0
- To: <f-cpu@seul.org>
- Subject: Re: [f-cpu] second order prefetch in FC0
- From: <cyrano@nerim.net>
- Date: Tue, 4 Mar 2003 14:43:36 CET
- Delivered-to: archiver@seul.org
- Delivered-to: f-cpu-outgoing@seul.org
- Delivered-to: f-cpu@seul.org
- Delivery-date: Tue, 04 Mar 2003 08:43:39 -0500
- Reply-to: f-cpu@seul.org
- Sender: owner-f-cpu@seul.org
I don't like prefetch. Did gcc could really calculate the very narrow windows where the prefetch is usefull ? Prefetch are implementation dependant but also clock speed dependant !
I prefer multi-load/store much more (a complete cache line for example that fill 4 or 8 registers).
So you're proposal look like a kind of double load ( a = toto -> titi ) or load then store (toto->titi = a). This could be a feature of "internal cpu buses" and a new instruction. As we control L1/L2 access and we don't need to conform to the limited feature of SDRAM, this kind of bus cycle could be added and optimised closed to the cache controller.
nicO
Devik <devik@cdi.cz> a écrit :
> Hi,
>
> just one idea. In FC0 load is supposed to trap
> on TLB miss (or access violation) before it enters
> pipeline. While it is simple it will kill many
> indirect addressing performance (foo->bar->baz)
> where we can't load pointer early or it will
> at least stall whole CPU because data are not
> in L0.
> On other side I agree that asynchronous load is
> not simple (we'd need some load buffers) and it
> is not as useful as prefetch (can't be moved
> out of control structures).
>
> I got (yet another) crazy idea. We support for
> prefetch of a cacheline where some address live.
> What about special prefetch which would
> prefetch cacheline, then load new address from
> it at given offset and prefetch that address too ?
>
> It involves other TLB and adder I know. On other
> side it is completely out of critical path and
> if we will do real load faster we can simply
> discard prefetch.
>
> It would help linked structures, especially trees
> and lists. "next" links are typically in first 32
> bytes so that it is possible for "item remove"
> subroutine to ask for prefetch of item along with
> its siblings.
> I already invented some way how to force gcc to emit
> some prefetches and this one would be possible too.
>
> It is for thinking (for YG: I don't blame FC0 I only
> want to share my ideas).
>
> devik
>
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org
> with
> unsubscribe f-cpu in the body. http://f-cpu.seul.org/
___________________________________
Webmail Nerim, http://www.nerim.net/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/