[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [f-cpu] Status quo

To: f-cpu@xxxxxxxx
Subject: Re: [f-cpu] Status quo
From: whygee@xxxxxxxxx
Date: Wed, 01 Apr 2015 09:33:02 +0200
Delivered-to: archiver@xxxxxxxx
Delivered-to: f-cpu-outgoing@xxxxxxxx
Delivered-to: f-cpu@xxxxxxxx
Delivery-date: Wed, 01 Apr 2015 03:33:07 -0400
In-reply-to: <CAH+PdrDaA3bJy5MTmG3Xig61Ym1srydXi=gMGFHLDi=7gDUhVg@xxxxxxxxxxxxxx>
References: <528a7b6f4739c568d6e0c4301a533fbc@xxxxxxxxx> <5515066B.7080007@xxxxxx> <262942d0fcd0d532445d45749231f240@xxxxxxxxx> <55156647.6080302@xxxxxxxxxxxx> <a4b8c284a731e679532c3d86666b1ecd@xxxxxxxxx> <5516537E.4050705@xxxxxx> <6855f52dea2c49d8edc09a3bcd61fe81@xxxxxxxxx> <5516A6A4.6060305@xxxxxx> <90eafc23dcee1ba89bea6270f7be63e2@xxxxxxxxx> <5516ED77.5060207@xxxxxxx> <fdc6da6170c32ff7fbaf794f27044ded@xxxxxxxxx> <CAH+PdrAY3w-+0R-DBRyOaLQZZjjHhQ8owVnDGjgZxAOr63hDXw@xxxxxxxxxxxxxx> <f72cd10ae090559f6232bd78dc23ff3f@xxxxxxxxx> <CAH+PdrDcyjNBFF1Dzi689CUV5tD5Atpu-xOdTwWmKfM3f+m54A@xxxxxxxxxxxxxx> <698d5ee3914168b85f25d3480d1d8912@xxxxxxxxx> <CAGv1asWXM8DXPXUkML-1XaE8Odb5BwRZ5otux+=z86gk1xTHQQ@xxxxxxxxxxxxxx> <5519D614.1050605@xxxxxxx> <CAGv1asWQMsmFzE8qTwZnSrjSjTjWwLQr46Vwu8Cv2jSNAaYfOA@xxxxxxxxxxxxxx> <551ABC70.1030503@xxxxxxx> <CAGv1asWDtL76SFfWoCRtuxWUV7HA4Lwvj54ZCDsc5dr7k0t9qw@xxxxxxxxxxxxxx> <a44cc096231572a9b16cc21ad602bf13@xxxxxxxxx> <CAGv1asVMq2S737aVnKTMYQ3qnJZGg7bPHEb6pLw5GJAhcLLs2Q@xxxxxxxxxxxxxx> <a2688b6d0e506e156320d2cfb4b35a06@xxxxxxxxx> <CAH+PdrDaA3bJy5MTmG3Xig61Ym1srydXi=gMGFHLDi=7gDUhVg@xxxxxxxxxxxxxx>
Reply-to: f-cpu@xxxxxxxx
Sender: owner-f-cpu@xxxxxxxx
User-agent: Roundcube Webmail/0.9.5

Le 2015-04-01 09:19, Nicolas Boulay a ÃcritÂ:

GPUÂ use a specific memory layout "in Z" to favor locality access and
caching. This is simple "tiling". When you copy this kind of image
between cpu and gpu memory, the pixel must be moved to the right
place. And it's slow.

Tiling is a good also for cpu code on image. Using tiling "by hand" in
cpu, is a pain in the ass. The asm code is too ugly.

So Cedric ask for a MMU flag to say to use the same memory layout in
tile for CPU and GPU, : you don't need to change pixel place during
the copy.

It could be very interresting if memory are shared between gpu and cpu
: no copy at all will be needed.

One more question from my part : Is it possible to always do 2D tiling
memory layout compatible with GPU, to avoid the flag ? It's only a
different way to read the memory.

Hi,

you both use terms that have different meanings depending on thecontext.

"in Z" could be about the Z-buffer. and none of you has mentioned the
"size" of the "tiles".

Cedric used the term "MMU" which is traditionally used for protection
and remapping of 4KB blocks, which is a totally different beast
from the pixels.

Cedric has posted a link that says that bytes are interleaved, and it
crosses raster line boundaries. It's not very easy to do.

The document also mentions different interleaving patterns so
it's not possible to do a one-fits-all unit sitting on the bus
for this.

I believe that your solution is a SIMD instruction that performs
"byte shuffling". I've seen it appear in big CPUs years ago.
Why isn't it used for this purpose ?

yg
*************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxx with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/

Follow-Ups:
- Re: [f-cpu] Status quo
  - From: Nicolas Boulay

References:
- Re: [f-cpu] Status quo
  - From: Nicolas Boulay

Prev by Author: Re: [f-cpu] Status quo
Next by Author: Re: [f-cpu] Status quo
Previous by thread: Re: [f-cpu] Status quo
Next by thread: Re: [f-cpu] Status quo
Index(es):
- Author
- Thread