[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: ZigZag (was Re: [f-cpu] Status quo)



Hello,

I think I wasn't really clear in my previous description of this
topic. The reason why I did bring up this idea is because the same zig
zag concept can be used in two case. The first one being when you use
a GPU and you upload the texture. Here that is a pretty self contained
piece of code and will usually be manually optimized. So an
instruction is fine for it.

The second case is that actually having that kind of memory layout is
also going to help a software implementation that are typically used
to do 2D graphics. But in that case the burden on the toolkit doing
the 2D graphics is so high to use a special instruction and change
their own rendering pipeline that it wont be worth it at all for them
to do it. That's where actually making the operation transparent to
the software by flagging some entry in the TLB would pay much more.

On Wed, Apr 1, 2015 at 8:26 PM,  <whygee@xxxxxxxxx> wrote:
> Le 2015-04-01 16:19, Cedric BAIL a Ãcrit :
>> On Wed, Apr 1, 2015 at 11:25 AM,  <whygee@xxxxxxxxx> wrote:
>> The problem with the specific instruction is that it is unlikely to be
>> triggered by a compiler
>
> what if you ask for it ?

Then it is, but that means you are now writing specific assembly code
for this target. The likeliness of someone writing a custom rendering
pipeline (think gstreamer, qt, gtk, efl) to take advantage of this
specific instruction is very very low.

>> and will require manual writing of the assembly code,
>
> not necessarily, but if your compiler doesn't support the CPU's
> features, why use it ?

There is no difference to me between writing assembly code and using a
C function stub that actually is just converted to an assembly
function. It is the same amount of work on the developers.

>> but also will require toolkit to adopt this change in
>> their rendering pipeline to benefit from it. This is something very
>> tricky to do and most toolkit wont do it.
>
> so in the end, you're telling me that users will never use
> the feature, so it's useless to implement it.

If it is an instruction, nobody will use it for their rendering
pipeline. They may use it in some very specific place that are self
contained, like in mesa for uploading a texture, or in gstreamer to do
some color conversion. But they won't use it everywhere it could have
been a benefit.

>> At the opposite changing the
>> way user space see the memory is much more likely to be done. It will
>> require a change in the allocator used for image (which is already
>> something clearly separated) and a change in the kernel to enable that
>> new mapping. Both of those change are much simpler in nature and less
>> tricky to do, so more likely to be done.
>
> Again, you're being too vague, too fast.
>
> The inter-block pattern is managed by the allocator, ok.
> Then how do you define that a pointer must have its LSB mangled ?

ioctl, mmap flags, ... whatever fit the needs of the kernel. The point
is that the memory allocator for image is already self contained and
require little software change.

> Given that there are several ways to mix the LSB, it has no place
> inside the CPU or directly on its address bus. And since it's
> a problem that is specific to GPU, why isn't it possible
> to manage it on the GPU side's bus ?

That's when I read this I understood you didn't understood why I was
talking about this subject :-) So now I hope it is clearer. Basically
my point here is that the more efficient we are on memory bandwidth
usage the better we can use the performance of the CPU for most
software. This is just one trick I know of, maybe other have other
idea on how to reduce the memory bandwidth we need to do specific
operation and we can from there infer what is the best solution, MMU
flags, ISA, block. That's also why I did talk about the RLE
compression of glyph. I can't really think of any other trick that
would "compress" the memory, but I am sure you got the idea.

>> I am not dismissing the LUT instruction here, it is useful in itself
>> for other task, but just a reminder that if this require massive
>> change in the existing software, it won't be used.
>
> Well, the F-CPU won't be used if it doesn't work, which is
> a higher priority :-D We won't even have a GPU before a while
> so there is no rush in shuffling address bit shuffling for specific
> purposes.

Yup, but it can useful also for the CPU side :-)
-- 
Cedric BAIL
*************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxx with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/