[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inline asm?

Jeff Read wrote:

> > Can I rely on vga_drawpixel() to be as fast as I could make it with the
> > proper asm language? Or should I even care with modern processors?
> It makes much more sense to construct a bitmap inside main memory and
> then have the graphics library blit the whole shebang to the video card.
> This is what Sprite32 does and its newest release "Anakin" contains an
> optional x86 ASM blitter routine (it handles transfers of sprites to the
> backbuffer before on-screen blitting which is done by XShm or DGA). Here
> are the steps that I recommend you take (since this is what I did in
> Sprite32):
> Prepare a rectangle of memory, either a page of video memory or a block
> of main memory, that's big enough to hold a screenful. This is to serve
> as your back buffer.
> Draw a screenful inside this buffer. *THIS* is the code that you're most
> likely going to want to optimize. An ASM routine to blit a rectangle
> with transparency, for example, will speed things up considerably if
> you're creating a sprite-based game.
> Get this buffer onto the screen. This will happen instantaneously if you
> have the option of swapping video buffers. Otherwise, you're going to
> have to copy the back buffer into the frame buffer. Generally there is a
> function such as XShmPutImage in X or putbox in SVGAlib that does this
> automagically for you. These are generally pretty well tweaked for
> performance. You can also write your own blitter, lock the framebuffer
> if you can get access to it, and do the transfers yourself. But you want
> to think about an ASM function that transfers an entire *rectangle* at a
> time.

The problem with this is that you lose most of the library effects. Say
the library has a function to copy with transparency. You code one
yourself that is a bit faster, so it pays off nicely and you're happy.
Later on, you do like me and upgrade your video card to a nice Matrox
Millenium G200 that has so many acceleration features it makes your head
spin. Including copying with transparency. And now the library function
is 400% faster than your painstakingly hand-coded assembler function.

Was it worth it?

This is just an example, but you just never know what is in store for
the next release of the library or the next generation of video cards.
With AGP and large amounts of video card memory we have now, even for a
2D game it makes sense to put most of the pixmaps in the video memory
and have the video card blitter (most modern cards have them, even the
cheap ones, and gamers *have* those cards) do the work for you
incredibly faster than you could have ever done in assembler.

> Calling a function such as vga_drawpixel() for each pixel in the
> rectangle is slow, since there's function call overhead for each and
> every pixel. Not good. A different approach is recommended.

Of course, nobody would actually do a function call for each pixels!

Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
"First they ignore you. Then they laugh at you.
Then they fight you. Then you win." -- Gandhi