[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [pygame] Python - Pygame - PyOpenGL performance

Would writing a replacement for PyOpenGL in C instead of in Python
with ctypes help? I think it really would ... PyOpenGL is internally
pretty complex, sometimes when I get tracebacks the error is 5 or 6
levels into PyOpenGL. Even a C library that only implemented the
common functions and relied on PyOpenGL for the constants and
functions that do complex things like handling strings would probably
help a lot.

On Fri, Feb 27, 2009 at 11:19 AM, Peter Gebauer
<peter.gebauer@xxxxxxxxxxxxxxxxxxxxx> wrote:
> Hi!
> I've done a few sprite thingies in OpenGL here are some pointers:
> Afaik display lists and VBO's can't bind different textures (?)
> per list/array. You can't animate lists by changing texcoords
> independently per element, so no go. VBO's have texture coords,
> but only one texture. Again, I'm no expert, might be wrong.
> With the quad aproach you should try
> to make the number of calls as few as possible. If you get
> rid of the push and translate for each sprite you'll get some
> extra speed. Try positioning each quads directly. The downside
> with sharing matrix over all sprites is the obvious lack of
> using OpenGL transformations, but some vector math aplied to
> the quads has been faster for me than having one transformed
> matrix per quad.
> Since I haven't been able to animate a list/vbo with independent
> textures and texture coords for each element/buffer object I've only
> used it for backdrops. The speed increase is tremendous.
> I also partition the elements so only one list/vbo is displayed per
> visible section, if you're screen display is smaller than the
> entire scene, this helps even more.
> If you put all your sprites and their animation frames into one
> big texture you could use VBO's, but I've never had the tenacity
> to try that aproach.
> Another way to increase speed is to write an opengl rendering engine
> in C and call and make it available as a Python extension. This is
> a major speed boost, in particular for a large number of iterations.
> Iirc PyOpenGL bindings are generated, many times this is suboptimal
> code for what you're trying to do, writing the Python extension in C
> manually have been faster for me many times. This is indeed true
> if you put your iterations inside a C loop instead of calling the
> C function from Python many times.
> In any case, still waiting for that OO 2D game engine with tons of
> OpenGL features and effects, including simple things like frame animation,
> LERP-like features and a simple 2D scenegraph. No luck yet, all attempts
> I've tried so far lack at least one "must have" feature. :)
> /Peter
> On 2009-02-26 (Thu) 11:29, Casey Duncan wrote:
>> Immediate mode calls (glVertex et al) are the very slowest way to use
>> OpenGL. In fact they are deprecated in OpenGL 3.0 and will eventually be
>> removed.
>> The display list is better as you discovered, but you still are making a
>> few OpenGL state changes per sprite, which is likely slowing you down.
>> Also there is some overhead for the display list call, which makes them
>> sub-optimal for just drawing a single quad.
>>>        glPushMatrix()
>>>        glTranslate(self.positionx,self.positiony,0)
>>>        glCallList(self.displist)
>>>        glPopMatrix()
>> You really need to batch the quads up into a few vertex arrays or vbos
>> to stream them to the card in one go. pyglet has a high-level python
>> sprite api that automates this for you fwiw.
>> -Casey
>> On Feb 26, 2009, at 11:04 AM, Zack Schilling wrote:
>>> I know the PyOpenGL mailing list might be a better place to ask this
>>> question, but I've had a lot of luck talking to the experienced people
>>> here so I figured I'd try it first.
>>> I'm trying to migrate a game I created from using the Pygame / SDL
>>> software rendering to OpenGL. Before attempting the massive and
>>> complex conversion involved with moving the whole game, I decided to
>>> make a little test program while I learned OpenGL.
>>> In this test, I set up OpenGL to work in 2D and began loading images
>>> into texture objects and drawing textured quads as sprites. I created a
>>> little glSprite class to handle the drawing and translation. At first
>>> its draw routine looked like this:
>>>        glPushMatrix()
>>>        glTranslate(self.positionx,self.positiony,0)
>>>        glBindTexture(GL_TEXTURE_2D, self.texture)
>>>        glBegin(GL_QUADS)
>>>        glTexCoord2f(0, 1)
>>>        glVertex2f(0, 0)
>>>        glTexCoord2f(1, 1)
>>>        glVertex2f(w, 0)
>>>        glTexCoord2f(1, 0)
>>>        glVertex2f(w, h)
>>>        glTexCoord2f(0, 0)
>>>        glVertex2f(0, h)
>>>        glEnd()
>>>        glPopMatrix()
>>> Note: self.texture is a texture ID of a loaded OpenGL texture object.
>>> My sprite class keeps a dictionary cache and only loads the sprite's
>>> image into a texture if it needs to.
>>> I'd get maybe 200 identical sprites (same texture) onscreen and my CPU
>>> would hit 100% load from Python execution. I looked into what could be
>>> causing this and found out that it's probably function call overhead.
>>> That's 14 external library function calls per sprite draw.
>>> The next thing I tried was to create a display list at each sprite's
>>> initialization. Then my code looked like this:
>>>        glPushMatrix()
>>>        glTranslate(self.positionx,self.positiony,0)
>>>        glCallList(self.displist)
>>>        glPopMatrix()
>>> Well, that's nice, down to 4 calls per draw. I was able to push ~500
>>> sprites per frame using this method before the CPU tapped out. I need
>>> more speed than this. My game logic uses 30-40% of the CPU alone and
>>> I'd like to push at least 1000 sprites. What can I do? I've looked into
>>> passing sprites as a matrix with vertex arrays, but forming a proper
>>> vertex array with numpy can sometimes be more trouble than it's worth.
>>> Plus, I can't swap out textures easily mid-draw, so it makes things
>>> much more complex than the simple way I'm doing things now.
>>> Is there any design pattern I could follow that will get me more speed
>>> without sending me off the deep end with complexity.
>>> Thanks,
>>> Zack