[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [pygame] Python - Pygame - PyOpenGL performance



I'm not sure ctypes is sufficient for everything you can do
with OpenGL, but I don't know.

/Peter

On 2009-03-16 (Mon) 13:00, Forrest Voight wrote:
> Would writing a replacement for PyOpenGL in C instead of in Python
> with ctypes help? I think it really would ... PyOpenGL is internally
> pretty complex, sometimes when I get tracebacks the error is 5 or 6
> levels into PyOpenGL. Even a C library that only implemented the
> common functions and relied on PyOpenGL for the constants and
> functions that do complex things like handling strings would probably
> help a lot.
> 
> On Fri, Feb 27, 2009 at 11:19 AM, Peter Gebauer
> <peter.gebauer@xxxxxxxxxxxxxxxxxxxxx> wrote:
> > Hi!
> >
> > I've done a few sprite thingies in OpenGL here are some pointers:
> >
> > Afaik display lists and VBO's can't bind different textures (?)
> > per list/array. You can't animate lists by changing texcoords
> > independently per element, so no go. VBO's have texture coords,
> > but only one texture. Again, I'm no expert, might be wrong.
> >
> > With the quad aproach you should try
> > to make the number of calls as few as possible. If you get
> > rid of the push and translate for each sprite you'll get some
> > extra speed. Try positioning each quads directly. The downside
> > with sharing matrix over all sprites is the obvious lack of
> > using OpenGL transformations, but some vector math aplied to
> > the quads has been faster for me than having one transformed
> > matrix per quad.
> >
> > Since I haven't been able to animate a list/vbo with independent
> > textures and texture coords for each element/buffer object I've only
> > used it for backdrops. The speed increase is tremendous.
> > I also partition the elements so only one list/vbo is displayed per
> > visible section, if you're screen display is smaller than the
> > entire scene, this helps even more.
> >
> > If you put all your sprites and their animation frames into one
> > big texture you could use VBO's, but I've never had the tenacity
> > to try that aproach.
> >
> > Another way to increase speed is to write an opengl rendering engine
> > in C and call and make it available as a Python extension. This is
> > a major speed boost, in particular for a large number of iterations.
> > Iirc PyOpenGL bindings are generated, many times this is suboptimal
> > code for what you're trying to do, writing the Python extension in C
> > manually have been faster for me many times. This is indeed true
> > if you put your iterations inside a C loop instead of calling the
> > C function from Python many times.
> >
> > In any case, still waiting for that OO 2D game engine with tons of
> > OpenGL features and effects, including simple things like frame animation,
> > LERP-like features and a simple 2D scenegraph. No luck yet, all attempts
> > I've tried so far lack at least one "must have" feature. :)
> >
> > /Peter
> >
> > On 2009-02-26 (Thu) 11:29, Casey Duncan wrote:
> >> Immediate mode calls (glVertex et al) are the very slowest way to use
> >> OpenGL. In fact they are deprecated in OpenGL 3.0 and will eventually be
> >> removed.
> >>
> >> The display list is better as you discovered, but you still are making a
> >> few OpenGL state changes per sprite, which is likely slowing you down.
> >> Also there is some overhead for the display list call, which makes them
> >> sub-optimal for just drawing a single quad.
> >>
> >>>        glPushMatrix()
> >>>        glTranslate(self.positionx,self.positiony,0)
> >>>        glCallList(self.displist)
> >>>        glPopMatrix()
> >>
> >> You really need to batch the quads up into a few vertex arrays or vbos
> >> to stream them to the card in one go. pyglet has a high-level python
> >> sprite api that automates this for you fwiw.
> >>
> >> -Casey
> >>
> >> On Feb 26, 2009, at 11:04 AM, Zack Schilling wrote:
> >>
> >>> I know the PyOpenGL mailing list might be a better place to ask this
> >>> question, but I've had a lot of luck talking to the experienced people
> >>> here so I figured I'd try it first.
> >>>
> >>> I'm trying to migrate a game I created from using the Pygame / SDL
> >>> software rendering to OpenGL. Before attempting the massive and
> >>> complex conversion involved with moving the whole game, I decided to
> >>> make a little test program while I learned OpenGL.
> >>>
> >>> In this test, I set up OpenGL to work in 2D and began loading images
> >>> into texture objects and drawing textured quads as sprites. I created a
> >>> little glSprite class to handle the drawing and translation. At first
> >>> its draw routine looked like this:
> >>>
> >>>        glPushMatrix()
> >>>        glTranslate(self.positionx,self.positiony,0)
> >>>        glBindTexture(GL_TEXTURE_2D, self.texture)
> >>>        glBegin(GL_QUADS)
> >>>        glTexCoord2f(0, 1)
> >>>        glVertex2f(0, 0)
> >>>        glTexCoord2f(1, 1)
> >>>        glVertex2f(w, 0)
> >>>        glTexCoord2f(1, 0)
> >>>        glVertex2f(w, h)
> >>>        glTexCoord2f(0, 0)
> >>>        glVertex2f(0, h)
> >>>        glEnd()
> >>>        glPopMatrix()
> >>>
> >>> Note: self.texture is a texture ID of a loaded OpenGL texture object.
> >>> My sprite class keeps a dictionary cache and only loads the sprite's
> >>> image into a texture if it needs to.
> >>>
> >>> I'd get maybe 200 identical sprites (same texture) onscreen and my CPU
> >>> would hit 100% load from Python execution. I looked into what could be
> >>> causing this and found out that it's probably function call overhead.
> >>> That's 14 external library function calls per sprite draw.
> >>>
> >>> The next thing I tried was to create a display list at each sprite's
> >>> initialization. Then my code looked like this:
> >>>        glPushMatrix()
> >>>        glTranslate(self.positionx,self.positiony,0)
> >>>        glCallList(self.displist)
> >>>        glPopMatrix()
> >>>
> >>> Well, that's nice, down to 4 calls per draw. I was able to push ~500
> >>> sprites per frame using this method before the CPU tapped out. I need
> >>> more speed than this. My game logic uses 30-40% of the CPU alone and
> >>> I'd like to push at least 1000 sprites. What can I do? I've looked into
> >>> passing sprites as a matrix with vertex arrays, but forming a proper
> >>> vertex array with numpy can sometimes be more trouble than it's worth.
> >>> Plus, I can't swap out textures easily mid-draw, so it makes things
> >>> much more complex than the simple way I'm doing things now.
> >>>
> >>> Is there any design pattern I could follow that will get me more speed
> >>> without sending me off the deep end with complexity.
> >>>
> >>> Thanks,
> >>>
> >>> Zack
> >