[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [pygame] Dirty rect overlapping optimizations

Just to clarify, this is an optimization to update the display surface?  TBH, if DR0ID says that it isn't a good optimization, then I'll have to follow him.  Thinking deeply about the issue, the would be better payoffs by finding way to not flip bits in the display surface in the first place.  To say it another way, if we can inform developers that they are drawing to specific areas in the display surface several times, then it would help profile the game better and reduce extraneous draws.  Android has options to detect extraneous overdraw, and I'm sure other platforms do as well.

Also, if we are going to promote pygame group rendering, a cheap 15-20% performance increase (based on my system) could be had by creating a C function that accepts a list of surfaces and their positions, then executing the blits in C.  I've brought it up before, but nothing came if it.  In fact, I was asked more than once if I had converted my surfaces, so maybe people just don't read threads?  ¯\_(ツ)_/¯

On Sun, Mar 26, 2017 at 2:50 PM, René Dudfield <renesd@xxxxxxxxx> wrote:
Because rectangles can overlap, it's possible to reduce the amount drawing done when using them for dirty rect updating. If there's two rectangles overlapping, then we don't need to over draw the overlapping area twice. Normally the dirty rect update algorithm used just makes a bigger rectangle which contains two of the smaller rectangles. But that's not optimal.

But, as with all over draw algorithms, can it be done fast enough to be worth it?

Here's an article on the topic:

jmm0 made some code here:

DR0ID, also did some with tests and faster code...

So far DR0ID says it's not really fast enough to be worth it. However, there are opportunities to improve it. Perhaps a more optimal algorithm, or one which uses C or Cython.

"worst case szenario with 2000 rects it takes ~0.31299996376 seconds"
If it's 20x faster in C, then that gets down to 0.016666. Still too slow perhaps.

It's not an embarrassingly parallel problem... I think. But I haven't thought on it much at all. Maybe there is a use for multi core here.

Anyone done any other work like this? Or know of some good algos for this?