Hellos,
I made an initial implementation of Surface.blits(), focusing just on correctness, with no optimizations.
In the micro benchmark below it takes from 88% to 92% of the time for 255 surfaces
compared to using Surface.blit in python in a loop over the same list.
.. method:: blits
| :sl:`draw many images onto another`
| :sg:`blits(blit_sequence=(source, dest), ...), doreturn=1) -> (Rect, ...)`
| :sg:`blits((source, dest, area), ...)) -> (Rect, ...)`
| :sg:`blits((source, dest, area, special_flags), ...)) -> (Rect, ...)`
Draws many surfaces onto this Surface. It takes a sequence as input,
with each of the elements corresponding to the ones of ``Surface.blit()``.
It needs at minimum a sequence of (source, dest).
Considering blit is usually the slow part in most pygame apps, this is sort of nice.
Note, I haven't updated pygame.sprite to use it. Volunteers?
I feel without updating pygame.sprite, many people won't use it.
Some benchmarking and other notes below.
cheers,
1) Why not work on a faster implementation that saves the unwrapped objects?
This would allow you to save a list into a C object like:
struct blitinfo {
SDL_Surface dest;
GAME_Rect *src_rect;
GAME_Rect *area;
int flags;
}
Then if you promise not to change the C list (ie, you are updating rects in place, and all your Surfaces are still there),
then it could avoid a lot of the unwrapping work.
However, I did a test where I commented out the blit call. So only the unwrapping and looping over the list is done.
And it seems that the python book keeping for these 255 10x10 surfaces is only 2.1%-3.3% of the total time taken.
2) Another optimization would be to avoid subsurface checks, and avoid a few other preparations for surfaces.
I tried this, and didn't see any noticeable improvement.
3) Currently neither SDL1 or SDL2 have a special batched blit, but there are proposals and implementations around.
Such as SDL_GPU.
This could see a bigger improvement on such backends where changing state is slow (OpenGL etc).
import pygame
from pygame.locals import *
NUM_SURFS = 255
dst = pygame.Surface((NUM_SURFS * 10, 10), SRCALPHA, 32)
dst.fill((230, 230, 230))
blit_list = []
for i in range(NUM_SURFS):
dest = (i * 10, 0)
surf = pygame.Surface((10, 10), SRCALPHA, 32)
color = (i * 1, i * 1, i * 1)
surf.fill(color)
blit_list.append((surf, dest))
def blits(blit_list):
for surface, dest in blit_list:
dst.blit(surface, dest)
In [17]: %timeit results = blits(blit_list)
774 µs ± 24.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [18]: %timeit results = dst.blits(blit_list)
717 µs ± 12.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [19]: %timeit results = dst.blits(blit_list, doreturn=0)
688 µs ± 14.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [20]: (100. / 774) * 717
Out[20]: 92.63565891472868
In [21]: (100. / 774) * 688
Out[21]: 88.88888888888889
If I comment out the actual blit call...
In [3]: %timeit results = dst.blits(blit_list)
26.2 µs ± 695 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [4]: %timeit results = dst.blits(blit_list, doreturn=0)
17.6 µs ± 314 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [5]: (100. / 774) * 26
Out[5]: 3.3591731266149867
In [6]: (100. / 774) * 17
Out[6]: 2.1963824289405682