[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [pygame] Bump mapping





On Wed, 30 Jun 2004, Pete Shinners wrote:
> Jasper Phillips wrote:
> > Cool, thanks!  This helped quite a bit, as I was fumbling around trying to
> > find a way to multiply 2 arrays of different sizes, and also didn't know
> > about N.where().
> > 
> > For 380x360  I got  .35 seconds for a ~3.7x speed improvement.
> > For 1220x840 I got 1.30 seconds for a ~7.5x speed improvement.
> > For the next size up (~2000x2000) it appears to suck up memory/cpu and
> > hang!  Not sure what's going on here. :-/
> 
> Let's see, 2000x2000*4/1024 = 15.6MB.
> When dealing with large arrays you'll have to be a little more careful 
> with Numeric. It tends to want to make copies of the arrays frequently. 
> At 15MB each it won't take many copies to swamp a machine.

Worse than that with 64 bit floats I think!
2000*2000*4*64/8/1024/1024 = 122MB.  Yowch!

[snip]
> Instead use either of these lines,
>    array *= shade
>    Numeric.multiply(array, shade, array)

I had to tweak this a bit since the inplace operators don't seem to work
when mixing floats and ints.  I ended up avoiding floats anyway, so no big
deal.

> Also, if you really are using the "compensate" mode. Precalculate that 
> elevation and compensation map. At 2k that's a heck of a lot of calls to 
> sin() and radians() for every call to shadeMap.

Can't do that, as when combined with the later clamping it gives a different
result, i.e. the end value depends upon the image being shade mapped.  sin()
and radians() should only be called once anyway, although the /= is
expensive.

> I'd say same goes for convering "shade" to an array of 0-1 floats.
> Although the shade maps aren't as large, doing it once when you load the
> shade image is a big win over doing the conversion every time you want to
> use it.

I won't be redoing this often, and Floats were the wrong way to go anyway.

> Numeric also has a "max" function, which should be better optimized than 
> the "where". Plus it can place the results directly in the original.
>     if compensate:
>         Numeric.maximum(array, 255, array)

Numeric.minimum(array, 255, array) is what I want.  It test any faster than
where(), but it is clearer.

> >   -This requires changing the rest of the code to store shades as
> >   (shade,shade,shade) instead of (shade,0,0)
> 
> Actually, you should be able to use just "shade", instead of backing it 
> into an array with 3 values. Numeric will automatically apply the value 
> in the "2D" array to each of the values in the "3D" array. If anything 
> this will help with the memory situation.
>      shade = pygame.surfarray.array3d(shadeSurf)[0].astype(N.Float64)

This helped too, although the syntax ends up being [:w,:h,0]

> Perhaps an entry for the new PCR, once it's all cleaned up (hint hint)

I thought it was awfully convenient for the new PCR to open for business
just now. ;-)


I managed to get about another 2.5x speed improvement out of my latest
tweak, through your tips plus switching to Int32s and rearranging the order
of computation to reduce roundoff errors.  This is despite now doing the /=
255 on the full surface instead of the smaller shadeSurface.  Plus it now
also works on 2000x2000 surfaces, and saves alot of memory.

-Jasper


# PS Here's my latest tweak:

def shadeMap( surface, shadeSurf, compensate=False, elevation=55.0 ):
    '''Bump map from a pre calculated shade map'''
    array  = pygame.surfarray.array3d( surface ).astype( N.Int32 )
    array *= createShadeArray( shadeSurf, surface.get_size() )

    if compensate:
        array /= int( 255 * sin(radians(elevation)) )
        N.minimum( array, 255, array )
    else:
        array /= 255

    return pygame.surfarray.make_surface( array.astype( N.Int8 ) )

def createShadeArray( shadeSurf, (wNew,hNew) ):
    '''Tile shadeSurf into a surfarray of the specified size'''
    shade  = pygame.surfarray.array3d( shadeSurf )
    w, h   = shadeSurf.get_size()
    xTiles = (wNew+w)/w
    yTiles = (hNew+h)/h
    shade  = N.concatenate((shade,)*xTiles,0)
    shade  = N.concatenate((shade,)*yTiles,1)
    return shade[:wNew,:hNew,:1]