Nice one!
This sounds like a very nice scaling function.
It'd be cool if we could include a run time way of including mmx and other cpu specific optimizations. Probably using the SDL methods would be the way to go.
I've added it to the todo list for this weeks mini sprint. http://www.pygame.org/wiki/todo So hopefully it'll get into pygame soon.
If you feel like figuring out how to use the SDL mmx detection routines to select the mmx routine at runtime, that'd be cool.
On 6/18/07, Richard Goedeken <SirRichard@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
> Hello everyone. I just joined the list; My name is Richard Goedeken.
> I'm using Pygame in a project that I've been working on for a few weeks,
> and I wanted an image scaling function with higher visual quality than
> the nearest-neighbor algorithm which is included with the 'scale'
> function. So I wrote one; it's in the attached zip file. I hereby give
> the Pygame maintainers permission to include and distribute this code
> with the Pygame project under the license of their choice.
>
> The algorithm which I've implemented is interesting. Each axis is
> scaled independently, which gives it the property that scaling an image
> only in the X dimension or only in the Y dimension will be about twice
> as fast as scaling both. The reason that this design was chosen is
> because the axes are scaled differently depending upon whether they are
> being shrunk or expanded. For expansion, a bilinear filter is used
> which looks nice at magnifications under 3x or so and is quick. For
> shrinking the image, a novel area-averaging algorithm is used which
> suppresses Moire patterns and looks good even at very small sizes.
>
> The source code is in transform.c. It's pretty big because I've also
> included inline MMX routines for the i686 and x86_64 architectures under
> Unix. The AT&T-style asm sytax won't work with the Intel or MS
> compilers, but someone could translate it and add Intel-style code for
> Win32. It runs a lot faster with the MMX code. I have included a test
> program (scaletest.py) which can run a short benchmark series of scaling
> operations. When run with a 600k pixel image, I got the following results:
>
> Machine Algorithm Code level Shrink time Expand time
> Athlon64 3800+ smoothscale C-only 36 ms 96 ms
> Athlon64 3800+ smoothscale 64-bit MMX 5 ms 16 ms
> Athlon64 3800+ scale C-only 2 ms 13 ms
> Pentium 3-800 smoothscale C-only 64 ms 180 ms
> Pentium 3-800 smoothscale 32-bit MMX 39 ms 119 ms
> Pentium 3-800 scale C-only 17 ms 85 ms
>
> I was surprised that the MMX ran so much (6x) faster than the C-code on
> my 64-bit machine. But I'm happy that it actually comes close to
> matching the nearest-neighbor 'scale' function. I think the P-3 may
> have been hindered by relatively low memory bandwidth. With newer
> 32-bit architectures such as the Core 2 or Athlon I believe that the MMX
> will give a bigger speed gain over the C than the P-3.
>
> The 'config.py' file is also modified to set CFLAGS to activate the
> inline assembly code. I've integrated this new function into my project
> system, and it's quite a nice visual upgrade. I'm sure there are a lot
> of people who could use a relatively fast smooth scaling algorithm in
> the pygame software, so enjoy!
>
> Richard
>
>
>