[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)



Hi,

that's good :)

SDL does it by compiling the mmx stuff in if the compiler supports it.
Then it has runtime checks to see if the cpu supports it.  SDL also
has a configure flag, which you can use to tell it not to even try
compiling mmx stuff.

So if the compiler doesn't support it, the C version is used.
If the compiler supports it, and the runtime cpu detection finds mmx,
then the mmx version is used.



On 6/20/07, Richard Goedeken <SirRichard@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Rene,

I would be willing to add the CPU detection functions but I can't think
of how it could be implemented in a useful way.  The compile-time checks
have to stay in because trying to compile the 64-bit code on a 32-bit
architecture, or the 32-bit MMX code on a PPC or similar, will cause
compile-time errors.  So it's given that if someone is running an i386,
PPC, Sun, Arm, etc, they will get the C code.  If they're running i686,
they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.

So, the dilemma is that whatever build a user is running, the code is
pretty much guaranteed to work on their CPU.  If someone is running the
i686 build on a 486 or something silly like that they'll probably have
bigger problems.  We could allow someone to 'downgrade' and run the C
code when their CPU supports MMX, but what's the point?

Regards,
Richard



René Dudfield wrote:
> Nice one!
>
> This sounds like a very nice scaling function.
>
> It'd be cool if we could include a run time way of including mmx and
> other cpu specific optimizations.  Probably using the SDL methods would
> be the way to go.
>
> I've added it to the todo list for this weeks mini sprint.
> http://www.pygame.org/wiki/todo  So hopefully it'll get into pygame soon.
>
> If you feel like figuring out how to use the SDL mmx detection routines
> to select the mmx routine at runtime, that'd be cool.
>
>
> On 6/18/07, *Richard Goedeken* <SirRichard@xxxxxxxxxxxxxxxxxxxxxxx
> <mailto:SirRichard@xxxxxxxxxxxxxxxxxxxxxxx>> wrote:
>
>     Hello everyone.  I just joined the list; My name is Richard Goedeken.
>     I'm using Pygame in a project that I've been working on for a few weeks,
>     and I wanted an image scaling function with higher visual quality than
>     the nearest-neighbor algorithm which is included with the 'scale'
>     function.  So I wrote one; it's in the attached zip file. I hereby give
>     the Pygame maintainers permission to include and distribute this code
>     with the Pygame project under the license of their choice.
>
>     The algorithm which I've implemented is interesting.  Each axis is
>     scaled independently, which gives it the property that scaling an image
>     only in the X dimension or only in the Y dimension will be about twice
>     as fast as scaling both.  The reason that this design was chosen is
>     because the axes are scaled differently depending upon whether they are
>     being shrunk or expanded.  For expansion, a bilinear filter is used
>     which looks nice at magnifications under 3x or so and is quick.  For
>     shrinking the image, a novel area-averaging algorithm is used which
>     suppresses Moire patterns and looks good even at very small sizes.
>
>     The source code is in transform.c.  It's pretty big because I've also
>     included inline MMX routines for the i686 and x86_64 architectures under
>     Unix.  The AT&T-style asm sytax won't work with the Intel or MS
>     compilers, but someone could translate it and add Intel-style code for
>     Win32.  It runs a lot faster with the MMX code.  I have included a test
>     program (scaletest.py) which can run a short benchmark series of scaling
>     operations.  When run with a 600k pixel image, I got the following
>     results:
>
>     Machine         Algorithm    Code level   Shrink time   Expand time
>     Athlon64 3800+  smoothscale  C-only       36 ms         96 ms
>     Athlon64 3800+  smoothscale  64-bit MMX   5 ms          16 ms
>     Athlon64 3800+  scale        C-only       2 ms          13 ms
>     Pentium 3-800   smoothscale  C-only       64 ms         180 ms
>     Pentium 3-800   smoothscale  32-bit MMX   39 ms         119 ms
>     Pentium 3-800   scale        C-only       17 ms         85 ms
>
>     I was surprised that the MMX ran so much (6x) faster than the C-code on
>     my 64-bit machine.  But I'm happy that it actually comes close to
>     matching the nearest-neighbor 'scale' function.  I think the P-3 may
>     have been hindered by relatively low memory bandwidth.  With newer
>     32-bit architectures such as the Core 2 or Athlon I believe that the MMX
>     will give a bigger speed gain over the C than the P-3.
>
>     The 'config.py' file is also modified to set CFLAGS to activate the
>     inline assembly code.  I've integrated this new function into my project
>     system, and it's quite a nice visual upgrade.  I'm sure there are a lot
>     of people who could use a relatively fast smooth scaling algorithm in
>     the pygame software, so enjoy!
>
>     Richard
>
>
>