[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [pygame] Proposed additions to Transform: connected components, upper and lower thresholding, and centroids



Hello,

a few notes below.


On Tue, Jun 17, 2008 at 10:49 AM, Nirav Patel <olpc@xxxxxxxxxxxxxx> wrote:
> As part of my project to add computer vision stuff to pygame, I'd like
> to write a function or functions that do the following.
>
> For vision purposes, it would be very useful to have thresholding with
> both upper and lower boundaries, returning both the number of pixels
> within the threshold and the centroid of those pixels.  This is a
> trivial addition to the existing transform.threshold() function, but
> is it acceptable to modify the input options and the output of an
> existing function?  Would it break compatibility with existing pygame
> games?  Would it make sense to have a second function so similar to an
> existing one?
>

You could modify the existing function if the old functionality stays
the same.  Probably by adding another default argument.  We try not to
break existing functionality.

I think the current one can use just one distance from the color.  So
it's both a lower, and upper threshold.  I'm just wondering if it
could be used already to do what you want?

> The other function, which is also similar (and could even just be an
> option in thresholding), is thresholding with connected component
> detection.  This would involve supplying an upper and lower threshold,
> a Surface, and optionally a mask.  The function would find the largest
> blob of pixels in the Surface within the threshold, make a mask of
> those pixels if desired, and return the centroid and number of pixels
> in the blob.
>

Currently this can sort of be done by making a mask from the
thresholded image.  Mask has a get_bounding_rect() function.  Doing
the get_bounding_rect on a mask turns out to be fast because you
process way less data -- as mask is 1 bit per pixel.  Then you can
sort the bounding rects on size to find the largest one.

I'm not sure if that will be suitable for your task though, but I
think maybe you could do things this way.


> It could also be useful to have multiple connected component
> detection, for "multi-touch" without having to use different colored
> objects (or if you are using IR LEDs like the Wii does), but I'm not
> sure how to handle that in a single pass of the array.  Actually, I'm
> not really sure how I'm going to handle both detection and creating a
> mask in a single pass either.  It may be necessary to store the
> starting pixel, ending pixel, and size of each connected component on
> the first pass, keeping track of which was the largest yet, and then
> have a shorter second pass to create the mask that only starts at the
> starting pixel and ends at the ending pixel.
>

Multiple areas can be found like above with Mask.get_bounding_rect().

Doing everything in one pass is hard... but if you reduce the data
down -- by using a mask -- then the second pass can act on 32x less
data.

eg, a 1024x1024 image:
>>>(1024 * 1024 * 4) / 32.
131072.0

So that is 4MB on the first pass down to 131KB of data to process on
the second pass.



> Any comments, reality checks, questions, or suggestions would be
> greatly appreciated.
>
> Nirav
>