Re: [Numpy-discussion] How to median filter a masked array?

At 9:50 AM -0700 2004-07-14, Paul F. Dubois wrote:
The median filter is prepared to take an argument of a numarray array but ignorant of and unprepared to deal with masked values. Using the __array__ trick, both Numeric.MA and numarray.ma would 'know' this and therefore replace the missing values in the filter's argument with the 'fill value' for that type -- a big number in the case of real arrays. You could explicitly choose that value (say using the overall median of the data m) by passing x.filled(m) rather than x to the filter.
If there is no such value, you probably do have to do it in C. If you wrote it in C, how would you treat missing elements? BTW it wouldn't be that hard; just pass both the array and its mask as separate elements to a C routine and use SWIG to hook it up.
I already have routines that handle masked data in C to create a radial profiles from 2-d integer data (since I could not figure out how to do that in numarray). I chose to pass the mask as a separate array, since I could not find any C interface for numarray.ma and since NaN made no sense for integer data. That code was pretty straightforward. I wish I could have found a simple way to support multiple array types. I thought using C++ with prototypes would be the ticket, but absent any examples and after looking through the numarray code, I gave up and took the easy way out. (I didn't use SWIG, though, I just hand coded everything. Maybe that was a mistake.) I confess that makes me worry about the underpinnings of numarray. It seems an obvious candidate to be written in C++ with prototypes. I hate to think what the developers have to go through, instead. In any case, writing a median filter is a bigger deal than taking a radial profile, and since one already existed I thought I'd ask.
I doubt NaN would help you here; you'd still have to figure out what to do in those places. Numeric did not have support for NaN because there were portability problems. Probably still are. And you still are stuck in a lot of cases anyway.
Well, NaN isn't very general in any case, since it's meaningless for integer data. So maybe that's a red herring. (Though if NaN had worked to mask data I would cheerfully have converted my images to floats to take advantage of it!). What's really wanted is a more unified approach to masked data. I suppose it's pie in the sky, but I sure wish most the numarray functions took an optional mask array (or accepted a numarray.ma object -- nice for the user, but probably too painful for words under the hood). I don't think there are major issues with what to do with masked data. Simply ignoring it works in most cases, e.g. mean, std dev, sum, max... In some cases one needs the new mask as output (e.g. matrix multiply). Filtering is a bit subtle: can masked data be treated the same as data off the edge? I hope so, but I'm not sure. Anyway, I am grateful for what we do have. Without Numeric or numarray I would have to write all my image processing code in a different language. -- Russell
participants (1)
-
Russell E Owen