Re: [Numpy-discussion] Are masked arrays slower for processing than ndarrays?

10 May 2009

      On May 9, 2009, at 8:17 PM, Eric Firing wrote:
...
Eric Firing wrote:
...
A part of the slowdown is what looks to me like unnecessary copying  
in _MaskedBinaryOperation.__call__.  It is using getdata, which  
applies numpy.array to its input, forcing a copy.  I think the copy  
is actually unintentional, in at least one sense, and possibly two:  
first, because the default argument of getattr is always evaluated,  
even if it is not needed; and second, because the call to np.array  
is used where np.asarray or equivalent would suffice.
Yep, good call. the try/except should be better, and yes, I forgot to  
force copy=False (thought it was on by default...). I didn't know that  
getattr always evaluated the default, the docs are scarce on that  
subject...
...
Pierre,
... I pressed "send" too soon.  There are test failures with the  
patch I attached to my last message.  I think the basic ideas are  
correct, but evidently there are wrinkles to be worked out.  Maybe  
putmask() has to be used instead of where() (putmask is much faster)  
to maintain the ability to do *= and similar, and maybe there are  
other adjustments. Somehow, though, it should be possible to get  
decent speed for simple multiplication and division; a 10x penalty  
relative to ndarray operations is just too much.
Quite agreed. It was a shock to realize that we were that slow. I  
gonna have to start testing w/ large arrays...

I'm confident we can significantly speed up the _MaskedOperations  
without losing any of the features. Yes, putmask may be a better  
option. We could probably use the following MO:
* result = a.data/b.data
* putmask(result, m, a)

However, I gonna need a good couple of weeks before being able to  
really look into it...

Re: [Numpy-discussion] Are masked arrays slower for processing than ndarrays?

Pierre GM