Re: [Numpy-discussion] Are masked arrays slower for processing than ndarrays?

10 May 2009

      Eli Bressert wrote:
...
Hi,
I'm using masked arrays to compute large-scale standard deviation,
multiplication, gaussian, and weighted averages. At first I thought
using the masked arrays would be a great way to sidestep looping
(which it is), but it's still slower than expected. Here's a snippet
of the code that I'm using it for.
[...]
# Like the spatial_weight section, this takes about 20 seconds
W = spatial_weight / Rho2
# Takes less than one second.
Ave = np.average(av_good,axis=1,weights=W)
Any ideas on why it would take such a long time for processing?
A part of the slowdown is what looks to me like unnecessary copying in 
_MaskedBinaryOperation.__call__.  It is using getdata, which applies 
numpy.array to its input, forcing a copy.  I think the copy is actually 
unintentional, in at least one sense, and possibly two: first, because 
the default argument of getattr is always evaluated, even if it is not 
needed; and second, because the call to np.array is used where 
np.asarray or equivalent would suffice.

The first file attached below shows the kernprof in the case of 
multiplying two masked arrays, shape (100000,50), with no masked 
elements; 2/3 of the time is taken copying the data.

Now, if there are actually masked elements in the arrays, it gets much 
worse: see the second attachment.  The total time has increased by more 
than a factor of 3, and the culprit is numpy.which(), a very slow 
function.  It looks to me like it is doing nothing useful at all; the 
numpy binary operation is still being executed for all elements, 
regardless of mask, contrary to the intention implied by the comment in 
the code.

The third attached file has a patch that fixes the getdata problem and 
eliminates the which().
With this patch applied we get the profile in the 4th file, to be 
compared to the second profile.  Much better.  I am pretty sure it could 
still be sped up quite a bit, though.  It looks like the masks are 
essentially being calculated twice for no good reason, but I don't 
completely understand all the mask considerations, so at this point I am 
not trying to fix that problem.

Eric
...
Especially the spatial_weight and W variables? Would there be a faster
way to do this? Or is there a way that numpy.std can process ignore
nan's when processing?
Thanks,
Eli Bressert
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
efiring@manini:~/test$ kernprof.py -lv profile_ma_mult.py 
Wrote profile results to profile_ma_mult.py.lprof
Timer unit: 1e-06 s

File: /usr/local/lib/python2.6/dist-packages/numpy/ma/core.py
Function: __call__ at line 689
Total time: 0.189224 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   689                                               @profile
   690                                               def __call__ (self, a, b, *args, **kwargs):
   691                                                   "Execute the call behavior."
   692         1           52     52.0      0.0          m = mask_or(getmask(a), getmask(b), shrink=False)
   693         1       122614 122614.0     64.8          (da, db) = (getdata(a), getdata(b))
   694                                                   # Easy case: there's no mask...
   695         1            7      7.0      0.0          if m is nomask:
   696         1        66282  66282.0     35.0              result = self.f(da, db, *args, **kwargs)
   697                                                   # There are some masked elements: run only on the unmasked
   698                                                   else:
   699                                                       result = np.where(m, da, self.f(da, db, *args, **kwargs))
   700                                                   # Transforms to a (subclass of) MaskedArray if we don't have a scalar
   701         1           10     10.0      0.0          if result.shape:
   702         1          124    124.0      0.1              result = result.view(get_masked_subclass(a, b))
   703                                                       # If we have a mask, make sure it's broadcasted properly
   704         1           57     57.0      0.0              if m.any():
   705                                                           result._mask = mask_or(getmaskarray(a), getmaskarray(b))
   706                                                       # If some initial masks where not shrunk, don't shrink the result
   707         1            4      4.0      0.0              elif m.shape:
   708                                                           result._mask = make_mask_none(result.shape, result.dtype)
   709         1            4      4.0      0.0              if isinstance(a, MaskedArray):
   710         1           33     33.0      0.0                  result._update_from(a)
   711         1            4      4.0      0.0              if isinstance(b, MaskedArray):
   712         1           30     30.0      0.0                  result._update_from(b)
   713                                                   # ... or return masked if we have a scalar and the common mask is True
   714                                                   elif m:
   715                                                       return masked
   716         1            3      3.0      0.0          return result

efiring@manini:~/test$ kernprof.py -lv profile_ma_mult.py 
Wrote profile results to profile_ma_mult.py.lprof
Timer unit: 1e-06 s

File: /usr/local/lib/python2.6/dist-packages/numpy/ma/core.py
Function: __call__ at line 689
Total time: 0.649928 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   689                                               @profile
   690                                               def __call__ (self, a, b, *args, **kwargs):
   691                                                   "Execute the call behavior."
   692         1        30012  30012.0      4.6          m = mask_or(getmask(a), getmask(b), shrink=False)
   693         1       120784 120784.0     18.6          (da, db) = (getdata(a), getdata(b))
   694                                                   # Easy case: there's no mask...
   695         1            7      7.0      0.0          if m is nomask:
   696                                                       result = self.f(da, db, *args, **kwargs)
   697                                                   # There are some masked elements: run only on the unmasked
   698                                                   else:
   699         1       418399 418399.0     64.4              result = np.where(m, da, self.f(da, db, *args, **kwargs))
   700                                                   # Transforms to a (subclass of) MaskedArray if we don't have a scalar
   701         1           10     10.0      0.0          if result.shape:
   702         1          111    111.0      0.0              result = result.view(get_masked_subclass(a, b))
   703                                                       # If we have a mask, make sure it's broadcasted properly
   704         1        25419  25419.0      3.9              if m.any():
   705         1        55096  55096.0      8.5                  result._mask = mask_or(getmaskarray(a), getmaskarray(b))
   706                                                       # If some initial masks where not shrunk, don't shrink the result
   707                                                       elif m.shape:
   708                                                           result._mask = make_mask_none(result.shape, result.dtype)
   709         1            7      7.0      0.0              if isinstance(a, MaskedArray):
   710         1           46     46.0      0.0                  result._update_from(a)
   711         1            4      4.0      0.0              if isinstance(b, MaskedArray):
   712         1           30     30.0      0.0                  result._update_from(b)
   713                                                   # ... or return masked if we have a scalar and the common mask is True
   714                                                   elif m:
   715                                                       return masked
   716         1            3      3.0      0.0          return result

Wrote profile results to profile_ma_mult.py.lprof
Timer unit: 1e-06 s

File: /usr/local/lib/python2.6/dist-packages/numpy/ma/core.py
Function: __call__ at line 692
Total time: 0.176606 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   692                                               @profile
   693                                               def __call__ (self, a, b, *args, **kwargs):
   694                                                   "Execute the call behavior."
   695         1        30007  30007.0     17.0          m = mask_or(getmask(a), getmask(b), shrink=False)
   696         1           34     34.0      0.0          (da, db) = (getdata(a), getdata(b))
   697         1        65855  65855.0     37.3          result = self.f(da, db, *args, **kwargs)
   698                                                   # Transforms to a (subclass of) MaskedArray if we don't have a scalar
   699         1           11     11.0      0.0          if result.shape:
   700         1          136    136.0      0.1              result = result.view(get_masked_subclass(a, b))
   701                                                       # If we have a mask, make sure it's broadcasted properly
   702         1        25425  25425.0     14.4              if m.any():
   703         1        55048  55048.0     31.2                  result._mask = mask_or(getmaskarray(a), getmaskarray(b))
   704                                                       # If some initial masks where not shrunk, don't shrink the result
   705                                                       elif m.shape:
   706                                                           result._mask = make_mask_none(result.shape, result.dtype)
   707         1            7      7.0      0.0              if isinstance(a, MaskedArray):
   708         1           45     45.0      0.0                  result._update_from(a)
   709         1            4      4.0      0.0              if isinstance(b, MaskedArray):
   710         1           31     31.0      0.0                  result._update_from(b)
   711                                                   # ... or return masked if we have a scalar and the common mask is True
   712                                                   elif m:
   713                                                       return masked
   714         1            3      3.0      0.0          return result

Re: [Numpy-discussion] Are masked arrays slower for processing than ndarrays?

Eric Firing