[Numpy-discussion] Question about numpy.ma masking

Gökhan Sever gokhansever at gmail.com
Mon May 10 12:17:58 EDT 2010


On Sun, May 9, 2010 at 2:42 PM, Eric Firing <efiring at hawaii.edu> wrote:

>
> The mask attribute can be a full array, or it can be a scalar to
> indicate that nothing is masked.  This is an optimization in masked
> arrays; it adds complexity, but it can save space and/or processing
> time. You can always access a full mask array by using
> np.ma.getmaskarray().  Or you can ensure the internal mask is an array,
> not a scalar, by using the shrink=False kwarg when making the masked
> array with np.ma.array().
>

shrink=False fits perfect for my use-case. I was guessing that leaving the
mask as scalar should something to do with optimization. Probably not many
people around write loops and check conditions based on the mask content
like I do :) I hope someone in SciPy10 will present a Numpy.MA talk or
tutorial describing all the nitty details of the module usage.


>
> Offhand, I suspect your loop can be eliminated by vectorization.
> Something like this:
>
> ns = len(shorter)
> slice0 = slice(ns)
> slice1 = slice(diff, diff+ns)
> cond1 = serialh.data['dccnTempSF'][slice0] != 0
> cond2 = np.ma.getmaskarray(basic.data['Air_Temp'][slice1]) == False
> cond = cond1 & cond2
> dccnConAmb[slice0][cond] = (serialc.data['dccnConc'][slice0][cond] *
>                            physical.data['STATIC_PR'][slice1][cond])
>

Bonus help :) My gmail has over 400 Python tagged e-mails collected over a
year. I get responses here (in mailing lists general) most of the time
faster than I get locally around my department. This (especially
no-appointments feature) doubles triples my learning experience. Just a
personal thanks to you and all who make these great mediums possible.

Anyways back to the topic again. The snippet I share is about a year old
from the times that I didn't know much about vectorization. Your version
looks good to my eyes, but it is little harder to read in general. Also I
don't know how would you debug this code. Sometimes I need to pause the
execution of scripts and step-by-step move through the lines and see how
values are changing in each iteration.

Lastly, this dccnConAmb is my CCN concentration normalized at ambient
pressure and temperature that I use to estimate C and k parameters from
power-law relationship using scipy's curve_fit() in case someone is curious
what I am after.


>
> Eric
>
>
Gökhan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100510/1d3bd427/attachment.html>


More information about the NumPy-Discussion mailing list