[Numpy-discussion] python array

Brett Olsen brett.olsen at gmail.com
Thu Mar 13 22:07:07 EDT 2014


The difference appears to be that the boolean selection pulls out all data
values <= 0.5 whether or not they are masked, and then carries over the
appropriate masks to the new array.  So r2010 and bt contain identical
unmasked values but different numbers of masked values.  Because the
initial fill value for your masked values was a large negative number, in
r2010 those masked values are carried over.  In bt, you've taken the
absolute value of the data array, so those fill values are now positive and
they are no longer carried over into the indexed array.

Because the final arrays are still masked, you are observing no difference
in the statistical properties of the arrays, only their sizes, because one
contains many more masked values than the other.  I don't think this should
be a problem for your computations. If you're concerned, you could always
explicitly demask them before your computations.  See the example problem
below.

~Brett

In [61]: import numpy as np

In [62]: import numpy.ma as ma

In [65]: a = np.arange(-8, 8).reshape((4, 4))

In [66]: a
Out[66]:
array([[-8, -7, -6, -5],
       [-4, -3, -2, -1],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7]])

In [68]: b = ma.masked_array(a, mask=a < 0)

In [69]: b
Out[69]:
masked_array(data =
 [[-- -- -- --]
 [-- -- -- --]
 [0 1 2 3]
 [4 5 6 7]],
             mask =
 [[ True  True  True  True]
 [ True  True  True  True]
 [False False False False]
 [False False False False]],
       fill_value = 999999)

In [70]: b.data
Out[70]:
array([[-8, -7, -6, -5],
       [-4, -3, -2, -1],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7]])

In [71]: c = abs(b)

In [72]: c[c <= 4].shape
Out[72]: (9L,)

In [73]: b[b <= 4].shape
Out[73]: (13L,)

In [74]: b[b <= 4]
Out[74]:
masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3 4],
             mask = [ True  True  True  True  True  True  True  True False
False False False
 False],
       fill_value = 999999)


In [75]: c[c <= 4]
Out[75]:
masked_array(data = [-- -- -- -- 0 1 2 3 4],
             mask = [ True  True  True  True False False False False False],
       fill_value = 999999)


On Thu, Mar 13, 2014 at 8:14 PM, Sudheer Joseph <sudheer.joseph at yahoo.com>wrote:

> Sorry,
>            The below solution I thoght working was not working but was
> just giving array size.
>
> --------------------------------------------
> On Fri, 14/3/14, Sudheer Joseph <sudheer.joseph at yahoo.com> wrote:
>
>  Subject: Re: [Numpy-discussion] python array
>  To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
>  Date: Friday, 14 March, 2014, 1:09 AM
>
>  Thank you very much Nicolas and
>  Chris,
>
>               The
>  hint was helpful and from that I treid below steps ( a crude
>  way I would say) and getting same result now
>
>  I have been using abs available by default and it is the
>  same with numpy.absolute( i checked).
>
>  nr= ((r2010>r2010.min()) & (r2010<r2010.max()))
>  nr[nr<.5].shape
>  Out[25]: (33868,)
>  anr=numpy.absolute(nr)
>  anr[anr<.5].shape
>  Out[27]: (33868,)
>
>  This way I used may have problem when mask used has values
>  which can affect the min max operation.
>
>  So I would like to know if there is a standard formal (
>  python/numpy) way to handle masked array when they need to
>  be subjected to boolean operations.
>
>  with best regards,
>  Sudheer
>
>
>  ***************************************************************
>  Sudheer Joseph
>  Indian National Centre for Ocean Information Services
>  Ministry of Earth Sciences, Govt. of India
>  POST BOX NO: 21, IDA Jeedeemetla P.O.
>  Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
>  Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
>  Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile)
>  E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com
>  Web- http://oppamthadathil.tripod.com
>  ***************************************************************
>
>  --------------------------------------------
>  On Thu, 13/3/14, Chris Barker - NOAA Federal <chris.barker at noaa.gov>
>  wrote:
>
>   Subject: Re: [Numpy-discussion] python array
>   To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
>   Date: Thursday, 13 March, 2014, 11:53 PM
>
>   On Mar 13, 2014, at 9:39 AM, Nicolas
>   Rougier <Nicolas.Rougier at inria.fr>
>   wrote:
>
>   >
>   > Seems to be related to the masked values:
>
>   Good hint -- a masked array keeps the "junk" values in the
>   main array.
>
>   What "abs" are you using -- it may not be mask-aware. (
>  you
>   want a
>   numpy abs anyway)
>
>   Also -- I'm not sure I know what happens with Boolean
>   operators on
>   masked arrays when you use them to index. I'd investigate
>   that.
>   (sorry, not at a machine I can play with now)
>
>   Chris
>
>
>   > print r2010[:3,:3]
>   > [[-- -- --]
>   > [-- -- --]
>   > [-- -- --]]
>   >
>   > print abs(r2010)[:3,:3]
>   > [[-- -- --]
>   > [-- -- --]
>   > [-- -- --]]
>   >
>   >
>   > print r2010[ r2010[:3,:3] <0 ]
>   > [-- -- -- -- -- -- -- -- --]
>   >
>   > print r2010[ abs(r2010)[:3,:3] < 0]
>   > []
>   >
>   > Nicolas
>   >
>   >
>   >
>   > On 13 Mar 2014, at 16:52, Sudheer Joseph <sudheer.joseph at yahoo.com>
>   wrote:
>   >
>   >> Dear experts,
>   >>
>          I am encountering a strange
>   behaviour of python data array as below. I have been
>  trying
>   to use the data from a netcdf file(attached herewith) to
>  do
>   certain calculation using below code. If I take absolute
>   value of the same array and look for values <.5  I
>   get a different value than the original array. But the
>  fact
>   is that this particular case do not have any negative
>  values
>   in the array( but there are other files where it can have
>   negative values so the condition is put). I do not see any
>   reason for getting different numbers for values <.5 in
>   case of bt and expected it to be same as that of r2010. If
>   any one has a guess on what is behind this behaviour
>  please
>   help.
>   >>
>   >>
>   >> In [14]: from netCDF4 import Dataset as nc
>   >>
>   >> In [15]: nf=nc('r2010.nc')
>   >> In [16]: r2010=nf.variables['R2010'][:]
>   >> In [17]: bt=abs(r2010)
>   >> In [18]: bt[bt<=.5].shape
>   >> Out[18]: (2872,)
>   >> In [19]: r2010[r2010<.5].shape
>   >> Out[19]: (36738,)
>   >>
>   >>
>   >> bt.min()
>   >> Out[20]: 0.0027588337040836768
>   >>
>   >> In [21]: bt.max()
>   >> Out[21]: 3.5078965479057089
>   >> In [22]: r2010.max()
>   >> Out[22]: 3.5078965479057089
>   >> In [23]: r2010.min()
>   >> Out[23]: 0.0027588337040836768
>   >>
>   >>
>   >>
>   >>
>   ***************************************************************
>   >> Sudheer Joseph
>   >> Indian National Centre for Ocean Information
>   Services
>   >> Ministry of Earth Sciences, Govt. of India
>   >> POST BOX NO: 21, IDA Jeedeemetla P.O.
>   >> Via Pragathi Nagar,Kukatpally, Hyderabad;
>  Pin:5000
>   55
>   >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
>   >>
>   Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile)
>   >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com
>   >> Web- http://oppamthadathil.tripod.com
>   >>
>   ***************************************************************<r2010.nc
> >_______________________________________________
>   >> NumPy-Discussion mailing list
>   >> NumPy-Discussion at scipy.org
>   >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>   >
>   > _______________________________________________
>   > NumPy-Discussion mailing list
>   > NumPy-Discussion at scipy.org
>   > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>   _______________________________________________
>   NumPy-Discussion mailing list
>   NumPy-Discussion at scipy.org
>   http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  _______________________________________________
>  NumPy-Discussion mailing list
>  NumPy-Discussion at scipy.org
>  http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140313/b00be613/attachment.html>


More information about the NumPy-Discussion mailing list