Re: [Numpy-discussion] bug with with fill_values in masked arrays?

March 21, 2008

      On Friday 21 March 2008 12:55:11 Chris Withers wrote:
...
Pierre GM wrote:
...
On Wednesday 19 March 2008 19:47:37 Matt Knox wrote:
...
...
1. why am I not getting my NaN's back?
Because they're gone when you create your masked array.
Really? At least one other post has disagreed with that.
Well, yeah, my bad, that depends on whether you use masked_invalid or 
fix_invalid or just build a basic masked array.

Example:
...
...
...
import numpy as np
import numpy.ma as ma
x = np.array([1,np.nan,3])
# Basic construction
y=ma.array(x)
masked_array(data = [  1.  NaN   3.],
      mask = False,
      fill_value=1e+20)
y=ma.masked_invalid(x)
masked_array(data = [1.0 -- 3.0],
      mask = [False  True False],
      fill_value=1e+20)
y._data
array([  1.,  NaN,   3.])
y=ma.fix_invalid(x)
masked_array(data = [1.0 -- 3.0],
      mask = [False  True False],
      fill_value=1e+20)
y._data
array([  1.00000000e+00,   1.00000000e+20,   3.00000000e+00])
...
And it does seem odd that a value, even if it's a nan, would be
destroyed...
Having NaNs in an array usually reduces performance: the option we follow w/ 
fix_invalid is to clear the masked array of the NaNs, and keeping track of 
where they were by setting the mask to True at the appropriate location. That 
way, you don't have the drop of performance of having NaNs in your underlying 
array.
Oh, and NaNs will be transformed to 0 if you use ints...
...
...
The idea here is to
get rid of the nan in your data
No, it's to mask them, otherwise I would have used a normal array, not a
ma.
Nope, the idea is really is to make things as efficient as possible. Now, you 
can still have your nans if you're ready to eat them.
...
...
to avoid potential problems while keeping
track of where the nans were in the first place.
...like plotting them on a graph, which the current behaviour makes
unworkable, that you end up doing a myarray.filled(0) to get around it,
with imperfect results.
Send an example. I don't seem to have this problem:

x = np.arange(10,dtype=np.float)
x[5]=np.nan
y=ma.masked_invalid(x)

plot(x,'ok-')
plot(y,'sr-')
...
Right, but why when the masked array is cast back to a list of numbers
if the fill_value of the ma not respected?
Because in your particular case, you're inspecting elements one by one, and 
then, your masked data becomes the masked singleton which is a special value. 
That has nothing to do w/ the filling.
...
...
...
...
2. why is the wrong fill value being used here?
the second element in the array iteration here is actually the
numpy.ma.masked constant, which always has the same fill value...
...and that's a bug.
And once again, it's not. numpy.ma.masked is a special value, like numpy.nan 
or numpy.inf