Hi, I have some arrays of various shapes in which I need to set any NaNs to 0. I have been doing the following: a[numpy.where(numpy.isnan(a)] = 0. as you can see here: In [20]: a=numpy.ones(2) In [21]: a[1]=numpy.log(-1) In [22]: a Out[22]: array([ 1., NaN]) In [23]: a[numpy.where(numpy.isnan(a))]=0. In [24]: a Out[24]: array([ 1., 0.]) Unfortunately, I've just discovered that when a.shape == () this doesn't work at all. For example: In [41]: a=numpy.array((1.)) In [42]: a.shape Out[42]: () In [43]: a[numpy.where(numpy.isnan(a))]=0. In [44]: a Out[44]: array(0.0) but if the shape is (1,), everything is ok: In [47]: a=numpy.ones(1) In [48]: a.shape Out[48]: (1,) In [49]: a[numpy.where(numpy.isnan(a))]=0. In [50]: a Out[50]: array([ 1.]) What's the difference between the 2 arrays with different shapes? If I pass a scalar into numpy.asarray() why do I get an array of shape () back? In my case this has caused a subtle bug. Is there a better way to set NaNs in an array to 0? Thanks for any tips, John.
On Tue, Jul 13, 2010 at 9:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk> wrote:
Hi,
I have some arrays of various shapes in which I need to set any NaNs to 0. I have been doing the following:
a[numpy.where(numpy.isnan(a)] = 0.
as you can see here:
In [20]: a=numpy.ones(2)
In [21]: a[1]=numpy.log(-1)
In [22]: a Out[22]: array([ 1., NaN])
In [23]: a[numpy.where(numpy.isnan(a))]=0.
In [24]: a Out[24]: array([ 1., 0.])
Unfortunately, I've just discovered that when a.shape == () this doesn't work at all. For example:
In [41]: a=numpy.array((1.))
In [42]: a.shape Out[42]: ()
In [43]: a[numpy.where(numpy.isnan(a))]=0.
In [44]: a Out[44]: array(0.0)
No need to use where. You can just do a[np.isnan(a)] = 0. But you do have to watch out for 0d arrays, can't index into those. How about:
def nan_replace(a, fill=0): ....: a = a.copy() ....: if a.ndim == 0: ....: return a ....: a[np.isnan(a)] = fill ....: return a ....:
a = np.array(9) nan_replace(a, 0) array(9) a = np.array([9, np.nan]) nan_replace(a, 0) array([ 9., 0.])
Oh, I guess a[np.isnan(a)] = fill makes a copy so the a.copy() can be moved inside the if statement.
On Tue, Jul 13, 2010 at 10:36 AM, Pauli Virtanen <pav@iki.fi> wrote:
ti, 2010-07-13 kello 10:06 -0700, Keith Goodman kirjoitti:
No need to use where. You can just do a[np.isnan(a)] = 0. But you do have to watch out for 0d arrays, can't index into those.
You can, but the index must be appropriate:
x = np.array(4) x[()] = 3 x array(3)
Then should this error message be changed?
a = np.array(4) a[1] <snip> IndexError: 0-d arrays can't be indexed
On Tue, Jul 13, 2010 at 11:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk> wrote:
Hi,
I have some arrays of various shapes in which I need to set any NaNs to 0. I have been doing the following:
a[numpy.where(numpy.isnan(a)] = 0.
as you can see here:
In [20]: a=numpy.ones(2)
In [21]: a[1]=numpy.log(-1)
In [22]: a Out[22]: array([ 1., NaN])
In [23]: a[numpy.where(numpy.isnan(a))]=0.
In [24]: a Out[24]: array([ 1., 0.])
Unfortunately, I've just discovered that when a.shape == () this doesn't work at all. For example:
In [41]: a=numpy.array((1.))
In [42]: a.shape Out[42]: ()
In [43]: a[numpy.where(numpy.isnan(a))]=0.
In [44]: a Out[44]: array(0.0)
but if the shape is (1,), everything is ok:
In [47]: a=numpy.ones(1)
In [48]: a.shape Out[48]: (1,)
In [49]: a[numpy.where(numpy.isnan(a))]=0.
In [50]: a Out[50]: array([ 1.])
What's the difference between the 2 arrays with different shapes?
If I pass a scalar into numpy.asarray() why do I get an array of shape () back? In my case this has caused a subtle bug.
Is there a better way to set NaNs in an array to 0?
You could make use of np.atleast_1d, and then everything would be canonicalized: In [33]: a = np.array(np.nan) In [34]: a Out[34]: array(nan) In [35]: a1d = np.atleast_1d(a) In [36]: a1d Out[36]: array([ NaN]) In [37]: a Out[37]: array(nan) In [38]: a1d.base is a Out[38]: True In [39]: a1d[np.isnan(a1d)] = 0. In [40]: a1d Out[40]: array([ 0.]) In [41]: a Out[41]: array(0.0) So Keith's nan_replace would be: In [42]: def nan_replace(a, fill=0.0): ....: a_ = np.atleast_1d(a) ....: a_[np.isnan(a_)] = fill ....:
Thanks for any tips, John.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Jul 13, 2010 at 10:45 AM, Kurt Smith <kwmsmith@gmail.com> wrote:
You could make use of np.atleast_1d, and then everything would be canonicalized:
In [33]: a = np.array(np.nan)
In [34]: a Out[34]: array(nan)
In [35]: a1d = np.atleast_1d(a)
In [36]: a1d Out[36]: array([ NaN])
In [37]: a Out[37]: array(nan)
In [38]: a1d.base is a Out[38]: True
In [39]: a1d[np.isnan(a1d)] = 0.
In [40]: a1d Out[40]: array([ 0.])
In [41]: a Out[41]: array(0.0)
So Keith's nan_replace would be:
In [42]: def nan_replace(a, fill=0.0): ....: a_ = np.atleast_1d(a) ....: a_[np.isnan(a_)] = fill ....:
Neat. The docstring for atleast_1d says "Copies are made only if necessary". I don't know when a copy is necessay, but watch out for it.
On Tue, Jul 13, 2010 at 12:45 PM, Kurt Smith <kwmsmith@gmail.com> wrote:
On Tue, Jul 13, 2010 at 11:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk> wrote:
Hi,
I have some arrays of various shapes in which I need to set any NaNs to 0. I have been doing the following:
a[numpy.where(numpy.isnan(a)] = 0.
as you can see here:
In [20]: a=numpy.ones(2)
In [21]: a[1]=numpy.log(-1)
In [22]: a Out[22]: array([ 1., NaN])
In [23]: a[numpy.where(numpy.isnan(a))]=0.
In [24]: a Out[24]: array([ 1., 0.])
Unfortunately, I've just discovered that when a.shape == () this doesn't work at all. For example:
In [41]: a=numpy.array((1.))
In [42]: a.shape Out[42]: ()
In [43]: a[numpy.where(numpy.isnan(a))]=0.
In [44]: a Out[44]: array(0.0)
but if the shape is (1,), everything is ok:
In [47]: a=numpy.ones(1)
In [48]: a.shape Out[48]: (1,)
In [49]: a[numpy.where(numpy.isnan(a))]=0.
In [50]: a Out[50]: array([ 1.])
What's the difference between the 2 arrays with different shapes?
If I pass a scalar into numpy.asarray() why do I get an array of shape () back? In my case this has caused a subtle bug.
Is there a better way to set NaNs in an array to 0?
You could make use of np.atleast_1d, and then everything would be canonicalized:
In [33]: a = np.array(np.nan)
In [34]: a Out[34]: array(nan)
In [35]: a1d = np.atleast_1d(a)
In [36]: a1d Out[36]: array([ NaN])
In [37]: a Out[37]: array(nan)
In [38]: a1d.base is a Out[38]: True
In [39]: a1d[np.isnan(a1d)] = 0.
In [40]: a1d Out[40]: array([ 0.])
In [41]: a Out[41]: array(0.0)
So Keith's nan_replace would be:
In [42]: def nan_replace(a, fill=0.0): ....: a_ = np.atleast_1d(a) ....: a_[np.isnan(a_)] = fill ....:
Maybe I am missing something subtle, but what about numpy's nan_to_num() function? Ben Root
Benjamin Root wrote:
On Tue, Jul 13, 2010 at 12:45 PM, Kurt Smith <kwmsmith@gmail.com <mailto:kwmsmith@gmail.com>> wrote:
On Tue, Jul 13, 2010 at 11:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk <mailto:j.reid@mail.cryst.bbk.ac.uk>> wrote: > Hi, > > I have some arrays of various shapes in which I need to set any NaNs to > 0. I have been doing the following: > > a[numpy.where(numpy.isnan(a)] = 0. > > > > as you can see here: > > In [20]: a=numpy.ones(2) > > In [21]: a[1]=numpy.log(-1) > > In [22]: a > Out[22]: array([ 1., NaN]) > > In [23]: a[numpy.where(numpy.isnan(a))]=0. > > In [24]: a > Out[24]: array([ 1., 0.]) > > > > Unfortunately, I've just discovered that when a.shape == () this doesn't > work at all. For example: > > In [41]: a=numpy.array((1.)) > > In [42]: a.shape > Out[42]: () > > In [43]: a[numpy.where(numpy.isnan(a))]=0. > > In [44]: a > Out[44]: array(0.0) > > > > > > but if the shape is (1,), everything is ok: > > In [47]: a=numpy.ones(1) > > In [48]: a.shape > Out[48]: (1,) > > In [49]: a[numpy.where(numpy.isnan(a))]=0. > > In [50]: a > Out[50]: array([ 1.]) > > > > What's the difference between the 2 arrays with different shapes? > > If I pass a scalar into numpy.asarray() why do I get an array of shape > () back? In my case this has caused a subtle bug. > > Is there a better way to set NaNs in an array to 0?
You could make use of np.atleast_1d, and then everything would be canonicalized:
In [33]: a = np.array(np.nan)
In [34]: a Out[34]: array(nan)
In [35]: a1d = np.atleast_1d(a)
In [36]: a1d Out[36]: array([ NaN])
In [37]: a Out[37]: array(nan)
In [38]: a1d.base is a Out[38]: True
In [39]: a1d[np.isnan(a1d)] = 0.
In [40]: a1d Out[40]: array([ 0.])
In [41]: a Out[41]: array(0.0)
So Keith's nan_replace would be:
In [42]: def nan_replace(a, fill=0.0): ....: a_ = np.atleast_1d(a) ....: a_[np.isnan(a_)] = fill ....:
Maybe I am missing something subtle, but what about numpy's nan_to_num() function?
That sounds useful but I should have said: sometimes I need to replace other values that aren't NaNs. Anyway thanks for all the tips everyone. Here's what I ended up with: def array_replace(a, loc, fill): """ Replace the values in a at the locations, loc, with the value, fill. For example: In [32]: a=numpy.arange(10)+1 In [33]: array_replace(a, a%3==1, 0) Out[33]: array([0, 2, 3, 0, 5, 6, 0, 8, 9, 0]) or for 0-d arrays: In [15]: b=numpy.array(3) In [16]: array_replace(b, b==3, 2) Out[16]: array([2]) In [17]: b Out[17]: array(2) """ a_ = numpy.atleast_1d(a) a_[numpy.atleast_1d(loc)] = fill return a_
2010/7/14 John Reid <j.reid@mail.cryst.bbk.ac.uk>:
That sounds useful but I should have said: sometimes I need to replace other values that aren't NaNs.
Sorry, for the double dumb recommendation of nan_to_num, but when replacing other normal values mabe you can use:
a = numpy.asarray(1) b = numpy.asarray([10, 1]) a *= (a != 1) b *= (b != 1) a array(0) b array([10, 0])
We have tested this for speed and it's even faster than the slicing-and-insert method. When having to replace, you can use a composite approach, e.g. to replace 1 by 10000:
b = (b * (b != 1) + 10000 * (b == 1))
I don't know how to replace nans by values other than zero. Friedrich
2010/7/13 John Reid <j.reid@mail.cryst.bbk.ac.uk>:
Hi,
I have some arrays of various shapes in which I need to set any NaNs to 0.
I just ran across numpy.nan_to_num():
a = numpy.log(-1) b = numpy.log([-1, 1]) a nan b array([ NaN, 0.]) numpy.nan_to_num(a) 0.0 numpy.nan_to_num(b) array([ 0., 0.])
Friedrich
participants (6)
-
Benjamin Root
-
Friedrich Romstedt
-
John Reid
-
Keith Goodman
-
Kurt Smith
-
Pauli Virtanen