Mailman 3 Re: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs - NumPy-Discussion

25 Jul 2008

      Hi All,

I'm sending a copy of this reply here because i think we could get some 
good answer.

Basically it was suggested to automarically mask NaN (and Inf ?) when 
creating ma.

I'm sure you already thought of this on this list and was curious to 
know why you decided not to do it.

Just so I can relay it to our list (sending to both list came back 
flagged as spam...)

C.

Hi Stephane,

This is a good suggestion, I'm ccing the numpy list on this. Because I'm 
wondering if it wouldn't be a better fit to do it directly at the 
numpy.ma level.

I'm sure they already thought about this (and 'inf' values as well) and 
if they don't do it , there's probably some good reason we didn't think 
of yet.
So before i go ahead and do it in MV2 I'd like to know the reason why 
it's not in numpy.ma, they are probably valid for MVs too.

C.

Stephane Raynaud wrote:
...
Hi,
how about automatically (or at least optionally) masking all NaN 
values when creating a MV array?
On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene 
mailto:amg@iri.columbia.edu> wrote:
Yup, this works. Thanks!
I guess it's time for me to dig deeper into numpy syntax and
    functions, now that CDAT is using the numpy core for array
    management...
Best,
Arthur
Charles Doutriaux wrote:
Seems right to me,
Except that the syntax might scare a bit the new users :)
C.
Andrew.Dawson@uea.ac.uk mailto:Andrew.Dawson@uea.ac.uk wrote:
Hi,
I'm not sure if what I am about to suggest is a good idea
            or not, perhaps Charles will correct me if this is a bad
            idea for any reason.
Lets say you have a cdms variable called U with NaNs as
            the missing
             value. First we can replace the NaNs with 1e20:
U.data[numpy.where(numpy.isnan(U.data))] = 1e20
And remember to set the missing value of the variable
            appropriately:
U.setMissing(1e20)
I hope that helps, Andrew
Hi Arthur,
If i remember correctly the way i used to do it was:
                a= MV2.greater(data,1.) b=MV2.less_equal(data,1)
                c=MV2.logical_and(a,b) # Nan are the only one left
                data=MV2.masked_where(c,data)
BUT I believe numpy now has way to deal with nan I
                believe it is numpy.nan_to_num But it replaces with 0
                so it may not be what you
                 want
C.
Arthur M. Greene wrote:
A typical netcdf file is opened, and the single
                    variable extracted:
fpr=cdms.open('prTS2p1_SEA_allmos.cdf')
                                pr0=fpr('prcp') type(pr0)

Masked values (indicating ocean in this case) show
                    up here as NaNs.
pr0[0,-15:-5,0]
prcp array([NaN NaN NaN NaN NaN NaN 0.37745094
                    0.3460784 0.21960783 0.19117641])
So far this is all consistent. A map of the first
                    time step shows the proper land-ocean boundaries,
                    reasonable-looking values, and so on. But there
                    doesn't seem to be any way to mask
                     this array, so, e.g., an 'xy' average can be
                    computed (it
                    comes out all nans). NaN is not equal to anything
                    -- even
                    itself -- so there does not seem to be any
                    condition, among the
                     MV.masked_xxx options, that can be applied as a
                    test. Also, it
                     does not seem possible to compute seasonal averages,
                    anomalies, etc. -- they also produce just NaNs.
The workaround I've come up with -- for now -- is
                    to first generate a new array of identical shape,
                    filled with 1.0E+20. One test I've found that can
                    detect NaNs is numpy.isnan:
isnan(pr0[0,0,0])
True
So it is _possible_ to tediously loop through
                    every value in the old array, testing with isnan,
                    then copying to the new array if the test fails.
                    Then the axes have to be reset...
isnan does not accept array arguments, so one
                    cannot do, e.g.,
prmasked=MV.masked_where(isnan(pr0),pr0)
The element-by-element conversion is quite slow.
                    (I'm still waiting for it to complete, in fact).
                    Any suggestions for dealing with NaN-infested data
                    objects?
Thanks!
AMG
P.S. This is 5.0.0.beta, RHEL4.
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society
    The Earth Institute, Columbia University, Lamont Campus
    Monell Building, 61 Route 9W, Palisades, NY  10964-8000 USA
    amg*at*iri-dot-columbia\dot\edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
-------------------------------------------------------------------------
    This SF.Net email is sponsored by the Moblin Your Move Developer's
    challenge
    Build the coolest Linux based applications with Moblin SDK & win
    great prizes
    Grand prize is a trip for two to an Open Source event anywhere in
    the world
    http://moblin-contest.org/redirect.php?banner_id=100&url=/
    http://moblin-contest.org/redirect.php?banner_id=100&url=/
    _______________________________________________
    Cdat-discussion mailing list
    Cdat-discussion@lists.sourceforge.net
    mailto:Cdat-discussion@lists.sourceforge.net
    https://lists.sourceforge.net/lists/listinfo/cdat-discussion
-- 
Stephane Raynaud
------------------------------------------------------------------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http:// moblin-contest.org/redirect.php?banner_id=100&url=/
------------------------------------------------------------------------
_______________________________________________
Cdat-discussion mailing list
Cdat-discussion@lists.sourceforge.net
https:// lists.sourceforge.net/lists/listinfo/cdat-discussion

Re: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs

Charles Doutriaux

tags

participants (2)