[Numpy-discussion] masked arrays and NaNs

Eric Firing efiring at hawaii.edu
Wed Apr 6 15:53:05 EDT 2005


I am whole-heartedly in favor of your efforts to end the 
Numeric/numarray split by combining the best of both. I am encouraged by 
the progress you have made, and by the depth and clarity of the 
accompanying technical discussions.  Thank you!

I am a long-time Matlab user in Physical Oceanography, and I have been 
trying to find a practical way to phase out Matlab.  One key is 
matplotlib, which is coming along wonderfully.  A second is the 
availability of a Num* (or scipy.base) module that provides the 
functionality and ease-of-use I presently get from Matlab.  This leads 
to a request which I suspect and hope is consistent with your present 
plans: efficient handling of NaNs and/or masked arrays.

In Physical Oceanography, and I suspect in many other fields, data sets 
are almost always full of holes.  Matlab's ability to use NaN as a bad 
value flag provides a wonderfully simple and efficient way of dealing 
with missing or bad data values.  A similar ease and transparency would 
be good in scipy.base.  In addition, or as a way of implementing 
NaN-handling internally, it might be best to have masked arrays 
incorporated at the C level--with the functionality available by 
default--rather than bolted on as a pure-python package.  I hope that 
inclusion of __array_mask__ in the protocol means that this is part of 
the plan.


More information about the NumPy-Discussion mailing list