On Tue, Dec 29, 2015, 11:41 AM Chris Barker - NOAA Federal < firstname.lastname@example.org> wrote:
but emit TypeError if
asked to cope with None as a value.
Well, sort of. Numpy arrays are homogenous, you can't have a None in an array ( other than an object style). All the Numpy "ufuncs" create an array from the input first -- that's where you get your ValueError.
But the Numpy experience is informative -- there have been years of " we need a better masked array" discussions, but no consensus on what it should be.
For floats, NaN can be used for missing values, but there is no such value for integers, and each use case has a sufferer end "obvious" interpretation. That's why it's explicit what you want with the nan* functions.
I don't think python should decide for users what None means in this context.
None is obviously the sound of one hand clapping. When you understand its proper use, you become Enlightened.
I think the fact both NumPy and pandas support R-style handling of min() and max() counts in favour of having variants of those with additional options for handling missing data values in the standard library statistics module.
NumPy and Pandas have a slightly different audience than Python core. The scientific community often veers more practical than pure, in some cases to the detriment of code clarity.
P.S. Another option might be to consider the question as part of a general "data cleaning" strategy for the statistics module, similar to the one discussed for pandas at http://pandas.pydata.org/pandas-docs/stable/missing_data.html
I prefer this option. Why solve the special case of max/min when we can solve (or help solve) the general case of missing data. There's already the internal ``_coerce`` method. Maybe clean that up for public consumption, or something like it, adding drop-missing functionality?
If that flies, then there might be room for an ``interpolate(sequence, method='linear')`` which would be awesome.
Even if the statistics module itself doesn't provide the tools to address those problems, it could provide some useful pointers on when someone may want to switch from the standard library module to a more comprehensive solution like pandas that better handles the messy complications of working with real world data (and data formats).
-- Nick Coghlan | email@example.com | Brisbane, Australia