Re: [Python-ideas] Have max and min functions ignore None

31 Dec 2015


      On Tue, Dec 29, 2015, 11:41 AM Chris Barker - NOAA Federal <
chris.barker@noaa.gov> wrote:
...
but emit TypeError if
...
asked to cope with None as a value.
Well, sort of. Numpy arrays are homogenous, you can't have a None in
an array ( other than an object style). All the Numpy "ufuncs" create
an array from the input first -- that's where you get your ValueError.
But the Numpy experience is informative -- there have been years of "
we need a better masked array" discussions, but no consensus on what
it should be.
For floats, NaN can be used for missing values, but there is no such
value for integers, and each use case has a sufferer end "obvious"
interpretation. That's why it's explicit what you want with the nan*
functions.
I don't think python should decide for users what None means in this
context.
None is obviously the sound of one hand clapping. When you understand its
proper use, you become Enlightened.
...
-CHB
...
I think the fact both NumPy and pandas support R-style handling of
min() and max() counts in favour of having variants of those with
additional options for handling missing data values in the standard
library statistics module.
NumPy and Pandas have a slightly different audience than Python core. The
scientific community often veers more practical than pure, in some cases to
the detriment of code clarity.
...
Regards,
...
Nick.
P.S. Another option might be to consider the question as part of a
general "data cleaning" strategy for the statistics module, similar to
the one discussed for pandas at
http://pandas.pydata.org/pandas-docs/stable/missing_data.html
I prefer this option. Why solve the special case of max/min when we can
solve (or help solve) the general case of missing data. There's already the
internal ``_coerce`` method. Maybe clean that up for public consumption, or
something like it, adding drop-missing functionality?

If that flies, then there might be room for an ``interpolate(sequence,
method='linear')`` which would be awesome.
...
...
Even if the statistics module itself doesn't provide the tools to
address those problems, it could provide some useful pointers on when
someone may want to switch from the standard library module to a more
comprehensive solution like pandas that better handles the messy
complications of working with real world data (and data formats).
--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia

Re: [Python-ideas] Have max and min functions ignore None

Michael Selik