Re: [Python-ideas] NAN handling in the statistics module

Jan. 7, 2019

      On Sun, 6 Jan 2019 19:40:32 -0800
Stephan Hoyer <shoyer@gmail.com> wrote:
...
On Sun, Jan 6, 2019 at 4:27 PM Steven D'Aprano <steve@pearwood.info> wrote:
...
I propose adding a "nan_policy" keyword-only parameter to the relevant
statistics functions (mean, median, variance etc), and defining the
following policies:
IGNORE:  quietly ignore all NANs
    FAIL:  raise an exception if any NAN is seen in the data
    PASS:  pass NANs through unchanged (the default)
    RETURN:  return a NAN if any NAN is seen in the data
    WARN:  ignore all NANs but raise a warning if one is seen
I don't think PASS should be the default behavior, and I'm not sure it
would be productive to actually implement all of these options.
For reference, NumPy and pandas (the two most popular packages for data
analytics in Python) support two of these modes:
- RETURN (numpy.mean() and skipna=False for pandas)
- IGNORE (numpy.nanmean() and skipna=True for pandas)
RETURN is the default behavior for NumPy; IGNORE is the default for pandas.
I agree with Stephan that RETURN and IGNORE are the only useful modes
of operation here.

Regards

Antoine.

Re: [Python-ideas] NAN handling in the statistics module

Antoine Pitrou