[issue20481] Clarify type coercion rules in statistics module

Sun Feb 2 13:33:38 CET 2014

Oscar Benjamin added the comment:

Wolfgang have you tested this with any third party numeric types from
sympy, gmpy2, mpmath etc.?

Last I checked no third party types implement the numbers ABCs e.g.:

>>> import sympy, numbers
>>> r = sympy.Rational(1, 2)
>>> r
1/2
>>> isinstance(r, numbers.Rational)
False

AFAICT testing against the numbers ABCs is just a slow way of testing
against the stdlib types:

$ python -m timeit -s 'from numbers import Integral' 'isinstance(1, Integral)'
100000 loops, best of 3: 2.59 usec per loop
$ python -m timeit -s 'from numbers import Integral' 'isinstance(1, int)'
1000000 loops, best of 3: 0.31 usec per loop

You can at least make it faster using a tuple:

$ python -m timeit -s 'from numbers import Integral' 'isinstance(1,
(int, Integral))'
1000000 loops, best of 3: 0.423 usec per loop

I'm not saying that this is necessarily a worthwhile optimisation but
rather that the numbers ABCs are in practice not really very useful
(AFAICT).

I don't know how well the statistics module currently handles third
party numeric types but if the type coercion is to be changed then
this something to look at. The current implementation tries to
duck-type to some extent and yours uses ABCs but does either approach
actually have any practical gain for interoperability with non-stdlib
numeric types? If not then it would be simpler just to explicitly
hard-code exactly how it works for the powerset of stdlib types.

OTOH if it could be made to do sensible things with non-stdlib types
then that would be great. Falling back on float isn't a bad choice but
if it can be made to do exact things for exact types (e.g.
sympy.Rational) then that would be great. Similarly mpmath.mpf
provides multi-precision floats. It would be great to be able to take
advantage of that higher precision rather than downgrade everything to
float.

This is in general a hard problem though so I don't think it's
unreasonable to make restrictions about what types it can work with -
achieving optimal accuracy for all types without restrictions is
basically impossible.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20481>
_______________________________________