
On 26.08.2021 17:36, Christopher Barker wrote:
There have been a number of discussions on this list, and at least one PEP, about NaN (and other special values).
Let’s keep this thread about handling them in the statistics lib.
But briefly:
NaNs are weird on purpose, and Python should absolutely not deviate from IEEE.
Agreed. I was just surprised that NANs are more Medusa-like than expected ;-)
That’s (one reason) Python has None :-)
If you are that worried about performance, you should probably use numpy anyway :-)
Sure, and pandas, which both have methods to replace NANs in arrays.
-CHB
On Thu, Aug 26, 2021 at 3:47 AM Marc-Andre Lemburg <mal@egenix.com <mailto:mal@egenix.com>> wrote:
On 26.08.2021 12:15, Steven D'Aprano wrote: > On Thu, Aug 26, 2021 at 11:05:01AM +0200, Marc-Andre Lemburg wrote: > >> Oh, good point. I was under the impression that NAN is handled >> as a singleton. > > There are 4503599627370496 distinct quiet NANs (plus about the same > signalling NANs). So it would need to be 4-quadrillion-ton :-) > > (If anyone is concerned about the large number of NANs, it's less than > 0.05% of the total number of floats.) > > Back in the mid-80s, Apple's floating point library, SANE, distinguished > different classes of error with distinct NANs. Few systems have followed > that lead, but each NAN still has 51 bits available for a diagnostic > code, plus the sign bit. While Python itself only generates a single NAN > value, if you are receiving data from outside sources it could contain > NANs with distinct payloads. > > The IEEE-754 standard doesn't mandate that NANs preserve the payload, > but it does recommend it. We shouldn't gratuitously discard that > information. It could be meaningful to whoever is generating the data.
Fair enough. Would it then make sense to at least have all possible NAN objects compare equal, treating the extra error information as an attribute value rather than a distinct value and perhaps exposing this as such ?
I'm after the "practicality beats purity" here. The math.isnan() test doesn't work well in practice, since you'd have to iterate over all sequence members and call that test function, which is expensive when done in Python.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Aug 26 2021) >>> Python Projects, Coaching and Support ... https://www.egenix.com/ >>> Python Product Development ... https://consulting.egenix.com/ ________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org <mailto:python-ideas@python.org> To unsubscribe send an email to python-ideas-leave@python.org <mailto:python-ideas-leave@python.org> https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/GX7PAY... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython