Is nan in (nan,) correct?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Mar 6 12:07:25 EST 2015


random832 at fastmail.us wrote:

> On Thu, Mar 5, 2015, at 22:49, Chris Angelico wrote:
>> I'm not sure it's just an optimization. Compare this post from
>> python-dev, where Nick Coghlan discusses the same topic:
>> 
>> https://mail.python.org/pipermail/python-dev/2014-July/135476.html
> 
> If it is a bug for NaN to "infect" containers' behavior, we need to take
> a serious look at sort().

Hmmm. No, I don't think so. I think that NANs are just weird and you either
accept that they do weird things or you don't use them.

Real numbers have a total ordering, so sorting a list of real numbers (or
their closest equivalent in computer implementations) makes sense, and sure
enough we can meaningfully sort a list of finite floats.

Complex numbers do not define an ordering at all: given a complex number w,
and another complex number z, asking whether w < z is meaningless. Ordering
comparisons simply aren't defined for complex numbers at all, and trying to
sort a list with more than one complex number will raise an exception.

But NANs are weird. NANs are unordered like complex, but unlike complex the
order comparisons are defined. They just always return False (except for
not-equal, which always returns True):

    NAN < x returns False, for every x
    NAN > x also returns False, for every x

If you think of NANs as members of a set with total ordering, like the real
number line, then this doesn't make sense. But floats (as opposed to the
Real Numbers) don't have total order. If you think of NANs as members of
unorderable values like the complex plane, then you should get an
exception, but they aren't members of an unorderable set, they're merely
not ordered. There's nothing wrong with doing:

    if x < 1.2345: ...

just because x happens to be a NAN. One shouldn't raise an exception (even
though this is an exceptional case), because < and > operators are
perfectly well-defined for NANs, unlike the case for complex values.

But the consequence of this is that sort misbehaves. Sort algorithms depend
on the values being sorted having a total order. If you sort a list
containing objects with a partial order (like sets, or NANs), then the sort
result is dependent on the initial order of the elements.

If we really wanted to solve this problem, we could re-define sorting to use
a separate method instead of __lt__, let's call it __sort_lt__. If
__sort_lt__ doesn't exist, sorting will fall back on __lt__. That will
allow sets to distinguish between the use of < for subset testing and the
use of < for the purposes of sorting, and raise an exception or a warning.
Likewise floats could define the method:

def __sort_lt__(self, other):
    if isnan(self) or isnan(other):
        raise SomeException
    return type(self).__lt__(other)


Another option would be to have a context manager which disables order
comparisons for NANs:

def sorted(it):
    L = list(it)
    with disabled_nan_comparisons():
        L.sort()  # raise if L contains a NAN
    return L

sort of thing.

But either solution is a pretty big change for a pretty rare issue.



-- 
Steven




More information about the Python-list mailing list