Proposal: === and !=== operators

Sat Jul 12 12:35:07 EDT 2014

On Sat, 12 Jul 2014 13:54:07 +0200, Johannes Bauer wrote:

> On 09.07.2014 11:17, Steven D'Aprano wrote:
> 
>> People are already having problems, just listen to Anders. He's
>> (apparently) not doing NAN-aware computations on his data, he just
>> wants to be able to do something like
>> 
>> this_list_of_floats == that_list_of_floats
> 
> This is a horrible example.
> 
> There's no pretty way of saying this: Comparing floats using equals
> operators has always and will always be an incredibly dumb idea. The
> same applies obviously to containers containing floats.

That's a myth. It simply is not true that you should never compare floats 
with the equals operator, it comes from the dark ages of numeric 
computing prior to IEEE-754.

If you said, "for many purposes, one should not compare floats for 
equality, but should use some sort of fuzzy comparison instead" then I 
would agree with you. But your insistence that equality "always" is wrong 
takes it out of good advice into the realm of superstition.

Quoting Professor William Kahan from the foreword to the "Apple Numerics 
Manual", second edition:

    [B]ecause so many computers in the 1960s and 1970's possessed
    so many different arithmetic anomalies, computational lore has 
    become encumbered with a vast body of superstition purporting 
    to cope with them. One such superstitious rule is "*Never* ask
    whether floating-point numbers are exactly equal."

That was written in 1987, just two years after the introduction of the 
IEEE-754 standard. It is heart-breaking that 26 years later this bogus 
"rule" is still being treated as gospel.

Bruce Dawson has written an awesome series of blog posts dealing with 
floating point issues. In this post:

https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

he discusses some of the issues with comparing two C floats or doubles 
for equality. If you read the entire post, he emphasises how hard it is 
to compare floats, and gives three methods:

- test whether they differ by an absolute error
- test whether they differ by a relative error
- test whether they differ by a certain number of Units In Last Place

(one method he misses is the one used by Python unittest module, which 
rounds the values before comparing them)

and describes some of the weaknesses of each. In a reply to a comment, he 
warns about using == to compare a float (single precision) and a double. 
But if you keep reading all the way down to this comment:

https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/#comment-9989

he says:

    [T]he default equality comparison absolutely should be 
    true equality. To do otherwise risks madness. I have a 
    post almost ready that uses exact floating-point 
    comparisons to validate math, thus proving that exact 
    comparisons are valid.

    [...] So, standard fuzzy comparison functions would be 
    nice, but the default should remain exact comparisons.

There's one obvious use-case for exact comparison:

"Has this value changed to some other value?"

old = x
x = function(x)
if x != old:
    print "changed!"

is fine. Changing the inequality to some fuzzy comparison is bogus, 
because that means that *some changes will not be detected*.

> I also agree with Chris that I think an additional operator will make
> things worse than better. It'll add confusion with no tangible benefit.
> The current operators might have the deficiency that they're not
> relexive, but then again: Why should == be always reflexive while the
> other operators aren't? 

You're not going to hear me arguing that the non-reflexivity of NANs and 
SQL NULL is a bad idea, although some very smart people have:

http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/

Mathematical equality is reflexive. It is fundamental to the nature of 
numbers and equality that a number is always equal to itself. To the 
degree that floats are supposed to model real numbers, they should obey 
the same laws of real numbers. However, they already fail to obey them, 
so the failure of reflexivity is just one more problem that makes 
floating point such a hard problem. Compared to floating point 
arithmetic, calculus is easy.

> Why should I be able to assume that
> 
> x == x -> True
> 
> but not
> 
> when x < x -> False

Because not all types are ordered:

py> x = 1+3j
py> x < x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: complex() < complex()

-- 
Steven