Proposal: === and !=== operators
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sat Jul 12 18:35:07 CEST 2014
On Sat, 12 Jul 2014 13:54:07 +0200, Johannes Bauer wrote:
> On 09.07.2014 11:17, Steven D'Aprano wrote:
>
>> People are already having problems, just listen to Anders. He's
>> (apparently) not doing NAN-aware computations on his data, he just
>> wants to be able to do something like
>>
>> this_list_of_floats == that_list_of_floats
>
> This is a horrible example.
>
> There's no pretty way of saying this: Comparing floats using equals
> operators has always and will always be an incredibly dumb idea. The
> same applies obviously to containers containing floats.
That's a myth. It simply is not true that you should never compare floats
with the equals operator, it comes from the dark ages of numeric
computing prior to IEEE-754.
If you said, "for many purposes, one should not compare floats for
equality, but should use some sort of fuzzy comparison instead" then I
would agree with you. But your insistence that equality "always" is wrong
takes it out of good advice into the realm of superstition.
Quoting Professor William Kahan from the foreword to the "Apple Numerics
Manual", second edition:
[B]ecause so many computers in the 1960s and 1970's possessed
so many different arithmetic anomalies, computational lore has
become encumbered with a vast body of superstition purporting
to cope with them. One such superstitious rule is "*Never* ask
whether floating-point numbers are exactly equal."
That was written in 1987, just two years after the introduction of the
IEEE-754 standard. It is heart-breaking that 26 years later this bogus
"rule" is still being treated as gospel.
Bruce Dawson has written an awesome series of blog posts dealing with
floating point issues. In this post:
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
he discusses some of the issues with comparing two C floats or doubles
for equality. If you read the entire post, he emphasises how hard it is
to compare floats, and gives three methods:
- test whether they differ by an absolute error
- test whether they differ by a relative error
- test whether they differ by a certain number of Units In Last Place
(one method he misses is the one used by Python unittest module, which
rounds the values before comparing them)
and describes some of the weaknesses of each. In a reply to a comment, he
warns about using == to compare a float (single precision) and a double.
But if you keep reading all the way down to this comment:
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/#comment-9989
he says:
[T]he default equality comparison absolutely should be
true equality. To do otherwise risks madness. I have a
post almost ready that uses exact floating-point
comparisons to validate math, thus proving that exact
comparisons are valid.
[...] So, standard fuzzy comparison functions would be
nice, but the default should remain exact comparisons.
There's one obvious use-case for exact comparison:
"Has this value changed to some other value?"
old = x
x = function(x)
if x != old:
print "changed!"
is fine. Changing the inequality to some fuzzy comparison is bogus,
because that means that *some changes will not be detected*.
> I also agree with Chris that I think an additional operator will make
> things worse than better. It'll add confusion with no tangible benefit.
> The current operators might have the deficiency that they're not
> relexive, but then again: Why should == be always reflexive while the
> other operators aren't?
You're not going to hear me arguing that the non-reflexivity of NANs and
SQL NULL is a bad idea, although some very smart people have:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/
Mathematical equality is reflexive. It is fundamental to the nature of
numbers and equality that a number is always equal to itself. To the
degree that floats are supposed to model real numbers, they should obey
the same laws of real numbers. However, they already fail to obey them,
so the failure of reflexivity is just one more problem that makes
floating point such a hard problem. Compared to floating point
arithmetic, calculus is easy.
> Why should I be able to assume that
>
> x == x -> True
>
> but not
>
> when x < x -> False
Because not all types are ordered:
py> x = 1+3j
py> x < x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: complex() < complex()
--
Steven
More information about the Python-list
mailing list