a.index(float('nan')) fails
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Fri Oct 26 14:40:55 EDT 2012
On Sat, 27 Oct 2012 03:45:46 +1100, Chris Angelico wrote:
> On Sat, Oct 27, 2012 at 3:23 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> In real life, you are *much* more likely to run into these examples of
>> "insanity" of floats than to be troubled by NANs:
>>
>> - associativity of addition is lost
>> - distributivity of multiplication is lost
>> - commutativity of addition is lost
>> - not all floats have an inverse
>>
>> e.g.
>>
>> (0.1 + 0.2) + 0.3 != 0.1 + (0.2 + 0.3)
>>
>> 1e6*(1.1 + 2.2) != 1e6*1.1 + 1e6*2.2
>>
>> 1e10 + 0.1 + -1e10 != 1e10 + -1e10 + 0.1
>>
>> 1/(1/49.0) != 49.0
>>
>> Such violations of the rules of real arithmetic aren't even hard to
>> find. They're everywhere.
>
> Actually, as I see it, there's only one principle to take note of: the
> "HMS Pinafore Floating Point Rule"...
>
> ** Floating point expressions should never be tested for equality **
> ** What, never? **
> ** Well, hardly ever! **
>
> The problem isn't with the associativity, it's with the equality
> comparison. Replace "x == y" with "abs(x-y)<epsilon" for some epsilon
> and all your statements fulfill people's expectations.
O RYLY?
Would you care to tell us which epsilon they should use?
Hint: *whatever* epsilon you pick, there will be cases where that is
either stupidly too small, stupidly too large, or one that degenerates to
float equality. And you may not be able to tell if you have one of those
cases or not.
Here's a concrete example for you:
What *single* value of epsilon should you pick such that the following
two expressions evaluate correctly?
sum([1e20, 0.1, -1e20, 0.1]*1000) == 200
sum([1e20, 99.9, -1e20, 0.1]*1000) != 200
The advice "never test floats for equality" is:
(1) pointless without a good way to know what epsilon they should use;
(2) sheer superstition since there are cases where testing floats for
equality is the right thing to do (although I note you dodged that bullet
with "hardly ever" *wink*);
and most importantly
(3) missing the point, since the violations of the rules of real-valued
mathematics still occur regardless of whether you explicitly test for
equality or not.
For instance, if you write:
result = a + (b + c)
some compilers may assume associativity and calculate (a + b) + c
instead. But that is not guaranteed to give the same result! (K&R allowed
C compilers to do that; the subsequent ANSI C standard prohibited re-
ordering, but in practice most C compilers provide a switch to allow it.)
A real-world example: Python's math.fsum is a high-precision summation
with error compensation based on the Kahan summation algorithm. Here's a
pseudo-code version:
http://en.wikipedia.org/wiki/Kahan_summation_algorithm
which includes the steps:
t = sum + y;
c = (t - sum) - y;
A little bit of algebra should tell you that c must equal zero.
Unfortunately, in this case algebra is wrong, because floats are not real
numbers. c is not necessarily zero.
An optimizing compiler, or an optimizing programmer, might very well
eliminate those calculations and so inadvertently eliminate the error
compensation. And not an equals sign in sight.
--
Steven
More information about the Python-list
mailing list