[Python-ideas] Way to check for floating point "closeness"?

Tue Jan 13 10:28:39 CET 2015

On Tue, Jan 13, 2015 at 1:34 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

> Unfortunately a naive ULP comparison has trouble with NANs, INFs, and
> numbers close to zero, especially if they have opposite signs. The
> smallest representable denormalised floats larger, and smaller, than
> zero are:
>
> 5e-324
> -5e-324
>
> These are the smallest magnitude floats apart from zero, so we might
> hope that they are considered "close together", but they actually differ
> by 9223372036854775808 ULP. Ouch.
>

Only with a naive (i.e., wrong :-) implementation. Those two floats differ
by precisely 2 units in the last place, and any correct implementation
should report that.  It's not hard to write code that deals correctly with
opposite signs.  Here's a simple difference_in_ulps function that correctly
reports the number of ulps difference between any two finite floats.

>>> import struct

>>> def to_ulps(x):

...     n = struct.unpack('<q', struct.pack('<d', x))[0]

...     return -(n + 2**63) if n < 0 else n

...

>>> def difference_in_ulps(x, y):

...     return abs(to_ulps(x) - to_ulps(y))

...

>>> difference_in_ulps(-5e-324, 5e-324)

2

This is almost exactly what's in Lib/test/test_math.py already, except that
the function there is better documented and uses "~(n + 2**63)" instead of
"-(n + 2**63)" in the negative n correction branch, which has the effect of
regarding 0.0 and -0.0 as 1 ulp apart.

Comparing by ulps was what I needed for testing library-quality functions
for the math and cmath modules; I doubt that it's what's needed for most
comparison tasks.  I'd expect the suggested combination of relative error
and absolute error to be more appropriate most of the time.

-- 
Mark

>
> I have some ideas for dealing with that, and if anyone is interested I'm
> happy to talk about it, but they're not ready for production yet.
>
> I think that the Bruce Dawson is right. Floating point comparisons are
> hard, really hard. I know that I've still got a lot to learn about it. I
> can think of at least five different ways to compare floats for
> equality, and they all have their uses:
>
> - exact equality using ==
> - absolute error tolerances
> - relative error tolerances
> - ULP comparisons
> - the method unittest uses, using round()
>
>
> I'm explicitly including == because it is a floating point superstition
> that one should never under any circumstances compare floats for exact
> equality. As general advice, "don't use == unless you know what you are
> doing" is quite reasonable, but it's the "never use" that turns it into
> superstition. As Bruce Dawson says, "Floating-point numbers aren’t
> cursed", and throwing epsilons into a problem where no epsilon is needed
> is a bad idea.
>
>
> https://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-compare-them/
>
>
> Aside: I'm reminded of APL, which mandates fuzzy equality (i.e. with a
> tolerance) of floating point numbers:
>
>     In an early talk Ken [Iverson] was explaining the advantages
>     of tolerant comparison. A member of the audience asked
>     incredulously, “Surely you don’t mean that when A=B and B=C,
>     A may not equal C?” Without skipping a beat, Ken replied,
>     “Any carpenter knows that!” and went on to the next question.
>     - Paul Berry
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150113/d978b88a/attachment.html>