<div dir="ltr">On Tue, Jan 13, 2015 at 1:34 AM, Steven D'Aprano <span dir="ltr"><<a href="mailto:steve@pearwood.info" target="_blank">steve@pearwood.info</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Unfortunately a naive ULP comparison has trouble with NANs, INFs, and<br>

numbers close to zero, especially if they have opposite signs. The<br>

smallest representable denormalised floats larger, and smaller, than<br>

zero are:<br>

<br>

5e-324<br>

-5e-324<br>

<br>

These are the smallest magnitude floats apart from zero, so we might<br>

hope that they are considered "close together", but they actually differ<br>

by 9223372036854775808 ULP. Ouch.<br></blockquote><div><br></div><div>Only with a naive (i.e., wrong :-) implementation. Those two floats differ by precisely 2 units in the last place, and any correct implementation should report that.  It's not hard to write code that deals correctly with opposite signs.  Here's a simple difference_in_ulps function that correctly reports the number of ulps difference between any two finite floats.</div><div><br></div><div><p style="margin:0px;font-size:12px;font-family:Menlo">>>> import struct</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">>>> def to_ulps(x):</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">...     n = struct.unpack('<q', struct.pack('<d', x))[0]</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">...     return -(n + 2**63) if n < 0 else n</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">... </p>

<p style="margin:0px;font-size:12px;font-family:Menlo">>>> def difference_in_ulps(x, y):</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">...     return abs(to_ulps(x) - to_ulps(y))</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">... </p>

<p style="margin:0px;font-size:12px;font-family:Menlo">>>> difference_in_ulps(-5e-324, 5e-324)</p>

<p style="margin:0px;font-size:12px;font-family:Menlo">2</p></div><div><br></div><div>This is almost exactly what's in Lib/test/test_math.py already, except that the function there is better documented and uses "~(n + 2**63)" instead of "-(n + 2**63)" in the negative n correction branch, which has the effect of regarding 0.0 and -0.0 as 1 ulp apart.</div><div><br></div><div>Comparing by ulps was what I needed for testing library-quality functions for the math and cmath modules; I doubt that it's what's needed for most comparison tasks.  I'd expect the suggested combination of relative error and absolute error to be more appropriate most of the time.</div><div><br></div><div>-- </div><div>Mark</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br>

I have some ideas for dealing with that, and if anyone is interested I'm<br>

happy to talk about it, but they're not ready for production yet.<br>

<br>

I think that the Bruce Dawson is right. Floating point comparisons are<br>

hard, really hard. I know that I've still got a lot to learn about it. I<br>

can think of at least five different ways to compare floats for<br>

equality, and they all have their uses:<br>

<br>

- exact equality using ==<br>

- absolute error tolerances<br>

- relative error tolerances<br>

- ULP comparisons<br>

- the method unittest uses, using round()<br>

<br>

<br>

I'm explicitly including == because it is a floating point superstition<br>

that one should never under any circumstances compare floats for exact<br>

equality. As general advice, "don't use == unless you know what you are<br>

doing" is quite reasonable, but it's the "never use" that turns it into<br>

superstition. As Bruce Dawson says, "Floating-point numbers aren’t<br>

cursed", and throwing epsilons into a problem where no epsilon is needed<br>

is a bad idea.<br>

<br>

<a href="https://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-compare-them/" target="_blank">https://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-compare-them/</a><br>

<br>

<br>

Aside: I'm reminded of APL, which mandates fuzzy equality (i.e. with a<br>

tolerance) of floating point numbers:<br>

<br>

    In an early talk Ken [Iverson] was explaining the advantages<br>

    of tolerant comparison. A member of the audience asked<br>

    incredulously, “Surely you don’t mean that when A=B and B=C,<br>

    A may not equal C?” Without skipping a beat, Ken replied,<br>

    “Any carpenter knows that!” and went on to the next question.<br>

    - Paul Berry<br>

<span class=""><font color="#888888"><br>

<br>

<br>

--<br>

Steve<br>

</font></span><div class=""><div class="h5">_______________________________________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" target="_blank">https://mail.python.org/mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" target="_blank">http://python.org/psf/codeofconduct/</a></div></div></blockquote></div><br></div></div>