Re: [Python-ideas] Way to check for floating point "closeness"?

14 Jan 2015


      On 01/14/2015 04:39 AM, Steven D'Aprano wrote:
...
On Tue, Jan 13, 2015 at 08:57:38PM -0600, Ron Adam wrote:
...
...
On 01/13/2015 09:53 AM, Chris Barker - NOAA Federal wrote:
[...]
...
...
...
I haven't thought it out yet, but maybe we could specify an absolute
tolerance near zero, and a relative tolerance elsewhere, both at once.
Tricky to document, even if possible.
Doesn't this problem come up at any boundary comparison, and not just zero?
No.
A quick refresher on error tolerances...
Suppose you have a value which should be exactly 0.5, but you calculate
it as 0.51. Then the absolute error is 0.51-0.5 = 0.01, and the relative
error is 0.01/0.5 = 0.02 (or 2%).
But consider two values, 0.0 and 0.1. Then the absolute error is 0.1-0.0
= 0.1, and the relative error is 0.1/0.0 which is infinite.
So 0.0 and -0.0 are problematic when dealing with relative errors.
I'm not sure why, but it seems like something here is out of place to me. 
Either it's not something that would come up, or it's something you would 
expect and handle differently.  Or it's would be an error, signalling you 
need to handle it differently.
...
...
...
So isn't the issue about any n distance from any floating point number that
is less than 1 ulp?  And in that regard, comparison to zero is no different
than any comparison to any other floating point value?
No. 1 ULP (Unit In Last Place) is the smallest possible difference
between two floats. A difference of 0 ULP means the two floats are
exactly equal.
A difference of 0 ULP means they *may* be exactly equal.  The 
calculation/representation just can't resolve to a finer amount, so two 
figures less than 1 ULP apart can get the same floating point 
representation.  The two numbers still have a +- error.  Possibly +-.5 ULP. 
  ie... potentially 1 ULP apart.
...
How it works: in mathematics, real numbers are continuous, but floats
are not.  There are only 2**64 floats in Python (a C double), less if you
ignore the NANs and INFs, which means we can conveniently enumerate them
from -(2**64) to (2**64-1), based on the internal structure of a float.
So if you convert two floats into this enumerated integer value (which
is equivalent to doing a type-cast from a C double to a C long) and
subtract the two ints, this gives you a measure of how far apart they
are. (As Mark mentioned earlier, you have to make allowance for negative
floats, also INF and NANs are problematic too.)
If two values are exactly equal, their "distance apart" in ULP will be
zero. A distance of 1 ULP means they are consecutive floats, they cannot
possibly be any closer without being equal. A distance of 2 ULP means
there is only a single float separating them, and so on.
Note that ULP do not directly correspond to a numeric tolerance. For
example, these pairs of values are each 1 ULP apart:
0.0 and 5e-324
1.0 and 1.0000000000000002
1e300 and 1.0000000000000002e+300
So in these three cases, 1 ULP represents numeric differences of:
0.00000000000000000000...00005
0.0000000000000002
2000000000000000000000...000.0
respectively.
...
...
Just trying to follow along,
A good resource is Bruce Dawson's blog RandomASCII, if you don't mind
the focus on C++. Start here:
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-number...
Thanks, :)

    Ron