[Python-ideas] Way to check for floating point "closeness"?

Wed Jan 14 07:10:39 CET 2015

On Jan 13, 2015, at 21:12, Ron Adam <ron3200 at gmail.com> wrote:

> On 01/13/2015 09:25 PM, Chris Barker - NOAA Federal wrote:
>>> >On Jan 13, 2015, at 6:58 PM, Ron Adam<ron3200 at gmail.com>  wrote:
> 
>>>> >>  maybe we could specify an absolute
>>>> >>tolerance near zero, and a relative tolerance elsewhere, both at once.
>>>> >>Tricky to document, even if possible.
> 
>>> Doesn't this problem come up at any boundary comparison, and not just zero?
> 
>> Zero is special because you lose the ability to use a relative
>> tolerance. Everything is huge compared to zero.
> 
> After I posted I realised that when you compare anything you subtract what you are comparing to, and if it's equal to zero, then it's equal to what you are comparing to.  So testing against zero is fundamental to all comparisons, is this is correct?
> 
> Wouldn't a relative tolerance be set relative to some value that is *not* zero?  And then used when comparing any close values, including zero.

I think you're missing the meanings of these terms.

Absolute tolerance means a fixed tolerance, like 1e-5; relative tolerance means you pick a tolerance that's relative to the values being compared--think of it as a percentage, say, 0.01%. Plenty of values are within +/- 1e-5 of 0. But the only value within +/- 0.01% of 0 is 0 itself.

Another way to look at it: x is close to y with absolute tolerance 1e-5 if abs(x-y) < 1e-5. x is close to y with relative tolerance 1e-5 if abs(x-y)/y < 1e-5. So, no value is ever close to 0 within any relative tolerance.

>>> >So isn't the issue about any n distance from any floating point number that is less than 1 ulp?
>> I'm still a bit fuzzy on Ulps, but it seems the goal here is to define
>> a tolerance larger than an ulp.  This is for the use case where we
>> expect multiple rounding errors -- many more than one ulp,

>> That's why I think the use case for ulp comparisons is more about
>> assessment of accuracy of algorithms than "did I introduce a big old
>> bug?" or, "is this computed value close enough to what I measured?"
> 
> I haven't looked into the finer aspects of ulps myself. It seems to me ulps
> only matter if the exponent part of two floating point numbers are equal
> and the value part is within 1 (or a small few) ulps of each other. Then
> there may be problems determining if they are equal, or one is greater or
> less than the other.  And only if there is no greater tolerance value set.

No, this is wrong.

First, 1.1e1 and 1.0e10 (binary) are only off by 1 ulp even though they have different exponents.

Second, there's never any problem determining if two finite numbers are equal or which one is greater. The issue is determining whether two numbers +/- their error bars are too close to call vs. unambiguously greater or lesser. For example, if I get 1.0e1 and 1.1e1 (assuming 1 bit mantissa for ease of discussion), the latter is clearly greater--but if I have 2 ulp of error, or an absolute error of 1e1, or a relative error of 50%, the fact that the latter is greater is irrelevant--each value is within the other value's error range. But if I get 1.0e1 and 1.0e10 with the same error, then I can say that the latter is unambiguously greater.

I think Chris is right that ulp comparisons usually only come up in testing an algorithm. You have to actually do an error analysis, and you have to have input data with a precision specified in ulp, or you're not going to get a tolerance in ulp. When you want to verify that you've correctly implemented an algorithm that guarantees to multiply input ulp by no more than 4, you can feed in numbers with +/- 1 ulp error (just typing in decimal numbers does that) and verify that the results are within +/- 4 ulp. But when you have real data, or inherent rounding issues, or an error analysis that's partly made up of rules of thumb and winging it, you're almost always going to end up with absolute or relative error instead. (Or, occasionally, something more complicated that you have to code up manually, like logarithmic relative error.)
> 
> A couple of thoughts come to mind.  I'm not sure if they are relevant though.
> 
> I think for math algorithms that are hard coded in a program, it isn't much of an issue as the author would have a feel for the size of a delta if needed, and output of an appropriate number of significant digits. and can calculate the error range if needed as well.  That probably fits most situations and is what is typically done.
> 
> It seems to me, that automated tracking and/or use of these things may be wanted with equation solvers.  The solver would determine the delta and significant digits from it's initial data.  That sounds like it could get very complex, but maybe making this easier to do is the point?
> 
> Cheers,
>  Ron
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/