[Python-ideas] Way to check for floating point "closeness"?
Chris Barker
chris.barker at noaa.gov
Wed Jan 14 17:44:24 CET 2015
On Wed, Jan 14, 2015 at 1:08 AM, Ron Adam <ron3200 at gmail.com> wrote:
> Significant digits has more to do with error of measurements, (and
> estimates),
right -- and relative tolerance is kinda-sorta like significant digits. i.e
a relative tolerance of 1e-4 is like saying the same to 4 (decimal)
significant digits.
which brings up the issue with 0.0 -- how many significant digits does 0.0
have?
does 0.000001234 have the same significant digits as 0.0 ? -- its not
really defined.
whereas it's fairly straightforward to say that:
0.000001234 and 0.000001233
are the same to 3 significant digits, but differ in the forth.
and note that:
In [46]: u, v = 0.000001234, 0.000001233
In [47]: err = abs(u-v)
In [48]: err
Out[48]: 9.999999999998634e-10
so the absolute error is less than 1e-9 == pretty "small" -- but is that
what we generally want? no.
In [49]: err <= 1e-3*abs(u)
Out[49]: True
but the error is less than a relative tolerance of 1e-3 (three sig figs)
In [50]: err <= 1e-4*abs(v)
Out[50]: False
and greater than a relative tolerance of 1e-4 (not four sig figs)
Again, this all works great when you are away from zero (and maybe need to
handle NaN and inf carefully too), but if we simply put this in the stdlib,
then folks may use it with a zero value, and not get what they expect.
My thinking now:
set a "zero_tolerance",which will default to the relative tolerance, but be
user-settable.
If one of the input values is zero, then use the zero_tolerance as an
absolute tolerance, if not, then use a relative tolerance. I think this
would make for fewer surprises, and make it easier to use the same function
call for a wide range of values, some of which may be zero.
What I haven't figure out yet is how(or if) to make sure that the
transition is continuous -- do we only use zero_tol if one of the values is
exactly zero? of if one or both of the values is less than zero_tol? It
seems that if you say, for instance that:
1e-12 is "close" to zero, then 1e-12 should also be "close" to any value
less than 1e-12. But if 1e-12 is "close" to 1e-14 (for example), then 1e-12
should probably be "close" to 1.00000000001 also, but it wouldn't be, if we
did an abrupt change to relative tolerance for any value >= the
zero_tolerance.
So more to think out here -- feel free to chime in.
I've been playing with this gist, though it's only a few real lines of
code anyway:
https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a
> while ULPs is about accuracy limits of the hardware/software ability to
> calculate.
>
>
> Second, there's never any problem determining if two finite numbers are
>> equal or which one is greater. The issue is determining whether two
>> numbers ± their error bars are too close to call vs. unambiguously
>> greater or lesser. For example, if I get 1.0e1 and 1.1e1 (assuming 1 bit
>> mantissa for ease of discussion), the latter is clearly greater--but if
>> I have 2 ulp of error, or an absolute error of 1e1, or a relative error
>> of 50%, the fact that the latter is greater is irrelevant--each value is
>> within the other value's error range.
>>
>
> You would use the larger of the three. And possibly give a warning if the
> 2 ulp error is the largest. (if the application is set to do so.)
>
> I presuming the 2 ulp is twice the limit of the floating point precision
> here.
>
> 50% accuracy of data
> 1e1 limit of significant digits/measurement
> 2 ulp twice floating point unit of least precision
>
> But if I get 1.0e1 and 1.0e10 with
>> the same error, then I can say that the latter is unambiguously
>> greater.
>>
>
> Yes, This was the point I was alluding to earlier.
>
>
> I think Chris is right that ulp comparisons usually only come up in
>> testing an algorithm. You have to actually do an error analysis, and you
>> have to have input data with a precision specified in ulp, or you're not
>> going to get a tolerance in ulp. When you want to verify that you've
>> correctly implemented an algorithm that guarantees to multiply input ulp
>> by
>> no more than 4, you can feed in numbers with ± 1 ulp error (just typing in
>> decimal numbers does that) and verify that the results are within ± 4 ulp.
>> But when you have real data, or inherent rounding issues, or an error
>> analysis that's partly made up of rules of thumb and winging it, you're
>> almost always going to end up with absolute or relative error instead.
>> (Or,
>> occasionally, something more complicated that you have to code up man
>> ually, like logarithmic relative error.)
>>
>
> If the algorithm doesn't track error accumulation, then yes.
>
> This is interesting but I'm going to search for some examples of how to
> use some of this. I'm not sure I can add to the conversation much, but
> thanks for taking the time to explain some of it.
>
>
> Cheers,
> Ron
>
>
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150114/46d0b59f/attachment.html>
More information about the Python-ideas
mailing list