Floating point equality [was Re: What exactly is "exact" (was Clean Singleton Docstrings)]
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Jul 20 01:42:50 EDT 2016
On Tuesday 19 July 2016 14:58, Rustom Mody wrote:
> So I again ask: You say «"Never compare floats for equality" is a pernicious
> myth»
It is the word *never* which makes it superstition. If people said "Take care
with using == for floats, its often not what you want" I would have no argument
with the statement.
I'd even (reluctantly) accept "usually not what you want". But "never" is out-
and-out cargo-cult programming.
> Given that for Chris’ is_equal we get
> is_equal(.1+.1+.1, .3) is True
> whereas for python builtin == its False
>
> What (non)myth do you suggest for replacement?
Floating point maths is hard, thinking carefully about what you are doing and
whether it is appropriate to use == or a fuzzy almost-equal comparison, or if
equality is the right way at all.
"But thinking is hard, can't you just tell me the answer?"
No. But I can give some guidelines:
Floating point arithmetic is deterministic, it doesn't just randomly mix in
error out of spite or malice. So in principle, you can always estimate the
rounding error from any calculation -- and sometimes there is none.
Arithmetic on integer-values (e.g. 1.0) is always exact, up to a limit of
either 2**53 or approximately 1e53, I forget which. (That's why most Javascript
programmers fail to notice that they don't have an integer type.) So long as
you're using operations that only produce integer values from integer arguments
(such as + - * // but not / ) then all calculations are exact. It is a waste of
time to do:
x = 2.0
y = x*1002.0
is_equal(y, 2004.0, 1e-16)
when you can just do y == 2004.0.
If you do decide to use an absolute error, e.g.:
abs(x - y) < tolerance
keep in mind that your tolerance needs to be chosen relative to the x and y.
For large values of x and y, the smallest possible difference may be very
large:
py> x = 1e80
py> delta = 2**-1000
py> assert delta
py> while x + delta == x:
... delta *= 2
... else:
... print(delta)
...
6.58201822928e+63
So if you're comparing two numbers around 1e80 or so, doing a "fuzzy
comparison" using an absolute tolerance of less than 6.5e63 or so is just a
slow and complicated way of performing an exact comparison using the ==
operator.
Absolute tolerance is faster and easier to understand, and works when the
numbers are on opposite sides of zero, or if one (or both) is zero. But
generally speaking, relative tolerance of one form or another:
abs(x - y) <= abs(x)*relative_tolerance
abs(x - y) <= abs(y)*relative_tolerance
abs(x - y) <= min(abs(x), abs(y))*relative_tolerance
abs(x - y) <= max(abs(x), abs(y))*relative_tolerance
is probably better, but they are slower.
A nice, simple technique is just to round:
if round(x, 6) == round(y, 6):
but that's not quite the same as abs(x-y) < 1e-6.
For library code that cares greatly about precision, using "Unit Last Place"
(ULP) calculations are probably best. But that's a whole different story.
--
Steve
More information about the Python-list
mailing list