[Python-ideas] Way to check for floating point "closeness"?

Sun Jan 18 19:28:48 CET 2015

OK,

I FINALLY got a chance to look at Steven's code in the statistic module
tests.

Not much code there, this really isn't hat big a deal.

It does check for NaN, and inf and all that, so that's good.

It is also symmetric with respect to x and y -- using the maximum of the
two to compute the relative error -- I think that's good. (This is
essentially the same as Boosts "strong" method -- though implemented a tiny
bit differently).

Here is the key definition:

def approx_equal(x, y, tol=1e-12, rel=1e-7):
...
    x is approximately equal to y if the difference between them is less
than
    an absolute error tol or a relative error rel, whichever is bigger.
...

This is a lot like the numpy code, actually, except it does a max test,
rather than adding the absolute and relative tolerances together. I think
this is a better way to go than numpy's but there is little practical
difference.

However, it suffers from the same issue -- "tol" is essentially a minimum
error that is considered acceptable. This is nice, as it it will allow zero
to be passed in, and if the other input is within tol of zero, it will be
considered approximately equal. However, for very small numbers (less that
the absolute tolerance), then they will always be considered approximately
equal:

In [18]: approx_equal(1.0e-14, 2.0e-14)
Out[18]: True

off by a factor of 2

In [19]: approx_equal(1.0e-20, 2.0e-25)
Out[19]: True

oops! way off!

This is with the defaults of course, and all you need to do is set teh tol
much lower:

In [20]: approx_equal(1.0e-20, 2.0e-25, tol=1e-25)
Out[20]: False

This is less fatal than with numpy, as with numpy you are processing a
whole array of numbers with the same tolerances, and they may not be all of
the same magnitude. But I think think it's trap for users.

My proposal:

Allow either an absolute or relative tolerance, but not try to do both in
one call.

or

If you really want the ability to do both at once (i.e. set a minimum for
the zero case), then:
  - make the default absolute tolerance zero -- fewer surprises that way
  - document the absolute tolerance as a mimimum error (difference), and
specifically mention the zero case in the docs.

Otherwise, go with Steven's code, and put it in the math module.

Also -- there was some talk of what do do with complex -- I say two complex
numbers are approx_equal if  approx_equal(z1.real, z2.real) and
approx_equal(z1.imag, z2.imag) -- that is more rigorous a test than using
the complex abs value of the difference.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150118/bf31a142/attachment.html>