OK,

I FINALLY got a chance to look at Steven's code in the statistic module tests.

Not much code there, this really isn't hat big a deal.

It does check for NaN, and inf and all that, so that's good.

It is also symmetric with respect to x and y -- using the maximum of the two to compute the relative error -- I think that's good. (This is essentially the same as Boosts "strong" method -- though implemented a tiny bit differently).

Here is the key definition:

def approx_equal(x, y, tol=1e-12, rel=1e-7):
...
    x is approximately equal to y if the difference between them is less than
    an absolute error tol or a relative error rel, whichever is bigger.
...

This is a lot like the numpy code, actually, except it does a max test, rather than adding the absolute and relative tolerances together. I think this is a better way to go than numpy's but there is little practical difference.

However, it suffers from the same issue -- "tol" is essentially a minimum error that is considered acceptable. This is nice, as it it will allow zero to be passed in, and if the other input is within tol of zero, it will be considered approximately equal. However, for very small numbers (less that the absolute tolerance), then they will always be considered approximately equal:

In [18]: approx_equal(1.0e-14, 2.0e-14)
Out[18]: True

off by a factor of 2

In [19]: approx_equal(1.0e-20, 2.0e-25)
Out[19]: True

oops! way off!

This is with the defaults of course, and all you need to do is set teh tol much lower:

In [20]: approx_equal(1.0e-20, 2.0e-25, tol=1e-25)
Out[20]: False

This is less fatal than with numpy, as with numpy you are processing a whole array of numbers with the same tolerances, and they may not be all of the same magnitude. But I think think it's trap for users.

My proposal:

Allow either an absolute or relative tolerance, but not try to do both in one call.

or

If you really want the ability to do both at once (i.e. set a minimum for the zero case), then:
  - make the default absolute tolerance zero -- fewer surprises that way
  - document the absolute tolerance as a mimimum error (difference), and specifically mention the zero case in the docs.

Otherwise, go with Steven's code, and put it in the math module.

Also -- there was some talk of what do do with complex -- I say two complex numbers are approx_equal if  approx_equal(z1.real, z2.real) and approx_equal(z1.imag, z2.imag) -- that is more rigorous a test than using the complex abs value of the difference.

-Chris


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov