[Python-ideas] PEP 485: A Function for testing approximate equality

Fri Jan 23 22:30:47 CET 2015

On Fri, Jan 23, 2015 at 12:40 AM, Chris Barker <chris.barker at noaa.gov> wrote:
> Existing Implementations
> ------------------------
>
> The standard library includes the
> ``unittest.TestCase.assertAlmostEqual`` method, but it:
>
> * Is buried in the unittest.TestCase class
>
> * Is an assertion, so you can't use it as a general test (easily)
>
> * Uses number of decimal digits or an absolute delta, which are
>   particular use cases that don't provide a general relative error.

I might phrase this a bit more strongly -- assertAlmostEqual is
confusing and broken-by-default for common cases like comparing two
small values, or comparing two large values.

> The numpy package has the ``allclose()`` and ``isclose()`` functions.
>
> The statistics package tests include an implementation, used for its
> unit tests.
>
> One can also find discussion and sample implementations on Stack
> Overflow, and other help sites.
>
> These existing implementations indicate that this is a common need,
> and not trivial to write oneself, making it a candidate for the
> standard library.
>
>
> Proposed Implementation
> =======================
>
> NOTE: this PEP is the result of an extended discussion on the
> python-ideas list [1]_.
>
> The new function will have the following signature::
>
>   is_close_to(actual, expected, tol=1e-8, abs_tol=0.0)
>
> ``actual``: is the value that has been computed, measured, etc.
>
> ``expected``: is the "known" value.
>
> ``tol``: is the relative tolerance -- it is the amount of error
> allowed, relative to the magnitude of the expected value.
>
> ``abs_tol``: is an minimum absolute tolerance level -- useful for
> comparisons near zero.
>
> Modulo error checking, etc, the function will return the result of::
>
>     abs(expected-actual) <= max(tol*expected, abs_tol)

So for reference, it looks like the differences from numpy are:

1) kwarg names: "tol" and "abs_tol" versus "atol", "rtol". Numpy's
names seem fine to me, but if you want the longer ones then probably
"rel_tol", "abs_tol" would be better?

2) use of max() instead of + to combine the relative and absolute
tolerance. I understand that you find the + conceptually offensive,
but I'm not really sure why -- max() is maybe a bit better, but it
seems like much of a muchness to me in practice. (Sure, like you say
further down, the total error using + might end up being higher by a
factor of two or so -- but either people are specifying the tolerances
they want, in which case they can say what they mean either way, or
else they're just accepting the defaults, in which case they don't
care.) It might be worth switching to + just for compatibility.

3) The default tolerances. Numpy is inconsistent with itself on this
point though (allclose vs. assert_allclose), so I wouldn't worry about
it too much :-).

However, a lot of the benefit of numpy.allclose is that it will do
something mostly-reasonable out-of-the-box even if the users haven't
thought things through at all. 99% of the benefit of having something
like this available is that it makes it easy to write tests, and 99%
of the benefit of a test is that it exists and makes sure that your
values are not wildly incorrect. So that's nice. BUT if you want that
kind of out-of-the-box utility then you need to have some kind of
sensible default for comparisons to zero.

(I just did a quick look at uses of python code uses of
assertAlmostEqual on github, and in my unscientific survey of reading
the first page of results, 30.4% of the calls were comparisons against
zero. IMO asking all these people to specify tolerances by hand on
every call is not very nice.)

One option would be to add a zero_tol argument, which is an absolute
tolerance that is only applied if expected == 0.

[And a nice possible side-effect of this is that numpy could
conceivably then add such an argument as well "for compatibility with
the stdlib", and possibly use this as a lever to fix it's weird
allclose/assert_allclose discrepancy. The main blocker to making them
consistent is that there is lots of code in the wild that assumes
allclose handles comparisons-to-zeros right, and also lots of code
that assumes that assert_allclose is strict with very-small non-zero
numbers, and with only rtol and atol you can't get both of these
behaviours simultaneously.]

> Inappropriate uses
> ------------------
>
> One use case for floating point comparison is testing the accuracy of
> a numerical algorithm. However, in this case, the numerical analyst
> ideally would be doing careful error propagation analysis, and should
> understand exactly what to test for. It is also likely that ULP (Unit
> in the Last Place) comparison may be called for. While this function
> may prove useful in such situations, It is not intended to be used in
> that way.

I'd strongly consider expanding the scope of this PEP a bit so that
it's proposing both a relative/absolute-error-based function *and* a
ULP-difference function. There was a plausible-looking one using
struct posted in the other thread, it would cover a wider variety of
cases, and having both functions next to each other in the docs would
provide a good opportunity to explain why the differences and which
might be preferred in which situation.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org