[Python-ideas] PEP 485: A Function for testing approximate equality
Juancarlo Añez
apalala at gmail.com
Fri Jan 23 02:24:34 CET 2015
+1
On Thu, Jan 22, 2015 at 8:10 PM, Chris Barker <chris.barker at noaa.gov> wrote:
> Hi folks,
>
> After much discussion on this list, I have written up a PEP, and it is
> ready for review (see below)
>
> It is also here: https://www.python.org/dev/peps/pep-0485/
>
> That version is not quite up to date just yet, so please refer to the one
> enclosed in this email for now.
>
> I am managing both the PEP and a sample implementation and tests in gitHub
> here:
>
> https://github.com/PythonCHB/close_pep
>
> Please go there if you want to try it out, add some tests, etc. Pull
> requests welcomed for code, tests, or PEP editing.
>
> A quick summary of the decisions I made, and what I think are the open
> discussion points:
>
> The focus is on relative tolerance, but with an optional absolute
> tolerance, primarily to be used near zero, but it also allows it to be used
> as a plain absolute difference check.
>
> It is using an asymmetric test -- that is, the tolerance is computed
> relative to one of the arguments. It is perhaps surprising and confusing
> that you may get a different result if you reverse the arguments, but in
> this discussion it became clear that there were some use-cases where it was
> helpful to know exactly what the tolerance is computed relative too, and
> that in most use cases, if just doesn't matter. I hope this is adequately
> explained in the PEP. We could add a flag to set a symmetric test (I'd go
> with what boost calls the "strong" test), but I'd rather not -- it just
> confuses things, and I expect users will tend to use defaults anyway.
>
> It is designed to work mostly with floats, but also supports Integer,
> Decimal, Fraction, and Complex. I'm not really thrilled with that, though,
> it turns out to be not quite as easy to duck-type it as I had hoped. To
> really do it right, there would have to be more switching on type in the
> code, which I think is ugly to write -- contributions, opinions welcome on
> this.
>
> I used 1e-8 as a default relative tolerance -- arbitrarily because that's
> about half of the decimal digits in a python float -- suggestions welcome.
>
> Other than that, of course, we can bike-shed the names of the function and
> the parameters. ;-)
>
> Fire away!
>
> -Chris
>
>
> PEP: 485
> Title: A Function for testing approximate equality
> Version: $Revision$
> Last-Modified: $Date$
> Author: Christopher Barker <Chris.Barker at noaa.gov>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 20-Jan-2015
> Python-Version: 3.5
> Post-History:
>
>
> Abstract
> ========
>
> This PEP proposes the addition of a function to the standard library
> that determines whether one value is approximately equal or "close"
> to another value.
>
>
> Rationale
> =========
>
> Floating point values contain limited precision, which results in
> their being unable to exactly represent some values, and for error to
> accumulate with repeated computation. As a result, it is common
> advice to only use an equality comparison in very specific situations.
> Often a inequality comparison fits the bill, but there are times
> (often in testing) where the programmer wants to determine whether a
> computed value is "close" to an expected value, without requiring them
> to be exactly equal. This is common enough, particularly in testing,
> and not always obvious how to do it, so it would be useful addition to
> the standard library.
>
>
> Existing Implementations
> ------------------------
>
> The standard library includes the
> ``unittest.TestCase.assertAlmostEqual`` method, but it:
>
> * Is buried in the unittest.TestCase class
>
> * Is an assertion, so you can't use it as a general test (easily)
>
> * Uses number of decimal digits or an absolute delta, which are
> particular use cases that don't provide a general relative error.
>
> The numpy package has the ``allclose()`` and ``isclose()`` functions.
>
> The statistics package tests include an implementation, used for its
> unit tests.
>
> One can also find discussion and sample implementations on Stack
> Overflow, and other help sites.
>
> These existing implementations indicate that this is a common need,
> and not trivial to write oneself, making it a candidate for the
> standard library.
>
>
> Proposed Implementation
> =======================
>
> NOTE: this PEP is the result of an extended discussion on the
> python-ideas list [1]_.
>
> The new function will have the following signature::
>
> is_close_to(actual, expected, tol=1e-8, abs_tol=0.0)
>
> ``actual``: is the value that has been computed, measured, etc.
>
> ``expected``: is the "known" value.
>
> ``tol``: is the relative tolerance -- it is the amount of error
> allowed, relative to the magnitude of the expected value.
>
> ``abs_tol``: is an minimum absolute tolerance level -- useful for
> comparisons near zero.
>
> Modulo error checking, etc, the function will return the result of::
>
> abs(expected-actual) <= max(tol*expected, abs_tol)
>
>
> Handling of non-finite numbers
> ------------------------------
>
> The IEEE 754 special values of NaN, inf, and -inf will be handled
> according to IEEE rules. Specifically, NaN is not considered close to
> any other value, including NaN. inf and -inf are only considered close
> to themselves.
>
>
> Non-float types
> ---------------
>
> The primary use-case is expected to be floating point numbers.
> However, users may want to compare other numeric types similarly. In
> theory, it should work for any type that supports ``abs()``,
> comparisons, and subtraction. The code will be written and tested to
> accommodate these types:
>
> * ``Decimal``: for Decimal, the tolerance must be set to a Decimal type.
>
> * ``int``
>
> * ``Fraction``
>
> * ``complex``: for complex, ``abs(z)`` will be used for scaling and
> comparison.
>
>
> Behavior near zero
> ------------------
>
> Relative comparison is problematic if either value is zero. In this
> case, the difference is relative to zero, and thus will always be
> smaller than the prescribed tolerance. To handle this case, an
> optional parameter, ``abs_tol`` (default 0.0) can be used to set a
> minimum tolerance to be used in the case of very small relative
> tolerance. That is, the values will be considered close if::
>
> abs(a-b) <= abs(tol*expected) or abs(a-b) <= abs_tol
>
> If the user sets the rel_tol parameter to 0.0, then only the absolute
> tolerance will effect the result, so this function provides an
> absolute tolerance check as well.
>
> A sample implementation is available (as of Jan 22, 2015) on gitHub:
>
> https://github.com/PythonCHB/close_pep/blob/master/is_close_to.py
>
>
> Relative Difference
> ===================
>
> There are essentially two ways to think about how close two numbers
> are to each-other: absolute difference: simply ``abs(a-b)``, and
> relative difference: ``abs(a-b)/scale_factor`` [2]_. The absolute
> difference is trivial enough that this proposal focuses on the
> relative difference.
>
> Usually, the scale factor is some function of the values under
> consideration, for instance:
>
> 1) The absolute value of one of the input values
>
> 2) The maximum absolute value of the two
>
> 3) The minimum absolute value of the two.
>
> 4) The arithmetic mean of the two
>
>
> Symmetry
> --------
>
> A relative comparison can be either symmetric or non-symmetric. For a
> symmetric algorithm:
>
> ``is_close_to(a,b)`` is always equal to ``is_close_to(b,a)``
>
> This is an appealing consistency -- it mirrors the symmetry of
> equality, and is less likely to confuse people. However, often the
> question at hand is:
>
> "Is this computed or measured value within some tolerance of a known
> value?"
>
> In this case, the user wants the relative tolerance to be specifically
> scaled against the known value. It is also easier for the user to
> reason about.
>
> This proposal uses this asymmetric test to allow this specific
> definition of relative tolerance.
>
> Example:
>
> For the question: "Is the value of a within x% of b?", Using b to
> scale the percent error clearly defines the result.
>
> However, as this approach is not symmetric, a may be within 10% of b,
> but b is not within x% of a. Consider the case::
>
> a = 9.0
> b = 10.0
>
> The difference between a and b is 1.0. 10% of a is 0.9, so b is not
> within 10% of a. But 10% of b is 1.0, so a is within 10% of b.
>
> Casual users might reasonably expect that if a is close to b, then b
> would also be close to a. However, in the common cases, the tolerance
> is quite small and often poorly defined, i.e. 1e-8, defined to only
> one significant figure, so the result will be very similar regardless
> of the order of the values. And if the user does care about the
> precise result, s/he can take care to always pass in the two
> parameters in sorted order.
>
> This proposed implementation uses asymmetric criteria with the scaling
> value clearly identified.
>
>
> Expected Uses
> =============
>
> The primary expected use case is various forms of testing -- "are the
> results computed near what I expect as a result?" This sort of test
> may or may not be part of a formal unit testing suite.
>
> The function might be used also to determine if a measured value is
> within an expected value.
>
>
> Inappropriate uses
> ------------------
>
> One use case for floating point comparison is testing the accuracy of
> a numerical algorithm. However, in this case, the numerical analyst
> ideally would be doing careful error propagation analysis, and should
> understand exactly what to test for. It is also likely that ULP (Unit
> in the Last Place) comparison may be called for. While this function
> may prove useful in such situations, It is not intended to be used in
> that way.
>
>
> Other Approaches
> ================
>
> ``unittest.TestCase.assertAlmostEqual``
> ---------------------------------------
>
> (
> https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual
> )
>
> Tests that values are approximately (or not approximately) equal by
> computing the difference, rounding to the given number of decimal
> places (default 7), and comparing to zero.
>
> This method was not selected for this proposal, as the use of decimal
> digits is a specific, not generally useful or flexible test.
>
> numpy ``is_close()``
> --------------------
>
> http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html
>
> The numpy package provides the vectorized functions is_close() and
> all_close, for similar use cases as this proposal:
>
> ``isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)``
>
> Returns a boolean array where two arrays are element-wise equal
> within a tolerance.
>
> The tolerance values are positive, typically very small numbers.
> The relative difference (rtol * abs(b)) and the absolute
> difference atol are added together to compare against the
> absolute difference between a and b
>
> In this approach, the absolute and relative tolerance are added
> together, rather than the ``or`` method used in this proposal. This is
> computationally more simple, and if relative tolerance is larger than
> the absolute tolerance, then the addition will have no effect. But if
> the absolute and relative tolerances are of similar magnitude, then
> the allowed difference will be about twice as large as expected.
>
> Also, if the value passed in are small compared to the absolute
> tolerance, then the relative tolerance will be completely swamped,
> perhaps unexpectedly.
>
> This is why, in this proposal, the absolute tolerance defaults to zero
> -- the user will be required to choose a value appropriate for the
> values at hand.
>
>
> Boost floating-point comparison
> -------------------------------
>
> The Boost project ( [3]_ ) provides a floating point comparison
> function. Is is a symetric approach, with both "weak" (larger of the
> two relative errors) and "strong" (smaller of the two relative errors)
> options.
>
> It was decided that a method that clearly defined which value was used
> to scale the relative error would be more appropriate for the standard
> library.
>
> References
> ==========
>
> .. [1] Python-ideas list discussion thread
> (
> https://mail.python.org/pipermail/python-ideas/2015-January/030947.html)
>
> .. [2] Wikipedaia page on relative difference
> (http://en.wikipedia.org/wiki/Relative_change_and_difference)
>
> .. [3] Boost project floating-point comparison algorithms
> (
> http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html
> )
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
--
Juancarlo *Añez*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150122/c57e6a56/attachment-0001.html>
More information about the Python-ideas
mailing list