Hi folks,After much discussion on this list, I have written up a PEP, and it is ready for review (see below)It is also here: https://www.python.org/dev/peps/pep-0485/That version is not quite up to date just yet, so please refer to the one enclosed in this email for now.I am managing both the PEP and a sample implementation and tests in gitHub here:Please go there if you want to try it out, add some tests, etc. Pull requests welcomed for code, tests, or PEP editing.A quick summary of the decisions I made, and what I think are the open discussion points:The focus is on relative tolerance, but with an optional absolute tolerance, primarily to be used near zero, but it also allows it to be used as a plain absolute difference check.It is using an asymmetric test -- that is, the tolerance is computed relative to one of the arguments. It is perhaps surprising and confusing that you may get a different result if you reverse the arguments, but in this discussion it became clear that there were some use-cases where it was helpful to know exactly what the tolerance is computed relative too, and that in most use cases, if just doesn't matter. I hope this is adequately explained in the PEP. We could add a flag to set a symmetric test (I'd go with what boost calls the "strong" test), but I'd rather not -- it just confuses things, and I expect users will tend to use defaults anyway.It is designed to work mostly with floats, but also supports Integer, Decimal, Fraction, and Complex. I'm not really thrilled with that, though, it turns out to be not quite as easy to duck-type it as I had hoped. To really do it right, there would have to be more switching on type in the code, which I think is ugly to write -- contributions, opinions welcome on this.I used 1e-8 as a default relative tolerance -- arbitrarily because that's about half of the decimal digits in a python float -- suggestions welcome.Other than that, of course, we can bike-shed the names of the function and the parameters. ;-)Fire away!-ChrisPEP: 485Title: A Function for testing approximate equalityVersion: $Revision$Last-Modified: $Date$Author: Christopher Barker <Chris.Barker@noaa.gov>Status: DraftType: Standards TrackContent-Type: text/x-rstCreated: 20-Jan-2015Python-Version: 3.5Post-History:Abstract========This PEP proposes the addition of a function to the standard librarythat determines whether one value is approximately equal or "close"to another value.Rationale=========Floating point values contain limited precision, which results intheir being unable to exactly represent some values, and for error toaccumulate with repeated computation. As a result, it is commonadvice to only use an equality comparison in very specific situations.Often a inequality comparison fits the bill, but there are times(often in testing) where the programmer wants to determine whether acomputed value is "close" to an expected value, without requiring themto be exactly equal. This is common enough, particularly in testing,and not always obvious how to do it, so it would be useful addition tothe standard library.Existing Implementations------------------------The standard library includes the``unittest.TestCase.assertAlmostEqual`` method, but it:* Is buried in the unittest.TestCase class* Is an assertion, so you can't use it as a general test (easily)* Uses number of decimal digits or an absolute delta, which areparticular use cases that don't provide a general relative error.The numpy package has the ``allclose()`` and ``isclose()`` functions.The statistics package tests include an implementation, used for itsunit tests.One can also find discussion and sample implementations on StackOverflow, and other help sites.These existing implementations indicate that this is a common need,and not trivial to write oneself, making it a candidate for thestandard library.Proposed Implementation=======================NOTE: this PEP is the result of an extended discussion on thepython-ideas list [1]_.The new function will have the following signature::is_close_to(actual, expected, tol=1e-8, abs_tol=0.0)``actual``: is the value that has been computed, measured, etc.``expected``: is the "known" value.``tol``: is the relative tolerance -- it is the amount of errorallowed, relative to the magnitude of the expected value.``abs_tol``: is an minimum absolute tolerance level -- useful forcomparisons near zero.Modulo error checking, etc, the function will return the result of::abs(expected-actual) <= max(tol*expected, abs_tol)Handling of non-finite numbers------------------------------The IEEE 754 special values of NaN, inf, and -inf will be handledaccording to IEEE rules. Specifically, NaN is not considered close toany other value, including NaN. inf and -inf are only considered closeto themselves.Non-float types---------------The primary use-case is expected to be floating point numbers.However, users may want to compare other numeric types similarly. Intheory, it should work for any type that supports ``abs()``,comparisons, and subtraction. The code will be written and tested toaccommodate these types:* ``Decimal``: for Decimal, the tolerance must be set to a Decimal type.* ``int``* ``Fraction``* ``complex``: for complex, ``abs(z)`` will be used for scaling andcomparison.Behavior near zero------------------Relative comparison is problematic if either value is zero. In thiscase, the difference is relative to zero, and thus will always besmaller than the prescribed tolerance. To handle this case, anoptional parameter, ``abs_tol`` (default 0.0) can be used to set aminimum tolerance to be used in the case of very small relativetolerance. That is, the values will be considered close if::abs(a-b) <= abs(tol*expected) or abs(a-b) <= abs_tolIf the user sets the rel_tol parameter to 0.0, then only the absolutetolerance will effect the result, so this function provides anabsolute tolerance check as well.A sample implementation is available (as of Jan 22, 2015) on gitHub:Relative Difference===================There are essentially two ways to think about how close two numbersare to each-other: absolute difference: simply ``abs(a-b)``, andrelative difference: ``abs(a-b)/scale_factor`` [2]_. The absolutedifference is trivial enough that this proposal focuses on therelative difference.Usually, the scale factor is some function of the values underconsideration, for instance:1) The absolute value of one of the input values2) The maximum absolute value of the two3) The minimum absolute value of the two.4) The arithmetic mean of the twoSymmetry--------A relative comparison can be either symmetric or non-symmetric. For asymmetric algorithm:``is_close_to(a,b)`` is always equal to ``is_close_to(b,a)``This is an appealing consistency -- it mirrors the symmetry ofequality, and is less likely to confuse people. However, often thequestion at hand is:"Is this computed or measured value within some tolerance of a knownvalue?"In this case, the user wants the relative tolerance to be specificallyscaled against the known value. It is also easier for the user toreason about.This proposal uses this asymmetric test to allow this specificdefinition of relative tolerance.Example:For the question: "Is the value of a within x% of b?", Using b toscale the percent error clearly defines the result.However, as this approach is not symmetric, a may be within 10% of b,but b is not within x% of a. Consider the case::a = 9.0b = 10.0The difference between a and b is 1.0. 10% of a is 0.9, so b is notwithin 10% of a. But 10% of b is 1.0, so a is within 10% of b.Casual users might reasonably expect that if a is close to b, then bwould also be close to a. However, in the common cases, the toleranceis quite small and often poorly defined, i.e. 1e-8, defined to onlyone significant figure, so the result will be very similar regardlessof the order of the values. And if the user does care about theprecise result, s/he can take care to always pass in the twoparameters in sorted order.This proposed implementation uses asymmetric criteria with the scalingvalue clearly identified.Expected Uses=============The primary expected use case is various forms of testing -- "are theresults computed near what I expect as a result?" This sort of testmay or may not be part of a formal unit testing suite.The function might be used also to determine if a measured value iswithin an expected value.Inappropriate uses------------------One use case for floating point comparison is testing the accuracy ofa numerical algorithm. However, in this case, the numerical analystideally would be doing careful error propagation analysis, and shouldunderstand exactly what to test for. It is also likely that ULP (Unitin the Last Place) comparison may be called for. While this functionmay prove useful in such situations, It is not intended to be used inthat way.Other Approaches================``unittest.TestCase.assertAlmostEqual``---------------------------------------Tests that values are approximately (or not approximately) equal bycomputing the difference, rounding to the given number of decimalplaces (default 7), and comparing to zero.This method was not selected for this proposal, as the use of decimaldigits is a specific, not generally useful or flexible test.numpy ``is_close()``--------------------The numpy package provides the vectorized functions is_close() andall_close, for similar use cases as this proposal:``isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)``Returns a boolean array where two arrays are element-wise equalwithin a tolerance.The tolerance values are positive, typically very small numbers.The relative difference (rtol * abs(b)) and the absolutedifference atol are added together to compare against theabsolute difference between a and bIn this approach, the absolute and relative tolerance are addedtogether, rather than the ``or`` method used in this proposal. This iscomputationally more simple, and if relative tolerance is larger thanthe absolute tolerance, then the addition will have no effect. But ifthe absolute and relative tolerances are of similar magnitude, thenthe allowed difference will be about twice as large as expected.Also, if the value passed in are small compared to the absolutetolerance, then the relative tolerance will be completely swamped,perhaps unexpectedly.This is why, in this proposal, the absolute tolerance defaults to zero-- the user will be required to choose a value appropriate for thevalues at hand.Boost floating-point comparison-------------------------------The Boost project ( [3]_ ) provides a floating point comparisonfunction. Is is a symetric approach, with both "weak" (larger of thetwo relative errors) and "strong" (smaller of the two relative errors)options.It was decided that a method that clearly defined which value was usedto scale the relative error would be more appropriate for the standardlibrary.References==========.. [1] Python-ideas list discussion thread.. [2] Wikipedaia page on relative difference.. [3] Boost project floating-point comparison algorithmsCopyright=========This document has been placed in the public domain.--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/