<div dir="ltr"><div><div>OK folks,<br><br></div>There has been a lot of chatter about this, which I think has served to provide some clarity, at least to me. However, I'm concerned that the upshot, at least for folks not deep into the discussion, will be: clearly there are too many use-case specific details to put any one thing in the std lib. But I still think we can provide something that is useful for most use-cases, and would like to propose what that is, and what the decision points are:<br><br></div>A function for the math module, called somethign like "is_close", "approx_equal", etc. It will compute a relative tolerance, with a default maybe around 1-e12, with the user able to specify the tolerance they want.<br><div><br></div><div>Optionally, the user can specify an "minimum absolute tolerance", it will default to zero, but can be set so that comparisons to zero can be handled gracefully.<br><br></div><div>The relative tolerance will be computed from the smallest of the two input values, so as to get symmetry : is_close(a,b) == is_close(b,a). (this is the Boost "strong" definition, and what is used by Steven D'Aprano's code in the statistics test module)<br><br></div><div>Alternatively, the relative error could be computed against a particular one of the input values (the second one?). This would be asymmetric, but be more clear exactly how "relative" is defined, and be closer to what people may expect when using it as a "actual vs expected" test. --- "expected" would be the scaling value. If the tolerance is small, it makes very little difference anyway, so I'm happy with whatever consensus moves us to. Note that if we go this way, then the parameter names should make it at least a little more clear -- maybe "actual" and "expected", rather than x and y or a and b or... and the function name should be something like is_close_to, rather than just is_close.<br><br></div><div>It will be designed for floating point numbers, and handle inf, -inf, and NaN "properly". But is will also work with other numeric types, to the extent that duck typing "just works" (i.e. division and comparisons all work).<br><br></div><div>complex numbers will be handled by:<br></div><div>is_close(x.real, y.real) and is_close(x.imag, y.imag)<br></div><div>(but i haven't written any code for that yet)<br></div><div><br></div><div>It will not do a simple absolute comparison -- that is the job of a different function, or, better yet, folks just write it themselves:<br><br></div><div>abs(x - y) <= delta<br><br></div><div>really isn't much harder to write than a function call:<br><br></div><div>absolute_diff(x,y,delta)<br><br></div><div>Here is a gist with a sample implementation:<br><br><a href="https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a">https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a</a><br><br></div><div>I need to add more tests, and make the test proper unit tests, but it's a start.<br><br></div><div>I also need to see how it does with other data types than float -- hopefully, it will "just work" with the core set.<br><br></div><div>I hope we can come to some consensus that something like this is the way to go.<br></div><div><br></div><div>-Chris<br><br></div><div><br><br><br><br></div><div><br></div><div><br><br><br><br></div><div><br><br></div><div><br><br><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 18, 2015 at 11:27 AM, Ron Adam <span dir="ltr"><<a href="mailto:ron3200@gmail.com" target="_blank">ron3200@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class=""><br>
<br>
On 01/17/2015 11:37 PM, Chris Barker wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
(Someone claimed that 'nothing is close to zero'. This is<br>
nonsensical both in applied math and everyday life.)<br>
<br>
<br>
I'm pretty sure someone (more than one of use) asserted that "nothing is<br>
*relatively* close to zero -- very different.<br>
</blockquote>
<br></span>
Yes, that is the case.<span class=""><br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
And I really wanted a way to have a default behavior that would do a<br>
reasonable transition to an absolute tolerance near zero, but I no longer<br>
thing that's possible. (numpy's implimentaion kind of does that, but it is<br>
really wrong for small numbers, and if you made the default min_tolerance<br>
the smallest possible representable number, it really wouldn't be useful.<br>
</blockquote>
<br></span>
I'm going to try to summarise what I got out of this discussion. Maybe it will help bring some focus to the topic.<br>
<br>
I think there are two case's to consider.<br>
<br>
# The most common case.<br>
rel_is_good(actual, expected, delta) # value +- %delta.<br>
<br>
# Testing for possible equivalence?<br>
rel_is_close(value1, value2, delta) # %delta close to each other.<br>
<br>
I don't think they are quite the same thing.<br>
<br>
rel_is_good(9, 10, .1) --> True<br>
rel_is_good(10, 9, .1) --> False<br>
<br>
rel_is_close(9, 10, .1) --> True<br>
rel_is_close(10, 9, .1) --> True<br>
<br>
<br>
In the "is close" case, it shouldn't matter what order the arguments are given. The delta is the distance from the larger number the smaller number is. (of the same sign)<br>
<br>
So when calculating the relative error from two values, you want it to be consistent with the rel_is_close function.<br>
<br>
rel_is_close(a, b, delta) <---> rel_err(a, b) <= delta<br>
<br>
And you should not use the rel_err function in the rel_is_good function.<br>
<br>
<br>
<br>
The next issue is, where does the numeric accuracy of the data, significant digits, and the languages accuracy (ULPs), come into the picture.<br>
<br>
My intuition.. I need to test the idea to make a firmer claim.. is that in the case of is_good, you want to exclude the uncertain parts, but with is_close, you want to include the uncertain parts.<br>
<br>
Two values "are close" if you can't tell one from the other with certainty. The is_close range includes any uncertainty.<br>
<br>
A value is good if it's within a range with certainty. And this excludes any uncertainty.<br>
<br>
This is where taking in consideration of an absolute delta comes in. The minimum range for both is the uncertainty of the data. But is_close and is_good do different things with it.<br>
<br>
Of course all of this only applies if you agree with these definitions of is_close, and is_good. ;)<br>
<br>
Cheers,<br>
Ron<div class=""><div class="h5"><br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
Python-ideas mailing list<br>
<a href="mailto:Python-ideas@python.org" target="_blank">Python-ideas@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-ideas" target="_blank">https://mail.python.org/<u></u>mailman/listinfo/python-ideas</a><br>
Code of Conduct: <a href="http://python.org/psf/codeofconduct/" target="_blank">http://python.org/psf/<u></u>codeofconduct/</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><br>Christopher Barker, Ph.D.<br>Oceanographer<br><br>Emergency Response Division<br>NOAA/NOS/OR&R (206) 526-6959 voice<br>7600 Sand Point Way NE (206) 526-6329 fax<br>Seattle, WA 98115 (206) 526-6317 main reception<br><br><a href="mailto:Chris.Barker@noaa.gov" target="_blank">Chris.Barker@noaa.gov</a></div>
</div></div></div></div></div>