[Python-ideas] Floating point "closeness" Proposal Outline
Neil Girdhar
mistersheik at gmail.com
Tue Jan 20 05:29:37 CET 2015
Also for complex numbers, I think comparing the magnitude (distance from
the origin, or absolute value) of (x-y) to the size of x or y makes more
sense than calling is_close on the real and imaginary parts. What if the
real parts are much larger than the imaginary parts, e.g. x=1e5+1e-5j,
y=1e5-1e-5j. Do you think x and y are not close?
Best,
Neil
On Monday, January 19, 2015 at 1:33:44 AM UTC-5, Chris Barker wrote:
>
> OK folks,
>
> There has been a lot of chatter about this, which I think has served to
> provide some clarity, at least to me. However, I'm concerned that the
> upshot, at least for folks not deep into the discussion, will be: clearly
> there are too many use-case specific details to put any one thing in the
> std lib. But I still think we can provide something that is useful for most
> use-cases, and would like to propose what that is, and what the decision
> points are:
>
> A function for the math module, called somethign like "is_close",
> "approx_equal", etc. It will compute a relative tolerance, with a default
> maybe around 1-e12, with the user able to specify the tolerance they want.
>
> Optionally, the user can specify an "minimum absolute tolerance", it will
> default to zero, but can be set so that comparisons to zero can be handled
> gracefully.
>
> The relative tolerance will be computed from the smallest of the two input
> values, so as to get symmetry : is_close(a,b) == is_close(b,a). (this is
> the Boost "strong" definition, and what is used by Steven D'Aprano's code
> in the statistics test module)
>
> Alternatively, the relative error could be computed against a particular
> one of the input values (the second one?). This would be asymmetric, but be
> more clear exactly how "relative" is defined, and be closer to what people
> may expect when using it as a "actual vs expected" test. --- "expected"
> would be the scaling value. If the tolerance is small, it makes very little
> difference anyway, so I'm happy with whatever consensus moves us to. Note
> that if we go this way, then the parameter names should make it at least a
> little more clear -- maybe "actual" and "expected", rather than x and y or
> a and b or... and the function name should be something like is_close_to,
> rather than just is_close.
>
> It will be designed for floating point numbers, and handle inf, -inf, and
> NaN "properly". But is will also work with other numeric types, to the
> extent that duck typing "just works" (i.e. division and comparisons all
> work).
>
> complex numbers will be handled by:
> is_close(x.real, y.real) and is_close(x.imag, y.imag)
> (but i haven't written any code for that yet)
>
> It will not do a simple absolute comparison -- that is the job of a
> different function, or, better yet, folks just write it themselves:
>
> abs(x - y) <= delta
>
> really isn't much harder to write than a function call:
>
> absolute_diff(x,y,delta)
>
> Here is a gist with a sample implementation:
>
> https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a
>
> I need to add more tests, and make the test proper unit tests, but it's a
> start.
>
> I also need to see how it does with other data types than float --
> hopefully, it will "just work" with the core set.
>
> I hope we can come to some consensus that something like this is the way
> to go.
>
> -Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Sun, Jan 18, 2015 at 11:27 AM, Ron Adam <ron... at gmail.com <javascript:>
> > wrote:
>
>>
>>
>> On 01/17/2015 11:37 PM, Chris Barker wrote:
>>
>>> (Someone claimed that 'nothing is close to zero'. This is
>>> nonsensical both in applied math and everyday life.)
>>>
>>>
>>> I'm pretty sure someone (more than one of use) asserted that "nothing is
>>> *relatively* close to zero -- very different.
>>>
>>
>> Yes, that is the case.
>>
>>
>> And I really wanted a way to have a default behavior that would do a
>>> reasonable transition to an absolute tolerance near zero, but I no longer
>>> thing that's possible. (numpy's implimentaion kind of does that, but it
>>> is
>>> really wrong for small numbers, and if you made the default min_tolerance
>>> the smallest possible representable number, it really wouldn't be useful.
>>>
>>
>> I'm going to try to summarise what I got out of this discussion. Maybe
>> it will help bring some focus to the topic.
>>
>> I think there are two case's to consider.
>>
>> # The most common case.
>> rel_is_good(actual, expected, delta) # value +- %delta.
>>
>> # Testing for possible equivalence?
>> rel_is_close(value1, value2, delta) # %delta close to each other.
>>
>> I don't think they are quite the same thing.
>>
>> rel_is_good(9, 10, .1) --> True
>> rel_is_good(10, 9, .1) --> False
>>
>> rel_is_close(9, 10, .1) --> True
>> rel_is_close(10, 9, .1) --> True
>>
>>
>> In the "is close" case, it shouldn't matter what order the arguments are
>> given. The delta is the distance from the larger number the smaller number
>> is. (of the same sign)
>>
>> So when calculating the relative error from two values, you want it to be
>> consistent with the rel_is_close function.
>>
>> rel_is_close(a, b, delta) <---> rel_err(a, b) <= delta
>>
>> And you should not use the rel_err function in the rel_is_good function.
>>
>>
>>
>> The next issue is, where does the numeric accuracy of the data,
>> significant digits, and the languages accuracy (ULPs), come into the
>> picture.
>>
>> My intuition.. I need to test the idea to make a firmer claim.. is that
>> in the case of is_good, you want to exclude the uncertain parts, but with
>> is_close, you want to include the uncertain parts.
>>
>> Two values "are close" if you can't tell one from the other with
>> certainty. The is_close range includes any uncertainty.
>>
>> A value is good if it's within a range with certainty. And this excludes
>> any uncertainty.
>>
>> This is where taking in consideration of an absolute delta comes in. The
>> minimum range for both is the uncertainty of the data. But is_close and
>> is_good do different things with it.
>>
>> Of course all of this only applies if you agree with these definitions of
>> is_close, and is_good. ;)
>>
>> Cheers,
>> Ron
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python... at python.org <javascript:>
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
>
> Chris.... at noaa.gov <javascript:>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150119/9afdef67/attachment-0001.html>
More information about the Python-ideas
mailing list