[Python-ideas] Floating point "closeness" Proposal Outline

Neil Girdhar mistersheik at gmail.com
Tue Jan 20 05:10:35 CET 2015


If you decide to make a PEP, please list the other algorithms you found and 
their definitions.  Personally, I'm for being consistent with numpy and 
defining math.isclose similar to numpy.isclose for consistency alone.

If you decide to invent a relative error function, my suggestion is: 
(a-b)/b + log(b/a), which is nonnegative, zero only at equality, and 
otherwise penalizes positive a for being different than some target 
positive b.  To me, it seems like guessing b using 1.9b is better than 
guessing it as 0.1b, and so on.  This corresponds to exponential KL 
divergence, which has a clear statistical meaning, but only applies to 
positive numbers.

Best,

Neil

On Monday, January 19, 2015 at 7:59:34 PM UTC-5, Chris Barker - NOAA 
Federal wrote:
>
> > On Jan 19, 2015, at 3:17 PM, Ron Adam <ron... at gmail.com <javascript:>> 
> wrote: 
> > The two different cases probably should be two different functions, and 
> not use a flag.  I'm not suggesting we need both. 
>
> I agree there--"strong" it is for the initial proposal at least. 
>
> > Well, approximate-in, approximate-out.  Unfortunately that applies to 
> all math 
>
> > But many computer programmers like things to be a bit more precise. 
>
> Well, yes and no . Floating point has its limitations, and many 
> programmers simply use double (python float), and hope the results are 
> good enough. What I'm hoping is that this will at least make it more 
> likely folks will apply the "good enough" criteria. 
>
> Proper floating point error analysis is hard, most of us leave it for 
> the experts, and I'm not suggesting this is useful for them. 
>
>
>
> > 
> > So I suggest not using the word approximate or estimate in the docs. 
>  The calculation isn't an approximation even if the values you supply it 
> are. It actually is a well defined range test. 
> > 
> >>        I hope we can come to some consensus that something like this is 
> >>        the way to go. 
> >> 
> >>    Good examples will help with this.  It may also help with choosing a 
> >>    good name. 
> >> 
> >> 
> >> you mean use-case examples? rather than specific value examples? 
> > 
> > Yes, specific values don't indicate how something should be used. 
> > 
> > 
> >>    To me, the strong version is an "is-good" test, and the weak version 
> is 
> >>    an "is-close" test.  I think it could be important to some people. 
> >> 
> >>    I like the idea of being able to use these as a teaching tool to 
> >>    demonstrate how our ideas of closeness, equality, and inequality can 
> be 
> >>    subjective. 
> >> 
> >> 
> >> Are you suggesting that we allow a flag for the user to set to choose 
> >> whether ot use weak or string version? I'd rather not -- I see this is 
> a 
> >> practical, works most of the time thing, not a teaching tool, or a 
> >> "provides every use case" tool. 
> > 
> > No flag, just that it needs to be well defined and not mix explanations 
> of use of one with the other.  Pick one, and then document how to use it 
> correctly.  At some point maybe someone will add the other if it's needed. 
> > 
> > It is possible to use one for the other if you take the differences into 
> account in the arguments. 
> > 
> > 
> >>    There are two cases... 
> >> 
> >>    1: (The weak version is require for this to work.) 
> >> 
> >>    Two numbers are definitely not equivalent if they are further apart 
> >>    than the largest error amount.  (The larger number better indicates 
> the 
> >>    largeness of the the possible relative error.) 
> >> 
> >>    And two numbers are close if you can't determine if they are 
> >>    equivalent, or not-equivalent with certainty.* 
> >> 
> >>    (* "close numbers" may include equivalent numbers if you define it 
> as a 
> >>    set of all definitely not-equivalent numbers.) 
> >> 
> >>    2: (The strong version is required for this to work.) 
> >> 
> >>    A value is good if it's within a valid range with certainty.  It is 
> >>    less than the smaller relative range of either number.  The smaller 
> >>    number better indicates the magnitude of smallness. 
> >> 
> >>    So case 1 should be used to test for errors, and case 2 should be 
> used 
> >>    to test for valid ranges. 
> >> 
> >>    It seems you have the 2nd case in mind, and that's fine.  Some of us 
> >>    where thinking of the first case, and possibly switching from one to 
> >>    the other during the discussion which is probably why it got 
> confusing 
> >>    or repetitious at some points. 
> >> 
> >> 
> >> yes, I suppose I do -- and again, in the common use case, where the 
> >> tolerance is also approximate, it really doesn't matter. 
> > 
> > I'm curious to what degree it can matter, given different size values 
> and tolerances? 
> > 
> > 
> >>    I think both of these are useful, but you definitely need to be 
> clear 
> >>    which one you are implementing, and to document it clearly. 
> >> 
> >> 
> >> yup. 
> > 
> > Cheers, 
> >   Ron 
> > 
> > _______________________________________________ 
> > Python-ideas mailing list 
> > Python... at python.org <javascript:> 
> > https://mail.python.org/mailman/listinfo/python-ideas 
> > Code of Conduct: http://python.org/psf/codeofconduct/ 
> _______________________________________________ 
> Python-ideas mailing list 
> Python... at python.org <javascript:> 
> https://mail.python.org/mailman/listinfo/python-ideas 
> Code of Conduct: http://python.org/psf/codeofconduct/ 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150119/e5de9d94/attachment-0001.html>


More information about the Python-ideas mailing list