[Python-ideas] Way to check for floating point "closeness"?

Neil Girdhar mistersheik at gmail.com
Fri Jan 16 16:13:53 CET 2015


Actually, I was wrong about the exponential distribution's KL divergence.
It's the relative error (b-a)/b  plus another term: log(b/a) — so I guess I
don't see what relative error means except as a heuristic.

Anyway, even if your symmetric error makes sense to you, does anyone
already use it?  If it were up to me, relative error would be b-a/b +
log(b/a), but since no one uses that, I think it's a bad idea.

On Thu, Jan 15, 2015 at 10:46 PM, Neil Girdhar <mistersheik at gmail.com>
wrote:

>
>
> On Thu, Jan 15, 2015 at 10:28 PM, Stephen J. Turnbull <stephen at xemacs.org>
> wrote:
>
>> Neil Girdhar writes:
>>
>>  > The symmetric error that people are proposing in this thread has no
>>  > intuitive meaning to me.
>>
>> There are many applications where the goal is to match two values,
>> neither of which is the obvious standard (eg, statistical tests
>> comparing populations,
>
>
> No, if you're trying to answer the question whether two things belong to
> the same population as opposed to another, you should infer the population
> statistics based on a and b and a your estimated overall population
> statistics and then calculate cross entropies.  Using some symmetric cross
> relative error has no meaning.
>
>
>> or even electrical circuits, where it may be
>> important that two components be matched to within 1%, although the
>> absolute value might be allowed to vary by up to 10%).  Symmetric
>> error is appropriate for those applications.  Symmetric error may be
>> less appropriate for applications where you want to hit an absolute
>> value, but it's (provably) not too bad.
>>
>> By "provably not too bad" I mean that if you take the word "close" as
>> a qualitative predicate, then although you can make the "distance"
>> explode by taking the "actual" to be an order of magnitude distant in
>> absolute units, you'll still judge it "not close" (just more so, but
>> "more so" is meaningless in this qualitative context).  On the other
>> hand, for values that *are* close (with reasonable tolerances) it
>> doesn't much matter which value you choose as the standard: "most" of
>> the time you will get the "right" answer (and as the tolerance gets
>> tighter, "most" tends to a limit of 100%).
>>
>
> In statistics and machine learning at least many people have argued that
> the cross entropy error is the most reasonable loss function.   When you
> have an observed value and estimated value, the right way of comparing them
> is a cross entropy error, and that's what absolute error and relative error
> are doing.  They correspond to cross entropies of the minimum assumptive
> distributions over the reals and positive reals.
>
> I think the numpy.allclose function almost always gives you what you want
> when you have an actual and an estimated value, which is the more usual
> case.
>
>
>>
>> The generic "are_close()" function should be symmetric.  I suppose it
>> might also to useful to have an "is_close_to()" function that is
>> asymmetric.
>>
>
> I disagree. Since the usual case is to have an observed and estimated
> value, then the close function should not be symmetric.  Either you should
> have two functions: relative error and absolute error, or you should
> combine them like numpy did.
>
> Best,
>
> Neil
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150116/79b2bd57/attachment.html>


More information about the Python-ideas mailing list