[Python-ideas] Way to check for floating point "closeness"?

Fri Jan 16 04:46:05 CET 2015

On Thu, Jan 15, 2015 at 10:28 PM, Stephen J. Turnbull <stephen at xemacs.org>
wrote:

> Neil Girdhar writes:
>
>  > The symmetric error that people are proposing in this thread has no
>  > intuitive meaning to me.
>
> There are many applications where the goal is to match two values,
> neither of which is the obvious standard (eg, statistical tests
> comparing populations,

No, if you're trying to answer the question whether two things belong to
the same population as opposed to another, you should infer the population
statistics based on a and b and a your estimated overall population
statistics and then calculate cross entropies.  Using some symmetric cross
relative error has no meaning.

> or even electrical circuits, where it may be
> important that two components be matched to within 1%, although the
> absolute value might be allowed to vary by up to 10%).  Symmetric
> error is appropriate for those applications.  Symmetric error may be
> less appropriate for applications where you want to hit an absolute
> value, but it's (provably) not too bad.
>
> By "provably not too bad" I mean that if you take the word "close" as
> a qualitative predicate, then although you can make the "distance"
> explode by taking the "actual" to be an order of magnitude distant in
> absolute units, you'll still judge it "not close" (just more so, but
> "more so" is meaningless in this qualitative context).  On the other
> hand, for values that *are* close (with reasonable tolerances) it
> doesn't much matter which value you choose as the standard: "most" of
> the time you will get the "right" answer (and as the tolerance gets
> tighter, "most" tends to a limit of 100%).
>

In statistics and machine learning at least many people have argued that
the cross entropy error is the most reasonable loss function.   When you
have an observed value and estimated value, the right way of comparing them
is a cross entropy error, and that's what absolute error and relative error
are doing.  They correspond to cross entropies of the minimum assumptive
distributions over the reals and positive reals.

I think the numpy.allclose function almost always gives you what you want
when you have an actual and an estimated value, which is the more usual
case.

>
> The generic "are_close()" function should be symmetric.  I suppose it
> might also to useful to have an "is_close_to()" function that is
> asymmetric.
>

I disagree. Since the usual case is to have an observed and estimated
value, then the close function should not be symmetric.  Either you should
have two functions: relative error and absolute error, or you should
combine them like numpy did.

Best,

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150115/57321ba6/attachment.html>