[Python-ideas] Floating point "closeness" Proposal Outline

Wed Jan 21 05:43:58 CET 2015

On 01/20/2015 04:40 AM, Steven D'Aprano wrote:
> On Mon, Jan 19, 2015 at 08:10:35PM -0800, Neil Girdhar wrote:
>
>> >If you decide to invent a relative error function,
> The error functions we have been talking about are hardly "invented".
> They're mathematically simple and obvious, and can be found in just
> about any undergraduate book on numerical computing:
>
> - The absolute error between two quantities a and b is the
>    absolute difference between them, abs(a-b).
>
> - The relative error is the difference relative to some
>    denominator d, abs(a-b)/abs(d), typically with d=a or d=b.
>
> If you happen to know that b is the correct value, then it is common to
> choose b as the denominator. If you have no a priori reason to think
> either a or b is correct, or if you prefer a symmetrical function, a
> common choice is to use d = min(abs(a), abs(b)).

The more we talk about this, the more I'm beginning to dislike the 
symmetric version.

We are trading an explicit (first, second) relationship with an implicit 
(smaller, larger) relationship.  For Python's general use, I don't like 
that.  Sorry. :/

Example...

Testing two resistors are within 5% of 10k ohms.

is_close(10500, 10000, 0.05)   9500.0<--->10500.0  True
is_close(10000, 9500, 0.05)   9025.0<--->9975.0  False

The order doesn't matter, but the size has an effect.

Using the larger value as the divider can result in false positives on the 
lower end instead of false negative on the higher end.

So I strongly think any such function (in Python) should have meaningful 
named arguments.  It's going to save a lot of explaining down the road.  :-)

> See, for example:
>
> http://mathworld.wolfram.com/AbsoluteError.html
> http://mathworld.wolfram.com/RelativeError.html
>
>
>> >my suggestion is:
>> >(a-b)/b + log(b/a), which is nonnegative, zero only at equality, and
>> >otherwise penalizes positive a for being different than some target
>> >positive b.  To me, it seems like guessing b using 1.9b is better than
>> >guessing it as 0.1b, and so on.  This corresponds to exponential KL
>> >divergence, which has a clear statistical meaning, but only applies to
>> >positive numbers.
> Do you have a reference or derivation for this? I'm happy to admit that
> I'm no Knuth or Kahan, but I've read a bit of numerical computing[1] and
> I've never seen anyone add a log term. I'm not even sure why you would
> do so.

I found this... There is a short paragraph there about measuring error in 
natural log units.

      http://people.duke.edu/~rnau/411log.htm

This next one has a lot of good info I think is relevant to this topic.

      http://www2.phy.ilstu.edu/~wenning/slh/

It does have one symmetric formula... relative difference, that uses the 
average of the two values.  It's probably what most people what for 
comparing two generated data point where neither one can be picked as a 
reference.

> [1] I know just enough to know how much I don't know.

I know even less.

I'm hoping to know what I don't know soon, but I think I have a lot to 
learn still.   ;-)

Cheers,
    Ron