[Python-ideas] Floating point "closeness" Proposal Outline
Ron Adam
ron3200 at gmail.com
Wed Jan 21 05:43:58 CET 2015
On 01/20/2015 04:40 AM, Steven D'Aprano wrote:
> On Mon, Jan 19, 2015 at 08:10:35PM -0800, Neil Girdhar wrote:
>
>> >If you decide to invent a relative error function,
> The error functions we have been talking about are hardly "invented".
> They're mathematically simple and obvious, and can be found in just
> about any undergraduate book on numerical computing:
>
> - The absolute error between two quantities a and b is the
> absolute difference between them, abs(a-b).
>
> - The relative error is the difference relative to some
> denominator d, abs(a-b)/abs(d), typically with d=a or d=b.
>
> If you happen to know that b is the correct value, then it is common to
> choose b as the denominator. If you have no a priori reason to think
> either a or b is correct, or if you prefer a symmetrical function, a
> common choice is to use d = min(abs(a), abs(b)).
The more we talk about this, the more I'm beginning to dislike the
symmetric version.
We are trading an explicit (first, second) relationship with an implicit
(smaller, larger) relationship. For Python's general use, I don't like
that. Sorry. :/
Example...
Testing two resistors are within 5% of 10k ohms.
is_close(10500, 10000, 0.05) 9500.0<--->10500.0 True
is_close(10000, 9500, 0.05) 9025.0<--->9975.0 False
The order doesn't matter, but the size has an effect.
Using the larger value as the divider can result in false positives on the
lower end instead of false negative on the higher end.
So I strongly think any such function (in Python) should have meaningful
named arguments. It's going to save a lot of explaining down the road. :-)
> See, for example:
>
> http://mathworld.wolfram.com/AbsoluteError.html
> http://mathworld.wolfram.com/RelativeError.html
>
>
>> >my suggestion is:
>> >(a-b)/b + log(b/a), which is nonnegative, zero only at equality, and
>> >otherwise penalizes positive a for being different than some target
>> >positive b. To me, it seems like guessing b using 1.9b is better than
>> >guessing it as 0.1b, and so on. This corresponds to exponential KL
>> >divergence, which has a clear statistical meaning, but only applies to
>> >positive numbers.
> Do you have a reference or derivation for this? I'm happy to admit that
> I'm no Knuth or Kahan, but I've read a bit of numerical computing[1] and
> I've never seen anyone add a log term. I'm not even sure why you would
> do so.
I found this... There is a short paragraph there about measuring error in
natural log units.
http://people.duke.edu/~rnau/411log.htm
This next one has a lot of good info I think is relevant to this topic.
http://www2.phy.ilstu.edu/~wenning/slh/
It does have one symmetric formula... relative difference, that uses the
average of the two values. It's probably what most people what for
comparing two generated data point where neither one can be picked as a
reference.
> [1] I know just enough to know how much I don't know.
I know even less.
I'm hoping to know what I don't know soon, but I think I have a lot to
learn still. ;-)
Cheers,
Ron
More information about the Python-ideas
mailing list