On 01/15/2015 01:29 AM, Steven D'Aprano wrote:
On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:
The question of which to use as the denominator is more subtle. Like you, I used to think that you should choose ahead of time which value was expected and which was actual, and divide by the actual. Or should that be the expected? I could never decide which I wanted: error relative to the expected, or error relative to the actual? And then I could never remember which order the two arguments went.
Finally I read Bruce Dawson (I've already linked to his blog three or four times) and realised that he is correct and I was wrong. Error calculations should be symmetrical, so that
error(a, b) == error(b, a)
regardless of whether you have absolute or relative error. Furthermore, for safety you normally want the larger estimate of error, not the smaller: given the choice between
(abs(a - b))/abs(a)
versus
(abs(a - b))/abs(b)
you want the *larger* error estimate, which means the *smaller* denominator. That's the conservative way of doing it.
A concrete example: given a=5 and b=7, we have:
absolute error = 2 relative error (calculated relative to a) = 0.4 relative error (calculated relative to b) = 0.286
That is, b is off by 40% relative to a; or a is off by 28.6% relative to b. Or another way to put it, given that a is the "true" value, b is 40% too big; or if you prefer, 28.6% of b is in error.
Whew! Percentages are hard! *wink*
Ewww the P word. :)
The conservative, "safe" way to handle this is to just treat the error function as symmetrical What if we are not concerned with the location two points are relative to zero? Or if the numbers straddle zero? and always report the larger of the two relative errors (excluding the case where the denominator is 0, in which case the relative error is either 100% or it doesn't exist). Worst case, you may reject some values which you should accept, but you will never accept any values that you should reject.
Consider two points that are a constant distance apart, but moving relative to zero. Their closeness doesn't change, but the relative error in respect to each other (and zero) does change. There is an implicit assumption that the number system used and the origin the numbers are measured from are chosen and relate to each other in some expected way. When ever you supply all the numbers, like in a test, it's not a problem, you just give good numbers.
Note that you would never compare to an expected value of zero.
You *cannot* compare to an expected value of zero, but you certainly can be in a situation where you would like to: math.sin(math.pi) should return 0.0, but doesn't, it returns 1.2246063538223773e-16 instead. What is the relative error of the sin function at x = math.pi?
relerr(a - b, expected_feet) < tolerance # relative feet from b relerr(a - 0, expected_feet) < tolerance # relative feet from zero relerr(a - b, ulp) # percentage of ulp's
I don't understand what you think these three examples are showing.
A percentage of an expected distance. Error of two points compared to a specific distance. >>> relerr(5 - -5, 10) 0.0 I think unless you use decimal, the ulp example will either be zero or some large multiple of ulp.
Take a look at the statistics test suite.
I definitely will. :-)
I'll be the first to admit that the error tolerances are plucked from thin air, based on what I think are "close enough", but they show how such a function might work:
* you provide two values, and at least one of an absolute error tolerance and a relative error; * if the error is less than the error(s) you provided, the test passes, otherwise it fails; * NANs and INFs are handled apprpriately.
is_close(218.345, 220, 1, .05) # OHMs is_close(a, b, ULP, 2) # ULPs is_close(a, b, AU, .001) # astronomical units
I don't see anyway to generalise those with just a function.
Generalise in what way?
I meant a function that would work in many places without giving some sort size and tolerance hints. Given two floating point numbers and noting else, I don't think you can tell if they represent something that is close without assuming some sort of context. At best, you need to assume the distance from zero and the numbers used are chosen to give a meaningful return value. While that can sometimes work, I don't think you can depend on it.
By using objects we can do a bit more. I seem to recall coming across measurement objects some place. They keep a bit more context with them.
A full system of
arithmetic is a *much* bigger problem than just calculating error estimates correctly, and should be a third-party library before even considering it for the std lib.
Yes, I agree. There are a few of them out there already. Cheers, Ron