
On 27 January 2015 at 13:08, Steven D'Aprano <steve@pearwood.info> wrote:
Symmetry and asymmetry of "close to" is a side-effect of the way you calculate the fuzzy comparison. In real life, "close to" is always symmetric because distance is the same whether you measure from A to B or from B to A. The distance between two numbers is their difference, which is another way of saying the error between them:
delta = abs(x - y)
(delta being the traditional name for this quantity in mathematics), and obviously delta doesn't depend on the order of x and y.
Asymmetry is bad, because it is rather surprising and counter-intuitive
that "x is close to y", but "y is not close to x". It may also be bad in a practical sense, because people will forget which order they need to give x and y and will give them in the wrong order. I started off with an approx_equal function in test_statistics that was symmetric, and I could never remember which way the arguments went.
Time permitting, over the next day or so I'll draw up some diagrams to
show how each of these tactics change what counts as close or not close.
If you consider the comparsion to be: abs(x-y) <= rel_tol * ref where "ref" is your "reference" value, then all of these are questions about what "ref" is. Possibilities include: * ref = abs(x) (asymmetric version, useful for comparing against a known figure) * ref = max(abs(x),abs(y)) (symmetric version) * ref = abs(x)+abs(y) or (abs(x)+abs(y))/2 (alternate symmetric version) * ref = zero_tol / rel_tol (for comparisons against zero) * ref = abs_tol/rel_tol (for completeness) If you're saying:
z = 1.0 - sum([0.1]*10) >>> z == 0 False is_close(0.0, z) True
your "reference" value is probably really "1.0" or "0.1" since those are the values you're working with, but neither of those values are derivable from the arguments provided to is_close(). Assuming x,y are non-negative and is_close(x,y,rel_tol=r): ref = x: -rx <= y-x <= rx ref = max(x,y): -rx <= y-x <= ry ref = (x+y)/2: -r*(x+y)/2 <= y-x <= r*(x+y)/2 If you set r and x as a constant, then the amounts y can be (below, above) x for the cases above are: rx, rx rx, rx/(1-r) rx/(1+r/2), rx/(1-r/2) Since r>0, 1-r != 1, and 1+r/2 != 1-r/2, so these each give slightly different ranges for a valid y. They're pretty trivial differences though; eg r=1e-8 and x=10 gives: rx = 1e-7 rx/(1-r) = 1.00000001e-07 rx/(1-r/2) = 1.000000005e-07 rx/(1+r/2) = 0.999999995e-07 If you're looking at 10% margins for a nominally 100 Ohm resistor (r=0.1, x=100), that'd translate to deltas of: rx = 10.0 rx/(1-r) = 11.11 rx/(1-r/2) = 10.526 rx/(1+r/2) = 9.524 Having an implementation like: def is_close(a, b=None, tol=1e-8, ref=None): assert (a != 0 and b != 0) or ref is not None if b is None: assert ref is not None b = ref if ref is None: ref = abs(a)+abs(b) return abs(a-b) <= tol*ref might give you the best of all worlds -- it would let you say things like:
is_close(1.0, sum([0.1]*10)) True
is_close(11, ref=10, tol=0.1) True
n = 26e10 a = n - sum([n/6]*6) b = n - sum([n/7]*7) a, b (-3.0517578125e-05, 0.0) is_close(a, b, ref=n) True is_close(a, b, ref=1) False is_close(a, b) AssertionError
and get reasonable looking results, I think? (If you want to use an absolute tolerance, you just specify ref=1, tol=abs_tol). An alternative thought: rather than a single "is_close" function, maybe it would make sense for is_close to always be relative, and just provide a separate function for absolute comparisons, ie: def is_close(a, b, tol=1e-8): assert a != 0 and b != 0 # or assert (a==0) == (b==0) return abs(a-b) <= tol*(a+b) def is_close_abs(a,b, tol=1e-8): return abs(a-b) <= tol def is_near_zero(a, tol=1e-8): return abs(a) <= tol Then you'd use is_close() when you wanted something symmetric and easy, and were mopre interested in rough accuracy than absolute precision, and if you wanted to do a 10% resistor check you'd either say: is_close_abs(r, 100, tol=10) or is_near_zero(a-100, tol=10) If you had a sequence of numbers and wanted to do both relative comparisons (first n significant digits match) and absolute comparisons you'd just have to say: for a in nums: assert is_close(a, b) or is_close_abs(a, b) which doesn't seem that onerous. Cheers, aj -- Anthony Towns <aj@erisian.com.au>