On 27 January 2015 at 13:08, Steven D'Aprano <steve@pearwood.info> wrote:
Symmetry and asymmetry of "close to" is a side-effect of the way you
calculate the fuzzy comparison. In real life, "close to" is always
symmetric because distance is the same whether you measure from A to B
or from B to A. The distance between two numbers is their difference,
which is another way of saying the error between them:

delta = abs(x - y)

(delta being the traditional name for this quantity in mathematics), and
obviously delta doesn't depend on the order of x and y.

Asymmetry is bad, because it is rather surprising and counter-intuitive
that "x is close to y", but "y is not close to x". It may also be bad in
a practical sense, because people will forget which order they need to
give x and y and will give them in the wrong order. I started off with
an approx_equal function in test_statistics that was symmetric, and I
could never remember which way the arguments went.

Time permitting, over the next day or so I'll draw up some diagrams to
show how each of these tactics change what counts as close or not close.

​If you consider the comparsion to be:

   abs(x-y) <= rel_tol * ref

where "ref" is your "reference" value, then all of these are questions about what "ref" is. Possibilities include:

 * ref = abs(x)
     (asymmetric version, useful for comparing against a known figure)
 * ref = max(abs(x),abs(y))
     (symmetric version)
 * ref = abs(x)+abs(y) or (abs(x)+abs(y))/2
    ​
​​
​ (alternate symmetric version)​

​* ref = zero_tol / rel_tol
     (for comparisons against zero)​
​ * ref = abs_tol/rel_tol
     (for completeness)​​

​If you're saying:

 >>> z = 1.0 - sum([0.1]*10)​
​​​​​
​ >>> z == 0
 False
 >>> is_close(0.0, z)
 True

your "reference" value is probably really "1.0" or "0.1" since those are the values you're working with, but neither of those values are derivable from the arguments provided to is_close().

Assuming x,y are non-negative and is_close(x,y,rel_tol=r):

 ref = x:
   -rx <= y-x <= rx

 ref = max(x,y):
   -rx <= y-x <= ry

 ref = (x+y)/2:
   -r*(x+y)/2 <= y-x <= r*(x+y)/2  
​   ​
​If you set r and x as a constant​, then the amounts y can be (below, above) x for the cases above are:

 rx, rx
 rx, rx/(1-r)
 rx/(1+r/2), rx/(1-r/2)

Since r>0, 1-r != 1, and 1+r/2 != 1-r/2, so these each give slightly different ranges for a valid y. They're pretty trivial differences though; eg r=1e-8 and x=10 gives:

 rx         = 1e-7
 rx/(1-r)   = 1.00000001e-07
 rx/(1-r/2) = 1.000000005e-07
 rx/(1+r/2) = 0.999999995e-07

If you're looking at 10% margins for a nominally 100 Ohm resistor (r=0.1, x=100), that'd translate to deltas of:

 rx         = 10.0
 rx/(1-r)   = 11.11
 rx/(1-r/2) = 10.526
 rx/(1+r/2) =  9.524

Having an implementation like:

 def is_close(a, b=None, tol=1e-8, ref=None):
   assert (a != 0 and b != 0) or ref is not None
   if b is None:
     assert ref is not None
     b = ref
   if ref is None:
     ref = abs(a)+abs(b)
   return abs(a-b) <= tol*ref

might give you the best of all worlds -- it would let you say things like:

  >>> is_close(1.0, sum([0.1]*10))
  True

  >>> is_close(11, ref=10, tol=0.1)
  True

  >>> n = 26e10
  >>> a = n - sum([n/6]*6)
  >>> b = n - sum([n/7]*7)
  >>> a, b
  (-3.0517578125e-05, 0.0)
  >>> is_close(a, b, ref=n)
  True
  >>> is_close(a, b, ref=1)
  False
  >>> is_close(a, b)
  AssertionError

and get reasonable looking results, I think? (If you want to use an absolute tolerance, you just specify ref=1, tol=abs_tol).

An alternative thought: rather than a single "is_close" function, maybe it would make sense for is_close to always be relative, and just provide a separate function for absolute comparisons, ie:

 def is_close(a, b, tol=1e-8):
    assert a != 0 and b != 0
    # or assert (a==0) == (b==0)
    return abs(a-b) <= tol*(a+b)

 def is_close_abs(a,b, tol=1e-8):
    return abs(a-b) <= tol

​ def is_near_zero(a, tol=1e-8):
    return abs(a) <= tol​

 
Then you'd use is_close() when you wanted something symmetric and easy, and were mopre interested in rough accuracy than absolute precision​, and if you wanted to do a 10% resistor check you'd either say:

   is_close_abs(r, 100, tol=10)

or

   is_near_zero(a-100, tol=10)

If you had a sequence of numbers and wanted to do both relative comparisons (first n significant digits match) and absolute comparisons you'd just have to say:

  for a in nums:
     assert is_close(a, b) or is_close_abs(a, b)

​which doesn't seem that onerous.​

​Cheers,
aj​

--
Anthony Towns <aj@erisian.com.au>