[Python-ideas] PEP 485: A Function for testing approximate equality

Steven D'Aprano steve at pearwood.info
Mon Jan 26 06:54:38 CET 2015


On Mon, Jan 26, 2015 at 01:17:02AM +0000, Nathaniel Smith wrote:
> On Mon, Jan 26, 2015 at 12:02 AM, Chris Barker <chris.barker at noaa.gov> wrote:
[...]
> > Though I guess I'd rather a zero_tol that defaulted to non-zero that an
> > abs_tol that did. So we might be able to satisfy your observation that a lot
> > of use cases call for testing against zero.
> 
> Yes, that's the idea -- defaulting rel_tol and zero_tol to non-zero
> values, and abs_tol to zero, gives you a set of defaults that will
> just work for people who want to write useful tests without having to
> constantly be distracted by floating point arcana.

I really don't think that setting one or two error tolerances is 
"floating point arcana". I don't think that having to explicitly decide 
on what counts as "close" (as either an absolute difference or a 
relative difference) is especially onerous: surely anyone writing code 
will be able to cope with one or two decisions:

- close enough means they differ by no more than X

- close enough means they differ by no more than X%, expressed 
  as a fraction

This isn't ULPs :-)

I'm almost inclined to not set any defaults, except perhaps zero for 
both (in which case "close to" cleanly degrades down to "exactly equal" 
except slower) and force the user to explicitly choose a value.

Arguments in favour of setting some defaults:

- People who want a "zero-thought" solution will get one, even 
  if it does the wrong thing for their specific application, but 
  at least they didn't have to think about it.

- The defaults might occasionally be appropriate.


Arguments against:

- There is no default error tolerance we can pick, whether relative
  or absolute, which will suit everyone all the time. Unless the 
  defaults are appropriate (say) 50% of the time or more, they will
  just be an attractive nuisance (see zero-thought above).


In the statistics tests, I had the opportunity to set my own global 
defaults, but I don't think I ever actually used them. Maybe I could 
have picked better defaults? I don't know.

I did use defaults per test suite, so that's an argument in favour of 
having is_close (approx_equal) not use defaults, but assertIsClose 
(assertApproxEqual) use per-instance defaults.

[Context: in the tests, I had an assertApproxEqual method that relied on 
approx_equal function. The function had defaults, but I never used them. 
The method defaulted to reading defaults from self.rel and self.tol, and 
I did use them.]

 

> This does require that zero_tol is only applied for expected == 0.0,
> *not* for actual == 0.0, though. If you expected 1e-10 and got 0.0
> then this *might* be okay in your particular situation but it really
> requires the user to think things through; a generic tool should
> definitely flag this by default.

I really think that having three tolerances, once of which is nearly 
always ignored, is poor API design. The user usually knows when they are 
comparing against an expected value of zero and can set an absolute 
error tolerance.

How about this?

- Absolute tolerance defaults to zero (which is equivalent to 
  exact equality).

- Relative tolerance defaults to something (possibly zero) to be
  determined after sufficient bike-shedding.

- An argument for setting both values to zero by default is that
  it will make it easy to choose one of "absolute or relative". You
  just supply a value for the one that you want, and let the other 
  take the default of zero.

- At the moment, I'm punting on the behaviour when both abs and rel
  tolerances are provided. That can be bike-shedded later.


Setting both defaults to zero means that the zero-thought version:

    if is_close(x, y): ...

will silently degrade to x == y, which is no worse than what people 
do now (except slower). We can raise a warning in that case.

The only tricky situation might be if you *may* be comparing against 
zero, but don't know so in advance. There are some solutions to that:

- The suggested "zero_tol" parameter, which I dislike. I think it is 
  an ugly and confusing API.

- Some versions of is_close may not require any special treatment 
  for zero, depending on how it treats the situation where both abs 
  and rel tolerances are given. Further discussion needed.

- Worst case, people write this:

    if (expected == 0 and is_close(actual, expected, tol=1e-8) 
        or is_close(actual, expected, rel=1e-5)):

but I don't think that will come up in practice very often. In the 
test_statistics module, I had tests that looked like this:

    for x in [bunch of non-zero values]:
        y = do_some_calculation(x)
        self.assertApproxEqual(x, y, rel=0.01)
   y = do_some_calculation(0)
   self.assertApproxEqual(0, y, tol=0.000001)

which isn't hard to do, so I don't think this is a real issue in 
practice. I think the case of "my expected value might be zero, but I'm 
not sure in advance" is rare and unusual enough that we don't need to 
worry about it.


[...]
> My claim wasn't that is_close_to(x, 0.0) provides a mathematically
> ill-defined result. I agree that that's a reasonable definition of
> "relatively close to" (though one could make an argument that zero is
> not relatively close to itself -- after all, abs(actual -
> expected)/expected is ill-defined).

Don't write it that way. Write it this way:

abs(actual - expected) <= relative_tolerance*expected

Now if expected is zero, the condition is true if and only if 
actual==expected.

It would be bizarre for is_close(a, a) to return False (or worse, 
raise an exception!) for any finite number. NANs, of course, are 
allowed to be bizarre. Zero is not :-)


> Instead, my point was that if the
> user is asking "is this close to 0?" instead of "is this exactly equal
> to zero?" then they probably are expecting that there exist some
> inputs for which those two questions give different answers. Saying
> "well TECHNICALLY this is a valid definition of 'close to'" is
> certainly true but somewhat unkind.

I agree, but I think this is a symptom of essential complexity in the 
problem domain. Ultimately, "is close" is ill-defined, and *somebody* 
has to make the decision what that will be, and that decision won't 
satisfy everyone always. We can reduce the complexity in one place:

    * provide sensible default values that work for expected != 0

but only by increasing the complexity elsewhere:

   * when expected == 0 the intuition that is_close is different 
     from exact equality fails

We can get rid of that complexity, but only by adding it back somewhere 
else:

   * is_close(x, y, zero_tol=0.1) and is_close(x, y, zero_tol=0.00001)
     give the same result for all the x,y I tested!

that is, zero_tol is nearly always ignored. Since people will often need 
to think about what they want "is close" to mean no matter what we do, I 
would prefer not to add the complexity of a third tolerance value. If 
that means that "zero thought" users end up inadvertently testing for 
exact equality without realising it, I think that's a price worth 
paying for a clean API.

(As I said earlier, we can raise a warning in that case.)


> > I think this points to having a separate function for absolute tolerance
> > compared to zero -- but that's just abs(val) > zero_tolerance, so why
> > bother?
> 
> Or it could just be the same function :-). Who wants to keep track of
> two functions that conceptually do the same thing?

Agreed.



-- 
Steven


More information about the Python-ideas mailing list