[Python-ideas] PEP 485: A Function for testing approximate equality
Steven D'Aprano
steve at pearwood.info
Mon Jan 26 06:54:38 CET 2015
On Mon, Jan 26, 2015 at 01:17:02AM +0000, Nathaniel Smith wrote:
> On Mon, Jan 26, 2015 at 12:02 AM, Chris Barker <chris.barker at noaa.gov> wrote:
[...]
> > Though I guess I'd rather a zero_tol that defaulted to non-zero that an
> > abs_tol that did. So we might be able to satisfy your observation that a lot
> > of use cases call for testing against zero.
>
> Yes, that's the idea -- defaulting rel_tol and zero_tol to non-zero
> values, and abs_tol to zero, gives you a set of defaults that will
> just work for people who want to write useful tests without having to
> constantly be distracted by floating point arcana.
I really don't think that setting one or two error tolerances is
"floating point arcana". I don't think that having to explicitly decide
on what counts as "close" (as either an absolute difference or a
relative difference) is especially onerous: surely anyone writing code
will be able to cope with one or two decisions:
- close enough means they differ by no more than X
- close enough means they differ by no more than X%, expressed
as a fraction
This isn't ULPs :-)
I'm almost inclined to not set any defaults, except perhaps zero for
both (in which case "close to" cleanly degrades down to "exactly equal"
except slower) and force the user to explicitly choose a value.
Arguments in favour of setting some defaults:
- People who want a "zero-thought" solution will get one, even
if it does the wrong thing for their specific application, but
at least they didn't have to think about it.
- The defaults might occasionally be appropriate.
Arguments against:
- There is no default error tolerance we can pick, whether relative
or absolute, which will suit everyone all the time. Unless the
defaults are appropriate (say) 50% of the time or more, they will
just be an attractive nuisance (see zero-thought above).
In the statistics tests, I had the opportunity to set my own global
defaults, but I don't think I ever actually used them. Maybe I could
have picked better defaults? I don't know.
I did use defaults per test suite, so that's an argument in favour of
having is_close (approx_equal) not use defaults, but assertIsClose
(assertApproxEqual) use per-instance defaults.
[Context: in the tests, I had an assertApproxEqual method that relied on
approx_equal function. The function had defaults, but I never used them.
The method defaulted to reading defaults from self.rel and self.tol, and
I did use them.]
> This does require that zero_tol is only applied for expected == 0.0,
> *not* for actual == 0.0, though. If you expected 1e-10 and got 0.0
> then this *might* be okay in your particular situation but it really
> requires the user to think things through; a generic tool should
> definitely flag this by default.
I really think that having three tolerances, once of which is nearly
always ignored, is poor API design. The user usually knows when they are
comparing against an expected value of zero and can set an absolute
error tolerance.
How about this?
- Absolute tolerance defaults to zero (which is equivalent to
exact equality).
- Relative tolerance defaults to something (possibly zero) to be
determined after sufficient bike-shedding.
- An argument for setting both values to zero by default is that
it will make it easy to choose one of "absolute or relative". You
just supply a value for the one that you want, and let the other
take the default of zero.
- At the moment, I'm punting on the behaviour when both abs and rel
tolerances are provided. That can be bike-shedded later.
Setting both defaults to zero means that the zero-thought version:
if is_close(x, y): ...
will silently degrade to x == y, which is no worse than what people
do now (except slower). We can raise a warning in that case.
The only tricky situation might be if you *may* be comparing against
zero, but don't know so in advance. There are some solutions to that:
- The suggested "zero_tol" parameter, which I dislike. I think it is
an ugly and confusing API.
- Some versions of is_close may not require any special treatment
for zero, depending on how it treats the situation where both abs
and rel tolerances are given. Further discussion needed.
- Worst case, people write this:
if (expected == 0 and is_close(actual, expected, tol=1e-8)
or is_close(actual, expected, rel=1e-5)):
but I don't think that will come up in practice very often. In the
test_statistics module, I had tests that looked like this:
for x in [bunch of non-zero values]:
y = do_some_calculation(x)
self.assertApproxEqual(x, y, rel=0.01)
y = do_some_calculation(0)
self.assertApproxEqual(0, y, tol=0.000001)
which isn't hard to do, so I don't think this is a real issue in
practice. I think the case of "my expected value might be zero, but I'm
not sure in advance" is rare and unusual enough that we don't need to
worry about it.
[...]
> My claim wasn't that is_close_to(x, 0.0) provides a mathematically
> ill-defined result. I agree that that's a reasonable definition of
> "relatively close to" (though one could make an argument that zero is
> not relatively close to itself -- after all, abs(actual -
> expected)/expected is ill-defined).
Don't write it that way. Write it this way:
abs(actual - expected) <= relative_tolerance*expected
Now if expected is zero, the condition is true if and only if
actual==expected.
It would be bizarre for is_close(a, a) to return False (or worse,
raise an exception!) for any finite number. NANs, of course, are
allowed to be bizarre. Zero is not :-)
> Instead, my point was that if the
> user is asking "is this close to 0?" instead of "is this exactly equal
> to zero?" then they probably are expecting that there exist some
> inputs for which those two questions give different answers. Saying
> "well TECHNICALLY this is a valid definition of 'close to'" is
> certainly true but somewhat unkind.
I agree, but I think this is a symptom of essential complexity in the
problem domain. Ultimately, "is close" is ill-defined, and *somebody*
has to make the decision what that will be, and that decision won't
satisfy everyone always. We can reduce the complexity in one place:
* provide sensible default values that work for expected != 0
but only by increasing the complexity elsewhere:
* when expected == 0 the intuition that is_close is different
from exact equality fails
We can get rid of that complexity, but only by adding it back somewhere
else:
* is_close(x, y, zero_tol=0.1) and is_close(x, y, zero_tol=0.00001)
give the same result for all the x,y I tested!
that is, zero_tol is nearly always ignored. Since people will often need
to think about what they want "is close" to mean no matter what we do, I
would prefer not to add the complexity of a third tolerance value. If
that means that "zero thought" users end up inadvertently testing for
exact equality without realising it, I think that's a price worth
paying for a clean API.
(As I said earlier, we can raise a warning in that case.)
> > I think this points to having a separate function for absolute tolerance
> > compared to zero -- but that's just abs(val) > zero_tolerance, so why
> > bother?
>
> Or it could just be the same function :-). Who wants to keep track of
> two functions that conceptually do the same thing?
Agreed.
--
Steven
More information about the Python-ideas
mailing list