[Python-ideas] PEP 485: A Function for testing approximate equality

Mon Jan 26 07:03:11 CET 2015

On Sun, Jan 25, 2015 at 5:17 PM, Nathaniel Smith <njs at pobox.com> wrote:

> > Though I guess I'd rather a zero_tol that defaulted to non-zero that an
> > abs_tol that did. So we might be able to satisfy your observation that a
> lot
> > of use cases call for testing against zero.
>
> Yes, that's the idea -- defaulting rel_tol and zero_tol to non-zero
> values, and abs_tol to zero, gives you a set of defaults that will
> just work for people who want to write useful tests without having to
> constantly be distracted by floating point arcana.
>

OK -- I get it now -- this is really about getting a default for a zero
tolerance test that does not mess up the relative test -- that may be a way
to go.

> This does require that zero_tol is only applied for expected == 0.0,
> *not* for actual == 0.0, though. If you expected 1e-10 and got 0.0
> then this *might* be okay in your particular situation but it really
> requires the user to think things through; a generic tool should
> definitely flag this by default.
>

got it -- if they want hat, they can set the abs_tolerance to what they
need.

>>  The example that came up in the numpy
> >> discussion of these defaults is that statsmodels has lots of tests to
> >> make sure that their computations of tail value probabilities are
> >> correct. These are often tiny (e.g., P(being 6 standard deviations off
> >> from the mean of a normal) = 9.9e-10), but emphatically different from
> >> zero. So it's definitely safer all around to stick to relative error
> >> by default for non-zero expected values.
>

Exactly why I don't think abs_tolerance should be anything other than 0.0

> > But would you even need to test for zero then in that case? And if so,
> > wouldn't setting abs_tol to what you wanted for "very small" be the right
> > thing to do? I note that Steven's testing code the the stdlib statistics
> > library used a rel_tolerance and abs_tolerance approach as well. I
> haven't
> > seen any example of special casing zero anywhere.
>
> Right, this example came up when it was discovered that np.allclose()
> has a non-zero abs_tol by default, and that
> np.testing.assert_allclose() has a zero abs_tol by default. It's a
> terrible and accidental API design, but it turns out that people
> really are intentionally use one or the other depending on whether
> they expect to be dealing with exact zeros or to be dealing with
> small-but-non-zero values.

why didn't they just override the defaults? but whatever.

The whole motivation for zero_tol is to
> allow a single set of defaults that satisfies both groups.
>

OK -- I'm buying it. However, what is a sensible default for
zero_tolerance? I agree it's less critical than for abs_tolerance, but what
should it be? Can we safely figure that order of magnitude one is most
common, and something in the  1e-8 to 1e-14 range makes sense? I suppose
that wouldn't be surprising to most folks.

Tests against zero won't necessarily fail -- sometimes rounding errors

> do cancel out, and you do get 0.0 instead of 1e-16 or whatever. At
> least for some sets of inputs, or until the code gets
> perturbed/refactored slightly, etc. That's why I said it might
> actually be better to unconditionally fail when expected==0.0 rather
> than knowingly perform a misleading operation.
>

I get it -- seems rare, certainly more rare than the other case,
is_close_to passes for small numbers when it really shouldn't. And sure,
you could get a pass the first time around, because, indeed you DID get
exactly zero -- that should pass. But when you do refactor and introduce a
slightly different answer, you'll get a failure then an can figure it out
then.

Are you actually proposing that the function should raise an Exception if
expected == 0.0 and abs_tolerance is also 0.0? (and i guess zero_tolerance
if there is one)

> Or do you think there are common use cases where you would want purely
> > relative tolerance, down to very close to zero, but want a larger
> tolerance
> > for zero itself, all in the same comprehension?
>
> inf = float("inf")
> for (x, expected) in [
>     (inf, inf),
>     (100, 1e100),
>     (1, 10),
>     (0, 1),
>     (-1, 0.1),
>     (-100, 1e-100),
>     (-inf, 0),
>     ]:
>     assert is_close_to(10 ** x, expected)
>

I meant a case that wasn't contrived ;-)

> Though really what I'm arguing is that all in the same userbase people
> want relative tolerance down close to zero but a larger tolerance for
> zero itself.
>

Absolutely -- and adding a zero_tolerance may be a way to get everyone
useful defaults.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150125/b8f4277f/attachment-0001.html>