[Python-ideas] PEP 485: A Function for testing approximate equality

Mon Jan 26 02:17:02 CET 2015

On Mon, Jan 26, 2015 at 12:02 AM, Chris Barker <chris.barker at noaa.gov> wrote:
> On Sun, Jan 25, 2015 at 7:32 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> > If you have a tolerance that you use only when expected is zero (or when
>> > either is...) then you have the odd result that a small number will be
>> > "close" to zero, but NOT close to a smaller number.
>
>
>>
>> > And you get the odd result:
>> >
>> > In [9]: is_close_to(1e-9, 0.0)
>> > Out[9]: True
>> >
>> > fine -- the default zero_tol is 1e-8
>> >
>> > In [10]: is_close_to(1e-9, 1e-12)
>> > Out[10]: False
>> >
>> > but huh??? 1e-9 is close to zero, but not close to 1e-12????
>>
>> Yes that's.... the idea? :-)
>>
>> If someone says that their expected value is exactly zero, then using
>> relative tolerance just makes no sense at all. If they wanted an exact
>> test they'd have written ==. And this is reasonable, because even if
>> you know that the exact answer is zero, then you can't expect to get
>> that with floating point -- +/-1e-16 or so is often  the best you can
>> hope for.
>
>
> sure -- that's why I (and numpy and Steven's statistics test function) put
> in an absolute tolerance as well. If you know you testing near near, then
> you set an abs_tolernace that define what "near zero" or "small" mean in
> this case.
>
>> But if someone says their expected value is 1e-12, then... well, it's
>> possible that they'd be happy to instead get 0. But likely not. 0 is
>> extremely far from 1e-12 in relative terms,
>
> And 1e-12 from zero also, of course. Which is the trick here. Even with an
> asymmetric test, 0.0 is not relatively close to anything, and nothing is
> relatively close to zero (as long as the relative tolerance is less than 1
> -- which it really should be. So I think we should use the zero_tolerance
> option if either input is zero, but then we get these continuities.
>
> So It seems, if a user wants to use the same parameters to test a bunch of
> numbers, and some of them may be zero, that they should define what small is
> to them by setting an abs_tolerance.
>
> Though I guess I'd rather a zero_tol that defaulted to non-zero that an
> abs_tol that did. So we might be able to satisfy your observation that a lot
> of use cases call for testing against zero.

Yes, that's the idea -- defaulting rel_tol and zero_tol to non-zero
values, and abs_tol to zero, gives you a set of defaults that will
just work for people who want to write useful tests without having to
constantly be distracted by floating point arcana.

This does require that zero_tol is only applied for expected == 0.0,
*not* for actual == 0.0, though. If you expected 1e-10 and got 0.0
then this *might* be okay in your particular situation but it really
requires the user to think things through; a generic tool should
definitely flag this by default.

>>  The example that came up in the numpy
>> discussion of these defaults is that statsmodels has lots of tests to
>> make sure that their computations of tail value probabilities are
>> correct. These are often tiny (e.g., P(being 6 standard deviations off
>> from the mean of a normal) = 9.9e-10), but emphatically different from
>> zero. So it's definitely safer all around to stick to relative error
>> by default for non-zero expected values.
>
>
> But would you even need to test for zero then in that case? And if so,
> wouldn't setting abs_tol to what you wanted for "very small" be the right
> thing to do? I note that Steven's testing code the the stdlib statistics
> library used a rel_tolerance and abs_tolerance approach as well. I haven't
> seen any example of special casing zero anywhere.

Right, this example came up when it was discovered that np.allclose()
has a non-zero abs_tol by default, and that
np.testing.assert_allclose() has a zero abs_tol by default. It's a
terrible and accidental API design, but it turns out that people
really are intentionally use one or the other depending on whether
they expect to be dealing with exact zeros or to be dealing with
small-but-non-zero values. The whole motivation for zero_tol is to
allow a single set of defaults that satisfies both groups.

>> Admittedly I am leaning pretty heavily on the "testing" use case here,
>> but that's because AFAICT that's the overwhelming use case for this
>> kind of functionality.
>
>
> I agree that it is as well -- sure you could use it for a simple recursive
> solution to an implicit equation, but how may people whip those up, compared
> to either testing code or writing a custom comparison designed specifically
> for the case at hand.
>
>> > I'd much rather require people to have to think about what makes sense
>> > for
>> > their use case than get trapped by a default that's totally
>> > inappropriate.
>>
>> But this seems a strange reason to advocate for a default that's
>> totally inappropriate. is_close_to(x, 0.0) simply doesn't mean
>> anything sensible in the current PEP -- even giving an error would be
>> better.
>
>
> Sure it does -- it means nothing is relatively close to zero -- haven't we
> all agreed that that's the mathematically correct result? And if you write a
> test against zero it will reliably fail first time if you haven't set an
> abs_tolerance. So you will then be forces to decide what "near zero" means
> to you, and set an appropriate abs_tolerance.

Tests against zero won't necessarily fail -- sometimes rounding errors
do cancel out, and you do get 0.0 instead of 1e-16 or whatever. At
least for some sets of inputs, or until the code gets
perturbed/refactored slightly, etc. That's why I said it might
actually be better to unconditionally fail when expected==0.0 rather
than knowingly perform a misleading operation.

My claim wasn't that is_close_to(x, 0.0) provides a mathematically
ill-defined result. I agree that that's a reasonable definition of
"relatively close to" (though one could make an argument that zero is
not relatively close to itself -- after all, abs(actual -
expected)/expected is ill-defined). Instead, my point was that if the
user is asking "is this close to 0?" instead of "is this exactly equal
to zero?" then they probably are expecting that there exist some
inputs for which those two questions give different answers. Saying
"well TECHNICALLY this is a valid definition of 'close to'" is
certainly true but somewhat unkind.

> I think this points to having a separate function for absolute tolerance
> compared to zero -- but that's just abs(val) > zero_tolerance, so why
> bother?

Or it could just be the same function :-). Who wants to keep track of
two functions that conceptually do the same thing?

> Or do you think there are common use cases where you would want purely
> relative tolerance, down to very close to zero, but want a larger tolerance
> for zero itself, all in the same comprehension?

inf = float("inf")
for (x, expected) in [
    (inf, inf),
    (100, 1e100),
    (1, 10),
    (0, 1),
    (-1, 0.1),
    (-100, 1e-100),
    (-inf, 0),
    ]:
    assert is_close_to(10 ** x, expected)

Though really what I'm arguing is that all in the same userbase people
want relative tolerance down close to zero but a larger tolerance for
zero itself.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org