[Python-ideas] PEP 485: A Function for testing approximate equality
Chris Barker
chris.barker at noaa.gov
Mon Jan 26 01:02:57 CET 2015
On Sun, Jan 25, 2015 at 7:32 AM, Nathaniel Smith <njs at pobox.com> wrote:
> > If you have a tolerance that you use only when expected is zero (or when
> > either is...) then you have the odd result that a small number will be
> > "close" to zero, but NOT close to a smaller number.
>
> > And you get the odd result:
> >
> > In [9]: is_close_to(1e-9, 0.0)
> > Out[9]: True
> >
> > fine -- the default zero_tol is 1e-8
> >
> > In [10]: is_close_to(1e-9, 1e-12)
> > Out[10]: False
> >
> > but huh??? 1e-9 is close to zero, but not close to 1e-12????
>
> Yes that's.... the idea? :-)
>
> If someone says that their expected value is exactly zero, then using
> relative tolerance just makes no sense at all. If they wanted an exact
> test they'd have written ==. And this is reasonable, because even if
> you know that the exact answer is zero, then you can't expect to get
> that with floating point -- +/-1e-16 or so is often the best you can
> hope for.
>
sure -- that's why I (and numpy and Steven's statistics test function) put
in an absolute tolerance as well. If you know you testing near near, then
you set an abs_tolernace that define what "near zero" or "small" mean in
this case.
But if someone says their expected value is 1e-12, then... well, it's
> possible that they'd be happy to instead get 0. But likely not. 0 is
> extremely far from 1e-12 in relative terms,
And 1e-12 from zero also, of course. Which is the trick here. Even with an
asymmetric test, 0.0 is not relatively close to anything, and nothing is
relatively close to zero (as long as the relative tolerance is less than 1
-- which it really should be. So I think we should use the zero_tolerance
option if either input is zero, but then we get these continuities.
So It seems, if a user wants to use the same parameters to test a bunch of
numbers, and some of them may be zero, that they should define what small
is to them by setting an abs_tolerance.
Though I guess I'd rather a zero_tol that defaulted to non-zero that an
abs_tol that did. So we might be able to satisfy your observation that a
lot of use cases call for testing against zero.
> The example that came up in the numpy
> discussion of these defaults is that statsmodels has lots of tests to
> make sure that their computations of tail value probabilities are
> correct. These are often tiny (e.g., P(being 6 standard deviations off
> from the mean of a normal) = 9.9e-10), but emphatically different from
> zero. So it's definitely safer all around to stick to relative error
> by default for non-zero expected values.
>
But would you even need to test for zero then in that case? And if so,
wouldn't setting abs_tol to what you wanted for "very small" be the right
thing to do? I note that Steven's testing code the the stdlib statistics
library used a rel_tolerance and abs_tolerance approach as well. I haven't
seen any example of special casing zero anywhere.
> Admittedly I am leaning pretty heavily on the "testing" use case here,
> but that's because AFAICT that's the overwhelming use case for this
> kind of functionality.
I agree that it is as well -- sure you could use it for a simple recursive
solution to an implicit equation, but how may people whip those up,
compared to either testing code or writing a custom comparison designed
specifically for the case at hand.
> I'd much rather require people to have to think about what makes sense for
> > their use case than get trapped by a default that's totally
> inappropriate.
>
> But this seems a strange reason to advocate for a default that's
> totally inappropriate. is_close_to(x, 0.0) simply doesn't mean
> anything sensible in the current PEP -- even giving an error would be
> better.
>
Sure it does -- it means nothing is relatively close to zero -- haven't we
all agreed that that's the mathematically correct result? And if you write
a test against zero it will reliably fail first time if you haven't set an
abs_tolerance. So you will then be forces to decide what "near zero" means
to you, and set an appropriate abs_tolerance.
I think this points to having a separate function for absolute tolerance
compared to zero -- but that's just abs(val) > zero_tolerance, so why
bother?
Or do you think there are common use cases where you would want purely
relative tolerance, down to very close to zero, but want a larger tolerance
for zero itself, all in the same comprehension?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150125/d04c6d25/attachment-0001.html>
More information about the Python-ideas
mailing list