On Fri, Jul 18, 2014 at 12:53 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Jul 18, 2014 at 12:38 PM,  <josef.pktd@gmail.com> wrote:
>
> On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd@gmail.com> wrote:
>
>> If you mean by this to add atol=1e-8 as default, then I'm against it.
>>
>> At least it will change the meaning of many of our tests in statsmodels.
>>
>> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
>> completely swamped if you change the default atol=0.
>> Adding atol=0 to all assert_allclose that currently use only rtol is a lot
>> of work.
>> I think I almost never use a default rtol, but I often leave atol at the
>> default = 0.
>>
>> If we have zeros, then I don't think it's too much work to decide whether
>> this should be atol=1e-20, or 1e-8.
>
>
> copied from
> http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html
> since I didn't get any messages here
>
> This is a compelling use-case, but there are also lots of compelling
> usecases that want some non-zero atol (i.e., comparing stuff to 0).
> Saying that allclose is for one of those use cases and assert_allclose
> is for the other is... not a very felicitious API design, I think. So
> we really should do *something*.
>
> Are there really any cases where you want non-zero atol= that don't
> involve comparing something against a 'desired' value of zero? It's a
> little wacky, but I'm wondering if we ought to change the rule (for
> all versions of allclose) to
>
> if desired == 0:
>     tol = atol
> else:
>     tol = rtol * desired
>
> In particular, means that np.allclose(x, 1e-30) would reject x values
> of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.
>
> -n
>
>
> That's much too confusing.
> I don't know what the usecases for np.allclose are since I don't have any.

I wrote allclose because it's shorter, but my point is that
assert_allclose and allclose should use the same criterion, and was
making a suggestion for what that shared criterion might be.

> assert_allclose is one of our (statsmodels) most frequently used numpy
> function
>
> this is not informative:
>
> `np.allclose(x, 1e-30)`
>
>
> since there are keywords
> either np.assert_allclose(x, atol=1e-30)

I think we might be talking past each other here -- 1e-30 here is my
"gold" p-value that I'm hoping x will match, not a tolerance argument.

my mistake 

 

> if I want to be "close" to zero
> or
>
> np.assert_allclose(x, rtol=1e-11, atol=1e-25)
>
> if we have a mix of large numbers and "zeros" in an array.
>
> Making the behavior of assert_allclose depending on whether desired is
> exactly zero or 1e-20 looks too difficult to remember, and which desired I
> use would depend on what I get out of R or Stata.

I thought your whole point here was that 1e-20 and zero are
qualitatively different values that you would not want to accidentally
confuse? Surely R and Stata aren't returning exact zeros for small
non-zero values like probability tails?

> atol=1e-8 is not close to zero in most cases in my experience.

If I understand correctly (Tony?) the problem here is that another
common use case for assert_allclose is in cases like

assert_allclose(np.sin(some * complex ** calculation / (that - should
- be * zero)), 0)

For cases like this, you need *some* non-zero atol or the thing just
doesn't work, and one could quibble over the exact value as long as
it's larger than "normal" floating point error. These calculations
usually involve "normal" sized numbers, so atol should be comparable
to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
to around 1e8, which is a plausible upper bound for where people might
expect assert_allclose to just work. I'm trying to figure out some way
to support your use cases while also supporting other use cases.

my problem is that there is no "normal" floating point error.
If I have units in 1000 or units in 0.0001 depends on the example and dataset that we use for testing.

this test two different functions/methods that calculate the same thing

(Pdb) pval
array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
(Pdb) res2.pvalues
array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
(Pdb) assert_allclose(pval, res2.pvalues, rtol=5 * rtol, atol=1e-25)

I don't care about errors that are smaller that 1e-25

for example testing p-values against Stata

(Pdb) tt.pvalue
array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
(Pdb) res2.pvalues
array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
(Pdb) tt.pvalue - res2.pvalues
array([  2.16612016e-40,   2.51187959e-15,   4.30027936e-21])
(Pdb) tt.pvalue / res2.pvalues - 1
array([  3.79811738e-11,   4.01900735e-14,   7.33806349e-11])
(Pdb) rtol
1e-10
(Pdb) assert_allclose(tt.pvalue, res2.pvalues, rtol=5 * rtol)


I could find a lot more and maybe nicer examples, since I spend quite a bit of time fine tuning unit tests.

Of course you can change it.

But the testing functions are code and very popular code.

And if you break backwards compatibility, then I wouldn't mind reviewing a pull request for statsmodels that adds 300 to 400 `atol=0` to the unit tests. :)

Josef
 

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion