Re: [Numpy-discussion] [pydata] Re: [SciPy-Dev] 1.8.0rc1
I bet the difference is:
In master, nansum ultimately calls arr.sum(...), which will be
intercepted by Series.sum.
In 1.8.x, nansum ultimately calls np.add.reduce(...), which can't be
intercepted and will return the wrong thing.
AFAICT the np.add.reduce(a, ...) call could just be replaced with
a.sum(...), but I might be missing something... surely there must have
been some reason it was written that way in the first place?
On Tue, Oct 1, 2013 at 2:48 PM, Jeff
so...looks like a backport issue?
On Tuesday, October 1, 2013 6:41:26 AM UTC-4, Nathaniel Smith wrote:
On Tue, Oct 1, 2013 at 3:27 AM, Charles R Harris
wrote: > On Mon, Sep 30, 2013 at 5:12 PM, Christoph Gohlke
mailto:cgo...@uci.edu> wrote: > > NumPy 1.8.0rc1 looks good. All tests pass on Windows and > most > 3rd party > > packages test OK now. Thank you. > > > > A few tests still fail in the following packages when > run > with > > numpy-MKL-1.8.0rc1-win-amd64-py3.3 compared to > > numpy-MKL-1.7.1-win-amd64-py3.3: > > > > 1) Pandas 0.12.0 > > > > ``` > > > > > ====================================================================== > > FAIL: test_nansum_buglet > (pandas.tests.test_series.TestNanops) > > > > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > "X:\Python33\lib\site-packages\pandas\tests\test_series.py", > > line 254, in test_nansum_buglet > > assert_almost_equal(result, 1) > > File > "X:\Python33\lib\site-packages\pandas\util\testing.py", line > > 134, in assert_almost_equal > > np.testing.assert_(isiterable(b)) > > File > "D:\Dev\Compile\Test\numpy-build\numpy\testing\utils.py", line > > 44, in assert_ > > raise AssertionError(msg) > > AssertionError > > ``` > > > > Possibly related: > > > > ``` > > >>> import numpy as np > > >>> from pandas import Series > > >>> s = Series([0.0]) > > >>> result = np.nansum(s) > > >>> print(result) > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > File > "X:\Python33\lib\site-packages\pandas\core\base.py", line > > 19, in > > __str__ > > return self.__unicode__() > > File > "X:\Python33\lib\site-packages\pandas\core\series.py", line > > 1115, in __unicode__ > > length=len(self) > 50, > > TypeError: len() of unsized object > > ``` [...] The pandas test passes for current pandas dev, so it looks like a bug on their end that has been taken care of.
test_nansum_buglet (__main__.TestNanops) ... ok
I'm concerned about this. 0.12.0 is currently the latest pandas release, so even if it is a bug on their side, we're going to be converting it from a latent bug to a real bug when we release... CC'ing pydata, do you guys have any insight into what changed here?
The code is: s = pandas.Series([1.0, np.nan]) result = np.nansum(s) With numpy 1.7.1, 'result' comes out as a np.float64. With numpy maintenance/1.8.x, 'result' comes out as a 0-d Series object. Series is a subclass of ndarray, but it's supposed to always be 1-d, so all kinds of stuff blows up as soon as you have a 0-d Series object.
I'm not sure what changed in numpy's nansum; if I try this same test with a simple no-op ndarray subclass: class MyArray(np.ndarray): pass np.nansum(np.array([1.0, np.nan]).view(MyArray)) then 1.7.1 and maintenance/1.8.x both act the same, and both return a 0-d MyArray, so it's not just a question of whether we remembered to handle subclasses at all.
-n
-- You received this message because you are subscribed to the Google Groups "PyData" group. To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
On Tue, Oct 1, 2013 at 7:58 AM, Nathaniel Smith
I bet the difference is:
In master, nansum ultimately calls arr.sum(...), which will be intercepted by Series.sum.
In 1.8.x, nansum ultimately calls np.add.reduce(...), which can't be intercepted and will return the wrong thing.
AFAICT the np.add.reduce(a, ...) call could just be replaced with a.sum(...), but I might be missing something... surely there must have been some reason it was written that way in the first place?
No good reason, just a bit more efficient. The change for current master was because of the change in nansum behavior for empty slices. Changing the call to a.sum is not a problem, although I confess that it seems a bit fragile... <snip> Chuck
On Tue, Oct 1, 2013 at 3:13 PM, Charles R Harris
On Tue, Oct 1, 2013 at 7:58 AM, Nathaniel Smith
wrote: I bet the difference is:
In master, nansum ultimately calls arr.sum(...), which will be intercepted by Series.sum.
In 1.8.x, nansum ultimately calls np.add.reduce(...), which can't be intercepted and will return the wrong thing.
AFAICT the np.add.reduce(a, ...) call could just be replaced with a.sum(...), but I might be missing something... surely there must have been some reason it was written that way in the first place?
No good reason, just a bit more efficient. The change for current master was because of the change in nansum behavior for empty slices. Changing the call to a.sum is not a problem, although I confess that it seems a bit fragile...
Yeah, ndarray subclassing is always fragile :-/. But hopefully __numpy_ufunc__ will solve the problem in 1.9 and going forward...? (I forget if it's implemented for .reduce yet.) Filed a tracker bug here: https://github.com/numpy/numpy/issues/3849 -n
participants (2)
-
Charles R Harris
-
Nathaniel Smith