[Numpy-discussion] [pydata] Re: [SciPy-Dev] 1.8.0rc1

Nathaniel Smith njs at pobox.com
Tue Oct 1 09:58:24 EDT 2013


I bet the difference is:

In master, nansum ultimately calls arr.sum(...), which will be
intercepted by Series.sum.

In 1.8.x, nansum ultimately calls np.add.reduce(...), which can't be
intercepted and will return the wrong thing.

AFAICT the np.add.reduce(a, ...) call could just be replaced with
a.sum(...), but I might be missing something... surely there must have
been some reason it was written that way in the first place?

On Tue, Oct 1, 2013 at 2:48 PM, Jeff <jeffreback at gmail.com> wrote:
> so...looks like a backport issue?
>
>
> On Tuesday, October 1, 2013 6:41:26 AM UTC-4, Nathaniel Smith wrote:
>>
>> On Tue, Oct 1, 2013 at 3:27 AM, Charles R Harris
>> <charles... at gmail.com> wrote:
>> >>> >> On Mon, Sep 30, 2013 at 5:12 PM, Christoph Gohlke <cgo... at uci.edu
>> >>> >> <mailto:cgo... at uci.edu>> wrote:
>> >>> >>      >     NumPy 1.8.0rc1 looks good. All tests pass on Windows and
>> >>> >> most
>> >>> >>      3rd party
>> >>> >>      >     packages test OK now. Thank you.
>> >>> >>      >
>> >>> >>      >     A few tests still fail in the following packages when
>> >>> >> run
>> >>> >> with
>> >>> >>      >     numpy-MKL-1.8.0rc1-win-amd64-py3.3 compared to
>> >>> >>      >     numpy-MKL-1.7.1-win-amd64-py3.3:
>> >>> >>      >
>> >>> >>      >     1) Pandas 0.12.0
>> >>> >>      >
>> >>> >>      >     ```
>> >>> >>      >
>> >>> >>
>> >>> >>
>> >>> >> ======================================================================
>> >>> >>      >     FAIL: test_nansum_buglet
>> >>> >> (pandas.tests.test_series.TestNanops)
>> >>> >>      >
>> >>> >>
>> >>> >>
>> >>> >> ----------------------------------------------------------------------
>> >>> >>      >     Traceback (most recent call last):
>> >>> >>      >         File
>> >>> >>      "X:\Python33\lib\site-packages\pandas\tests\test_series.py",
>> >>> >>      >     line 254, in test_nansum_buglet
>> >>> >>      >           assert_almost_equal(result, 1)
>> >>> >>      >         File
>> >>> >>      "X:\Python33\lib\site-packages\pandas\util\testing.py", line
>> >>> >>      >     134, in assert_almost_equal
>> >>> >>      >           np.testing.assert_(isiterable(b))
>> >>> >>      >         File
>> >>> >>      "D:\Dev\Compile\Test\numpy-build\numpy\testing\utils.py", line
>> >>> >>      >     44, in assert_
>> >>> >>      >           raise AssertionError(msg)
>> >>> >>      >     AssertionError
>> >>> >>      >     ```
>> >>> >>      >
>> >>> >>      >     Possibly related:
>> >>> >>      >
>> >>> >>      >     ```
>> >>> >>      >     >>> import numpy as np
>> >>> >>      >     >>> from pandas import Series
>> >>> >>      >     >>> s = Series([0.0])
>> >>> >>      >     >>> result = np.nansum(s)
>> >>> >>      >     >>> print(result)
>> >>> >>      >     Traceback (most recent call last):
>> >>> >>      >         File "<stdin>", line 1, in <module>
>> >>> >>      >         File
>> >>> >> "X:\Python33\lib\site-packages\pandas\core\base.py", line
>> >>> >>      >     19, in
>> >>> >>      >     __str__
>> >>> >>      >           return self.__unicode__()
>> >>> >>      >         File
>> >>> >>      "X:\Python33\lib\site-packages\pandas\core\series.py", line
>> >>> >>      >     1115, in __unicode__
>> >>> >>      >           length=len(self) > 50,
>> >>> >>      >     TypeError: len() of unsized object
>> >>> >>      >     ```
>> [...]
>> >
>> > The pandas test passes for current pandas dev, so it looks like a bug on
>> > their end that has been taken care of.
>> >
>> > test_nansum_buglet (__main__.TestNanops) ... ok
>>
>> I'm concerned about this. 0.12.0 is currently the latest pandas
>> release, so even if it is a bug on their side, we're going to be
>> converting it from a latent bug to a real bug when we release...
>> CC'ing pydata, do you guys have any insight into what changed here?
>>
>> The code is:
>>   s = pandas.Series([1.0, np.nan])
>>   result = np.nansum(s)
>> With numpy 1.7.1, 'result' comes out as a np.float64. With numpy
>> maintenance/1.8.x, 'result' comes out as a 0-d Series object. Series
>> is a subclass of ndarray, but it's supposed to always be 1-d, so all
>> kinds of stuff blows up as soon as you have a 0-d Series object.
>>
>> I'm not sure what changed in numpy's nansum; if I try this same test
>> with a simple no-op ndarray subclass:
>>   class MyArray(np.ndarray):
>>       pass
>>   np.nansum(np.array([1.0, np.nan]).view(MyArray))
>> then 1.7.1 and maintenance/1.8.x both act the same, and both return a
>> 0-d MyArray, so it's not just a question of whether we remembered to
>> handle subclasses at all.
>>
>> -n
>
> --
> You received this message because you are subscribed to the Google Groups
> "PyData" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pydata+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



More information about the NumPy-Discussion mailing list