[Numpy-discussion] Bug in numpy std, etc. with other data structures?

Bruce Southey bsouthey at gmail.com
Sat Sep 17 23:11:28 EDT 2011


On Sat, Sep 17, 2011 at 10:00 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Sat, Sep 17, 2011 at 10:50 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On Sat, Sep 17, 2011 at 4:12 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>> On Sat, Sep 17, 2011 at 4:48 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>> Just ran into this. Any objections for having numpy.std and other
>>>> functions in core/fromnumeric.py call asanyarray before trying to use
>>>> the array's method? Other data structures like pandas and larry define
>>>> their own std method, for instance, and this doesn't allow them to
>>>> pass through. I'm inclined to say that the issue is with numpy, though
>>>> maybe the data structures shouldn't shadow numpy array methods while
>>>> altering the signature. I dunno.
>>>>
>>>> df = pandas.DataFrame(np.random.random((10,5)))
>>>>
>>>> np.std(df,axis=0)
>>>> <snip>
>>>> TypeError: std() got an unexpected keyword argument 'dtype'
>>>>
>>>> np.std(np.asanyarray(df),axis=0)
>>>> array([ 0.30883352,  0.3133324 ,  0.26517361,  0.26389029,  0.20022444])
>>>>
>>>> Though I don't think this would work with larry yet.
>>>>
>>>> Pull request: https://github.com/numpy/numpy/pull/160
>>>>
>>>> Skipper
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> numpy.std()  does accepts array-like which obvious means that
>> np.std([1,2,3,5]) works making asanyarray call a total waste of cpu
>> time. Clearly pandas is not array-like input (as Wes points out below)
>> so an error is correct. Doing this type of 'fix' will have unintended
>> consequences when other non-numpy objects are incorrectly passed to
>> numpy functions. Rather you should determine why 'array-like' failed
>> here IF you think a pandas object is either array-like or a numpy
>> object.
>
> No, the reason it is failing is because np.std takes the
> EAFP/duck-typing approach:
>
> try:
>    std = a.std
> except AttributeError:
>    return _wrapit(a, 'std', axis, dtype, out, ddof)
> return std(axis, dtype, out, ddof)
>
> Indeed DataFrame has an std method but it doesn't have the same
> function signature as ndarray.std.
>

Thanks for the clarification - see Robert I am not making things up!
Bruce

>>
>>>
>>> Note I've no real intention of making DataFrame fully ndarray-like--
>>> but it's nice to be able to type:
>>>
>>> df.std(axis=0)
>>> df.std(axis=1)
>>> np.sqrt(df)
>>>
>>> etc. which works the same as ndarray. I suppose the
>>> __array__/__array_wrap__ interface is there largely as a convenience.
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>> I consider that the only way pandas or any other numpy-derivative to
>> overcome this is get into numpy/scipy. After all Travis opened the
>> discussion for Numpy 3 which you could still address.
>>
>> Bruce
>> PS Good luck on the ddof thing given the past discussions on it!
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list