[Numpy-discussion] numpy 1.10.1 reduce operation on recarrays
josef.pktd at gmail.com
josef.pktd at gmail.com
Mon Oct 19 21:56:59 EDT 2015
On Fri, Oct 16, 2015 at 9:31 PM, Allan Haldane <allanhaldane at gmail.com>
wrote:
> On 10/16/2015 09:17 PM, josef.pktd at gmail.com wrote:
>
>>
>>
>> On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane <allanhaldane at gmail.com
>> <mailto:allanhaldane at gmail.com>> wrote:
>>
>> On 10/16/2015 05:31 PM, josef.pktd at gmail.com
>> <mailto:josef.pktd at gmail.com> wrote:
>> >
>> >
>> > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris
>> > <charlesr.harris at gmail.com <mailto:charlesr.harris at gmail.com>
>> <mailto:charlesr.harris at gmail.com
>> <mailto:charlesr.harris at gmail.com>>> wrote:
>> >
>> >
>> >
>> > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris
>> > <charlesr.harris at gmail.com <mailto:charlesr.harris at gmail.com>
>> <mailto:charlesr.harris at gmail.com
>> <mailto:charlesr.harris at gmail.com>>> wrote:
>> >
>> >
>> >
>> > On Fri, Oct 16, 2015 at 11:58 AM, <josef.pktd at gmail.com
>> <mailto:josef.pktd at gmail.com>
>> > <mailto:josef.pktd at gmail.com
>>
>> <mailto:josef.pktd at gmail.com>>> wrote:
>> >
>> > was there a change with reduce operations with
>> recarrays in
>> > 1.10 or 1.10.1?
>> >
>> > Travis shows a new test failure in the statsmodels
>> testsuite
>> > with 1.10.1:
>> >
>> > ERROR: test suite for <class
>> > 'statsmodels.base.tests.test_data.TestRecarrays'>
>> >
>> > File
>> >
>>
>> "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py",
>> > line 131, in _handle_constant
>> > const_idx = np.where(self.exog.ptp(axis=0) ==
>> > 0)[0].squeeze()
>> > TypeError: cannot perform reduce with flexible type
>> >
>> >
>> > Sorry for asking so late.
>> > (statsmodels is short on maintainers, and I'm
>> distracted)
>> >
>> >
>> > statsmodels still has code to support recarrays and
>> > structured dtypes from the time before pandas became
>> > popular, but I don't think anyone is using them
>> together
>> > with statsmodels anymore.
>> >
>> >
>> > There were several commits dealing both recarrays and
>> ufuncs, so
>> > this might well be a regression.
>> >
>> >
>> > A bisection would be helpful. Also, open an issue.
>> >
>> >
>> >
>> > The reason for the test failure might be somewhere else hiding
>> behind
>> > several layers of statsmodels, but only started to show up with
>> numpy 1.10.1
>> >
>> > I already have the reduce exception with my currently installed
>> numpy
>> > '1.9.2rc1'
>> >
>> >>>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'),
>> > ('x_2', 'f8')]).view(np.recarray)
>> >
>> >>>> np.ptp(x, axis=0)
>> > Traceback (most recent call last):
>> > File "<stdin>", line 1, in <module>
>> > File
>> >
>>
>> "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py",
>> > line 2047, in ptp
>> > return ptp(axis, out)
>> > TypeError: cannot perform reduce with flexible type
>> >
>> >
>> > Sounds like fun, and I don't even know how to automatically bisect.
>> >
>> > Josef
>>
>> That example isn't the problem (ptp should definitely fail on
>> structured
>> arrays), but I've tracked down what is - it has to do with views of
>> record arrays.
>>
>> The fix looks simple, I'll get it in for the next release.
>>
>>
>> Thanks,
>>
>> I realized that at that point in the statsmodels code we should have
>> only regular ndarrays, so the array conversion fails somewhere.
>>
>> AFAICS, the main helper function to convert is
>>
>> def struct_to_ndarray(arr):
>> return arr.view((float, len(arr.dtype.names)))
>>
>> which doesn't look like it will handle other dtypes than float64. Nobody
>> ever complained, so maybe our test suite is the only user of this.
>>
>> What is now the recommended way of converting structured
>> dtypes/recarrays to ndarrays?
>>
>> Josef
>>
>
> Yes, that's the code I narrowed it down to as well. I think the code in
> statsmodels is fine, the problem is actually a bug I must admit I
> introduced in changes to the way views of recarrays work.
>
> If you are curious, the bug is in this line:
>
> https://github.com/numpy/numpy/blob/master/numpy/core/records.py#L467
>
> This line was intended to fix the problem that accessing a nested record
> array field would lose the 'np.record' dtype. I only considered void
> structured arrays, and had forgotten about sub-arrays which statsmodels
> uses.
>
> I think the fix is to replace `issubclass(val.type, nt.void)` with
> `val.names` or something similar. I'll take a closer look soon.
>
>
Another example fresh from Travis that might have the same source
and I didn't even know statsmodels uses recarrays in the models
AssertionError:
Arrays are not almost equal to 7 decimals
(shapes (6,), (6, 3) mismatch)
x: recarray([�?, �;�:B�ѿ](�D����������,
��L��������
ƿC�3Y�?, O�����N;�j���8���H��,
�N�A�������T��B;��pٿ, 9m�;_���J��...
y: array([[ 1. , 0. , 0. ],
[-0.2794347, -0.100468 , -1.9709737],
[-0.0469873, -0.1728197, 0.0436493],...
Josef
>
> Allan
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151019/0db5fc67/attachment.html>
More information about the NumPy-Discussion
mailing list