numpy 1.10.1 reduce operation on recarrays

was there a change with reduce operations with recarrays in 1.10 or 1.10.1? Travis shows a new test failure in the statsmodels testsuite with 1.10.1: ERROR: test suite for <class 'statsmodels.base.tests.test_data.TestRecarrays'> File "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", line 131, in _handle_constant const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze() TypeError: cannot perform reduce with flexible type Sorry for asking so late. (statsmodels is short on maintainers, and I'm distracted) statsmodels still has code to support recarrays and structured dtypes from the time before pandas became popular, but I don't think anyone is using them together with statsmodels anymore. Josef

On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
The reason for the test failure might be somewhere else hiding behind several layers of statsmodels, but only started to show up with numpy 1.10.1 I already have the reduce exception with my currently installed numpy '1.9.2rc1'
x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), ('x_2', 'f8')]).view(np.recarray)
Sounds like fun, and I don't even know how to automatically bisect. Josef

On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane <allanhaldane@gmail.com> wrote:
Thanks, I realized that at that point in the statsmodels code we should have only regular ndarrays, so the array conversion fails somewhere. AFAICS, the main helper function to convert is def struct_to_ndarray(arr): return arr.view((float, len(arr.dtype.names))) which doesn't look like it will handle other dtypes than float64. Nobody ever complained, so maybe our test suite is the only user of this. What is now the recommended way of converting structured dtypes/recarrays to ndarrays? Josef

On 10/16/2015 09:17 PM, josef.pktd@gmail.com wrote:
Yes, that's the code I narrowed it down to as well. I think the code in statsmodels is fine, the problem is actually a bug I must admit I introduced in changes to the way views of recarrays work. If you are curious, the bug is in this line: https://github.com/numpy/numpy/blob/master/numpy/core/records.py#L467 This line was intended to fix the problem that accessing a nested record array field would lose the 'np.record' dtype. I only considered void structured arrays, and had forgotten about sub-arrays which statsmodels uses. I think the fix is to replace `issubclass(val.type, nt.void)` with `val.names` or something similar. I'll take a closer look soon. Allan

On Fri, Oct 16, 2015 at 9:31 PM, Allan Haldane <allanhaldane@gmail.com> wrote:
Another example fresh from Travis that might have the same source and I didn't even know statsmodels uses recarrays in the models AssertionError: Arrays are not almost equal to 7 decimals (shapes (6,), (6, 3) mismatch) x: recarray([�?, �;�:B�ѿ](�D����������, ��L��������ƿC�3Y�?, O�����N;�j���8���H��, �N�A�������T��B;��pٿ, 9m�;_���J��... y: array([[ 1. , 0. , 0. ], [-0.2794347, -0.100468 , -1.9709737], [-0.0469873, -0.1728197, 0.0436493],... Josef

On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
The reason for the test failure might be somewhere else hiding behind several layers of statsmodels, but only started to show up with numpy 1.10.1 I already have the reduce exception with my currently installed numpy '1.9.2rc1'
x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), ('x_2', 'f8')]).view(np.recarray)
Sounds like fun, and I don't even know how to automatically bisect. Josef

On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane <allanhaldane@gmail.com> wrote:
Thanks, I realized that at that point in the statsmodels code we should have only regular ndarrays, so the array conversion fails somewhere. AFAICS, the main helper function to convert is def struct_to_ndarray(arr): return arr.view((float, len(arr.dtype.names))) which doesn't look like it will handle other dtypes than float64. Nobody ever complained, so maybe our test suite is the only user of this. What is now the recommended way of converting structured dtypes/recarrays to ndarrays? Josef

On 10/16/2015 09:17 PM, josef.pktd@gmail.com wrote:
Yes, that's the code I narrowed it down to as well. I think the code in statsmodels is fine, the problem is actually a bug I must admit I introduced in changes to the way views of recarrays work. If you are curious, the bug is in this line: https://github.com/numpy/numpy/blob/master/numpy/core/records.py#L467 This line was intended to fix the problem that accessing a nested record array field would lose the 'np.record' dtype. I only considered void structured arrays, and had forgotten about sub-arrays which statsmodels uses. I think the fix is to replace `issubclass(val.type, nt.void)` with `val.names` or something similar. I'll take a closer look soon. Allan

On Fri, Oct 16, 2015 at 9:31 PM, Allan Haldane <allanhaldane@gmail.com> wrote:
Another example fresh from Travis that might have the same source and I didn't even know statsmodels uses recarrays in the models AssertionError: Arrays are not almost equal to 7 decimals (shapes (6,), (6, 3) mismatch) x: recarray([�?, �;�:B�ѿ](�D����������, ��L��������ƿC�3Y�?, O�����N;�j���8���H��, �N�A�������T��B;��pٿ, 9m�;_���J��... y: array([[ 1. , 0. , 0. ], [-0.2794347, -0.100468 , -1.9709737], [-0.0469873, -0.1728197, 0.0436493],... Josef
participants (3)
-
Allan Haldane
-
Charles R Harris
-
josef.pktd@gmail.com