[Numpy-discussion] 1.6.2 no more unique for rows
Nathaniel Smith
njs at pobox.com
Wed May 30 06:59:35 EDT 2012
On Tue, May 29, 2012 at 7:42 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, May 28, 2012 at 9:18 PM, <josef.pktd at gmail.com> wrote:
>>
>>
>> https://github.com/numpy/numpy/commit/74b9f5eef8fac643bf9012dbb2ac6b4b19f46892
>> broke return_inverse for structured arrays, because of the use of
>> mergesort
>>
>> I'm using structured dtypes to get uniques and return_inverse by rows
>>
>> >>> groups = np.random.randint(0,4,size=(10,2))
>> >>> groups_ = groups.view([('',groups.dtype)]*groups.shape[1]).flatten()
>> >>> groups
>> array([[0, 2],
>> [1, 2],
>> [1, 1],
>> [3, 1],
>> [3, 1],
>> [2, 1],
>> [1, 0],
>> [3, 3],
>> [3, 2],
>> [0, 0]])
>> >>> groups_
>> array([(0, 2), (1, 2), (1, 1), (3, 1), (3, 1), (2, 1), (1, 0), (3, 3),
>> (3, 2), (0, 0)],
>> dtype=[('f0', '<i4'), ('f1', '<i4')])
>>
>> >>> np.argsort(groups_)
>> array([9, 0, 6, 2, 1, 5, 4, 3, 8, 7])
>>
>> >>> np.argsort(groups_, kind='mergesort')
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> File "C:\Python26\lib\site-packages\numpy\core\fromnumeric.py", line
>> 679, in argsort
>> return argsort(axis, kind, order)
>> TypeError: requested sort not available for type
>>
>> >>> uni, uni_idx, uni_inv = np.unique(groups_, return_index=True,
>> >>> return_inverse=True)
>> >>> uni_inv
>> array([1, 4, 3, 6, 6, 5, 2, 8, 7, 0])
>>
>> exception in numpy 1.6.2rc2 (as reported by Debian for statsmodels)
>>
>
> I've been putting of, um, planning to implement the different sort kinds for
> object/structured arrays for a while, sounds like it needs to get done.
So I guess this is a 1.6.1 -> 1.6.2 regression, and presumably we
won't be landing any new sort implementations in the 1.6 branch.
Should we be thinking about reverting this and releasing a 1.6.3? (I
don't know if it's worth it, but it seems like something we should
think about either way.)
Same question applies to 1.7 too -- obviously the change to unique()
is a good one, but maybe it has to wait until mergesort can handle
structured dtypes?
-N
More information about the NumPy-Discussion
mailing list