[Numpy-discussion] 1.6.2 no more unique for rows

Ralf Gommers ralf.gommers at googlemail.com
Wed May 30 17:55:12 EDT 2012


On Wed, May 30, 2012 at 5:39 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Wed, May 30, 2012 at 4:59 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Tue, May 29, 2012 at 7:42 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Mon, May 28, 2012 at 9:18 PM, <josef.pktd at gmail.com> wrote:
>> >>
>> >>
>> >>
>> https://github.com/numpy/numpy/commit/74b9f5eef8fac643bf9012dbb2ac6b4b19f46892
>> >> broke return_inverse for structured arrays, because of the use of
>> >> mergesort
>> >>
>> >> I'm using structured dtypes to get uniques and return_inverse by rows
>> >>
>> >> >>> groups = np.random.randint(0,4,size=(10,2))
>> >> >>> groups_ =
>> groups.view([('',groups.dtype)]*groups.shape[1]).flatten()
>> >> >>> groups
>> >> array([[0, 2],
>> >>       [1, 2],
>> >>       [1, 1],
>> >>       [3, 1],
>> >>       [3, 1],
>> >>       [2, 1],
>> >>       [1, 0],
>> >>       [3, 3],
>> >>       [3, 2],
>> >>       [0, 0]])
>> >> >>> groups_
>> >> array([(0, 2), (1, 2), (1, 1), (3, 1), (3, 1), (2, 1), (1, 0), (3, 3),
>> >>       (3, 2), (0, 0)],
>> >>      dtype=[('f0', '<i4'), ('f1', '<i4')])
>> >>
>> >> >>> np.argsort(groups_)
>> >> array([9, 0, 6, 2, 1, 5, 4, 3, 8, 7])
>> >>
>> >> >>> np.argsort(groups_, kind='mergesort')
>> >> Traceback (most recent call last):
>> >>  File "<stdin>", line 1, in <module>
>> >>  File "C:\Python26\lib\site-packages\numpy\core\fromnumeric.py", line
>> >> 679, in argsort
>> >>    return argsort(axis, kind, order)
>> >> TypeError: requested sort not available for type
>> >>
>> >> >>> uni, uni_idx, uni_inv = np.unique(groups_, return_index=True,
>> >> >>> return_inverse=True)
>> >> >>> uni_inv
>> >> array([1, 4, 3, 6, 6, 5, 2, 8, 7, 0])
>> >>
>> >> exception in numpy 1.6.2rc2 (as reported by Debian for statsmodels)
>> >>
>> >
>> > I've been putting of, um, planning to implement the different sort
>> kinds for
>> > object/structured arrays for a while, sounds like it needs to get done.
>>
>> So I guess this is a 1.6.1 -> 1.6.2 regression, and presumably we
>> won't be landing any new sort implementations in the 1.6 branch.
>> Should we be thinking about reverting this and releasing a 1.6.3? (I
>> don't know if it's worth it, but it seems like something we should
>> think about either way.)
>>
>> Same question applies to 1.7 too -- obviously the change to unique()
>> is a good one, but maybe it has to wait until mergesort can handle
>> structured dtypes?
>>
>>
> Should definitely be reverted if a 1.6.3 goes out.
>

But is a 1.6.3 required for this issue alone? It's a regression, but it
looks like a corner case and is already fixed in statsmodels. If there are
more users who are running into this problem though, I'm OK with doing a
1.6.3 release just for this.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120530/93d01702/attachment.html>


More information about the NumPy-Discussion mailing list