unique return_index order?

The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior. A promise would be nice. Is it intended? Alan Isaac

On Fri, Mar 21, 2014 at 8:26 PM, Alan G Isaac <alan.isaac@gmail.com> wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
AFAIU it's not, or it was in version, but shouldn't be. ?? I think this broke return_inverse in some cases if both were set to true I haven't kept track of the problems, and the code still seems to be the same that was changed to stable sorting. Josef
Alan Isaac _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac <alan.isaac@gmail.com> wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7. Chuck

On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac <alan.isaac@gmail.com>wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7.
Does this mean return_inverse works again for all cases, even with return_index? I removed return_index from my code in statsmodels because I make frequent use of return_inverse, which was broken. We don't have any unittests in statsmodels anymore that use both return_xxx. Josef
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Fri, Mar 21, 2014 at 6:49 PM, <josef.pktd@gmail.com> wrote:
On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac <alan.isaac@gmail.com>wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7.
Does this mean return_inverse works again for all cases, even with return_index?
I removed return_index from my code in statsmodels because I make frequent use of return_inverse, which was broken. We don't have any unittests in statsmodels anymore that use both return_xxx.
I don't know, needs checking. Seems to work now with a simple trial array of integers. Chuck

On Fri, Mar 21, 2014 at 9:01 PM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Fri, Mar 21, 2014 at 6:49 PM, <josef.pktd@gmail.com> wrote:
On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac <alan.isaac@gmail.com>wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7.
Does this mean return_inverse works again for all cases, even with return_index?
I removed return_index from my code in statsmodels because I make frequent use of return_inverse, which was broken. We don't have any unittests in statsmodels anymore that use both return_xxx.
I don't know, needs checking. Seems to work now with a simple trial array of integers.
my example from may 2012, thread "1.6.2 no more unique for rows" works fine on python 3.3 numpy 1.7.1
groups = np.random.randint(0,4,size=(10,2)) groups_ = groups.view([('',groups.dtype)]*groups.shape[1]).flatten() uni, uni_idx, uni_inv = np.unique(groups_, return_index=True, return_inverse=True) uni array([(0, 2), (0, 3), (1, 0), (2, 1), (2, 2), (3, 2), (3, 3)], dtype=[('f0', '<i4'), ('f1', '<i4')]) uni_inv array([1, 6, 3, 4, 5, 3, 2, 5, 0, 2], dtype=int32) np.__version__ '1.7.1'
Thanks, Josef
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Fri, Mar 21, 2014 at 9:27 PM, <josef.pktd@gmail.com> wrote:
On Fri, Mar 21, 2014 at 9:01 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Fri, Mar 21, 2014 at 6:49 PM, <josef.pktd@gmail.com> wrote:
On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac <alan.isaac@gmail.com>wrote:
The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior.
A promise would be nice. Is it intended?
Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7.
summary, AFAICS: since numpy 1.6.2 np.unique used mergesort if return_index=True and provides a stable sort. Josef
Does this mean return_inverse works again for all cases, even with return_index?
I removed return_index from my code in statsmodels because I make frequent use of return_inverse, which was broken. We don't have any unittests in statsmodels anymore that use both return_xxx.
I don't know, needs checking. Seems to work now with a simple trial array of integers.
my example from may 2012, thread "1.6.2 no more unique for rows" works fine on python 3.3 numpy 1.7.1
groups = np.random.randint(0,4,size=(10,2)) groups_ = groups.view([('',groups.dtype)]*groups.shape[1]).flatten() uni, uni_idx, uni_inv = np.unique(groups_, return_index=True, return_inverse=True) uni array([(0, 2), (0, 3), (1, 0), (2, 1), (2, 2), (3, 2), (3, 3)], dtype=[('f0', '<i4'), ('f1', '<i4')]) uni_inv array([1, 6, 3, 4, 5, 3, 2, 5, 0, 2], dtype=int32) np.__version__ '1.7.1'
Thanks,
Josef
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
Alan G Isaac
-
Charles R Harris
-
josef.pktd@gmail.com