bug in lexsort with two different dtypes?
In [1]: intArr1 = numpy.array([ 0, 1, 2,2,1, 5,5,5]) In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4]) In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d']) Here I sort two int arrays. As expected intArr2 dominates intArr1 but the items with the same intArr2 values are sorted forwards according to intArr1 In [6]: numpy.lexsort((intArr1, intArr2)) Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7]) This, however, looks like a bug to me. Here I sort an int array and a str array. As expected charArray dominates intArr1 but the items with the same charArray values are sorted *backwards* according to intArr1 In [5]: numpy.lexsort((intArr1, charArr)) Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7]) Is this a bug or am I missing something? Tom
On 6/26/07, Tom Denniston
In [1]: intArr1 = numpy.array([ 0, 1, 2,2,1, 5,5,5]) In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4]) In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d'])
Here I sort two int arrays. As expected intArr2 dominates intArr1 but the items with the same intArr2 values are sorted forwards according to intArr1 In [6]: numpy.lexsort((intArr1, intArr2)) Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7])
This, however, looks like a bug to me. Here I sort an int array and a str array. As expected charArray dominates intArr1 but the items with the same charArray values are sorted *backwards* according to intArr1 In [5]: numpy.lexsort((intArr1, charArr)) Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7])
Is this a bug or am I missing something?
Looks like a bug. In [12]: numpy.argsort([charArr], kind='m') Out[12]: array([[2, 1, 0, 5, 4, 3, 6, 7]]) In [13]: numpy.argsort([intArr2], kind='m') Out[13]: array([[0, 1, 2, 3, 4, 5, 6, 7]]) Both of these are stable sorts, and since the elements are in order should return [[0, 1, 2, 3, 4, 5, 6, 7]]. Actually, I think they should return [0, 1, 2, 3, 4, 5, 6, 7], I'm not sure why the returned array is 2D and I suspect that is a bug also. As to why the string array sorts incorrectly, I am not sure. It could be that the sort isn't stable, there could be a stride error, or the comparison is returning wrong values. My bet is on the first being the case. Please file a ticket on this. Chuck
On 6/26/07, Charles R Harris
On 6/26/07, Tom Denniston
wrote: In [1]: intArr1 = numpy.array([ 0, 1, 2,2,1, 5,5,5]) In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4]) In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d'])
Here I sort two int arrays. As expected intArr2 dominates intArr1 but the items with the same intArr2 values are sorted forwards according to intArr1 In [6]: numpy.lexsort((intArr1, intArr2)) Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7])
This, however, looks like a bug to me. Here I sort an int array and a str array. As expected charArray dominates intArr1 but the items with the same charArray values are sorted *backwards* according to intArr1 In [5]: numpy.lexsort((intArr1, charArr)) Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7])
Is this a bug or am I missing something?
Looks like a bug.
In [12]: numpy.argsort([charArr], kind='m') Out[12]: array([[2, 1, 0, 5, 4, 3, 6, 7]])
In [13]: numpy.argsort([intArr2], kind='m') Out[13]: array([[0, 1, 2, 3, 4, 5, 6, 7]])
Both of these are stable sorts, and since the elements are in order should return [[0, 1, 2, 3, 4, 5, 6, 7]]. Actually, I think they should return [0, 1, 2, 3, 4, 5, 6, 7], I'm not sure why the returned array is 2D and I suspect that is a bug also. As to why the string array sorts incorrectly, I am not sure. It could be that the sort isn't stable, there could be a stride error, or the comparison is returning wrong values. My bet is on the first being the case.
Nevermind the 2D thingee, that was pilot error in changing lexsort to argsort, charArr should not be in a list: In [25]: numpy.argsort(charArr, kind='m', axis=0) Out[25]: array([2, 1, 0, 5, 4, 3, 6, 7]) Works just fine. Chuck
On 6/26/07, Charles R Harris
On 6/26/07, Charles R Harris
wrote: On 6/26/07, Tom Denniston < tom.denniston@alum.dartmouth.org> wrote:
In [1]: intArr1 = numpy.array([ 0, 1, 2,2,1, 5,5,5]) In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4]) In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d'])
Here I sort two int arrays. As expected intArr2 dominates intArr1 but the items with the same intArr2 values are sorted forwards according to intArr1 In [6]: numpy.lexsort((intArr1, intArr2)) Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7])
This, however, looks like a bug to me. Here I sort an int array and a str array. As expected charArray dominates intArr1 but the items with the same charArray values are sorted *backwards* according to intArr1 In [5]: numpy.lexsort((intArr1, charArr)) Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7])
Is this a bug or am I missing something?
It was a bug. It is fixed in svn. Chuck Looks like a bug.
In [12]: numpy.argsort([charArr], kind='m') Out[12]: array([[2, 1, 0, 5, 4, 3, 6, 7]])
In [13]: numpy.argsort([intArr2], kind='m') Out[13]: array([[0, 1, 2, 3, 4, 5, 6, 7]])
Both of these are stable sorts, and since the elements are in order should return [[0, 1, 2, 3, 4, 5, 6, 7]]. Actually, I think they should return [0, 1, 2, 3, 4, 5, 6, 7], I'm not sure why the returned array is 2D and I suspect that is a bug also. As to why the string array sorts incorrectly, I am not sure. It could be that the sort isn't stable, there could be a stride error, or the comparison is returning wrong values. My bet is on the first being the case.
Nevermind the 2D thingee, that was pilot error in changing lexsort to argsort, charArr should not be in a list:
In [25]: numpy.argsort(charArr, kind='m', axis=0) Out[25]: array([2, 1, 0, 5, 4, 3, 6, 7])
Works just fine.
Chuck
thanks
On 6/30/07, Charles R Harris
On 6/26/07, Charles R Harris
wrote: On 6/26/07, Charles R Harris < charlesr.harris@gmail.com> wrote:
On 6/26/07, Tom Denniston <
tom.denniston@alum.dartmouth.org> wrote:
In [1]: intArr1 = numpy.array([ 0, 1, 2,2,1, 5,5,5]) In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4]) In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d'])
Here I sort two int arrays. As expected intArr2 dominates intArr1 but the items with the same intArr2 values are sorted forwards according to intArr1 In [6]: numpy.lexsort((intArr1, intArr2)) Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7])
This, however, looks like a bug to me. Here I sort an int array and a str array. As expected charArray dominates intArr1 but the items with the same charArray values are sorted *backwards* according to intArr1 In [5]: numpy.lexsort((intArr1, charArr)) Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7])
Is this a bug or am I missing something?
It was a bug. It is fixed in svn.
Chuck
Looks like a bug.
In [12]: numpy.argsort([charArr], kind='m') Out[12]: array([[2, 1, 0, 5, 4, 3, 6, 7]])
In [13]: numpy.argsort([intArr2], kind='m') Out[13]: array([[0, 1, 2, 3, 4, 5, 6, 7]])
Both of these are stable sorts, and since the elements are in order
should return [[0, 1, 2, 3, 4, 5, 6, 7]]. Actually, I think they should return [0, 1, 2, 3, 4, 5, 6, 7], I'm not sure why the returned array is 2D and I suspect that is a bug also. As to why the string array sorts incorrectly, I am not sure. It could be that the sort isn't stable, there could be a stride error, or the comparison is returning wrong values. My bet is on the first being the case.
Nevermind the 2D thingee, that was pilot error in changing lexsort to
argsort, charArr should not be in a list:
In [25]: numpy.argsort(charArr, kind='m', axis=0) Out[25]: array([2, 1, 0, 5, 4, 3, 6, 7])
Works just fine.
Chuck
_______________________________________________ Numpydiscussion mailing list Numpydiscussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpydiscussion
participants (2)

Charles R Harris

Tom Denniston