numpy sort is not working
Hi everyone, I have a numpy array of dimensions
allRics.shape (583760, 1)
To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)]
allRics.dtype
dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')]) I checked and the endianness in dtype is correct. When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on 'time'. I do the following.
x=np.sort(allRics,order='time')
x[17330:17350]['time']
array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav@credit-suisse.com | www.credit-suisse.com <http://www.credit-suisse.com/> =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote:
Hi everyone,
I have a numpy array of dimensions
allRics.shape (583760, 1)
To sort the array, I set the dtype of the array as follows:
allRics.dtype
allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')])
I checked and the endianness in dtype is correct.
When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ‘time’. I do the following.
x=np.sort(allRics,order='time') x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]])
Time column doesn’t change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010).
Regards, Alok
Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav@credit-suisse.com | www.credit-suisse.com
============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi Travis, Very Strange. I am on version 1.6.2 L What could I be missing. I started using numpy quite recently. Is there a way to share the data with you? Regards, Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 11:58 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote: Hi everyone, I have a numpy array of dimensions
allRics.shape (583760, 1)
To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)]
allRics.dtype
dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')]) I checked and the endianness in dtype is correct. When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on 'time'. I do the following.
x=np.sort(allRics,order='time')
x[17330:17350]['time']
array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav@credit-suisse.com | www.credit-suisse.com <http://www.credit-suisse.com/> ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
On Tue, Sep 11, 2012 at 12:52 AM, Jadhav, Alok < alok.jadhav@credit-suisse.com> wrote:
Hi Travis,****
** **
Very Strange. I am on version 1.6.2 L****
What could I be missing. I started using numpy quite recently. Is there a way to share the data with you?
Hi Alok, Typically, a self-contained example which reproduces the error and generates its own data is more useful / valuable than sharing datasets. if you could come up with this, that'd be great! Be Well Anthony
****
** **
Regards,****
** **
Alok Jadhav****
GAT IT Hong Kong****
+852 2101 6274 (*852 6274)****
** **
*From:* numpy-discussion-bounces@scipy.org [mailto: numpy-discussion-bounces@scipy.org] *On Behalf Of *Travis Oliphant *Sent:* Tuesday, September 11, 2012 11:58 AM *To:* Discussion of Numerical Python *Subject:* Re: [Numpy-discussion] numpy sort is not working****
** **
Hey Alok, ****
** **
This is worth taking a look. What version of NumPy are you using? ****
** **
It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. ****
** **
Your example seems to work for me (with a simulated case on 1.6.1)****
** **
Best,****
** **
-Travis****
** **
** **
On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote:****
****
****
Hi everyone,****
****
I have a numpy array of dimensions****
****
allRics.shape (583760, 1) ****
****
To sort the array, I set the dtype of the array as follows:****
****
allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np. float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np. float)] ****
allRics.dtype****
dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')])****
****
I checked and the endianness in dtype is correct.****
****
When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ‘time’. I do the following.****
****
x=np.sort(allRics,order='time')****
x[17330:17350]['time']****
array([[ 61184.4 ],****
[ 61188.51 ],****
[ 61188.979],****
[ 61188.979],****
[ 61189.989],****
[ 61191.66 ],****
[ 61194.35 ],****
[ 61194.35 ],****
[ 61198.79 ],****
[ 61198.145],****
[ 36126.217],****
[ 36126.217],****
[ 36126.218],****
[ 36126.218],****
[ 36126.219],****
[ 36126.271],****
[ 36126.271],****
[ 36126.271],****
[ 36126.293],****
[ 36126.293]])****
****
Time column doesn’t change its order. Could someone please advise what is missing here? Is this related to the bug****
http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010).****
****
Regards,****
Alok****
****
****
****
****
****
Alok Jadhav****
*CREDIT SUISSE AG*****
GAT IT Hong Kong, KVAG 67****
International Commerce Centre | Hong Kong | Hong Kong****
Phone +852 2101 6274 | Mobile +852 9169 7172****
alok.jadhav@credit-suisse.com | www.credit-suisse.com****
****
** **
============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
============================================================================== ****
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion****
** **
**
============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
============================================================================== ****
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Anthony, Travis, I understand how an example that generates the data would be useful but it will be difficult to provide this in my case for following reasons - Data is read from hdf5 files (>50 MB) - Attaching the code which is used to generate the data. It may provide some more light. - numpy.sort() example on the site for small array works fine. So my guess is that only my data is having issue. - Fyi, I am on a windows 7 machine with python 2.6 and numpy 1.6.2 for j in range(0,len(rics)): ric=rics[j] trd=h5.getTrades(ric) idx=np.ones(len(trd))*j opened=np.zeros(len(trd)) time=trd[0:,0] trdp1=trd[0:,1] trdp0=np.insert(trd[1:,1], 0, basePx[ric]) dt=trd[0:,0]- np.insert(trd[0:-1,0], 0, trd[0,0]) value=trd[0:,1]*trd[0:,2] ricData=np.array([idx,opened,time,trdp1,trdp0,dt,value]) ricData=np.transpose(ricData) if allRics is None: allRics=ricData else: allRics=np.vstack((allRics, ricData)) allRics.dtype=[('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] allRics=np.sort(allRics,order='time') # This doesn't work Please notice that I am using vstack to generate the array. I just found out that I am able to sort numpy array before I set the dtype in following crude way allRics=allRics[allRics[:,2].argsort()] # This works I am able to continue with my code right now but not sure why structured array could not be sorted. Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Anthony Scopatz Sent: Tuesday, September 11, 2012 2:04 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working On Tue, Sep 11, 2012 at 12:52 AM, Jadhav, Alok <alok.jadhav@credit-suisse.com> wrote: Hi Travis, Very Strange. I am on version 1.6.2 L What could I be missing. I started using numpy quite recently. Is there a way to share the data with you? Hi Alok, Typically, a self-contained example which reproduces the error and generates its own data is more useful / valuable than sharing datasets. if you could come up with this, that'd be great! Be Well Anthony Regards, Alok Jadhav GAT IT Hong Kong +852 2101 6274 <tel:%2B852%202101%206274> (*852 6274) From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 11:58 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] >>> allRics.dtype dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')]) I checked and the endianness in dtype is correct. When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on 'time'. I do the following. >>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 <tel:%2B852%202101%206274> | Mobile +852 9169 7172 <tel:%2B852%209169%207172> alok.jadhav@credit-suisse.com | www.credit-suisse.com <http://www.credit-suisse.com/> ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
I am wondering of this has to do with the size of the array. It looks like the array is sorted --- but in chunks. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 10, 2012, at 10:46 PM, "Jadhav, Alok" <alok.jadhav@credit-suisse.com> wrote:
Hi everyone,
I have a numpy array of dimensions
allRics.shape (583760, 1)
To sort the array, I set the dtype of the array as follows:
allRics.dtype
allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')])
I checked and the endianness in dtype is correct.
When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ‘time’. I do the following.
x=np.sort(allRics,order='time') x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]])
Time column doesn’t change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010).
Regards, Alok
Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav@credit-suisse.com | www.credit-suisse.com
============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
The sorted array you see is the same as original array. I have replied to code to generate the below data. (done in a loop. Each loop generates sorted numpy arrayas it reads from file) and combines all arrays into a single numpy array. I need to sort the final array into a single sorted array. It could be because of array size. It maybe silently failing somewhere? I don’t see any error, but output is not sorted. Array sorting works fine if the array is not structured. Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 10:07 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working I am wondering of this has to do with the size of the array. It looks like the array is sorted --- but in chunks. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 10, 2012, at 10:46 PM, "Jadhav, Alok" <alok.jadhav@credit-suisse.com> wrote: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] >>> allRics.dtype dtype([('idx', '<f8'), ('opened', '<f8'), ('time', '<f8'), ('trdp1', '<f8'), ('trdp0', '<f8'), ('dt', '<f8'), ('value', '<f8')]) I checked and the endianness in dtype is correct. When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ‘time’. I do the following. >>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn’t change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion@scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav@credit-suisse.com | www.credit-suisse.com <http://www.credit-suisse.com/> ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
participants (3)
-
Anthony Scopatz
-
Jadhav, Alok
-
Travis Oliphant