![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Is there a more elegant and/or faster way to read some records from a file and then sort them by different fields ? What I have now is too specific and error-prone in general: import numpy as N records = N.fromfile(a_file, dtype=N.dtype('i2,i4')) records_by_f0 = records.take(records.getfield('i2').argsort()) records_by_f1 = records.take(records.getfield('i4',2).argsort()) If there's a better way, I'd like to see it; bonus points for in-place sorting. Thanks, George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/5c7407de6b47afcd3b3e2164ff5bcd45.jpg?s=120&d=mm&r=g)
El dt 31 de 10 del 2006 a les 23:38 +0000, en/na George Sakkis va escriure:
Why this is too specific or error-prone? I think your solution is quite good. If what you want is a more compact way to write the above, you can try with: In [56]:records=numpy.array([(1,1),(0,2)], dtype="i2,i4") In [57]:records[records['f0'].argsort()] Out[57]: array([(0, 2), (1, 1)], dtype=[('f0', '<i2'), ('f1', '<i4')]) In [58]:records[records['f1'].argsort()] Out[58]: array([(1, 1), (0, 2)], dtype=[('f0', '<i2'), ('f1', '<i4')]) HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Francesc Altet wrote:
Because it 1. repeats the field types 2. requires adding up the length of all previous fields as offset. If you're not convinced yet, try writing this in less than 3 seconds ;-): records = N.fromfile(a_file, dtype=N.dtype('i2,i4,f4,S5,B,Q')) records_by_f5 = ??
Ah, much better; I didn't know you can index a normal array (not recarray) by label. Now, if there's a way to do the sorting in place (records.sort('f1') doesn't work unfortunately), that would be perfect. Thanks, George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/5c7407de6b47afcd3b3e2164ff5bcd45.jpg?s=120&d=mm&r=g)
El dc 01 de 11 del 2006 a les 15:18 +0000, en/na George Sakkis va escriure:
Ah, I see your point :)
Yes, I agree that having the possibility to do records.sort('f1') would be a great addition (both in terms of usability but also efficiency). Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/fa3279b202e9a85f7d90a0422bca4489.jpg?s=120&d=mm&r=g)
Hey George On Tue, 31 Oct 2006, George Sakkis wrote:
Check the thread "Strange results when sorting array with fields" from about a week back. Travis made some changes to sorting in the presence of fields that should solve your problem, assuming your fields appear in the order you want to sort (i.e. you want to sort f1, f2, f3 and not something like f1, f3, f2). Cheers, Albert ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Albert Strasheim wrote:
I'm afraid this won't help in my case; I want to sort twice, once by f1 and once by f2. I guess I could make a second file with the fields swapped but this seems more messy and inefficient than Francesc's suggestion. George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 11/1/06, George Sakkis <george.sakkis@gmail.com> wrote:
Do you actually want the two different orders, or do you want to sort on the first field, then sort all the items with the same first field on the second field? Chuck ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
The former. George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
George Sakkis wrote:
Sorting on a particular field in-place would be possible if there were some-way to indicate to VOID_compare the field-order you wanted to use to compare on. There are a few ways I could think of doing this. 1) Use a thread-specific global variable (this doesn't recurse very easily). 2) Use the meta-object in the field specifier to indicate the order (the interface could still be something like .sort(order='f1') and a temporary data-type object is created and used). 3) Use a special key in the fields dictionary although this would require some fixes to all the code that cycles through the fields dictionary to recurse on structures. 4) Overload any of the other variables in the PyArray_Descr * structure. 5) Add a sort-order to the end of the PyArray_Descr * structure and a flag to the hasobject flag bits (that would be the last one available) that states that the Data-type object has the sort-order defined (so binary compatibilty is retained but the new feature can be used going forward). Any other ideas? -Travis ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/49df8cd4b1b6056c727778925f86147a.jpg?s=120&d=mm&r=g)
George Sakkis wrote:
As a final contribution before I become more quiet for a few weeks (and aside from hopefully releasing 1.0.1 soon), I've added a feature to allow specifying the sorting order for record arrays. It's added to the sort method as the order= keyword. You can pass in a string (specifying which field comes first --- all other fields will stay in the same order) or you can pass in a list or tuple which indicates the field order (any fields un-specified will stay in their same relative order). It works be creating a new data-type object with the .names attribute replaced with a newly ordered one and then calling sort on a view of the array with that data-type. The VOID_compare uses the names tuple to determine the ordering. This was the un-named option in my previous list. It's a better solution than all of the others, I think. The newly created data-type is discarded after the sorting is complete. -Travis ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/5c7407de6b47afcd3b3e2164ff5bcd45.jpg?s=120&d=mm&r=g)
El dt 31 de 10 del 2006 a les 23:38 +0000, en/na George Sakkis va escriure:
Why this is too specific or error-prone? I think your solution is quite good. If what you want is a more compact way to write the above, you can try with: In [56]:records=numpy.array([(1,1),(0,2)], dtype="i2,i4") In [57]:records[records['f0'].argsort()] Out[57]: array([(0, 2), (1, 1)], dtype=[('f0', '<i2'), ('f1', '<i4')]) In [58]:records[records['f1'].argsort()] Out[58]: array([(1, 1), (0, 2)], dtype=[('f0', '<i2'), ('f1', '<i4')]) HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Francesc Altet wrote:
Because it 1. repeats the field types 2. requires adding up the length of all previous fields as offset. If you're not convinced yet, try writing this in less than 3 seconds ;-): records = N.fromfile(a_file, dtype=N.dtype('i2,i4,f4,S5,B,Q')) records_by_f5 = ??
Ah, much better; I didn't know you can index a normal array (not recarray) by label. Now, if there's a way to do the sorting in place (records.sort('f1') doesn't work unfortunately), that would be perfect. Thanks, George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/5c7407de6b47afcd3b3e2164ff5bcd45.jpg?s=120&d=mm&r=g)
El dc 01 de 11 del 2006 a les 15:18 +0000, en/na George Sakkis va escriure:
Ah, I see your point :)
Yes, I agree that having the possibility to do records.sort('f1') would be a great addition (both in terms of usability but also efficiency). Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/fa3279b202e9a85f7d90a0422bca4489.jpg?s=120&d=mm&r=g)
Hey George On Tue, 31 Oct 2006, George Sakkis wrote:
Check the thread "Strange results when sorting array with fields" from about a week back. Travis made some changes to sorting in the presence of fields that should solve your problem, assuming your fields appear in the order you want to sort (i.e. you want to sort f1, f2, f3 and not something like f1, f3, f2). Cheers, Albert ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Albert Strasheim wrote:
I'm afraid this won't help in my case; I want to sort twice, once by f1 and once by f2. I guess I could make a second file with the fields swapped but this seems more messy and inefficient than Francesc's suggestion. George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 11/1/06, George Sakkis <george.sakkis@gmail.com> wrote:
Do you actually want the two different orders, or do you want to sort on the first field, then sort all the items with the same first field on the second field? Chuck ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/a9aa7fdd7a72ac870c603739a9c26b94.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
The former. George ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
George Sakkis wrote:
Sorting on a particular field in-place would be possible if there were some-way to indicate to VOID_compare the field-order you wanted to use to compare on. There are a few ways I could think of doing this. 1) Use a thread-specific global variable (this doesn't recurse very easily). 2) Use the meta-object in the field specifier to indicate the order (the interface could still be something like .sort(order='f1') and a temporary data-type object is created and used). 3) Use a special key in the fields dictionary although this would require some fixes to all the code that cycles through the fields dictionary to recurse on structures. 4) Overload any of the other variables in the PyArray_Descr * structure. 5) Add a sort-order to the end of the PyArray_Descr * structure and a flag to the hasobject flag bits (that would be the last one available) that states that the Data-type object has the sort-order defined (so binary compatibilty is retained but the new feature can be used going forward). Any other ideas? -Travis ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
![](https://secure.gravatar.com/avatar/49df8cd4b1b6056c727778925f86147a.jpg?s=120&d=mm&r=g)
George Sakkis wrote:
As a final contribution before I become more quiet for a few weeks (and aside from hopefully releasing 1.0.1 soon), I've added a feature to allow specifying the sorting order for record arrays. It's added to the sort method as the order= keyword. You can pass in a string (specifying which field comes first --- all other fields will stay in the same order) or you can pass in a list or tuple which indicates the field order (any fields un-specified will stay in their same relative order). It works be creating a new data-type object with the .names attribute replaced with a newly ordered one and then calling sort on a view of the array with that data-type. The VOID_compare uses the names tuple to determine the ordering. This was the un-named option in my previous list. It's a better solution than all of the others, I think. The newly created data-type is discarded after the sorting is complete. -Travis ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
participants (6)
-
Albert Strasheim
-
Charles R Harris
-
Francesc Altet
-
George Sakkis
-
Travis Oliphant
-
Travis Oliphant