Sorting large files

Tim Peters tim_one at email.msn.com
Sun Jan 2 02:34:48 EST 2000


[Edward C. Jones]
> A Python program of mine generates a large number of tuples
> of ten integers each. I need to sort this collection of tuples.
>
> If there weren't so many tuples, I could make a Python list
> of these tuples and use sort(). Currently I write the
> tuples out to a file in ASCII and use the UNIX sort command.
> Is there a faster way to sort these tuples? ...

Can only echo previous requests for more info (how many is "large", is there
anything special about the distribution of ints, or the order in which they
appear, etc).

[Charles G Waldman]
> The list.sort method in Python uses quicksort (AFAIR) so
> it's just as efficient as using UNIX sort.

The 1.5.2 list.sort() is an optimized hybrid of binary insertion sort and
samplesort.  It does fewer compares than quicksort, and is much more robust
against pathological input (one data-dependent reason a Unix quicksort may
be unreasonably slow).  OTOH, the Unix sort knows it's comparing strings,
while list.sort() must redispatch to the type-approriate comparison routine
on each compare (higher overhead per comparison).  Also possible that the
Unix sort does some form of radix sort (which may-- or may not --be much
quicker than a quicksort, depending on a billion things we haven't been told
...).

no-info-in-no-info-out!-ly y'rs  - tim






More information about the Python-list mailing list