sort functions in python
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Feb 9 03:02:18 EST 2008
On Fri, 08 Feb 2008 19:09:06 -0800, Jeff Schwab wrote:
> Steven D'Aprano wrote:
>> On Fri, 08 Feb 2008 17:00:27 -0800, t3chn0n3rd wrote:
>>
>>> Do you think it is relatively easy to write sort algorithms such as
>>> the common Bubble sort in Python as compared to other high level
>>> programming langauges
>>
>>
>> You realise that bubble sort is one of the worst possible sort
>> algorithms possible, particularly on modern hardware? It's not the
>> worst: bogo sort and bozo sort are worse, but bubble sort is still
>> pretty atrocious.
>
> That's the conventional wisdom, mostly because CS professors need a "bad
> algorithm" to show undergrads, but it's not really accurate. The truth
> is that bubble-sort's performance depends strongly on the input data. In
> the worst case, yes, bubble-sort is O(n^2); however, for almost-sorted
> data (only a few elements out of place), bubble-sort is actually one of
> the fastest known algorithms.
It depends on what you mean by "bubble sort". There are many different
variations of bubble sort, that are sometimes described by names such as
comb sort, cocktail sort, exchange sort, and sometimes merely referred to
bubble sort. It's rather like any divide-and-conquer sorting algorithm
being called quicksort.
I'm as guilty of that as anyone else -- the example code I posted, raided
from Wikibooks is very different from this bubble sort algorithm from
PlanetMath:
http://planetmath.org/encyclopedia/Bubblesort.html
def bubblesort(L):
done = False
while not done:
done = True
for i in xrange(len(L)-1):
if L[i+1] <= L[i]:
L[i], L[i+1] = L[i+1], L[i]
done = False
return L
This one runs significantly faster than the earlier one I posted.
Another vital factor is, what language are you implementing the sort
routine in? The conventional wisdom for low-level languages like assembly
and C doesn't necessarily hold for high-level languages like Python.
Comparisons are slow in Python and fast in C; in C a good algorithm will
minimize the amount of moves, while in Python a good algorithm will
minimize the number of comparisons.
Finally, what hardware are you running on? There are interactions between
hardware caches and pipes that can ruin the performance of an otherwise
promising algorithm.
But all those factors aside, I fear that your attempt to rehabilitate the
reputation of bubble sort is misplaced. Here's another person who wants
to defend bubble sort:
"A fair number of algorithm purists (which means they've probably never
written software for a living) claim that the bubble sort should never be
used for any reason. Realistically, there isn't a noticeable performance
difference between the various sorts for 100 items or less, and the
simplicity of the bubble sort makes it attractive."
http://linux.wku.edu/~lamonml/algor/sort/bubble.html
But then on the same person's page on insertion sort:
"The insertion sort is a good middle-of-the-road choice for sorting lists
of a few thousand items or less. The algorithm is significantly simpler
than the shell sort, with only a small trade-off in efficiency. At the
same time, the insertion sort is over twice as fast as the bubble sort
and almost 40% faster than the selection sort."
http://linux.wku.edu/~lamonml/algor/sort/insertion.html
He gives sample C code for both, and the bubble sort has eight lines of
code, versus nine for insertion sort (excluding bare braces).
Perhaps what he should have said is:
"Bubble sort is a tiny bit simpler than insertion sort, and almost twice
as slow!"
--
Steven
More information about the Python-list
mailing list