sort functions in python
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Feb 9 17:10:05 EST 2008
On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:
> Carl Banks wrote:
>> On Feb 8, 10:09 pm, Jeff Schwab <j... at schwabcenter.com> wrote:
>>> If you expect your data to be pretty nearly sorted already, but you
>>> just want to make sure (e.g. because a small number of elements may
>>> have been inserted or removed since the last sort), bubble-sort is a
>>> good choice.
>>
>> But if you're at that stage you probably were doing something wrong in
>> the first place.
>
> How do you figure? You don't always have control over the
> insertions/replacements, or where they happen. As a silly example,
> assume you have a sorted list of children, by height. Maybe you check
> your list once per school semester. The new sort order for any given
> semester will likely be very close to the previous order; however, a few
> swaps may be in order, according to the different speeds at which
> children have grown.
You check their heights once per semester, but you're worried about an
extra ten or twenty microseconds to sort the data?
Frankly, if we're talking about sorting at the level of Python
application programming, nothing you write is going to beat the speed of
the built-in list.sort() and sorted() built-ins. Here's that bubble sort
from yesterday again:
def bubble(A):
# http://planetmath.org/encyclopedia/Bubblesort.html
N = len(A)-1
done = False
while not done:
done = True
for i in xrange(N):
if A[i+1] <= A[i]:
A[i], A[i+1] = A[i+1], A[i]
done = False
return A
and some speed tests:
>>> L = [1, 2, 3, 4, 6, 5, 7, 9, 8]
>>> min(timeit.Timer('bubble(L)',
... 'from __main__ import bubble, L').repeat(number=1000))
0.012052059173583984
>>> min(timeit.Timer('sorted(L)',
... 'from __main__ import L').repeat(number=1000))
0.0055558681488037109
>>> min(timeit.Timer('bubble(L)',
... 'from __main__ import bubble; L=range(5)').repeat(number=1000))
0.008541107177734375
>>> min(timeit.Timer('sorted(L)',
... 'L=range(5)').repeat(number=1000))
0.0046191215515136719
If you're writing code in a low-level language, or a high-level language
with no built-in sort (or a crappy one), your mileage may vary.
--
Steven
More information about the Python-list
mailing list