hash and __eq__

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun May 31 20:29:34 EDT 2009


On Sun, 31 May 2009 12:08:33 +0100, Arnaud Delobelle wrote:

> Anyway, it's good to know that quicksort is O(n^2) in the worst case -
> and that this worst case can crop up very easily in some situations,
> especially if not too much care has been taken implementing it.

The naive quicksort algorithm, where you recursively split the list into 
two halves, performs badly if the list is already sorted (or nearly so). 
It's easy to fix that: randomise the list before you sort! You can do 
that with a single pass of the list. That has the advantage of ensuring 
that no specific input can cause degraded performance, so an attacker 
can't DOS your application by passing sorted lists to be sorted.

Another strategy is to randomly exchange the pivot element when sorting. 
This avoids having to do a separate pass over the list, while still 
ensuring that no particular input can cause degraded performance.

A third strategy is to use a hybrid insertion/quicksort function. This is 
probably a good idea regardless, because for small lists, the overhead of 
quicksort generally makes insertion sort faster anyway. (CPython's sort 
used to be similar: it was a hybrid insertion/samplesort until, by 
memory, version 2.2, when it was changed for Timsort.)



-- 
Steven



More information about the Python-list mailing list