hash and __eq__

Albert van der Horst albert at spenarnc.xs4all.nl
Thu Jun 4 09:41:37 EDT 2009


In article <0233137f$0$8244$c3e8da3 at news.astraweb.com>,
Steven D'Aprano  <steve at REMOVE-THIS-cybersource.com.au> wrote:
>On Sun, 31 May 2009 12:08:33 +0100, Arnaud Delobelle wrote:
>
>> Anyway, it's good to know that quicksort is O(n^2) in the worst case -
>> and that this worst case can crop up very easily in some situations,
>> especially if not too much care has been taken implementing it.
>
>The naive quicksort algorithm, where you recursively split the list into
>two halves, performs badly if the list is already sorted (or nearly so).

No it isn't (in general).
It only does so if you use the first element as a pivot,
resulting in a set with one element and a set with n-1 elements.
It is an incredible long time ago that someone implemented qsort
this very naive way, especially since in the 80's Knuth warned
about it.

A current naive way is use element (m+n)/2 when sorting the range m..n.
This is nearly optimal for a sorted set. For a random set the chance
for a consistent bad choice is astronomically low. I have a QSORT
implemented in the library for my lina Forth and didn't bother to do
better than this. (Used in some of the demanding problems in
projecteuler.net).

More sophisticated qsorts select three elements at random and use
best of three. The worst behaviour is still O(n^2) but the multiplication
factor is lower and the chance is astronomically low to the third power.

>It's easy to fix that: randomise the list before you sort! You can do
>that with a single pass of the list. That has the advantage of ensuring
>that no specific input can cause degraded performance, so an attacker
>can't DOS your application by passing sorted lists to be sorted.

>Another strategy is to randomly exchange the pivot element when sorting.
                                 choose?

A good sort utility with facilities for records with fields
(using a modified heap sort, and using disk base merge sort for very
large sets) can be find on my site below (called ssort).
Because of the heap sort even the worst case is O(Nlog(N)) if
the O(N^2) behaviour of qsort is your worry.

>--
>Steven

Groetjes Albert

--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert at spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst




More information about the Python-list mailing list