[Numpy-discussion] String sort

Charles R Harris charlesr.harris at gmail.com
Tue Feb 12 13:34:14 EST 2008


On Feb 12, 2008 9:07 AM, Francesc Altet <faltet at carabos.com> wrote:

> A Monday 11 February 2008, Charles R Harris escrigué:
> > I'll check it out when I get home. As I say, it was running about 10%
> > slower on my machine, but if it does better on most platforms it is
> > probably the way to go. We can always change it in the future when
> > everyone is running on quantum computers.
>
> We've done some testing on newqsort in several computers in our company.
> Here are the results for ordering a list with 1 million of strings of
> length 15 filled with random information (using C rand()):
>
> 1) Ubuntu 7.1 (gcc 4.1.3, -O3, Intel Pentium 4 @ 2 GHz)
> C qsort with C style compare: 2.450000
> C qsort with Python style compare: 2.440000
> NumPy newqsort: 0.650000
>
> 2) Windows XP (SP2) (MSVC 7.1, /Ox, Intel Pentium 4 @ 2 GHz)
> C qsort with C style compare: 0.971000
> C qsort with Python style compare: 0.962000
> NumPy newqsort: 0.921000
>
> 3) SuSe LE 10.3 (gcc 4.2.1, -O3, AMD Opteron @ 2 GHz)
> C qsort with C style compare: 0.640000
> C qsort with Python style compare: 0.600000
> NumPy newqsort: 0.590000
>
> 4) Debian 4.2.2 (lenny) (gcc 4.2.3, -O3, Intel Pentium 4 @ 3.2 GHz)
> C qsort with C style compare: 1.770000
> C qsort with Python style compare: 1.750000
> NumPy newqsort: 0.440000
>
> 5) Mandriva 2008.0 (gcc 4.2.2, -O3, Intel Core2 Duo @ 1.5 GHz)
> C qsort with C style compare: 1.590000
> C qsort with Python style compare: 1.550000
> NumPy newqsort: 0.510000
>
> 6) Ubuntu 7.1 (gcc 4.1.3, -O3, Intel Pentium 4 @ 2.5 GHz)
> C qsort with C style compare: 1.890000
> C qsort with Python style compare: 1.900000
> NumPy newqsort: 0.500000
>
> 7) Ubuntu 7.1 (gcc 4.1.2, -O3, PowerPC 3 @ 1.3 GHz)
> C qsort with C style compare: 3.030000
> C qsort with Python style compare: 2.970000
> NumPy newqsort: 1.040000
>
> 8) MacOSX 10.4 (Tiger) (gcc 4.0.1, -O3, PowerPC 3 @ 1.3 GHz)
> C qsort with C style compare: 1.560000
> C qsort with Python style compare: 1.510000
> NumPy newqsort: 1.220000
>
> All benchmarks have been run using the attached benchmark (if anybody
> else wants to join the fiesta, please report back your feedback).
>
> Summarizing, one can say a couple of things:
>
> * Recent Debian distros and derivatives (Ubuntu) as well as Mandriva are
> suffering from a innefficient system qsort (at least the implementation
> for strings).  SuSe Linux Enterprise 10.3 seems to have solved this.
> And Windows XP (SP2) and MacOSX (Tiger) looks like they have a
> relatively efficient implementation of qsort.
>
> * The newqsort performs the best on all the platforms we have checked
> (ranging from a 5% of improvement on Opteron/SuSe, up to 3.8x with some
> Pentium4/Ubuntu systems).
>

The 3.8 is amazing, isn't it? I've found that the performance also depends
on whether I initialize the strings with random or empty. With the random
initialization the new sort is ~2x faster. That's fedora 8, core duo, 32 bit
OS, gcc 4.1.2.


> All in all, I'd also say that newqsort would be a good candidate to be
> put into NumPy.
>

I've merged some sorting tests preparatory to merging the new sorts. There
is a release coming up this weekend, I don't know if it is tagged yet, but
in any case I plan to merge the new sorts soon. Please help with the testing
when I do. Now it's off to paying work.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080212/5f185630/attachment.html>


More information about the NumPy-Discussion mailing list