How to make this unpickling/sorting demo faster?
Thu Apr 17 23:10:14 CEST 2008
Steve Bergman <sbergman27 at gmail.com> writes:
> Anything written in a language that is > 20x slower (Perl, Python,
> PHP) than C/C++ should be instantly rejected by users on those grounds
Well, if you time it starting from when you sit down at the computer
and start programming, til when the sorted array is output, Python
might be 20x faster than C/C++ and 100x faster than assembler.
> I've challenged someone to beat the snippet of code below in C, C++,
> or assembler, for reading in one million pairs of random floats and
> sorting them by the second member of the pair. I'm not a master
> Python programmer. Is there anything I could do to make this even
> faster than it is?
1. Turn off the cyclic garbage collector during the operation since
you will have no cyclic garbage.
2. See if there is a way to read the array directly (using something
like the struct module or ctypes) rather than a pickle.
3. Use psyco and partition the array into several smaller ones with a
quicksort-like partitioning step, then sort the smaller arrays in
parallel using multiple processes, if you have a multicore CPU.
4. Write your sorting routine in C or assembler and call it through
the C API. If the sorting step is a small part of a large program and
the sort is using a lot of cpu, this is a good approach since for the
other parts of the program you still get the safety and productivity
gain of Python.
> Also, if I try to write the resulting list of tuples back out to a
> gdbm file,
I don't understand what you're doing with gdbm. Just use a binary
More information about the Python-list