How to make this unpickling/sorting demo faster?
Paul Rubin
http
Thu Apr 17 17:10:14 EDT 2008
Steve Bergman <sbergman27 at gmail.com> writes:
> Anything written in a language that is > 20x slower (Perl, Python,
> PHP) than C/C++ should be instantly rejected by users on those grounds
> alone.
Well, if you time it starting from when you sit down at the computer
and start programming, til when the sorted array is output, Python
might be 20x faster than C/C++ and 100x faster than assembler.
> I've challenged someone to beat the snippet of code below in C, C++,
> or assembler, for reading in one million pairs of random floats and
> sorting them by the second member of the pair. I'm not a master
> Python programmer. Is there anything I could do to make this even
> faster than it is?
1. Turn off the cyclic garbage collector during the operation since
you will have no cyclic garbage.
2. See if there is a way to read the array directly (using something
like the struct module or ctypes) rather than a pickle.
3. Use psyco and partition the array into several smaller ones with a
quicksort-like partitioning step, then sort the smaller arrays in
parallel using multiple processes, if you have a multicore CPU.
4. Write your sorting routine in C or assembler and call it through
the C API. If the sorting step is a small part of a large program and
the sort is using a lot of cpu, this is a good approach since for the
other parts of the program you still get the safety and productivity
gain of Python.
> Also, if I try to write the resulting list of tuples back out to a
> gdbm file,
I don't understand what you're doing with gdbm. Just use a binary
file.
More information about the Python-list
mailing list