[Python-Dev] Optimization of the Year
raymond.hettinger at verizon.net
Tue Feb 10 02:15:39 EST 2004
Hey, hey! After looking at this on and off for months, I finally
found a safe way to dramatically improve the speed of building lists:
We've always known that every list.append() resulted in a realloc()
call after computing an adaptive, oversized block length.
The obvious solution was to save the block size and avoid calls to
realloc() when the block was already large enough to hold a new
element. Unfortunately, the list structure was wide open to
manipulation by every piece of code written since Python's infancy.
Almost no assumptions could be made about other code directly
hacking the list structure elements.
The safe solution turned out to be saving a copy of the ob_item pointer
whenever a block of known size was allocated. That copy is then
used a guard for the size check. That way, if the pointer is swapped
out by another routine (not a myth, it happens), we assume our saved
block size is invalid and just do a normal realloc.
The results are wonderful:
* list(it) speeds up by at least 40% when "it" is an iterable
not defining __len__ (there is no change for iterables that
do define __len__ because they pre-size in a single step).
. list comprehensions show a similar speed-up
. list.append(x) is about twice as fast
AFAICT, this is a straight-win with no trade-offs along the way.
The only offsetting cost is adding two fields to the list
structure (Tim said that was bad but I don't remember why).
Please take a look and see if there are any holes in the theory.
While the patch passes the test suite, it is possible that some
extension module reallocs() the list to a smaller size (leaving
the pointer unchanged) and then changes its mind and starts growing
the list again. Given the magnitude of the speed-up, I think that
risk is warranted. The offending code would have to be by-passing
the C API for lists and tinkering with the internals in atypical
way (swapping one pointer for another and/or altering ob_size is
typical; downward resizing the existing pointer is not).
P.S. SF is down for maintenance. Will post the patch there tomorrow.
While the change is simple, the patch is long because all the NRESIZE
macros had to be replaced by a function call.
More information about the Python-Dev