[Python-Dev] Do we need __length_hint__ at all? (Was PEP 0424: A method for exposing a length hint)

Victor Stinner victor.stinner at gmail.com
Mon Jul 16 13:05:20 CEST 2012


> *If* resizing list is so slow, then why not make it faster?

A simple solution to speed up such problem is to change the
overallocation factor, but it may waste memory. Python tries to be
fast and to not waste too much memory.

> Why is it a significant optimisation?
> How much slower is it?
> Where is the data?

I worked recently on optimizing str%args and str.format(args). Handle
correctly the memory allocation is just critical for performances,
especially for str with the PEP 393, because we have to shrink the
buffer to the exact string length with the formatting function is
done. I tried various overallocation factors and I chose 1.25 (5/4)
because it was the fastest. See for example this issue for benchmark
numbers:
http://bugs.python.org/issue14687

The PyUnicodeWriter internal object uses various options to choose how
many bytes should be allocated:
 * an overallocation flag to disable overallocation when we know that
we are writing the last character/string into be buffer
 * a "minimal length" used for the first allocation
 * an hardcoded overallocation factor (1.25)

PyUnicodeWriter is a little bit different than the __length_hint__
issue because PyUnicodeWriter has to shrink the buffer when it is
done, but I can say that overallocation is very useful for speed.

Victor


More information about the Python-Dev mailing list