[Python-Dev] PEP 0424: A method for exposing a length hint

Mark Shannon mark at hotpy.org
Wed Aug 1 12:06:19 CEST 2012


Maciej Fijalkowski wrote:
> On Wed, Aug 1, 2012 at 10:46 AM, Mark Shannon <mark at hotpy.org> wrote:
>> While the idea behind PEP 424 is sound, the text of the PEP is rather vague
>> and missing a lot of details.
>> There was extended discussion on the details, but none of that has appeared
>> in the PEP yet.
>>
>> So Alex, how about adding those details?
>>
>> Also the rationale is rather poor.
>> Given that CPython is the reference implementation, PyPy should be compared
>> to CPython, not vice-versa.
>> Reversing PyPy and CPython in the rationale gives:
>>
>> '''
>> Being able to pre-allocate lists based on the expected size, as estimated by
>> __length_hint__,
>>
>> can be a significant optimization.
>> PyPy has been observed to run some code slower than CPython, purely because
>> this optimization is absent.
>> '''
>>
>> Which is a PyPy bug report, not a rationale for a PEP ;)
>>
>> Perhaps a better rationale would something along the lines of:
>>
>> '''
>> Adding a __length_hint__ method to the iterator protocol allows sequences,
>> notably lists,
>> to be initialised from iterators with only a single resize operation.
>> This allows sequences to be intialised quickly, yet have a small growth
>> factor, reducing memory use.
>> '''
>>
> 
> Hi Mark.
> 
> It's not about saving memory. It really is about speed. Noone bothered
> measuring cpython with length hint disabled to compare, however we did
> that for pypy hence the rationale contains it. It's merely to state
> "this seems like an important optimization". Since the C-level code
> involved is rather similar (it's mostly runtime anyway), it seems
> reasonable to draw a conclusion that removing length hint from cpython
> would cause slowdown.

It is not about making it faster *or* saving memory, but *both*.
Without __length_hint__ there is a trade off between speed and memory use.
You can have speed at the cost of memory by increasing the resize factor.

With __length_hint__ you can get both speed and good memory use.

Cheers,
Mark



More information about the Python-Dev mailing list