[Python-Dev] Do we need __length_hint__ at all? (Was PEP 0424: A method for exposing a length hint)

Stefan Behnel stefan_ml at behnel.de
Mon Jul 16 11:37:30 CEST 2012


Maciej Fijalkowski, 16.07.2012 11:14:
> On Mon, Jul 16, 2012 at 11:02 AM, Stefan Behnel wrote:
>> Mark Shannon, 16.07.2012 10:37:
>>> If resizing of lists is too slow, then we should reconsider the 9/8
>>> factor and/or look to tweak the resizing code.
>>
>> The thing is that the performance is platform specific. On systems with a
>> fast memory allocator, especially on Linux, the difference is negligible.
>> However, with a slow memory allocator, especially a slow realloc(), e.g. on
>> Windows or Solaris, this can substantially hurt the performance, up to a
>> quadratically increasing runtime in the worst case.
>>
>> The length hint was implemented specifically to work around this problem.
>
> It's not the actual allocation (typically), it's the copying that's the
> problem.

Note that a call to realloc() includes that part and can avoid copying if
possible. A good realloc() implementation can make this use case run in
amortised linear time, at least on a system with sufficient memory. A bad
one can result in quadratic runtime, which is way more than a change by a
constant factor. Thus my above comment that it's platform specific.


> there are cases where it *can* be significant (as in the copying is by far
> the dominating operation), most notable giant templates with iterators
> written in C.

Absolutely. This is particularly visible at the C level because C
implemented iterators have a very low overhead overall.


> Speaking of which - I find this bikeshed disgusting. The purpose of the PEP
> is to codify whatever is already written in code in CPython. If you guys
> don't want it, we'll just implement it anyway and try to follow the CPython
> current implementation from 2.7.

The idea behind this bikeshedding is that at the moment where we make this
an official protocol rather than an implementation detail, it should be
able to communicate the different states on the callee side of such a
protocol. I.e. it needs a better design than the current one.

Stefan



More information about the Python-Dev mailing list