Performance of lists vs. list comprehensions

Wed Jan 20 03:21:30 EST 2010

* Steven D'Aprano:
> On Wed, 20 Jan 2010 05:25:22 +0100, Alf P. Steinbach wrote:
> 
>> * Steven D'Aprano:
>>> On Tue, 19 Jan 2010 16:20:42 -0500, Gerald Britton wrote:
>>>
>>>> That's surprising. I wouldn't implement it that way at all.  I'd use a
>>>> dynamically-expanding buffer as I suggested.  That way you get a
>>>> single pass and don't have to calculate anything before you begin.  In
>>>> the best case, you'd use half the memory (the result just fits in the
>>>> buffer after its last expansion and no saved tuple).  In the worst
>>>> case, the memory use is about the same (you just expanded the buffer
>>>> using a 2x expansion rule then you hit the last item).
>>> In the worst case, you waste 50% of the memory allocated.
>> Yes. That is a good argument for not doing the expanding buffer thing.
>> But such buffers may be generally present anyway, resulting from
>> optimization of "+".
> 
> 
> As near as I can determine, the CPython optimization you are referring to 
> doesn't use the "double the buffer when needed" technique. It operates on 
> a completely different strategy. As near as I can tell (as a non-C 
> speaker), it re-sizes the string in place to the size actually needed, 
> thus reducing the amount of copying needed.
> 
> The optimization patch is here:
> 
> http://bugs.python.org/issue980695
> 
> and some history (including Guido's opposition to the patch) here:
> 
> http://mail.python.org/pipermail/python-dev/2004-August/046686.html
> 
> Nevertheless, the patch is now part of CPython since 2.4, but must be 
> considered an implementation-specific optimization. It doesn't apply to 
> Jython, and likely other implementations.

Provided that the CPython code is that code then you're right, it only calls 
PyObject_REALLOC, depending on that operation having (amortized) constant time.

And PyObject_REALLOC can in principle achieve that by relying on paging, e.g. 
using the OS' virtual memory allocator, instead of doubling and copying.

However, I'm baffled by the code I find via Google Code search; there 
PyObject_REALLOC simply calls malloc and copies, which if that were the code 
used in CPython 2.4  and greater, and if the '+' code is the code that you refer 
to above, would produce O(n^2) time for the first '+' example.

>> Using CPython 2.6.4 in Windows XP:
> [snip time measurements]
>> Here the linear increase of times indicate that the "+" is being
>> optimized using an expanding buffer for the string.
> 
> Be careful of jumping to the conclusion from timings that a certain 
> algorithm is used. All you can really tell is that, whatever the 
> algorithm is, it has such-and-such big-oh behaviour.

Well, as for intended meaning you're right about that, it needs not be a 
doubling buffer.

>> If only the minimal
>> space was allocated each time then one would expect O(n^2) behavior
>> instead of the apparently O(n) above.
> 
> I don't follow that reasoning.

The reasoning assumes independent allocations rather than reallocation 
(extension of existing allocation).

Then quadratic time for the example follows from sum(range(n)) = (n^2-n)/2 of 
the sizes of the data copied.

But with PyObject_REALLOC as essentially constant time also 'join' could take 
advantage of this. :-)

Cheers,

- Alf