list.pop(0) vs. collections.dequeue

Sat Jan 23 07:12:29 EST 2010

On Fri, 22 Jan 2010 21:42:43 -0800, Steve Howell wrote:

> This innocent program here literally moves about a million bytes of
> memory around for no good reason:
> 
>     lst = []
>     for i in range(2000):
>         lst.append(i)
>     while lst:
>         print lst.pop(0)
> 
> Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
> 
> Why wouldn't you get a competent C programmer simply make list_ass_slice
> smart enough to make list.pop(0) O(1)?

Because there are always trade-offs, and the competent C programmers who 
coded the implementation for lists choose different tradeoffs to the ones 
you would prefer.

Seems to me that the simple solution to your problem is for you to 
implement your own data structure that makes whatever trade-offs you 
like. If it is good enough and popular enough, it might even replace the 
existing list implementation.

>> That is, you give me the impression that you prefer this:
>>
>> while alist:
>>     x = alist.pop(0)
>>     process(x)
>>
>> over this:
>>
>> for x in alist:
>>     process(x)
>> # allow alist to be garbage collected when it goes out of scope
>>
>>
> No, to be more precise, I prefer this implementation of a recursive
> parser (using lists) to one that would have to use deque's because of
> the lameness of Python's list implementation:
> 
> https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py

That's a lot of code. Am I supposed to study the whole module, or can you 
give me a hint as to what you're referring to? The lack of docstrings and 
comments doesn't fill me with enthusiasm for reading it.

Nevertheless, on the basis of a quick scan, I suppose that you're 
probably talking about the nested function called recurse:

    def recurse(prefix_lines):
        while prefix_lines:
            prefix, line = prefix_lines[0]
            if line == '':
                prefix_lines.pop(0)
                append('')
            else:
                block_size = get_block(prefix_lines)
                if block_size == 1:
                    prefix_lines.pop(0)
                    if line == pass_syntax:
                        pass
                    elif line.startswith(flush_left_syntax):
                        append(line[len(flush_left_syntax):])
                    elif line.startswith(flush_left_empty_line):
                        append('')
                    else:
                        append(prefix + leaf_method(line))
                else:
                    block = prefix_lines[:block_size]
                    prefix_lines = prefix_lines[block_size:]
                    branch_method(output, block, recurse)
        return

Since you're not even looking at the results of the pop, why don't you 
just call del prefix_lines[0]? It probably won't perform any better, but 
it is more idiomatic.

An alternative would be to do exactly what you want lists to do: track 
the start of the list. Untested:

    def recurse(prefix_lines):
        start = 0
        end = len(prefix_lines)
        while start < end:
            prefix, line = prefix_lines[start]
            if line == '':
                start += 1
                append('')
            else:
                block_size = get_block(prefix_lines)
                if block_size == 1:
                    start += 1
                    if line == pass_syntax:
                        pass
                    elif line.startswith(flush_left_syntax):
                        append(line[len(flush_left_syntax):])
                    elif line.startswith(flush_left_empty_line):
                        append('')
                    else:
                        append(prefix + leaf_method(line))
                else:
                    block = prefix_lines[:block_size]
                    start = block_size
                    branch_method(output, block, recurse)
        return

No more O(N) deletions. Problem solved.

-- 
Steven