list.pop(0) vs. collections.dequeue

Steve Howell showell30 at yahoo.com
Sat Jan 23 05:37:56 EST 2010


On Jan 23, 1:24 am, Terry Reedy <tjre... at udel.edu> wrote:
> On 1/23/2010 3:23 AM, Steve Howell wrote:
>
> > On Jan 23, 12:13 am, Terry Reedy<tjre... at udel.edu>  wrote:
>
> >> Challenge yes, mock no.
>
> >> Part of writing good basic data structures is not adding needless
> >> complication from featuritis and not penalizing 99.99% of access to
> >> satify a .01% need better satisfied another way.
>
> > I would like to challenge your assertion that advancing ob_item
> > instead of doing memmove during list_ass_slice would impact the
> > performance of list accesses in any way.  It would only slow down
> > operations that add/insert items into the list by, and then only by a
> > single conditional statement, and those add/insert operations are
> > already O(N) to begin with.
>
> If you try writing a full patch, as I believe someone did, or at least a
> prototype thereof, when the idea was discussed, you might have a better
> idea of what the tradeoffs are and why it was rejected.
>
> For instance, when you append to a full list, it is resized. I believe
> it is now doubled, but the policy has varied over the years. If you turn
> list from essentially a stack to a deque, you complicate the resizing
> policy and have to consider the addition of a shift policy. Do you split
> the over-allocated fore and aft or increase the overallocation from
> double to, say, triple? If the former, then for normal usage that never
> uses the fore part, the over-allocation has been effectively reduced
> from 2x to 1.5x (which is about what it once was, I believe). This means
> more frequent resizings and copyings, which means slower operation for
> most use cases. Adding extra usually wasted space is also a nuisance.
>

It looks like most of the complication would be in list_resize.

I'm gonna oversimplify a bit, but tell me if this is the gist of it.

I would have ob_item continue to always refer to first element of the
list, and then I'd have to introduce another variable to refer to the
start of our allocated memory, ob_start_memory, whenever you do a
realloc/free/malloc.  I'd have a notion of fore_wastage, which would
either be a variable I maintain or something that I just calculate as
needed from the difference of ob_item and ob_start_memory.

In deciding whether I want to give memory back to the memory manager,
I just need to adjust my calculations to account for fore and aft
wastage to see if it's time to do a shrink, and before shrinking, I do
the memmove.

On growth, I would just always do a memmove right away if there is
fore_wastage, and then do the normal procedure for aft wastage.

For the most common scenario of append, append, append, the only
penalty is having to skip over fore_wastage logic by checking for
fore_wastage == 0 or ob_item == ob_start_memory.

For the scenario of several appends followed by several pops, I get
the big win of only doing log 2 N memmoves instead of N as I shrink
the list down to zero.

If I start alternating between pops and appends, it's basically a
wash...instead of doing the memmove on the pop, I do it on the next
append.

If I were to pop the top element and then prepend a new element, to be
truly efficient, I'd want to use reserved right away, but for
simplicity, I would probably not complicate list_ass_slice and just do
the normal resize() dance, which would give me memmove in one
direction followed immediately by a memmove in the other direction
when I come back to list_ass_slice.  (But it would all still be a
wash, since I would have to do the same number of memmoves in the
current implementation.)

A lot of the essential complexity here seems to come from the fact
that realloc() isn't a very robust abstraction.  It seems to be
expensive to tell it you want to shrink, and it also does not have an
interface to tell it to give you a little growing room.  On the other
hand, the code within list_resize() actually provides a nice framework
for amortizing memmoves exponentially.



More information about the Python-list mailing list