[Python-Dev] collections module
tim.one at comcast.net
Sat Jan 10 15:49:26 EST 2004
[Kurt B. Kaiser]
> Aren't array-based implementations of queue usually based on
> circular buffers?
Yes, but those are generally bounded queues.
> Head and tail pointers plus a count are sufficient. Then a push is:
> check count (resize if necessary), write, adjust head pointer, adjust
By analogy to a real-life queue (of, say, people), pushing is usually viewed
as adding to the tail (when you get in line for movie tickets, you're likely
to get punched if you add yourself to the head of the queue). I have to
insist on using that terminology, just because it ties my head in knots to
swap the usual meanings.
> A pull is similar, using the tail pointer. Constant time.
So is the speculative Python list gimmick we're talking about, until it runs
out of room. We use mildly exponential overallocation when growing a list
to ensure amortized O(1) behavior regardless.
> The resize is a little tricky because the gap between the tail and
> head pointers is what needs to grow.
Resizing is easier with the speculative Python list gimmick, and 0-based
indexing (where list is the current head) is identical to the
list-indexing implementation Python already has. The latter is important
because it's fast. Needing to "wrap around" an index to account for
circularity would introduce new test+branch expense into indexing.
> The current implementation of Queue (with what appears to me to be a
> listpop(v, 0) for get() ) involves an internal shift of the whole
> pointer vector, O(n), for every read of the queue. The Cookbook
> examples mentioned alleviate this but still with excess copying or
> unbounded memory usage. IMHO it's better done in listobject.
Queue.py and the speculative Python list gimmick are different issues. The
overhead of mucking with locks, and Python-level method calls, in Queue.py
surely overwhelms whatever savings you could get over the two-list queue
implementation Josiah put in his Cookbook comment. The only "extra" expense
there over a straight push-and-pop-both-at-the-right-end Python list is that
each element added to the queue is involved in (no more than) one C-level
pointer swap (due to the reverse() call, when the input list is turned into
the output list). If the speculative Python list gimmick were implemented,
Queue.py could, of course, change to use it.
> I'm not up to speed on the reasons why ob_item needs to float
So that indexing a Python list is as fast as it is today, and so that a
mountain of C code continues to work (all C code mucking with Python lists
now believes that a list contains ob_size elements, starting at ob_items and
ending at ob_items+ob_size-1).
> but perhaps there is a straightforward indirect relationship to the
> head pointer.
ob_item would *be* the head pointer (for my meaning for "head"), whenever
the list is non-empty.
> Why not have listobject grow deque functionality using circular buffer
As above, I don't see much to recommend that over the speculative Python
list gimmick: indexing is slower and more complicated, resizing is more
complicated, and all the listobject.c internals would get more complicated
by needing to deal with that the list contents are no longer necessarily in
a contiguous slice of a C vector (from the C POV, the list contents could be
in two disjoint contiguous slices). It would have the advantage that
resizing is never needed until you're wholly out of room, but that seems
minor compared to the drawbacks.
More information about the Python-Dev