[Python-Dev] collections module

Sun Jan 11 13:30:28 EST 2004

[Kurt B. Kaiser]
>> I am talking about a modification to listobject.c, also.

[Martin v. Loewis]
> So you suggest that the list type become circular???

That's my understanding, but already belabored at length <wink>.

> That will cause a major ABI incompatibility, as all modules
> that use PyList_GET_SIZE need to be recompiled.

Well, the macros aren't supposed to be used outside the core -- external
extensions are supposed to be using the representation-hiding PyList_Size,
PyList_GetItem, PyList_SetItem functions.

If we keep it contiguous, users of the macros may need to be recompiled
anyway, since the layout of the list header would change.

> In addition, it causes a slow-down for the standard use of lists.

That's one of my two major objections.  The other is the enormous amount of
code changes that would be required throughout listobject.c -- all "the
usual" C idioms for mucking with vectors would no longer work correctly
as-is, whether it's the typical "for (p = list->ob_item: p < fence; ++p)"
loop, or use of library functions like memmove and memcpy.  That's a ton of
additional complication, with obvious bad implications for transparency,
stability, maintainability, code bloat, and efficiency.

> So you would make the general case slower, just to support a
> special case. Why are you proposing that?

Kurt's worried about the expense of memmove(), but not about the expense of
integer mod.  To be fair(er <wink>>), I'm not convinced that anyone has
dreamt up *detailed* suggestions sufficient to allow common queue access
patterns to be reasonably zippy.  Think about this one:

q = [incoming() for i in range(1000)]
while True:
    q.append(incoming())
    consume(q.pop(0))

That's just a queue in steady-state, with a backlog of 1000 but thereafter
with balanced entry and exit rates.

We currently realloc() q on every append(), rounding up its size, where "its
size" is taken from ob_size.  That would no longer be *correct*:  ob_size
stays steady at 1000, but append needs ever-larger indices relative to the
start of the allocated memory.  IOW, while ob_size bears a useful
relationship to the amount of memory we actually allocate tody, that
relationship would no longer hold (ob_size would become an arbitrarily poor
lower bound).

So what are we going to do instead, exactly?  We actually allocate enough
room for 1024 elements now (although Python doesn't currently remember
that).  If we start to remember when we're out of room, and memmove() the
whole blob left 24 slots when append hits the end of available room, then
every 24 (+-1 -- doesn't matter at this level) iterations of the loop, we
end up memmove'ing 1000 pointers.  If the steady-state backlog were, say,
1023 elements instead, we'd end up memmove'ing 1023 pointers on *every* loop
iteration, and each append becomes a linear-time operation.

OTOH, if we "round up the size" based on the number of allocated (not used)
slots, then we get back amortized O(1) time per append -- but the amount of
allocated memory needed by the example grows without bound, despite that the
number of used slots remains constant.

No "pure" strategy I've seen seems good enough.  A circular implementation
has no problem with the example, or, indeed, with any queue access pattern,
although a circular implementation would be reluctant to reduce the amount
of allocated space when the number of used slots decreases (since the unused
slots are "a hole in the middle" of the allocated slots, and we can't give
back a hole to the system malloc).