[Python-Dev] collections module

Michael Chermside mcherm at mcherm.com
Tue Jan 13 13:37:46 EST 2004


[Guido:]
> My own ideal solution [for queues] would be a subtle hack to the list
> implementation to make pop(0) and insert(0, x) go a lot faster -- this
> can be done by adding one or two extra fields to the list head and
> allowing empty space at the front of the list as well as at the end.

[Tim:]
> That Python's builtin list calls realloc() on every append is pretty gross,
> but it's also a tradeoff, saving the bytes in the list header that would be
> required to remember how many allocated-but-unused slots exist.  If you've
> got a million tiny lists (and some apps do), burning an additional 8MB for
> something your app doesn't really need isn't attractive (on a 32-bit box
> today, the list header is 16 bytes; pymalloc aligns to 8-byte boundaries, so
> the next possible size is 24 bytes).

[Guido:]
> Just in case nobody had thought of it before, couldn't the realloc()
> call be avoided when roundupsize(oldsize) == roundupsize(newsize)???

[Tim:]
> I had thought of it before, but never acted on it <wink>.  roundupsize() has
> gotten more expensive over the years too, and it may vary across boxes
> whether it's cheaper to call that a second time than it is to make the
> system realloc deduce that nothing needs to be done.  roundupsize() is
> certainly cheaper than Microsoft's realloc() in the do-nothing case.
> 
> Note that there's an endcase glitch here when oldsize==0 and newsize==1.
> roundupsize() returns 8 for both, but it doesn't follow that ob_item isn't
> NULL.  realloc(p, n) special-cases the snot out of p==NULL to avoid crashing
> then.
    [...]
> Note in passing that roundupsize() is expensive *because* lists don't
> remember the amount of overallocated space too:  roundupsize() has to
> arrange to deliver the same result for many consecutive inputs

I'm sure this is just pointing out the obvious, but *IF* the list header
were increased, we could add TWO fields... on for allocated_size, and
another for left_side_slots. According to Tim, this would still "only"
bump list objects up to 24 bytes, but would allow:
  [1] O(1) performance for pop(0)
  [2] No need to call realloc() on every append(), just on "true" 
      resizes.

I guess I'm proposing the following (all ideas previously mentioned):
  [1] pop(0) would increment ob_item and left_side_slots.
  [2] insert(0,x) would decrement ob_item and left_side_slots *IF*
      left_side_slots were > 0. Lists should still grow at the
      tail for efficiency, but we can re-use space at the head left
      by a pop(0).
  [3] append(0,x) (and other functions that insert) would compare 
      ob_size and allocated_size (do we need left_side_slots in 
      here too?) to see whether new space is needed at the end. 
      If so, it would EITHER call realloc() for more space, OR
      it would memmove so left_side_slots became 0. I'm still not
      quite sure what rule to use to determine which to do (although
      for MOST lists, left_side_slots would be 0, so there wouldn't
      be a question).

So, to recap advantages and disadvantages: lists would still grow 
just as they do now. Lists used as Queues by calling append() and
pop(0) would be time efficient so long as we avoid calling memmove
for each and every append() (that's why we don't want to ALWAYS
memmove when left_side_slots > 0). Lists on which pop(0) was never
called (that's most of 'em) would have NO degradation of space used,
and IMPROVED speed in cases (like Windows) where a realloc() noop
takes noticable time. And (this seems the most significant drawback)
the size of a list would increase by 8 bytes, even for very short
lists.

I wouldn't want to make the change without trying out the performance
in some "real applications", but it seems like there are some
potentially useful ideas floating around here.

-- Michael Chermside




More information about the Python-Dev mailing list