[Python-Dev] collections module

Josiah Carlson jcarlson at uci.edu
Sat Jan 10 14:06:41 EST 2004


> If all the problem is about performance of pop(0) operations when the
> queues grow a lot, why not implement a queue as a list of lists
> (sublists), and all the magic will be to select an appropriate maximum
> size for the sublists?  Wouldn't this be actually a simulation of the
> "extra space at the left side of the list" algorithm without touching
> the C list implementation?

*pulls out algorithm analysis hat*
If your maximum size of the sublist is k, then any pop is at least O(k).
If you have a total of n items in the queue, then every k items incurrs
an O(n/k) penalty for the outer list shuft, or O(n/k^2) per item, or
overall O(k + n/k^2) per pop for a total queue size of n, and a maximum
sublist size of k.

for k > n^(1/3), this is O(k) per pop operation
for k < n^(1/3), this is O(n/k^2) per pop operation, which is always
greater than k

As k approaches 1, pops are horrible, but as long as k is greater than
the cube root of n, the operations are reasonable.  If one knows the
maximal size of the queue beforehand, setting up a properly sized k
would result in reasonable behavior using lists of lists.


However, using knuth's method, a dictionary fifo, or [val, [val, []]]
all result in O(1) puts and gets, on either end (the [val, [val, []]]
would require a previous pointer).


> Another idea, for a single list solution where most of the times there
> are lots of puts and then mostly lots of gets, perhaps this would be
> fast:
> 
> An attribute of the instance stores the last operation (put or get)
> If the current operation is not the same as the last, first reverse the
> underlying list, then do append() or pop().
> 
> But that is not practical for very large queues.

It would be perfectly reasonable for a streamed puts/gets, but I don't
know how common such behavior is necessary.

 - Josiah



More information about the Python-Dev mailing list