[Python-Dev] Re: Re: collections module

Terry Reedy tjreedy at udel.edu
Sun Jan 11 23:21:30 EST 2004


"Tim Peters" <tim.one at comcast.net> wrote in message
news:LNBBLJKPBEHFEDALKOLCKEOHIEAB.tim.one at comcast.net...
> [Terry Reedy]
>>  So I do not think trying to make
> > prepends O(1) should be considered further.
>
> I expect that if queue-like behavior *can* be supported efficiently using
an
> always-contiguous vector, efficient dequeue-like behavior will fall out
for
> free.

Efficient inserts at 0 would seem to me to require space at the front which
is usually wasted.   But if you can avoid this, perhaps by only adding
front space when prepends are attempted, great.


> > My possibly naive thought is that before reallocating, look at the
> > ratio of left free to total.
>
> I don't know what that means (neither the "left free" part nor what
exactly
> "total" is counting).  The devil is in the details here.

Whoops, I mixed Python and C viewpoints.  By total, I meant the C size of
the pointer array.  By 'free', I meant the popped pointers 'below' the
current 0 element, which from a C view, are not free but merely
overwriteable.

> > If memmove would result in as much, or perhaps nearly as much,
> > over-allocation as realloc,
>
> memmove() leaves the total number of unused slots unchanged.  realloc can
do
> that too, or increase it by any amount, or decrease it by any amount.   I
> just don't know what you're trying to say ... explain how it works, step
by
> step, in the
>
>     q = [incoming() for i in range(1000)]

Lets assume that C array is 1024 at this point.

>     while True:
>         q.append(incoming())

At 25th append, C array is full and there are 24 passe pointers.  24/1024
is small, so ignore them and expand as currently.  Assume array is doubled
to 2048 (since I don't know current actual value) and no memmove.

>         consume(q.pop(0))

After additional 1024 appends, array appears full again.  However, there
are 1048 unneeded pointers to 'left' of 0 position and 1048//2048 > .5 >=
threshhold, so memmove top 1000 back to beginning of array which stays at
size 2048.  Every 1048 appends thereafter, repeat memmove in array that
never again gets resized.  In the long run, the average pointer moves per
push/pop is less that 1.

> > then just do that.  Certainly if the ratio is small (like say .1),
> > don't bother with memmove.  The exact cutoff is a tuning matter.

If expansion is currently by fraction f, then after expansion, f/(1+f) of
slots are free at end of array.  Let threshhold t be this fraction or
perhaps a bit less.  With proposed l[0] popping, suppose that at append
fault, p of s slots are below l[0] pointer and that p/s >= t.  Then do
memmove instead of expansion.  Rationale: after memmove, there will be as
large (or almost as large) a fraction of slots available at end of list for
append as there now are.

I hope this is somewhat clearer.

Terry J. Reedy






More information about the Python-Dev mailing list