[Tutor] python list, right! but concretely?

spir ☣ denis.spir at gmail.com
Sun May 2 19:17:18 CEST 2010


On Mon, 3 May 2010 00:50:40 +1000
Steven D'Aprano <steve at pearwood.info> wrote:

> On Sun, 2 May 2010 07:44:22 pm Lie Ryan wrote:
> 
> > Python's 'list' is an array of pointers to `PyObject` ('object' in
> > Python) and the resizing algorithm keeps the list size such that
> > "allocated / 2 <= actual <= allocated". When list need to resize, it
> > overallocates the list slightly over 1.125 times than the requested
> > size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize".
> >
> > If you're interested in more precise details (since I do omit some,
> > especially those that seems to be micro-optimization-related), you
> > need to read the source at
> > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c
> 
> Thanks for the correction Lie. However, I don't understand what is going 
> on in the code. I'm not a C programmer, so I'm struggling to read it, 
> but as far as I can tell the comment in the code and the code itself 
> are not the same.
> 
> The code calculates an amount to over-allocate the list:
> 
> new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
> 
> Some experiments in Python shows that the right hand expression varies 
> like this:
> 
> >>> for i in range(30):
> ...     print ((i >> 3) + (3 if i < 9 else 6)),
> ...
> 3 3 3 3 3 3 3 3 4 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9

Right, I don't understand neither. Since i>>8 <=> i/8, the above code is finally a (fast, because of bit-level op) version of:

    if size < 8:
	return 3
    else:
	return size/8 + (3 if size < 9 else 6),

> But the comment in the code says:
> 
> /* This over-allocates proportional to the list size, making room
> * for additional growth.  The over-allocation is mild, but is
> * enough to give linear-time amortized behavior over a long
> * sequence of appends() in the presence of a poorly-performing
> * system realloc().
> * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
> */
> 
> which doesn't look anything like the numbers above. I don't understand 
> what this growth pattern refers to. What have I misunderstood?

I first thought "growth pattern" is to be interpreted as "how the array size grows when it is progressively fed with new items", like when feeding a list in a loop.
But I tried to simulate this, and the size sticks at 3. finally, I think what is called in source "new_allocated" is not the new array memory size, but the _difference_. When writing "size = size + new_allocated" instead of "size = new_allocated", It get a growth pattern of:
    0 3 6 9 16 24 33 43 54 66 80 96 114
Which is not exactly what is stated in code, but rather similar...

def grow():
    size = 0 ; count = 0
    for count in range(100):
        if count > size:
            size = size   +   (size >> 3) + (3 if size < 9 else 6)
            print size,

Denis
________________________________

vit esse estrany ☣

spir.wikidot.com


More information about the Tutor mailing list