[Patches] [ python-Patches-1022910 ] Conserve memory with list.pop()
SourceForge.net
noreply at sourceforge.net
Sun Sep 12 21:57:45 CEST 2004
Patches item #1022910, was opened at 2004-09-06 02:16
Message generated for change (Comment added) made by rhettinger
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022910&group_id=5470
Category: Core (C code)
Group: Python 2.4
>Status: Closed
Resolution: None
Priority: 6
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Raymond Hettinger (rhettinger)
Summary: Conserve memory with list.pop()
Initial Comment:
The current list resizing scheme only downsizes when
more than 16 elements are removed in a single step:
del a[100:120].
When popping elements off of a list one at a time, the
current resizing approach never shrinks.
This patch makes it shrink whenever more than half of
the space is unused.
There may be a better approach. This one is simple and
avoids thrashing in stack applications that hover up
and down around a nearly steady state.
----------------------------------------------------------------------
>Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-12 14:57
Message:
Logged In: YES
user_id=80475
Revised and applies as Objects/listobject.c 2.223
* Changed the var name as requested. Great idea.
* Changed the second "if" to only check for zero (so that
lists can shrink all the way down). As requested, this
change leaves non-zero length lists in an over-allocated state.
* Restated the overallocation formula so that it depends
only on the requested size instead of both requested and
previous size. Doesn't change the growth schedule but does
make the algorithm easier to analyze.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-09-11 23:28
Message:
Logged In: YES
user_id=31435
+1 on the idea, and maybe <wink> on the code. Reading this
routine always gave me a headache, and I finally figured out
why: _new_size should be name new_allocated. It's the
new value of self->allocated, after all, and has much less to
do with newsize (which _new_size looks most like) or ob_size.
If I rename the code like that, I find it much easier to follow.
Then the tail end of the patched code looks like
new_allocated = (newsize >> 3) + (self->ob_size < 8 ? 3 : 6)
+ newsize;
if (newsize < (allocated >> 1))
new_allocated = newsize;
Then I ask "why?" about the last "if" clause. That is, why
change new_allocated away from the value it was given on
the preceding line? IOW, if that was a good value for
new_allocated when the list was growing, why not also when
the list is shrinking? If we reset new_allocated to *exactly*
the smaller requested new size, then if they add an element
next time we have to endure a realloc right away again. I
think it's good to leave some room for growth. If it turns out
the list isn't yo-yo'ing, but going straight down to 0, that's
fine, we'll just shrink it at a slightly slower pace then.
For example, suppose the list has room for 500 slots,
currently holds 250, and the last element is popped.
Unconditionally accepting the first new_allocated computed
would leave it with 249 + 3 + (249 >> 3) = 283 allocated
slots instead of with 249. The relative difference is small, but
the former is friendlier if the next operation is an append, and
is faster to compute (skips the "if").
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022910&group_id=5470
More information about the Patches
mailing list