[Python-Dev] Unicode objects more space efficient than plain strings? can that be?

Tim Peters tim.one@comcast.net
Thu, 02 May 2002 16:29:33 -0400


[Tim]
> ...
> I'm inclined to change the pymalloc realloc to copy a shrinking block
> if at least 25% of the input block would go away, else leave it alone.
> In this specific case, something like 90% of the input block could be
> reclaimed.

So I did that.  In the test program

"""
if 1:
    L = [u"abc%d"%i for i in xrange(1000000)]
else:
    L = ["abc%d"%i for i in xrange(1000000)]

raw_input('L is built')

del L
raw_input('L is deleted')
"""

the process size grew to 46MB using regular strings, and to 69MB using
Unicode strings.  The second time the test program pauses shows another
difference:  pymalloc never gives small-block memory back to the system free
now, so the process size only shrunk by 4MB (for the list guts) after
deleting L when using regular strings.  The process size shrunk by 42MB when
using Unicode strings.