Suitability for long-running text processing?
tsuraan
tsuraan at gmail.com
Mon Jan 8 11:55:24 EST 2007
> $ python
> Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02)
> [GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> # Python is using 2.7 MiB
> ... a = ['1234' for i in xrange(10 << 20)]
> >>> # Python is using 42.9 MiB
> ... del a
> >>> # Python is using 2.9 MiB
>
> With 10,485,760 strings of 4 chars, it still works as expected.
Have you tried running the code I posted? Is there any explanation as to
why the code I posted fails to ever be cleaned up?
In your specific example, you have a huge array of pointers to a single
string. Try doing "a[0] is a[10000]". You'll get True. Try "a[0] is
'1'+'2'+'3'+'4'". You'll get False. Every element of a is a pointer to the
exact same string. When you delete a, you're getting rid of a huge array of
pointers, but probably not actually losing the four-byte (plus gc overhead)
string '1234'.
So, does anybody know how to get python to free up _all_ of its allocated
strings?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20070108/fc80606b/attachment.html>
More information about the Python-list
mailing list