Suitability for long-running text processing?

Felipe Almeida Lessa felipe.lessa at gmail.com
Mon Jan 8 11:35:21 EST 2007


On 1/8/07, tsuraan <tsuraan at gmail.com> wrote:
>
>
> > I just tried on my system
> >
> > (Python is using 2.9 MiB)
> > >>> a = ['a' * (1 << 20) for i in xrange(300)]
> > (Python is using 304.1 MiB)
> > >>> del a
> > (Python is using 2.9 MiB -- as before)
> >
> > And I didn't even need to tell the garbage collector to do its job. Some
> info:
>
> It looks like the big difference between our two programs is that you have
> one huge string repeated 300 times, whereas I have thousands of
> four-character strings.  Are small strings ever collected by python?

In my test there are 300 strings of 1 MiB, not a huge string repeated. However:

$ python
Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> # Python is using 2.7 MiB
... a = ['1234' for i in xrange(10 << 20)]
>>> # Python is using 42.9 MiB
... del a
>>> # Python is using 2.9 MiB

With 10,485,760 strings of 4 chars, it still works as expected.

-- 
Felipe.



More information about the Python-list mailing list