[BangPypers] How to handle files efficiently in python

Vishal vsapre80 at gmail.com
Thu Mar 24 12:22:00 CET 2011


On Thu, Mar 24, 2011 at 12:07 PM, Senthil Kumaran <orsenthil at gmail.com>wrote:

> On Thu, Mar 24, 2011 at 11:06:14AM +0530, Vishal wrote:
> > setting l[0] to None, un-references the earlier string data associated
> with
> > that name, which is then (force) collected by the collect() call.
>
> Can you please elaborate this further? I don't see such behavior in
> this snippet.
>
> >>> l
> [10, 20, 30]
> >>> sys.getsizeof(l)
> 104
> >>> l[0] = None
> >>> import gc
> >>> gc.collect()
> 0
> >>> sys.getsizeof(l)
> 104
> >>> l[0] = range(1000)
> >>> sys.getsizeof(l)
> 104
> >>> l[0] = None
> >>> gc.collect()
> 0
> >>> sys.getsizeof(l)
> 104
>
> --
> Senthil
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>

Use of sys.getsizeof() is misleading, specially for sequence objects.
sys.getsizeof() does not give the correct size of the data that is
referenced by the collection itself. So that would not be a good way to find
size of the actual data. You should use something like this:
http://code.activestate.com/recipes/546530/ . I saved this recipe by the
name of PyObjSize which I have used below.

>>> mb = 1024*1024
>>> s1 = 'a'*mb
>>> s2 = 'b'*mb
>>> s3 = 'c'*mb
>>> sys.getsizeof(s1)
1048600
>>> sys.getsizeof(s2)
1048600
>>> sys.getsizeof(s3)
1048600
>>> l = [s1, s2, s3]
>>> sys.getsizeof(l)
48
>>> import PyObjSize
>>> PyObjSize.asizeof(l)
3145848
>>> l[0] = None
>>> PyObjSize.asizeof(l)
2097256
>>> collect()
0

Return from collect() is the number of "unreachable objects" which, I think,
means number of objects that could not be collected as part of this run.
The doc also says that ints and floats may not be collected correctly.

http://docs.python.org/library/gc.html

You'd have to look at the process size, using the Windows Task Manager or
may be top in Unix. Using these tools, I have found that, string data gets
collected fine.

-- 
Thanks and best regards,
Vishal Sapre


More information about the BangPypers mailing list