memory use with regard to large pickle files

"Martin v. Löwis" martin at v.loewis.de
Wed Oct 15 18:44:04 EDT 2008


> The program works fine, but the memory load is huge.  The size of
> the pickle file on disk is about 900 Meg so I would theoretically
> expect my program to consume about twice that (the dictionary
> contained in the pickle file plus its repackaging into other formats),
> but instead my program needs almost 5 Gig of memory to run.
> Am I being unrealistic in my memory expectations?

I would say so, yes. As you use 5GiB of memory, it seems you are
running a 64-bit system.

On such a system, each pointer takes 8 bytes. In addition,
each object takes at least 16 bytes; if it's variable-sized,
it takes at least 24 bytes, plus the actual data in the object.

OTOH, in a pickle, a pointer takes no space, unless it's a
shared pointer (i.e. backwards reference), which takes
as many digits as you need to encode the "object number"
in the pickle. Each primitive object takes only a single byte
overhead (as opposed to 24), causing quite drastic space
reductions. Of course, non-primitive objects take more, as
they need to encode the class they are instances of.

> Is there a way to see how much memory is being consumed
> by a single data structure or variable?  How can I go about
> debugging this problem?

In Python 2.6, there is the sys.getsizeof function. For
earlier versions, the asizeof package gives similar results.

Regards,
Martin



More information about the Python-list mailing list