Memory error while saving dictionary of size 65000X50 using pickle

"Martin v. Löwis" martin at v.loewis.de
Mon Jul 7 18:55:16 EDT 2008


> I didn't have the problem with dumping as a string. When I tried to
> save this object to a file, memory error pops up.

That's not what the backtrace says. The backtrace says that the error
occurs inside pickle.dumps() (and it is consistent with the functions
being called, so it's plausible).

> I am sorry for the mention of size for a dictionary. What I meant by
> 65000X50 is that it has 65000 keys and each key has a list of 50
> tuples.
[...]
> 
> You exmple works just fine on my side.

I can get the program

import pickle

d = {}

for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
print "Starting dump"
s = pickle.dumps(d)

to complete successfully, also, however, it consumes a lot
of memory. I can reduce memory usage slightly by
a) dumping directly to a file, and
b) using cPickle instead of pickle
i.e.

import cPickle as pickle

d = {}

for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
print "Starting dump"
pickle.dump(d,open("/tmp/t.pickle","wb"))

The memory consumed originates primarily from the need to determine
shared references. If you are certain that no object sharing occurs
in your graph, you can do
import cPickle as pickle

d = {}

for i in xrange(65000):
    d[i]=[(x,) for x in range(50)]
print "Starting dump"
p = pickle.Pickler(open("/tmp/t.pickle","wb"))
p.fast = True
p.dump(d)

With that, I see no additional memory usage, and pickling completes
really fast.

Regards,
Martin



More information about the Python-list mailing list