splitting a large dictionary into smaller ones

John Machin sjmachin at lexicon.net
Mon Mar 23 08:56:25 CET 2009


On Mar 23, 1:32 pm, per <perfr... at gmail.com> wrote:
> hi all,
>
> i have a very large dictionary object that is built from a text file
> that is about 800 MB -- it contains several million keys.  ideally i
> would like to pickle this object so that i wouldnt have to parse this
> large file to compute the dictionary every time i run my program.
> however currently the pickled file is over 300 MB and takes a very
> long time to write to disk - even longer than recomputing the
> dictionary from scratch.

What version of Python are you using, if 2.X are you using pickle or
cPickle, on what platform, and what *exactly* is the code that you are
using to call *ickle.dump() and *ickle.load()?

What datatype(s) are the keys? the values? If str/unicode, what are
average lengths?



More information about the Python-list mailing list