Pickling speed [was: Re: eval(repr(x)) == x]

David Bolen db3l at fitlinxx.com
Mon Jan 28 21:18:28 EST 2002


pinard at iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard) writes:

> Wanting to save this initialisation time, I added some mechanics to
> `pybox.py' so it cPickles all its recognition/rebuilding data on disk after
> generation, and just read back this data from the pickle given it exists.
> The goal was to skip the long initialisation.  To my surprise, I did not get
> any speed improvement this way, it was a bit longer to unpickle the data
> than to initialise it afresh, would it mean a lot of `re.compile' calls.
> So, I merely removed the pickling mechanics and decided to live with the
> initialisation time.  I guess I learned (:-) that pickling is not worth
> doing unless one as rather very lengthy initialisations, that is, much
> more than in the `rebox.py' case.
> 
> Does this match the experience of others, with regard to cPickle speed?

I expect it's less cPickle speed as much as the pickling speed of
regular expression objects.  While in some cases, reading a pickle
file of objects may be a speed increase, that's not really the goal.
Even when reading the pickle file, Python needs to instantiate each
object as read, which isn't far different from executing code to
create them in the first place, unless the computation necessary to
decide what to create is high.  Pickling is primarily state
persistence.

And it has to be portable.  The pickle format is designed to be
portable and Python version independent.  So for example, reals are
actually pickled as string representations of their value (from
repr()), and then evaluated on the way back in.  And I believe that
the regular expression objects achieve this by making the regex string
itself the pickled form of the expression, and then automatically
compiling it again when you read it back from the pickle file.

So your pickled format really wasn't much different from compiling on
the fly, since that's all that was happening when you loaded the
pickle.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list