pickle module doens't work

Dave Angel d at davea.name
Thu Dec 27 13:34:11 CET 2012

On 12/27/2012 07:05 AM, Omer Korat wrote:
> You're probably right in general, for me the 3.3 and 2.7 pickles definitely don't work the same:
> 3.3:
>>>> type(pickle.dumps(1))
> <type 'bytes'>
> 2.7:
>>>> type(pickle.dumps(1, pickle.HIGHEST_PROTOCOL))
> <type 'str'>

That is the same. In 2.7, str is made up of bytes, while in 3.3, str
would be unicode. So 'bytes' is the 3.3 equivalent of str.

> As you can see, in 2.7 when I try to dump something, I get useless string. Look what I gen when I dump an NLTK object such as the sent_tokenize function:
> '\x80\x02cnltk.tokenize\nsent_tokenize\ng\x00'
> Now, this is useless. If I try to load it on a platform without NLTK installed on it, I get:
> ImportError: No module named 'nltk'
> So it means the actual sent_tokenizer wasn't saved. Just it's module.

As Peter Otten has already pointed out, that's how pickle works. It does
not somehow encode the whole module into the pickle, only enough
information to recreate the particular objects you're saving, *using*
the same modules. I don't know of any method of avoiding the destination
machine needing nltk, regardless of Python version.

Perhaps you'd rather see it in the Python docs.


pickle <http://docs.python.org/2/library/pickle.html#module-pickle>can
save and restore class instances transparently, however the class
definition must be importable and live in the same module as when the
object was stored.
Similarly, when class instances are pickled, their class’s code and data
are not pickled along with them. Only the instance data are pickled.
This is done on purpose, so you can fix bugs in a class or add methods
to the class and still load objects that were created with an earlier
version of the class.



More information about the Python-list mailing list