[Python-ideas] An idea for a new pickling tool

Alexandre Vassalotti alexandre at peadrop.com
Wed Apr 22 23:14:58 CEST 2009

On Tue, Apr 21, 2009 at 6:02 PM, Raymond Hettinger <python at rcn.com> wrote:
> Motivation
> ----------
> Python's pickles use a custom format that has evolved over time
> but they have five significant disadvantages:
>   * it has lost its human readability and editability

This is not part of pickle design goals. Also, I don't think the
pickle protocol ever been a human-friendly format. Even if protocol 0
is ASCII-based, it doesn't mean one would like to edit it by hand.

>   * is doesn't compress well

Do you have numbers to support this? The last time I tested
compression on pickle data, it worked fairly well. In fact, I get a
2.70 compression ratio for some pickles using gzip.

>From my experience with pickle, I doubt you can improve significantly
the size of pickled data, without using static schemata (like Google
Protocol Buffers and Thrift). The only inefficient thing in pickle, I
am aware of, is the handling of PUT and GET opcodes.

>   * it isn't interoperable with other languages
>   * it doesn't have the ability to enforce a schema

Again, these are not part of pickle's design goals.

>   * it is a major security risk for untrusted inputs

There are way to fix this without replacing pickle. See the recipe in
pickle documentation:


> New idea
> --------
> Develop a solution using a mix of PyYAML, a python coded version of
> Kwalify, optional compression using bz2, gzip, or zlib, and pretty
> printing using pygments.
> YAML ( http://yaml.org/spec/1.2/ ) is a language independent standard
> for data serialization.
> PyYAML ( http://pyyaml.org/wiki/PyYAML ) is a full implementation of
> the YAML standard.  It uses the YAML's application-specific tags and
> Python's own copy/reduce logic to provide the same power as pickle itself.

But how are you going to handle serialization of class instances in a
language independent manner?

-- Alexandre

More information about the Python-ideas mailing list