[Python-Dev] Increasing the C-optimized pickle extensibility

Guido van Rossum guido at python.org
Fri Apr 26 11:15:02 EDT 2019

I think it's better not to introduce a new opcode, for the reason you
stated -- you don't want your pickles to be unreadable by older Python
versions, if you can help it.

On Fri, Apr 26, 2019 at 5:59 AM Pierre Glaser <pierre.glaser at inria.fr>

> Hi All,
> We (Antoine Pitrou, Olivier Grisel and myself) spent some efforts recently
> on
> enabling pickle extensions to extend the C-optimized Pickler instead of the
> pure Python one.
> Pickle extensions have a crucial role in many distributed computing
> libraries:
> cloudpickle (https://github.com/cloudpipe/cloudpickle) for example is
> vendored
> in dask, pyspark, ray, and joblib.
> Early benchmarks show that relying on the C-optimized pickle yields
> significant serialization speed improvements (up to 30x faster).
> (draft PR of the CPickler-backed version of cloudpickle:
> https://github.com/cloudpipe/cloudpickle/pull/253)
> To make extending the C Pickler possible, we are currently moving forward
> with
> a few enhancements to the public pickle API.
> * First, we are enabling Pickler subclasses to implement a reducer_override
>   method, that will be have priority over the registered reducers in the
>   dispatch_table and over the default handling of classes and functions.
>   (PR link: https://github.com/python/cpython/pull/12499)
> * Then, we are adding a new keyword argument to save_reduce called
> state_setter.
>   (consequently we allow a reducer's return value to have a new, 6th item).
>   This state setter callable is useful to override programmatically the
> state updating
>   behavior of an object, that would otherwise be restricted to its static
>   ``__setstate__`` method.
>   (PR link: https://github.com/python/cpython/pull/12588)
> The PR review process of these changes is in progress, and anyone is
> welcomed
> to chime in and share some thoughts.
> The first addition is very non-invasive. We estimated that the second
> point did
> not require introducing a new opcode, as this change could be implemented
> as
> simple sequence of standard pickle instructions. We therefore think that
> it is
> not necessary to make this change dependent on the new protocol 5 proposed
> in
> PEP 574.
> The key advantage in not creating a new opcode that this makes our change
> backward-compatible, meaning that 3.8-written pickles will not break
> because of
> our change if read using earlier Python versions.
> OTOH, one might argue that a new OPCODE might
> * make the code a little bit cleaner
> * make it easier to interpret disassembled pickle strings.
> If you are interested, here is an example of a disassembled pickle string
> using our currently proposed solution:
> https://github.com/pierreglaser/cpython/pull/2#issuecomment-486243350
> Does anyone have an opinion on this?
> Thanks,
> Pierre
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org

--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190426/102aea43/attachment.html>

More information about the Python-Dev mailing list