[Python-Dev] Increasing the C-optimized pickle extensibility
Guido van Rossum
guido at python.org
Fri Apr 26 11:15:02 EDT 2019
I think it's better not to introduce a new opcode, for the reason you
stated -- you don't want your pickles to be unreadable by older Python
versions, if you can help it.
On Fri, Apr 26, 2019 at 5:59 AM Pierre Glaser <pierre.glaser at inria.fr>
wrote:
> Hi All,
>
> We (Antoine Pitrou, Olivier Grisel and myself) spent some efforts recently
> on
> enabling pickle extensions to extend the C-optimized Pickler instead of the
> pure Python one.
>
> Pickle extensions have a crucial role in many distributed computing
> libraries:
> cloudpickle (https://github.com/cloudpipe/cloudpickle) for example is
> vendored
> in dask, pyspark, ray, and joblib.
> Early benchmarks show that relying on the C-optimized pickle yields
> significant serialization speed improvements (up to 30x faster).
> (draft PR of the CPickler-backed version of cloudpickle:
> https://github.com/cloudpipe/cloudpickle/pull/253)
>
> To make extending the C Pickler possible, we are currently moving forward
> with
> a few enhancements to the public pickle API.
>
> * First, we are enabling Pickler subclasses to implement a reducer_override
> method, that will be have priority over the registered reducers in the
> dispatch_table and over the default handling of classes and functions.
> (PR link: https://github.com/python/cpython/pull/12499)
>
> * Then, we are adding a new keyword argument to save_reduce called
> state_setter.
> (consequently we allow a reducer's return value to have a new, 6th item).
> This state setter callable is useful to override programmatically the
> state updating
> behavior of an object, that would otherwise be restricted to its static
> ``__setstate__`` method.
> (PR link: https://github.com/python/cpython/pull/12588)
>
> The PR review process of these changes is in progress, and anyone is
> welcomed
> to chime in and share some thoughts.
>
> The first addition is very non-invasive. We estimated that the second
> point did
> not require introducing a new opcode, as this change could be implemented
> as
> simple sequence of standard pickle instructions. We therefore think that
> it is
> not necessary to make this change dependent on the new protocol 5 proposed
> in
> PEP 574.
>
> The key advantage in not creating a new opcode that this makes our change
> backward-compatible, meaning that 3.8-written pickles will not break
> because of
> our change if read using earlier Python versions.
>
> OTOH, one might argue that a new OPCODE might
> * make the code a little bit cleaner
> * make it easier to interpret disassembled pickle strings.
>
> If you are interested, here is an example of a disassembled pickle string
> using our currently proposed solution:
> https://github.com/pierreglaser/cpython/pull/2#issuecomment-486243350
>
> Does anyone have an opinion on this?
>
> Thanks,
>
> Pierre
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190426/102aea43/attachment.html>
More information about the Python-Dev
mailing list