[Python-Dev] PEP 574 -- Pickle protocol 5 with out-of-band data

Antoine Pitrou solipsis at pitrou.net
Wed Mar 28 16:19:39 EDT 2018


On Wed, 28 Mar 2018 23:03:08 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:
> 28.03.18 21:39, Antoine Pitrou пише:
>  > I'd like to submit this PEP for discussion.  It is quite specialized
>  > and the main target audience of the proposed changes is
>  > users and authors of applications/libraries transferring large amounts
>  > of data (read: the scientific computing & data science ecosystems).  
> 
> Currently I'm working on porting some features from cloudpickle to the 
> stdlib. For these of them which can't or shouldn't be implemented in the 
> general purpose library (like serializing local functions by serializing 
> their code objects, because it is not portable) I want to add hooks that 
> would allow to implement them in cloudpickle using official API. This 
> would allow cloudpickle to utilize C implementation of the pickler and 
> unpickler.

Yes, that's something that would benefit a lot of people.
For the record, here are my notes on the topic:
https://github.com/cloudpipe/cloudpickle/issues/58#issuecomment-339751408

> It is well known that pickle is unsafe. Unpickling untrusted data can 
> cause executing arbitrary code. It is less known that unpickling can be 
> made safe by controlling resolution of global names in custom 
> Unpickler.find_class(). I want to provide helpers which would help 
> implementing safe unpickling by specifying just white lists of globals 
> and attributes.

I'm not sure how safe that would be, because 1) there may be other
attack vectors, and 2) it's difficult to predict which functions are
entirely safe for calling.  I think the best way to make pickles safe
is to cryptographically sign them so that they cannot be forged by an
attacker.

> This work still is not finished, but I think it is worth to include it 
> in protocol 5 if some features will need bumping protocol version.

Agreed.  Do you know by which timeframe you'll know which opcodes you
want to add?

Regards

Antoine.




More information about the Python-Dev mailing list