
On Wed, 28 Mar 2018 23:03:08 +0300 Serhiy Storchaka <storchaka@gmail.com> wrote:
28.03.18 21:39, Antoine Pitrou пише:
I'd like to submit this PEP for discussion. It is quite specialized and the main target audience of the proposed changes is users and authors of applications/libraries transferring large amounts of data (read: the scientific computing & data science ecosystems).
Currently I'm working on porting some features from cloudpickle to the stdlib. For these of them which can't or shouldn't be implemented in the general purpose library (like serializing local functions by serializing their code objects, because it is not portable) I want to add hooks that would allow to implement them in cloudpickle using official API. This would allow cloudpickle to utilize C implementation of the pickler and unpickler.
Yes, that's something that would benefit a lot of people. For the record, here are my notes on the topic: https://github.com/cloudpipe/cloudpickle/issues/58#issuecomment-339751408
It is well known that pickle is unsafe. Unpickling untrusted data can cause executing arbitrary code. It is less known that unpickling can be made safe by controlling resolution of global names in custom Unpickler.find_class(). I want to provide helpers which would help implementing safe unpickling by specifying just white lists of globals and attributes.
I'm not sure how safe that would be, because 1) there may be other attack vectors, and 2) it's difficult to predict which functions are entirely safe for calling. I think the best way to make pickles safe is to cryptographically sign them so that they cannot be forged by an attacker.
This work still is not finished, but I think it is worth to include it in protocol 5 if some features will need bumping protocol version.
Agreed. Do you know by which timeframe you'll know which opcodes you want to add? Regards Antoine.