[Python-ideas] Fwd: Re: Secure unpickle

Andrew Barnert abarnert at yahoo.com
Thu Jul 23 03:44:56 CEST 2015

> On Jul 22, 2015, at 17:27, Neil Girdhar <mistersheik at gmail.com> wrote:
> Thanks Andrew, totally agree with what you said.  For the record, I don't know exactly what the problem is.  I just noticed on some projects people talking about writing their own unpickling code because of insecurities in pickle, and it made me think: "why should you have to?"

The problem is inherent to the design of pickle: it's a virtual machine that can make Python import arbitrary modules and call arbitrary globals (with arbitrary literals and/or already-constructed objects as arguments). You can't fix that without replacing the whole design. And that's what they're asking for in your second link: they want explicit imperative code in matplotlib, rather than the data, to drive the process.

Also, the reason pickle is so convenient is that classes can opt in just by adding the right methods, but that's the same reason that not anticipating everything your code might do can mean an invisible security hole instead of a "can't pickle that type" error, so you can't fix that either without giving up that convenience.

Of course the other problem is FUD. Despite the fact that there are plenty of use cases for which pickle is safe, there are people who would rather teach you that it's never ever safe than teach you how to recognize and understand potential problems. And there are people who believe pickle is slow and space-wasteful and can't handle large data, either because they read a blog post from 15 years ago, or because they're still using 2.7 and haven't read far enough down the docs page to see that they don't have to use format 0. And people who dogmatically insist that all serialization formats should be interchange formats (a pickle can only be unpickled by the exact same program, or a carefully-updated newer version of the same program) even when interchange isn't relevant. And so on. Changing pickle wouldn't get rid of the FUD unless you completely replaced it.

So, it might be useful to build a little PyPI module that offered a pickle loader that didn't allow new modules to be imported and didn't allow any globals to be called except the ones specified in an explicit tuple specified in the constructor. But you still have to understand the issues to know when that will and won't solve your problems. And it still wouldn't satisfy the people posting in those bug reports.

More information about the Python-ideas mailing list