On 07/23/2015 09:54 AM, Neil Girdhar wrote:
On Wed, Jul 22, 2015 at 9:46 PM, Nathaniel Smith <njs@pobox.com <mailto:njs@pobox.com>> wrote:
On Wed, Jul 22, 2015 at 5:27 PM, Neil Girdhar <mistersheik@gmail.com <mailto:mistersheik@gmail.com>> wrote: > > That is so unfortunate. Pickle is such a good solution except for the > security. Why can't we have security too? It doesn't seem to me to be > right for a project like matplotlib to be writing their own serialization > library. It would be awesome if Python had secure serialization built-in.
The reason you can pickle/unpickle arbitrary Python objects is that the pickle format is basically a structured, optimized way of generating and then evaluating arbitrary Python code. Which is great because it's totally general -- that's why we love pickle, you can pickle anything -- but that exact feature is what makes it insecure. If you want to make something secure, that means making some explicit decisions about what kinds of things can be put into your data format and which cannot, and write some explicit code to handle each of these things instead of just handing the file format direct access to your interpreter. But by the time you've done that you've done the hard part of implementing a new format anyway...
Wouldn't it be easier to just tell unpickle which code it's allowed to run (by passing a list of modules and classes)?
unpickle can already do that, via Unpickler.find_class. There's an example in the docs. Eric.