On Wed, Jul 15, 2020 at 9:46 AM Steven D'Aprano email@example.com wrote:
On Mon, Jul 13, 2020 at 09:56:45PM +1000, Chris Angelico wrote:
A pickle file (or equivalent blob in a database, or whatever) should be considered equally as trusted as your source code. If you're writing out a file that has the exact same access permissions as your own source code, and then reading it back, you shouldn't have to worry about pickle's safety any more than you worry about your code's safety
- anyone who could maliciously craft something for you to unpickle
could equally just edit the source code directly.
If I worry about the security of my source code, I can put a known good copy on read-only media, or lock it down with more restrictive permissions so that the user running the code cannot modify it. In either case, if my code needs to write data out and then later back in to a pickle file, it can't be written to the same location as my source code. (As it is read-only.)
At that point, you are NOT running it with the "exact same access permissions", are you? :) But a large amount of code is indeed run with the same access permissions as its temporary files (which may be incredibly restrictive or incredibly generous, either way).
They've probably been thinking about ways to exploit pickle for months. I've spent three minutes reading the docs. Who is likely to win?
This is why an *inherently safe* serialization format is a necessary thing. I don't want to spend even three minutes thinking about exploits, I just want to write the data out and read it back in, no issues, no worries, and not have to think about it.
And that's why we have JSON and various others, which are not pickle and are not vulnerable the way that pickle is. I don't think we need a "safe pickle". What we need is to not use pickle when it's not the right tool.
I'm highly sympathetic to the requests for "JSON but able to encode more types", but not so sympathetic to "pickle but magically able to be safe".