[Python-ideas] Fwd: Re: Secure unpickle

Ryan Gonzalez rymg19 at gmail.com
Wed Jul 22 22:58:47 CEST 2015


Disclaimer: I know virtually *nothing* about cryptography, so this is probably worse than it seems.


On July 22, 2015 3:54:31 PM CDT, Andrew Barnert <abarnert at yahoo.com> wrote:
>On Jul 22, 2015, at 13:27, Ryan Gonzalez <rymg19 at gmail.com> wrote:
>> 
>> A further idea: hashes.
>> 
>> Each Pickle database (or whatever it's called) would contain a hash
>made up
>> of:
>> 
>> a) The types used to pickle the data.
>> b) The hash of the data itself, prefixed with 2 bytes that have some
>sort
>> of hard-to-get meaning (the length of the call stack?).
>> c) The seconds since epoch, or another 64-bit value.
>
>A type pickled and unpickled in a different interpreter instance isn't
>necessarily going to have the same hash value. And if you don't mean a
>Python hash, how do you hash an arbitrary class object? Or, if you mean
>just the name, how does that secure anything?
>
>For that matter, it's often important for an updated version of the
>code to be able to load pickles created with yesterday's version. This
>is easy to do with the pickle protocol, but hashing would presumably
>break that (unless it didn't protect anything at all).
>
>> The three values would likely be merged via bitwise or.
>
>Why would you merge three hash values with bitwise or instead of one of
>the usual hash combining mechanisms? This just throws away most of your
>entropy.

Uhhhh...I have no clue. It just came off the top of my head.

>
>> This has the advantage that there are three different elements making
>up
>> the hash, some of which are harder to locate. Unless two of the
>values are
>> known, the third can't be.
>> 
>> The types would be extracted from the hash via some kind of magic,
>
>That really _would_ be magic. The whole point of a hash is that it's
>one-way. If the hashed values can be recovered from it, it's not a
>hash.

Well, I again know nothing about cryptography, so I guess "key" is a better phrase. :O

>
>Also, "harder to locate" is useless, unless you plan to continually
>update your code as attackers locate  the things you've hidden. (And,
>for something used in as many high-profile uses as Python's pickler,
>any security by obscurity would be attacked very frequently.)
>
>> and then
>> it would validate the data in the database based on the types, like
>Neil
>> said.
>> 
>> If someone wanted to change the types, they would need to regenerate
>the
>> whole hash.
>
>And... So what? Unless the checker has some secure way of knowing which
>timestamp, etc. to use in checking the hash, all you have to do is give
>it the timestamp, etc. that go along with your regenerated hash, and it
>will pass.
>
>> Further security could be obtained by prefixing the first value
>> with another special byte sequence that, although easier to find,
>would be
>> used for validation purposes.
>> 
>> Point 2's prefixing bytes and point 3's value would be especially
>trickier
>> to find, since a few seconds may pass before the data is written to
>disk.
>> 
>> It's still a bit insecure, but much better than the current
>situation. I
>> think.
>
>I think it's much worse than the current situation, because it adds
>illusory security while still being effectively just as crackable.
>
>> 
>> 
>>> On Wed, Jul 22, 2015 at 3:03 AM, Neil Girdhar
><mistersheik at gmail.com> wrote:
>>> 
>>> I've heard it said that pickle is a security hole, and so it's
>better to
>>> write your own serialization routine.  That's unfortunate because
>pickle
>>> has so many advantages such as automatically tying into
>copy/deepcopy.
>>> Would it be possible to make unpickle secure, e.g., by having the
>caller
>>> create a context in which all calls to unpickle are limited to
>unpickling a
>>> specific set of types?  (When these types unpickle their
>sub-objects, they
>>> could potentially limit the set of types further.)
>>> 
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>> 
>> 
>> 
>> -- 
>> Ryan
>> [ERROR]: Your autotools build scripts are 200 lines longer than your
>> program. Something’s wrong.
>> http://kirbyfan64.github.io/
>> Currently listening to: Death Egg Boss theme (Sonic Generations)
>> -- 
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


More information about the Python-ideas mailing list