How much is set in stone?

Andrew Dalke dalke at dalkescientific.com
Sun Nov 11 04:36:54 EST 2001


Paul Rubin [on security issues with pickling]:
>Basically if you unpickle a string that came from an untrusted source
>(say, a browser cookie from the Cookie module), the string can make
>pickle load arbitrary modules and call arbitrary object constructors
>in your application.  The docs for the cookie module mention this and
>there's an bug open on sourceforge to fix the pickle docs.

I've been trying to figure out what Perl does to prevent this problem
in, I assume, Data::Dumper.  As usual, I'm confused.  It appears that
Perl has the same problems Python has, in that arbitrary Thaws methods
can be called.  (See below for how creating an object, even without
calling a constructor or special thawing function, can cause problems
because the destructor might have side effects, as with
TemporaryFileWrapper.)

BTW, the line
> the string can make pickle load arbitrary modules and call arbitrary
> object constructors in your application.

should be replaced with "call arbitrary callables".  For example,
here's a way to remove a file using pickle.loads:

>>> t = "(S'filename.txt'\012p1\012ios\012unlink\012p2\012(dp3\012b."
>>> import pickle
>>> open("filename.txt", "w").write("Hello\n")
>>> ^Z
Suspended
[dalke at pw600a src]$ cat filename.txt
Hello
[dalke at pw600a src]$ fg
python

>>> pickle.loads(t)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/dalke/local/lib/python2.0/pickle.py", line 900, in loads
    return Unpickler(file).load()
  File "/home/dalke/local/lib/python2.0/pickle.py", line 516, in load
    dispatch[key](self)
  File "/home/dalke/local/lib/python2.0/pickle.py", line 856, in load_build
    inst.__dict__.update(value)
AttributeError: 'None' object has no attribute '__dict__'
>>> ^Z
Suspended
[dalke at pw600a src]$ cat filename.txt
cat: filename.txt: No such file or directory
[dalke at pw600a src]$


Is the following useful?

=============
# safe_pickle.py
import pickle

class SafeUnpickler(pickle.Unpickler):
  def __init__(self, file, legit_classes = [], legit_modules = []):
    pickle.Unpickler.__init__(self, file)
    self.legit_classes = legit_classes
    self.legit_modules = legit_modules
  def find_class(self, module, name):
    if module in self.legit_modules and \
         name in self.legit_classes:
      return pickle.Unpickler.find_class(self, module, name)
    else:
      raise SystemError("Tries to unpickle a soured item (%s, %s)" % \
                        (module, name))

def load(file, legit_classes = [], legit_modules = []):
  return SafeUnpickler(file, legit_classes, legit_modules).load()
===============

>>> import cStringIO, pickle, safe_pickle
>>>
>>> class Spam:
...   def __init__(sel, x, y):
...     print "Called with", x, y
...   def __getinitargs__(self):
...     return (9, 8)
...
>>> spam = Spam(1, 2)
Called with 1 2
>>> s = pickle.dumps(spam)
>>> pickle.loads(s)
Called with 9 8
<__main__.Spam instance at 0x12034a988>
>>> safe_pickle.load(cStringIO.StringIO(s))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "safe_pickle.py", line 18, in load
    return SafeUnpickler(file, legit_classes, legit_modules).load()
  File "/home/dalke/local/lib/python2.0/pickle.py", line 516, in load
    dispatch[key](self)
  File "/home/dalke/local/lib/python2.0/pickle.py", line 682, in load_inst
    klass = self.find_class(module, name)
  File "safe_pickle.py", line 14, in find_class
    raise SystemError("Tries to unpickle a soured item (%s, %s)" % \
SystemError: Tries to unpickle a soured item (__main__, Spam)
>>> safe_pickle.load(cStringIO.StringIO(s), ["Spam"], ["__main__"])
Called with 9 8
<__main__.Spam instance at 0x1202ee1c8>
>>>


It's also possible to check that find_class really returns a ClassType,
and override load_obj (or load_inst or both?) so the constructor is
never called.  Hmm, and what about returning the Bastionized form of
the found class?

It's still tricky as things aren't automatically safe -- what if you
unpickle a tempfile.TemporaryFileWrapper?  It's possible for the
destructor there to unlink the named file.  So skipping the constructor
and any deserialization function still doesn't guarantee the safeness
of unpickling.

But at least with this 'safe_pickle' you get to determine what you trust.

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list