[Web-SIG] Python pickle and web security.

Sat Sep 16 04:07:01 CEST 2006

Hi,

I think my main point was about using pickle for sessions, not just
using pickle by itself.

Unlike loading other data, code gets run when you load a pickle.  It
is indeed like running python code.  So if you do not trust where you
store your pickles to run python code, then that is a problem.

If the unpickle or pickle code is not bug free, then you can not trust
that unpickling a pickle will not allow data to be made which can
trick the unpickle escaping code.

With the history of bugs with the unpickle code, I don't think relying
on it is a good idea.

For a list of pickle bugs you can search the python bug tracker.
There are over 70 bugs listed including the open, closed, and deleted
bugs.  With 13 open bugs listed.

One of the bugs was closed because: 'Closing due to lack of response.
cPickle is such a complex module, without a test case the leak cannot
be found.'

I think that line says best about how much you should trust the C
module pickle code that is 5753 lines long, and has not been audited.

Will pickle *always* escape data you pass it correctly when it encodes
it into a pickle?  Will unpickle *always* unescape parts of the pickle
correctly?  If not then those pickles can run code.

The risk of using pickle does not seem to be worth the convenience
that it gives.  With alternatives to pickle which do not execute code
being available why not use them?

By using pickle for session data you allow people the oportunity to
put data into the pickle.  For example say you store a given GET
variable in the session.

Combining that you allow people with pickle-sessions to put data into
the pickle, and the risk that pickle might not encode/decode it
correctly is the problem I see.

However if allowing untrusted data to be placed into a pickle is ok,
then this is not a problem.  That only leaves the problem of allowing
the data store of your sessions to be able to execute code where you
load sessions.

This means you allow execution of code from your data store to your
session loading code.  Which means if you use a separate database
machine(quite common), or if you use a separate memcache server(not
unheard of) you allow these machines to execute code on the session
using machine.

There's a reason why people use separate user accounts, and separate
machines for doing different tasks.  That reason is to limit what each
user or machine can do.  By using pickles for sessions those benefits
are removed in some cases.

Cheers,

On 9/15/06, Jim Fulton <jim at zope.com> wrote:
>
> On Sep 15, 2006, at 4:29 AM, René Dudfield wrote:
>
> > Hello,
> >
> > I posted this on my blog the other day about people using pickle for
> > sessions, but got no response.  Do you guys think using pickles for
> > sessions is an ok thing to do?
>
> You don't want to accept pickles from an untrusted source, which
> typically means you don't want to accept pickles over the network.
> Even then, there are ways to use pickles securely. For example, you
> can, if you know what you're doing, arrange to prevent pickle from
> calling global objects or control specifically what global objects
> are callable.
>
> There is nothing wrong with using pickles to store data internally.
> As long as the pickles are generated by the application, there is no
> risk to the application reading them again, assuming that they are
> stored where they can't be tampered with.
>
> Saying pickle is inherently insecure is like saying Python is
> inherently insecure.  You don't want to execute Python from an
> untrusted source.  If someone can tamper with your Python code, then
> you have a serious security problem as well.
>
> Jim
>