[Python-Dev] The pysandbox project is broken

Nick Coghlan ncoghlan at gmail.com
Sat Nov 16 11:12:18 CET 2013


On 16 Nov 2013 11:35, "Christian Tismer" <tismer at stackless.com> wrote:
> IOW: Do we really need a full abstraction, embedded in a virtual OS, or
> is there already a compromise that suits 98 percent of the common needs?
>
> I think as a starter, categorizing the expectations of some measure of 'secure python'
> would make sense. And I'm asking the people with better knowledge of these matters
> than I have. (and not asking those who don't... ;-) )

The litany of vulnerability reports against the Java sandbox has long
confirmed my impression that secure sandboxing is a hard, not
completely solved problem, best left to better resourced platform
developers (or at least taking the appropriate steps to benefit from
their work).

A self-hosted language runtime level sandbox is, at best, a first line
of defence that protects against basic, naive attacks. One of the
assumptions I see from the folks working on operating systems, virtual
machine and container security is that the sandboxes *will* be
compromised at some point, so you have to make sure to understand what
the consequences of those breaches will be, and the best answer is
"they run into the next line of defence, so the only thing they have
gained is the ability to attack that").

In terms of in-process sandboxing of CPython (*at all*, let alone
self-hosted), we're currently missing some key foundational
components:

- the ability for a host process to cleanly configure the capabilities
of an embedded CPython interpreter (that's what PEP 432 is all about)
- elimination of all of the mechanisms by which hostile untrusted code
can trigger a segfault in the runtime (any segfault bug can reasonably
be assumed to be a security vulnerability waiting to be exploited, the
only question is whether the CPython runtime is part of the exposed
attack surface, and what the consequences are of compromising the
runtime). While Victor Stinner's recent work with failmalloc has been
a big step forward here, as have been various other changes in the
CPython code base (like adding recursion depth constraints to the
compiler toolchain), we're still a long way from being able to say
"CPython cannot be segfaulted by legal Python code that doesn't use
ctypes or an equivalent FFI library".

This is why I share Guido's (and the PyPy team's) view that secure,
cross-platform sandboxing of (C)Python is currently not possible.
Secure in-process sandboxing is hard even for languages like Lua,
JavaScript and Java that were designed from the ground up with
sandboxing in mind - sure, you can lock things down to the point where
untrusted code assuredly can't do any damage, but it often can't do
anything *useful* in that state, either.

By contrast, the PyPy sandbox model which uses a deliberately
constrained runtime to execute untrusted code in an OS level process
that is designed to only permit communication with the parent process
is *exactly* the kind of paranoid defence-in-depth approach that
should be employed when running untrusted code. Ideally, all of the
platform level "this child process is not allowed to do anything
except talk to me over stdin and stdout" would also be brought to bear
on the sandboxed runtime, so that as yet undiscovered vulnerabilities
in the PyPy sandbox don't result in a system compromise.

Anyone interested in sandboxing of Python code would be well-advised
to direct their efforts towards the parent process bindings for
http://doc.pypy.org/en/latest/sandbox.html, as well as identifying the
associated platform specific settings to lock out the child process
from all system access except communication with the parent process
over the standard streams.

Cheers,
Nick.


More information about the Python-Dev mailing list