[Python-Dev] The pysandbox project is broken

Sat Nov 16 11:50:48 CET 2013

On Sat, Nov 16, 2013 at 12:12 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 16 Nov 2013 11:35, "Christian Tismer" <tismer at stackless.com> wrote:
>> IOW: Do we really need a full abstraction, embedded in a virtual OS, or
>> is there already a compromise that suits 98 percent of the common needs?
>>
>> I think as a starter, categorizing the expectations of some measure of 'secure python'
>> would make sense. And I'm asking the people with better knowledge of these matters
>> than I have. (and not asking those who don't... ;-) )
>
> The litany of vulnerability reports against the Java sandbox has long
> confirmed my impression that secure sandboxing is a hard, not
> completely solved problem, best left to better resourced platform
> developers (or at least taking the appropriate steps to benefit from
> their work).
>
> A self-hosted language runtime level sandbox is, at best, a first line
> of defence that protects against basic, naive attacks. One of the
> assumptions I see from the folks working on operating systems, virtual
> machine and container security is that the sandboxes *will* be
> compromised at some point, so you have to make sure to understand what
> the consequences of those breaches will be, and the best answer is
> "they run into the next line of defence, so the only thing they have
> gained is the ability to attack that").
>
> In terms of in-process sandboxing of CPython (*at all*, let alone
> self-hosted), we're currently missing some key foundational
> components:
>
> - the ability for a host process to cleanly configure the capabilities
> of an embedded CPython interpreter (that's what PEP 432 is all about)
> - elimination of all of the mechanisms by which hostile untrusted code
> can trigger a segfault in the runtime (any segfault bug can reasonably
> be assumed to be a security vulnerability waiting to be exploited, the
> only question is whether the CPython runtime is part of the exposed
> attack surface, and what the consequences are of compromising the
> runtime). While Victor Stinner's recent work with failmalloc has been
> a big step forward here, as have been various other changes in the
> CPython code base (like adding recursion depth constraints to the
> compiler toolchain), we're still a long way from being able to say
> "CPython cannot be segfaulted by legal Python code that doesn't use
> ctypes or an equivalent FFI library".
>
> This is why I share Guido's (and the PyPy team's) view that secure,
> cross-platform sandboxing of (C)Python is currently not possible.
> Secure in-process sandboxing is hard even for languages like Lua,
> JavaScript and Java that were designed from the ground up with
> sandboxing in mind - sure, you can lock things down to the point where
> untrusted code assuredly can't do any damage, but it often can't do
> anything *useful* in that state, either.
>
> By contrast, the PyPy sandbox model which uses a deliberately
> constrained runtime to execute untrusted code in an OS level process
> that is designed to only permit communication with the parent process
> is *exactly* the kind of paranoid defence-in-depth approach that
> should be employed when running untrusted code. Ideally, all of the
> platform level "this child process is not allowed to do anything
> except talk to me over stdin and stdout" would also be brought to bear
> on the sandboxed runtime, so that as yet undiscovered vulnerabilities
> in the PyPy sandbox don't result in a system compromise.
>
> Anyone interested in sandboxing of Python code would be well-advised
> to direct their efforts towards the parent process bindings for
> http://doc.pypy.org/en/latest/sandbox.html, as well as identifying the
> associated platform specific settings to lock out the child process
> from all system access except communication with the parent process
> over the standard streams.

Note Nick that the part that runs stuff in child process (as opposed
to have two different pythons running in the same process) is really
not a limitation of the approach. It's just that it's a proof of
concept and various other options are also possible, just noone seems
to be interested to pursue them. Additional OS level blocking is
really only working against potential segfaults, since we know that
there is no IO possible from the inner process. A JIT-less PyPy
sandbox can be made very secure by locking the executable pages as
non-writable (we know the code does not do any IO).

Cheers,
fijal