Creating a reliable sandboxed Python environment

Thu May 28 12:34:25 EDT 2015

Thanks for the responses folks. I will briefly summarize them:

> As you say, it is fundamentally not possible to make this work at 
the Python level.

This is pretty effectively demonstrated by "Tav's admirable but failed attempt to sandbox file IO":
* http://tav.espians.com/a-challenge-to-break-python-security.html

Wow there are some impressive ways to confuse the system. I particularly like overriding str's equality function to defeat mode checking code when opening files.

> When we needed this at edX, we wrote CodeJail (https://github.com/edx/codejail). 
It's a wrapper around AppArmor to provide OS-level protection of code 
execution in subprocesses.  It has Python-specific features, but because it 
is based on AppArmor, can sandbox any process, so long as it's configured 
properly. 

This looks promising. I will take a closer look.

> What about launching the Python process in a Docker container?

This may work in combination with other techniques. Certainly faster than spinning up a new VM or snapshot-restoring a fixed VM on a repeated basis. Would need to see whether CPU, Memory, and Disk usage could be constrained at the level of a container.

- David

On Monday, May 25, 2015 at 7:24:32 PM UTC-7, davi... at gmail.com wrote:
> I am writing a web service that accepts Python programs as input, runs the provided program with some profiling hooks, and returns various information about the program's runtime behavior. To do this in a safe manner, I need to be able to create a sandbox that restricts what the submitted Python program can do on the web server.
> 
> Almost all discussion about Python sandboxes I have seen on the internet involves selectively blacklisting functionality that gives access to system resources, such as trying to hide the "open" builtin to restrict access to file I/O. All such approaches are doomed to fail because you can always find a way around a blacklist.
> 
> For my particular sandbox, I wish to allow *only* the following kinds of actions (in a whitelist):
> * reading from stdin & writing to stdout;
> * reading from files, within a set of whitelisted directories;
> * pure Python computation.
> 
> In particular all other operations available through system calls are banned. This includes, but is not limited to:
> * writing to files;
> * manipulating network sockets;
> * communicating with other processes.
> 
> I believe it is not possible to limit such operations at the Python level. The best you could do is try replacing all the standard library modules, but that is again just a blacklist - it won't prevent a determined attacker from doing things like constructing their own 'code' object and executing it.
> 
> It might be necessary to isolate the Python process at the operating system level.
> * A chroot jail on Linux & OS X can limit access to the filesystem. Again this is just a blacklist.
> * No obvious way to block socket creation. Again this would be just a blacklist.
> * No obvious way to detect unapproved system calls and block them.
> 
> In the limit, I could dynamically spin up a virtual machine and execute the Python program in the machine. However that's extremely expensive in computational time.
> 
> Has anyone on this list attempted to sandbox Python programs in a serious fashion? I'd be interested to hear your approach.
> 
> - David