[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

Steven D'Aprano steve at pearwood.info
Tue Apr 12 09:12:27 EDT 2016

I haven't been following this thread in detail, so perhaps I have 
missed something, but I have a question...

On Tue, Apr 12, 2016 at 02:05:06PM +0200, Victor Stinner wrote:

> You don't understand that even if the visible "Python scope", "Python
> namespace", or call it as you want (the code that is accessible from
> your sandbox) looks very tiny, the real effictive code is HUGE. For
> example, you give a full access to the str type which is made of 20K
> lines of C code:
> haypo at smithers$ wc -l Objects/unicodeobject.c Objects/unicodectype.c
> Objects/stringlib/*h
>  15670 Objects/unicodeobject.c
>   1284 Objects/stringlib/unicode_format.h
>  20156 total
> Did you review carefully *all* these lines? If a single C line gives
> access to the real Python namespace, the game is over.

I don't follow this logic. Jon's sandbox doesn't provide an interface to 
calling arbitrary lines of C code from Python. It is limited to only a 
restricted set of Python operations.

So sticking to string methods for the sake of discussion, it doesn't 
matter if (let's say) str.upper has access to the real Python namespace. 
There is no API for str.upper to return that namespace. It only returns 
a new string. So where is the error in the following reasoning?

There are 44 string methods, excluding those that start with an 
underscore. So if Jon audits those 44 methods, and determines which ones 
return (let's say) strings and which give access to namespaces, then he 
can block the ones which give access to namespaces and allow the ones 
which return strings.

To give a concrete example... suppose that the C locale library is 
unsafe. Further, let's suppose that the str.isdigit method calls code 
from the C locale library, to determine whether or not the string is 
made up of locale-specific digits. How does this make str.isdigit 
(potentially) unsafe? Regardless of what happens inside the method, it 
still returns either True or False and nothing else. There's no 
str.isdigit API to access the locale library.

I can think of one possible threat. Suppose that the locale library has 
a bug, so that calling "aardvark".isdigit seg faults, potentially 
executing arbitrary C code, but at the very least crashing the 
application. Is that the sort of attack you're concerned by?

> In a few minutes, I found "{0.__class__}".format(obj) which is not a
> full escape of the sandbox, but it's just to give one example. With
> more time, I'm sure that a line can be found in the str type to escape
> your sandbox.

Maybe so. And then Jon will fix that vulnerability. And somebody will 
find a new one. And he'll fix that too, or decide that it is too hard to 
fix and give up.

That's how security works. Even software designed for security can have 
exploitable bugs:


It seems unfair to me to hold Jon to a higher standard than we hold 
people like Apple, or the Linux kernal devs.

I fully accept and respect your personal opinion, based on your 
experience, that Jon's tactic is doomed to failure. But if he needs to 
learn this for himself, just as you had to learn it for yourself 
(otherwise you wouldn't have started your own sandbox project), I can 
respect that too. Progress depends on the unreasonable person who thinks 
they can overturn the conventional wisdom.

You're telling Jon not to bother trying to sandbox CPython, he should 
use PyPy's sandbox instead. But if the PyPy people had believed the 
conventional wisdom that you can't sandbox Python, they wouldn't have a 
sandbox either.

Even if the only thing we learn from Jon's experiment is a new set of 
tricks for breaking out of the sandbox, that's still interesting, if not 
useful. And maybe he'll find some combination of whielist and OS-level 
jail that together makes a practical sandbox. And if not, well, it's his 
own time he is wasting.

> IMHO it's a waste of time to try to reduce the great Python with
> battery included to a simple calculator to compute 1+2.

Completely agree. But hopefully the whitelist won't be that restrictive, 
and will allow subtraction and multiplication as well :-)


More information about the Python-Dev mailing list