[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)
Steven D'Aprano
steve at pearwood.info
Tue Apr 12 09:12:27 EDT 2016
I haven't been following this thread in detail, so perhaps I have
missed something, but I have a question...
On Tue, Apr 12, 2016 at 02:05:06PM +0200, Victor Stinner wrote:
> You don't understand that even if the visible "Python scope", "Python
> namespace", or call it as you want (the code that is accessible from
> your sandbox) looks very tiny, the real effictive code is HUGE. For
> example, you give a full access to the str type which is made of 20K
> lines of C code:
>
> haypo at smithers$ wc -l Objects/unicodeobject.c Objects/unicodectype.c
> Objects/stringlib/*h
> 15670 Objects/unicodeobject.c
[...]
> 1284 Objects/stringlib/unicode_format.h
> 20156 total
>
> Did you review carefully *all* these lines? If a single C line gives
> access to the real Python namespace, the game is over.
I don't follow this logic. Jon's sandbox doesn't provide an interface to
calling arbitrary lines of C code from Python. It is limited to only a
restricted set of Python operations.
So sticking to string methods for the sake of discussion, it doesn't
matter if (let's say) str.upper has access to the real Python namespace.
There is no API for str.upper to return that namespace. It only returns
a new string. So where is the error in the following reasoning?
There are 44 string methods, excluding those that start with an
underscore. So if Jon audits those 44 methods, and determines which ones
return (let's say) strings and which give access to namespaces, then he
can block the ones which give access to namespaces and allow the ones
which return strings.
To give a concrete example... suppose that the C locale library is
unsafe. Further, let's suppose that the str.isdigit method calls code
from the C locale library, to determine whether or not the string is
made up of locale-specific digits. How does this make str.isdigit
(potentially) unsafe? Regardless of what happens inside the method, it
still returns either True or False and nothing else. There's no
str.isdigit API to access the locale library.
I can think of one possible threat. Suppose that the locale library has
a bug, so that calling "aardvark".isdigit seg faults, potentially
executing arbitrary C code, but at the very least crashing the
application. Is that the sort of attack you're concerned by?
> In a few minutes, I found "{0.__class__}".format(obj) which is not a
> full escape of the sandbox, but it's just to give one example. With
> more time, I'm sure that a line can be found in the str type to escape
> your sandbox.
Maybe so. And then Jon will fix that vulnerability. And somebody will
find a new one. And he'll fix that too, or decide that it is too hard to
fix and give up.
That's how security works. Even software designed for security can have
exploitable bugs:
http://securityvulns.com/news/FreeBSD/jail/chdir.html
It seems unfair to me to hold Jon to a higher standard than we hold
people like Apple, or the Linux kernal devs.
I fully accept and respect your personal opinion, based on your
experience, that Jon's tactic is doomed to failure. But if he needs to
learn this for himself, just as you had to learn it for yourself
(otherwise you wouldn't have started your own sandbox project), I can
respect that too. Progress depends on the unreasonable person who thinks
they can overturn the conventional wisdom.
You're telling Jon not to bother trying to sandbox CPython, he should
use PyPy's sandbox instead. But if the PyPy people had believed the
conventional wisdom that you can't sandbox Python, they wouldn't have a
sandbox either.
Even if the only thing we learn from Jon's experiment is a new set of
tricks for breaking out of the sandbox, that's still interesting, if not
useful. And maybe he'll find some combination of whielist and OS-level
jail that together makes a practical sandbox. And if not, well, it's his
own time he is wasting.
> IMHO it's a waste of time to try to reduce the great Python with
> battery included to a simple calculator to compute 1+2.
Completely agree. But hopefully the whitelist won't be that restrictive,
and will allow subtraction and multiplication as well :-)
--
Steve
More information about the Python-Dev
mailing list