[Python-3000] Will we have a true restricted exec environment for python-3000?

Sat Apr 8 19:44:55 CEST 2006

Vineet Jain wrote:
> Nick Coghlan wrote:
>> are somewhat staggering, and designing an in-process sandbox to cope 
>> with that is a big ask (and demonstrating that the sandbox actually 
>> *achieves* that goal is even tougher).
> I was thinking along the lines of:
> 
> 1. Start a "light" python interpreter, which by default will not allow 
> you to import anything including any of the standard python libraries.

But will it allow you to use numbers or strings?

If yes, then you can get to object(), and hence to pretty much whatever C 
builtins you want. So its not enough to try to hide dangerous builtins like 
file(), you want to remove them from the light version entirely (routing all 
file system and network access requests through the main application). But if 
the file objects are gone, what happens to the Python machinery that relies on 
them (like import)?

Python's powerful introspection is a severe drawback from a security POV - it 
is *really* hard to make a user stay in a box you put them in without 
crippling some part of the language as a side effect.

So while I agree with your approach in principle, there's a big chunk of work 
hiding behind that word "light". What inconveniences can be tolerated in the 
restricted code when the payoff is that a user can trust that the code is 
unable to do anything "bad" to the system? What are the kinds of cases where 
rexec and Bastion broke down, and is it possible to avoid them? Can 
rexec/Bastion be fixed, or is it necessary to start from scratch? Would it 
make sense to modify the parser to disallow the use of 
getattr/setattr/delattr, and make any identifier using double-preceding and 
double-trailing underscores a SyntaxError? The latter disables Python-level 
access to most magic attributes, while the former would eliminate functions 
that otherwise provide easy workarounds for the latter restriction. This 
disables most introspection, permitting other techniques that might otherwise 
be easily worked around to be somewhat effective.

And more fundamental technology questions: If the restricted interpreter runs 
in a separate process, should it be a CPython compile time option, or a 
full-blown fork? Or would something based on PyPy be a better idea? Does 
Python's introspection make it a better idea to go with an Access Control List 
based system rather than a Capability based system?

IOW, I think there's a lot of work to be done just to figure out the real 
scope of the problem to be solved, as well as researching previous efforts 
(and why they failed) and comparable efforts (like the Java sandbox, or the 
.NET security model, and Lua's lightweight core), before proceeding on to try 
to create something new. And that's even leaving out the part about trying to 
persuade people that what you've built actually *is* secure, and being 
prepared to back that up by promptly fixing any discovered holes.

Hmm, anyone looking for a thesis topic? ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org