[Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

Mark Shannon mark at hotpy.org
Thu Mar 8 14:40:46 CET 2012


Nick Coghlan wrote:
> On Thu, Mar 8, 2012 at 10:06 PM, Mark Shannon <mark at hotpy.org> wrote:
>> I don't think it cleans up import, but I'll defer to Brett on that.
>> I've included __import__() along with exec and eval as it is a place where
>> new namespaces can be introduced into an execution.
>> There may be others I haven't though of.
> 
> runpy is another one.

Add that to the list.
> 
> However, the problem I see with "builtins" as a separate argument is
> that it would be a lie.
> 
> The element that's most interesting about locals vs globals vs
> builtins is the scope of visibility of their contents.
> 
> When I call out to another function in the same module, locals are not
> shared, but globals and builtins are.
> 
> When I call out to code in a *different* module, neither locals nor
> globals are shared, but builtins are still common.

Not necessarily. All functions in a module will inherit their globals 
*and* builtins from the module, which gets them from __import__().

> 
> So there are two ways this purported extra "builtins" parameter could work:
> 
> 1. Sandboxing - you try to genuinely give the execution context a
> different set of builtins that's shared by all code executed, even
> imports from other modules.  

Victor's pysandbox seems pretty good to me, I had a go at breaking it
and failed, but it is too restrictive.

Rather than make pysandbox more secure, I think my proposal could make
it more usable, as clearer guarantees about access and visibility can be
provided to the sandbox developer.
You shouldn't need to cripple introspection in order to limit access to 
the builtins.

> However, I assume this isn't what you
> meant, since it is the domain of sandboxing utilities like Victor's
> pysandbox and is known to be incredibly difficult to get right (hence
> the demise of both rexec and Bastion and recent comments about known
> segfault vulnerabilities that are tolerable in the normal case of
> merely processing untrusted data with trusted code but anathema to a
> robust CPython native sandboxing scheme that can still cope even when
> the code itself is untrusted).

By changing the implementation to be based around immutable "execution 
context"s means that the compiler will enforce things for us.
Static typing has its advantages, occasionally :)

As I stated elsewhere, the crashers can be fixed. I think Victor has 
already fixed a couple.

> 
> 2. chained globals - just an extra namespace that's chained behind the
> globals dictionary for name lookup, not actually shared with code
> invoked from other modules.

That's exactly what builtins already are. They are a fall back for 
LOAD_GLOBAL and similar when something isn't found in the globals.

> 
> The second approach is potentially useful, but:
> 
> 1. "builtins" is *not* the right name for it (because any other code
> invoked will still be using the original builtins)

Other code will use whatever builtins they were given at __import__.

The key point is that every piece of code already inherits locals, 
globals and builtins from somewhere else.
We can already control locals (by which parameters are passed in) and
globals via exec, eval, __import__, and runpy (any others?)
but we can't control builtins.


One last point is that this is a low-impact change. All code using eval, 
etc. will continue to work as before.
It also may speed things up a little.

Cheers,
Mark.



More information about the Python-Dev mailing list