[Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]

Thu Jul 6 07:13:07 CEST 2006

Brett Cannon wrote:
> On 7/5/06, Michael Chermside <mcherm at mcherm.com> wrote:
>> If you were using capabilities, you would need to ensure that
>> restricted interpreters could only get the file object that they
>> were given. But then _all_ of these fancy versions of the
>> restrictions would be immediately supported: it would be up to the
>> users to create secure wrappers implementing the specific
>> restrictions desired.
> 
> I agree.  I would prefer this way of doing it.  But as I have said, making
> sure that 'file' does not get out into the wild is tough.

I seem to recall someone mentioned earlier in this discussion the notion 
of somehow throwing an exception when sandboxed code attempts to push a 
file reference onto the interpreter stack.

I'm not an expert in these matters, so perhaps what I am going to say 
will make no sense, but here goes:

What if there were two copies of the evaluator function. One copy would 
be a slightly slower 'checked' function, that would test all objects for 
a 'check' bit. Any attempt to evaluate a reference to an object with a 
check bit set would throw an exception.

The other eval function would be the 'unchecked' version that would run 
at full speed, just like it does today.

Transitioning from the checked to the unchecked state could only be done 
via C code. So the 'file' wrapper, for example, would switch over to the 
unchecked interpreter before calling the actual methods of 'file'. That 
C wrapper might also check the current permission state to see what 
operations were legal.

Note that whenever a C function sets the interpreter state to 
'unchecked', that fact is saved on the stack, so that when the function 
returns, the previous state is restored. The function for setting the 
interpreter state is something like PyCall_Unchecked( ... ), which 
restores the interpreter state back to where it was.

Transitioning from unchecked to checked is trickier. Specifically, you 
don't want to ever run sandboxed code in the unchecked state - this is a 
problem for generators, callbacks, and so on. I can think of two 
approaches to handling this:

First approach is to mark all sandboxed code with a bit indicating the 
code is untrusted. Any attempt to call or otherwise invoke a function 
that has this bit set would throw the interpreter into the 'checked' 
state. (Note that transitioning the other way is *not* automatic - i.e. 
calling trusted code does not automatically transition to unchecked state.)

The approach is good because it means that if you have intermediary code 
between the wrapper and the sandboxed code, the interpreter still does 
the right thing - it sets the interpreter into checked state.

One problem is how to restore the 'unchecked' state when a function call 
returns. Probably you would have to build this into the code that does 
the state transition.

If marking the sandboxed code isn't feasible, then you'd have to have 
the wrapper objects wrap all of the callbacks with code that goes to 
checked state before calling the callbacks. This means finding all the 
possible holes - however, I suspect that there are far fewer such holes 
than trying to hide all possible 'file' methods. However, one advantage 
of doing this is that restoring the 'unchecked' state after the call 
returns is fairly straightforward.

The advantage of the this whole approach is that once you set the 
'check' bit on 'file', it doesn't matter whether 'file' gets loose or 
not - you can't do anything with it without throwing an exception. 
Moreover, it also doesn't matter what code path you go through to access 
it. Only C code that flips the interpreter into unchecked state can call 
methods on 'file' without blowing up.

So essentially, what I propose is to define a simple security primitive 
- which essentially comes down to checking a single bit - and use that 
as a basis to create more complex and subtle security mechanisms.

-- Talin