[pypy-dev] Re: Mixed modules for both PyPy and CPython

holger krekel hpk at trillke.net
Sun Apr 16 22:44:43 CEST 2006


Hi VanL! 

On Sat, Apr 15, 2006 at 17:38 -0600, VanL wrote:
> holger krekel wrote:
> 
> >>Second, comments on py3k list indicated that secure python is difficult 
> >>because of a) introspection, b) type inference, and c) GIL acquisition. 
> >
> >Hum, this list looks a bit weird to me.  Could you state what
> >the actual attacks are for which security measures are discussed? 
> >Or which use cases are people on py3k having in mind? 
> 
> This is an amalgam of several different posts (and maybe different 
> threads) but here goes:

hey, thanks for the effort!

> In the thread "Will we have a true restricted exec environment for 
> python 3000," Vineet Jain asked for a restricted mode which would
> 
> "1. Limit the memory consumed by the script
> 2. Limit access to file system and other system resources
> 3. Limit cpu time that the script will take
> 4. Be able to specify which modules are available for import."

all more or less relates to the "sandbox" idea. 

> In responses to that request, various people commented on the 
> difficulties of implementing such a restricted mode.  On that thread, 
> several people had the same idea I had, to try to use PyPy for this 
> purpose - however, it didn't look like many people were up-to-date 
> reading both lists (and thus familiar-ish with PyPy's execution model).

Yes, using PyPy's metaprogramming facilities for implementing sandbox 
models should make the tasks relatively easy.  "Metaprogramming" because 
PyPy is about writing a program that generates programs which happen to 
be a full Python interpreter.   But it's also true that we haven't
had concise discussions (or rather never documented the results :)
about how to implement the above.  But the fact that we can 
instrument our GC, resource handling code, or the main interpreter 
loop and accordingly produce a full python interpreter gives us
various ways to go about the sandbox problem.  Funny little 
thesis topic, i'd say :) 

> A) Introspection
> ... introspection funnyness ... 
> Python's powerful introspection is a severe drawback from a security POV 
> - it is *really* hard to make a user stay in a box you put them in 
> without crippling some part of the language as a side effect."

All the possible ways to introspect your way around the python
object model makes one wonder if protecting the resources
isn't a more viable approach than protecting navigation.  

> Thus, in CPy, allowing someone to access a C type effectively opens up 
> all the C types.  In PyPy, however, each type is effectively in its own 
> box.  Further, PyPy already has a structure that can deal with these 
> sorts of accesses: the flowgraph.  Operations in PyPy come about because 
> of traversals of the graph - certain branches of the graph could be 
> restricted or proxied out to a trusted interpreter.

Applying systematic transformations to families of graphs (relating
to IO resources, say) is one possibility.  Lately we seem to want
to express almost everything as graph transformations, btw. 
 
> B) GIL Acquisition
> 
> Another person suggested leveraging the multiple subinterpreter code 
> which already exists in CPython to create a restricted-exec interpreter. 
>  MvL noted that GIL acquisition made that difficult:
> 
> "Part of the problem is that it doesn't really work. Some objects *are* 
> shared across interpreters, such as global objects in extension modules 
> (extension modules are initialized only once). I believe that the GIL 
> management code (for acquiring the GIL out of nowhere) breaks if there 
> are multiple interpreters."

For PyPy, it need not be true that the GIL makes protecting resources
harder.  Then again, it doesn't matter so much because i don't think that
we'll require a sub-interpreter for implementing resource protection 
(but who knows :)

I guess we should discuss sometime on which path to follow
(also see my other mail).  Any opinions on that, btw? 

best and thanks!

    holger



More information about the Pypy-dev mailing list