ANNOUNCE: CapPython, an object-capability subset of Python
During the past couple of months I have been working on an object-capability subset of Python - in other words, a restricted execution scheme for sandboxing Python code. It has been influenced by other object-capability subset languages, such as Joe-E (a subset of Java [1]), Caja/Cajita (subsets of Javascript [2]) and Caperl (based on Perl [3]). I'm calling it CapPython because the name doesn't seem to have been taken yet. :-) I believe it is now secure, so it seems like a good time to announce it here! The basic idea behind CapPython is to enforce encapsulation by restricting access to private attributes of objects. This is achieved through a combination of static checking and limiting access to unsafe builtins and modules. Private attributes may only be accessed through "self" variables. "Self" variables are defined as being the first arguments of functions defined inside class definitions, with a few restrictions intended to prevent these functions from escaping without being safely wrapped. Private attribute names are those starting with "_". Additionally, "im_self", "im_func" and some other special cases are treated as private attributes. Assignments to attributes are only allowed via "self" variables. For example, the following code is accepted by the static verifier: class Counter(object): def __init__(self): self._count = 0 def get_next(self): self._count += 1 return self._count But the following code reads a private attribute and so it is rejected as violating encapsulation: counter._count -= 1 CapPython consists of three parts: - a static verifier; - a "safe exec" function, which will check code before executing it and can run code in a safe scope; - a module loader which implements a safe __import__ function. Eventually this will be runnable as untrusted CapPython code. I am documenting CapPython via my blog at the moment, with the following posts so far: http://lackingrhoticity.blogspot.com/2008/08/introducing-cappython.html http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builti... http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-p... The code is available from a Bazaar repository on Launchpad: https://code.launchpad.net/cappython I am currently working on creating a simple example program, which will be a wsgiref-based web server with a form for executing CapPython code. This involves taming some of the standard libraries to pass the verifier. There are some design notes here - http://plash.beasts.org/wiki/CapPython - although these notes are more a list of references and problems CapPython needs to address than an explanation of the current design. There was also a thread about CapPython on the e-lang mailing list: http://www.eros-os.org/pipermail/e-lang/2008-August/012828.html Mark [1] http://code.google.com/p/joe-e/ [2] http://code.google.com/p/google-caja/ [3] http://caperl.links.org/
Mark Seaborn wrote:
During the past couple of months I have been working on an object-capability subset of Python - in other words, a restricted execution scheme for sandboxing Python code. It has been influenced by other object-capability subset languages, such as Joe-E (a subset of Java [1]), Caja/Cajita (subsets of Javascript [2]) and Caperl (based on Perl [3]). I'm calling it CapPython because the name doesn't seem to have been taken yet. :-)
No wonder ;-). I like CapPy better, though there is a shareware screen capture program by that name. PyCap is taken. CapThon is not.
I believe it is now secure, so it seems like a good time to announce it here!
The basic idea behind CapPython is to enforce encapsulation by restricting access to private attributes of objects. This is achieved through a combination of static checking and limiting access to unsafe builtins and modules.
Private attributes may only be accessed through "self" variables. "Self" variables are defined as being the first arguments of functions defined inside class definitions, with a few restrictions intended to prevent these functions from escaping without being safely wrapped.
What about functions defined outside class definitions and then attached as an attribute. Prevented?
Private attribute names are those starting with "_". Additionally, "im_self", "im_func" and some other special cases are treated as private attributes.
In 3.0, unbound methods are gone and im_self and im_func are __self__ and __func__ attributes of method objects.
How about Capt'n Python? :-)
Anyway, this is way cool. Looking forward to kicking the tires!
On Thu, Sep 18, 2008 at 1:33 PM, Terry Reedy
Mark Seaborn wrote:
During the past couple of months I have been working on an object-capability subset of Python - in other words, a restricted execution scheme for sandboxing Python code. It has been influenced by other object-capability subset languages, such as Joe-E (a subset of Java [1]), Caja/Cajita (subsets of Javascript [2]) and Caperl (based on Perl [3]). I'm calling it CapPython because the name doesn't seem to have been taken yet. :-)
No wonder ;-). I like CapPy better, though there is a shareware screen capture program by that name. PyCap is taken. CapThon is not.
I believe it is now secure, so it seems like a good time to announce it here!
The basic idea behind CapPython is to enforce encapsulation by restricting access to private attributes of objects. This is achieved through a combination of static checking and limiting access to unsafe builtins and modules.
Private attributes may only be accessed through "self" variables. "Self" variables are defined as being the first arguments of functions defined inside class definitions, with a few restrictions intended to prevent these functions from escaping without being safely wrapped.
What about functions defined outside class definitions and then attached as an attribute. Prevented?
Private attribute names are those starting with "_". Additionally, "im_self", "im_func" and some other special cases are treated as private attributes.
In 3.0, unbound methods are gone and im_self and im_func are __self__ and __func__ attributes of method objects.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, Sep 18, 2008 at 04:33:23PM -0400, Terry Reedy wrote:
Mark Seaborn wrote: I'm calling it CapPython
No wonder ;-). I like CapPy better, though there is a shareware screen capture program by that name. PyCap is taken. CapThon is not.
CaPy, and make capybara its mascot. ;) Or may be "captyve" because the goal of the project is to make some code captive. :) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Terry Reedy
Mark Seaborn wrote:
Private attributes may only be accessed through "self" variables. "Self" variables are defined as being the first arguments of functions defined inside class definitions, with a few restrictions intended to prevent these functions from escaping without being safely wrapped.
What about functions defined outside class definitions and then attached as an attribute. Prevented?
Yes, that is prevented: attribute assignment is only allowed on "self" variables, so you can't assign to class attributes. Classes can't be extended that way. That should not be a big problem for expressiveness; defining __getattr__ will still be possible. CapPython has to prevent attribute assignment by default because Python allows it on objects by default. It would be possible to allow attribute assignment by having CapPython rewrite it to a normal method call whose behaviour classes have to opt into, rather than opt out of. Currently CapPython does not do any rewriting.
Private attribute names are those starting with "_". Additionally, "im_self", "im_func" and some other special cases are treated as private attributes.
In 3.0, unbound methods are gone and im_self and im_func are __self__ and __func__ attributes of method objects.
Yes. The renaming of "im_self" and "im_func" is good. The removal of unbound methods is a *big* problem [1]. Regards, Mark [1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-p...
On Thu, Sep 18, 2008 at 2:15 PM, Mark Seaborn
Terry Reedy
wrote: Mark Seaborn wrote:
Private attributes may only be accessed through "self" variables. "Self" variables are defined as being the first arguments of functions defined inside class definitions, with a few restrictions intended to prevent these functions from escaping without being safely wrapped.
What about functions defined outside class definitions and then attached as an attribute. Prevented?
Yes, that is prevented: attribute assignment is only allowed on "self" variables, so you can't assign to class attributes. Classes can't be extended that way. That should not be a big problem for expressiveness; defining __getattr__ will still be possible.
CapPython has to prevent attribute assignment by default because Python allows it on objects by default.
It would be possible to allow attribute assignment by having CapPython rewrite it to a normal method call whose behaviour classes have to opt into, rather than opt out of. Currently CapPython does not do any rewriting.
Private attribute names are those starting with "_". Additionally, "im_self", "im_func" and some other special cases are treated as private attributes.
In 3.0, unbound methods are gone and im_self and im_func are __self__ and __func__ attributes of method objects.
Yes. The renaming of "im_self" and "im_func" is good. The removal of unbound methods is a *big* problem [1].
Regards, Mark
[1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-p...
I don't know to what extent you want to modify Python fundamentals, but I think this could be solved simply by adding a metaclass that returns an unbound method object for C.f, couldn't it? -- --Guido van Rossum (home page: http://www.python.org/~guido/)
PyPy offers sandboxing interpreter without compromising language features itself. Here are docs: http://codespeak.net/pypy/dist/pypy/doc/sandbox.html Also, are you aware of directory Lib/test/crashers (in python's svn) which contains some possible ways to segfault cpython? (which can lead to compromise later) Cheers, fijal
"Guido van Rossum"
On Thu, Sep 18, 2008 at 2:15 PM, Mark Seaborn
wrote:
Yes. The renaming of "im_self" and "im_func" is good. The removal of unbound methods is a *big* problem [1].
Regards, Mark
[1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-p...
I don't know to what extent you want to modify Python fundamentals, but I think this could be solved simply by adding a metaclass that returns an unbound method object for C.f, couldn't it?
I have considered that, and it does appear to be possible to use metaclasses for that. It looks like CapPython could set the new __build_class__ builtin (on a per-module basis), which means that the verifier would not need to require that every class has a "metaclass=safemetaclass" declaration. However, there is a problem which occurs when CapPython code interacts with normal Python code. In Python 2.x, CapPython has the very nice property that it is usually safe to pass normal objects and classes into CapPython code without allowing the CapPython code to break encapsulation: * CapPython code can only use instance objects via their public interfaces. * If CapPython code receives a class object C, it can create a derived class D, but it cannot access private attributes of instances of C unless they are also instances of D. Holding C gives you only limited authority: you can only look inside objects whose classes you have defined. There are some builtin objects that are unsafe - e.g. open, getattr, type - but it is rare for these to be passed around as first class values. In constrast, class objects are often passed around to be used as constructors. Without unbound methods, normal Python class objects become dangerous objects. It becomes much more likely that normal Python code could accidentally pass a class object in to CapPython code. So if Python code defines class C(object): def f(self): return self._foo - then if CapPython code gets hold of C, it can apply C.f(x) to get x._foo of any object. I don't really understand the rationale for removing unbound methods. OK, it simplifies the language slightly. Sometimes that is good, sometimes that is bad. OK, there might occasionally be use cases where you want to define a function in a class scope and get back the unwrapped function. But you can easily get it via the im_func attribute (now __func__). One of the posts in the original discussion [1] said that removing unbound methods brings class attributes into line with builtin methods such as list.append on the grounds that list.append is list.__dict__["append"] is True. I don't agree: list.append already applies type check:
class C(object): pass list.append(C(), 1) TypeError: descriptor 'append' requires a 'list' object but received a 'C'
It has to do so otherwise the interpreter could crash. The check added by unbound methods makes class attributes consistent with these builtins. Removing unbound methods introduces an inconsistency. Also, what about the "only one way to do it" principle? If you want to define a function that can be applied to any type, there's already a way to do that: define it outside of a class. Regards, Mark [1] http://mail.python.org/pipermail/python-dev/2005-January/050685.html
participants (6)
-
Christian Heimes
-
Guido van Rossum
-
Maciej Fijalkowski
-
Mark Seaborn
-
Oleg Broytmann
-
Terry Reedy