[Python-ideas] CapPython's use of unbound methods

Guido van Rossum guido at python.org
Mon Mar 30 23:19:51 CEST 2009


On Sun, Mar 29, 2009 at 4:57 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> As a side note, it is interesting to compare CapPython to ECMAScript
> 3.1's strict mode, which, as I understand it, changes the semantics of
> ECMAScript's attribute access such that doing X.A when X does not have
> an attribute A raises an exception rather than returning undefined.
>
> Since existing Javascript implementations lack this feature, Cajita (a
> fail-stop subset of Javascript, part of the Caja project) has to go to
> some lengths to emulate it.  This seems to be the main reason that
> Cajita rewrites Javascript code, to add attribute existence checks.
>
> Fortunately CapPython does not have to make this kind of semantic
> change.

Well of course it makes a much more severe semantic change by
declaring illegal all use of attribute names starting with underscore.

> Interestingly, in Javascript is is easier to add this kind of change
> on a per-module basis than in Python, because dynamic attribute access
> in Javascript is done via a builtin syntax (x[a]) rather than via a
> function (getattr in Python).

I guess if you wanted to override getattr on a per-module basis you
could give each module a separate __builtins__.

> However, CPython's restricted execution mode (which Tav is proposing
> to resurrect) does change the semantics of attribute access.

It does not change the general semantics of attribute access -- it
only takes away a small set of *specific* attributes (e.g. __code__
and func_code) from a small set of *specific* object types (e.g.
function objects). This is because every object has the ability to
override getting attributes (via __getattribute__ in Python, or
tp_getattro in C).

> It's not
> yet clear to me how this works, and how it applies to the getattr
> function.  I suspect it involves looking up the stack.

No, it does not look at the stack. It looks at the globals, which
contain a special magic entry __builtins__ (with an 's') which is the
dict where built-in functions are looked up. When this dict is the
same object as the *default* built-in dict (which is
__builtin__.__dict__ where __builtin__ -- without 's' -- is the module
defining the built-in functions), it gives you supervisor privileges;
if it is any other object, it disallows access to those specific
attributes I referred to above.

I really recommend that you study the CPython implementation. Without
understanding it you stand a chance of creating a secure subset.

The getattr() function and the x.y notation both invoke the same
implementation (PyObject_GetAttr()). This in turn defers to the
tp_getattro slot of the object x. And if the object is implemented in
Python, this in turn defers to the object's __getattribute__ method.
Then object.__getattribute__ defines the default lookup code, which
searches into the object's __dict__ if there is one, then in the
class's __dict__ and walking the MRO, and finally (just before raising
AttributeError) calls the __getattr__ hook if it exists (don't confuse
the latter with __getattribute__).

> Guido van Rossum <guido at python.org> wrote:
>> More seriously, IIUC you are disallowing all use of attribute names
>> starting with underscores, which not only invalidates most Python
>> code in practical use (though you might not care about that) but
>> also disallows the use of many features that are considered part of
>> the language, such as access to __dict__ and many other
>> introspective attributes.
>
> This is true.  I'm not claiming that a lot of Python code will pass
> the verifier.  It might not accept all idiomatic code; I'm just
> claiming that code using encapsulated objects under CapPython can
> still be idiomatic.

For some definition of idiomatic. There are a lot of well-known Python
idioms involving attribute names starting with underscore.

(I hate to question your Python proficiency, but I do have to wonder
-- how much Python have you written in your life? Where did you learn
Python?)

> We could probably allow reading self.__dict__ safely in CapPython.

Though that's not enough -- peeking in other.__dict__ is also somewhat common.

> The term "introspection" covers a lot of language features.  Some are
> OK in an object-capability language and some are not.

Agreed. And many introspection features aren't that important or
commonly used. But some others are, and this includes using __dict__
and  __class__.

> For example, some might consider dir() to be an introspective feature,

It is.

> and this function is fine if suitably wrapped.

You'd have to look at the C implementation to see what it might do though.

> x.__class__.__name__ is a common idiom.  Although we can't allow
> x.__class__ on its own, we could provide a get_class_name function and
> rewrite "x.__class__.__name__" to "get_class_name(x)".
>
> "type(x) is C" is another common idiom.

Though in most cases isinstance(x, C) is preferred.

> Again, CapPython doesn't
> provide type() but it can provide a type_is() function:
> def type_is(x, t):
>    return type(x) is t

And slowly we slide down the path of writing less and less idiomatic Python...

> The "locals" builtin is not something CapPython can allow in general.
> Any function that can look up the stack in this way is potentially
> dangerous.  But it might be OK to allow "locals()", i.e. the case
> where "locals" is called as a function and not used as a first class
> value.  I would prefer not to have to do that though.

Using locals() isn't that idiomatic anyway, so this is probably fine.
It's mostly used by beginners who are still exploring the extreme end
of the language's dynamism. :-)

>> > To some extent the verifier's check of only accessing private
>> > attributes through self is just checking a coding style that I already
>> > follow when writing Python code (except sometimes for writing test
>> > cases).
>>
>> You might wish this to be true, but for most Python programmers, it
>> isn't. Introspection is a commonly-used part of the language (probably
>> more so than in Java). So is the use of attribute names starting with
>> a single underscore outside the class tree, e.g. by "friend"
>> functions.
>
> The friend function pattern is an example of something that CapPython
> could support, with some extra notation in order to make it explicit.
> It is a case of what is known as rights amplification in capability
> systems.
>
> Here's an example of how I envisage it would work in CapPython:
>
> class C(object):
>    def _get_foo(self):
>        return self._foo
> _get_foo = C._get_foo
>
> Although C._get_foo would normally be rejected, the verifier would
> allow reading C._get_foo immediately after the class definition as a
> special case.  The resulting _get_foo function would only be able to
> operate on instances of C (assuming the presence of unbound methods in
> the language).

I'm not sure how useful this is -- friends aren't necessarily in the
same module as the class, otherwise they might as well be declared as
static methods.

>> > Of course some of the verifier's checks, such as only allowing
>> > attribute assignments through self, are a lot more draconian than
>> > coding style checks.
>>
>> That also sounds like a rather serious hindrance to writing Python as
>> most people think of it.
>
> Attribute assignment is something that we could handle by rewriting.
> For example,
>
>  x.y = z
>
> could be rewritten to
>
>  x.set_attribute("y", z)

Why not

x.set_y(z)

?

> x's class definition would have to declare that attribute y is
> assignable.  The problem with attribute assignment in Python as it
> stands is that it is opt-out.  Attributes can be made read-only (by
> using "property" or defining __setattr__), but this is not the
> default.

This will encourage people to write "Java in Python" which is an
unfortunately common anti-pattern.

>> > Whether these function definitions are accepted by the verifier
>> > depends on their context.
>>
>> But this isn't.
>>
>> Are you saying that the verifier accepts the use of self._foo in a
>> method?
>
> Yes.
>
>> That would make the scenario of potentially passing a class
>> defined by Alice into Bob's code much harder to verify -- now suddenly
>> Alice has to know about a lot of things before she can be sure that
>> she doesn't leave open a backdoor for Bob.
>
> In most cases Alice would not want Bob to extend classes that she has
> defined, so she would not give Bob access to the unwrapped class
> objects.  She would just give Bob the constructor.

Or perhaps, better, a factory function, right?

> If Alice wants to
> be sure that she does that, she can add a decorator to all her class
> definitions:
>
> def constructor_only(klass):
>    def wrapper(*args, **kwargs):
>        return klass(*args, **kwargs)
>    return wrapper
>
> @constructor_only
> class C(object):
>    ...

Clever. It does meant that even the class body of C cannot refer to
C-the-class, which prevents certain idioms (mostly involving updating
class variables -- perhaps not all that common).

> (However, this assumes that class decorators are available, and
> CapPython does not support Python 2.6 yet.)

Well you can always do this manually:

class C(object):
    ...
C = constructor_only(C)

>> > The default environment doesn't provide the real getattr() function.
>> > It provides a wrapped version that rejects private attribute names.
>>
>> Do you have a web page describing the precise list of limitations you
>> apply in your "subset" of Python?
>
> I started some wiki pages to explain the verifier rules and which
> builtins are allowed, blocked or wrapped:
> http://plash.beasts.org/wiki/CapPython/VerifierRules
> http://plash.beasts.org/wiki/CapPython/Builtins
> I hope that will make things clearer.

Ok, I'll try to remember to look there before responding next time.

>> Does it support import of some form?
>
> Yes, it supports import:
> http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html
>
> The safeeval module allows callers to provide their own __import__
> function when evalling code.

Ok. Have you done a security contest like Tav did yet? Implementing
import correctly *and* safely is fiendishly difficult.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-ideas mailing list