[Python-ideas] CapPython's use of unbound methods

Fri Mar 20 00:12:49 CET 2009

Guido van Rossum <guido at python.org> wrote:

> On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> > Suppose we have an object x with a private attribute, "_field",
> > defined by a class Foo:
> >
> > class Foo(object):
> >
> >    def __init__(self):
> >        self._field = "secret"
> >
> > x = Foo()
> 
> Can you add some principals to this example? Who wrote the Foo class
> definition? Does CapPython have access to the source code for Foo? To
> the class object?

OK, suppose we have two principals, Alice and Bob.  Alice receives a
string from Bob.  Alice instantiates the string using CapPython's
safe_eval() function, getting back a module object that contains a
function object.  Alice passes the function an object x.  Alice's
intention is that the function should not be able to get hold of the
contents of x._field, no matter what string Bob supplies.

To make this more concrete, this is what Alice executes, with
source_from_bob defined in a string literal for the sake of example:

source_from_bob = """
class C:
    def f(self):
        return self._field
def entry_point(x):
    C.f(x) # potentially gets the secret object in Python 3.0
"""

import safeeval

secret = object()

class Foo(object):
    def __init__(self):
        self._field = secret

x = Foo()
module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
module.entry_point(x)

In this example, Bob's code is not given access to the class object
Foo.  Furthermore, Bob should not be able to get access to the class
Foo from the instance x.  The type() builtin is not considered to be
safe in CapPython so it is not included in the default environment.

Bob's code is not given access to the source code for class Foo.  But
even if Bob is aware of Alice's source code, it should not affect
whether Bob can get hold of the secret object.

By the way, you can try out the example by getting the code from the
Bazaar repository:
bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython

> > However, in Python 3.0, the CapPython code can do this:
> >
> > class C(object):
> >
> >    def f(self):
> >        return self._field
> >
> > C.f(x) # returns "secret"
> >
> > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is
> > not being called on an instance of C.
> 
> In Python 2.x I could write
> 
> class C(Foo):
>   def f(self):
>     return self._field

In the example above, Bob's code is not given access to Foo, so Bob
cannot do this.  But you are right, if Bob's code were passed Foo as
well as x, Bob could do this.

Suppose Alice wanted to give Bob access to class Foo, perhaps so that
Bob could create derived classes.  It is still possible for Alice to
do that safely, if Alice defines Foo differently.  Alice can pass the
secret object to Foo's constructor instead of having the class
definition get its reference to the secret object from an enclosing
scope:

class Foo(object):

    def __init__(self, arg):
        self._field = arg

secret = object()
x = Foo(secret)
module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
module.entry_point(x, Foo)

Bob can create his own objects derived from Foo, but cannot use his
access to Foo to break encapsulation of instance x.  Foo is now
authorityless, in the sense that it does not capture "secret" from its
enclosing environment, unlike the previous definition.

> or alternatively
> 
> class C(x.__class__):
>   <same f as before>

The verifier would reject x.__class__, so this is not possible.

> > Guido said, "I don't understand where the function object f gets its
> > magic powers".
> >
> > The answer is that function definitions directly inside class
> > statements are treated specially by the verifier.
> 
> Hm, this sounds like a major change in language semantics, and if I
> were Sun I'd sue you for using the name "Python" in your product. :-)

Damn, the makers of Typed Lambda Calculus had better watch out for
legal action from the makers of Lambda Calculus(tm) too... :-)  Is it
really a major change in semantics if it's just a subset? ;-)

To some extent the verifier's check of only accessing private
attributes through self is just checking a coding style that I already
follow when writing Python code (except sometimes for writing test
cases).

Of course some of the verifier's checks, such as only allowing
attribute assignments through self, are a lot more draconian than
coding style checks.

> > If you wrote the same function definition at the top level:
> >
> > def f(var):
> >    return var._field # rejected
> >
> > the attribute access would be rejected by the verifier, because "var"
> > is not a self variable, and private attributes may only be accessed
> > through self variables.
> >
> > I renamed the variable in the example,
> 
> What do you mean by this?

I just mean that I applied alpha conversion.

def f(self):
    return self._field

is equivalent to

def f(var):
    return var._field

Whether these function definitions are accepted by the verifier
depends on their context.

> Do you also catch things like
> 
> g = getattr
> s = 'field'.replace('f', '_f')
> 
> print g(x, s)
> 
> ?

The default environment doesn't provide the real getattr() function.
It provides a wrapped version that rejects private attribute names.

Mark