
Guido van Rossum <guido@python.org> wrote:
On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn <mrs@mythic-beasts.com> wrote:
Suppose we have an object x with a private attribute, "_field", defined by a class Foo:
class Foo(object):
def __init__(self): self._field = "secret"
x = Foo()
Can you add some principals to this example? Who wrote the Foo class definition? Does CapPython have access to the source code for Foo? To the class object?
OK, suppose we have two principals, Alice and Bob. Alice receives a string from Bob. Alice instantiates the string using CapPython's safe_eval() function, getting back a module object that contains a function object. Alice passes the function an object x. Alice's intention is that the function should not be able to get hold of the contents of x._field, no matter what string Bob supplies. To make this more concrete, this is what Alice executes, with source_from_bob defined in a string literal for the sake of example: source_from_bob = """ class C: def f(self): return self._field def entry_point(x): C.f(x) # potentially gets the secret object in Python 3.0 """ import safeeval secret = object() class Foo(object): def __init__(self): self._field = secret x = Foo() module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) module.entry_point(x) In this example, Bob's code is not given access to the class object Foo. Furthermore, Bob should not be able to get access to the class Foo from the instance x. The type() builtin is not considered to be safe in CapPython so it is not included in the default environment. Bob's code is not given access to the source code for class Foo. But even if Bob is aware of Alice's source code, it should not affect whether Bob can get hold of the secret object. By the way, you can try out the example by getting the code from the Bazaar repository: bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython
However, in Python 3.0, the CapPython code can do this:
class C(object):
def f(self): return self._field
C.f(x) # returns "secret"
Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is not being called on an instance of C.
In Python 2.x I could write
class C(Foo): def f(self): return self._field
In the example above, Bob's code is not given access to Foo, so Bob cannot do this. But you are right, if Bob's code were passed Foo as well as x, Bob could do this. Suppose Alice wanted to give Bob access to class Foo, perhaps so that Bob could create derived classes. It is still possible for Alice to do that safely, if Alice defines Foo differently. Alice can pass the secret object to Foo's constructor instead of having the class definition get its reference to the secret object from an enclosing scope: class Foo(object): def __init__(self, arg): self._field = arg secret = object() x = Foo(secret) module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) module.entry_point(x, Foo) Bob can create his own objects derived from Foo, but cannot use his access to Foo to break encapsulation of instance x. Foo is now authorityless, in the sense that it does not capture "secret" from its enclosing environment, unlike the previous definition.
or alternatively
class C(x.__class__): <same f as before>
The verifier would reject x.__class__, so this is not possible.
Guido said, "I don't understand where the function object f gets its magic powers".
The answer is that function definitions directly inside class statements are treated specially by the verifier.
Hm, this sounds like a major change in language semantics, and if I were Sun I'd sue you for using the name "Python" in your product. :-)
Damn, the makers of Typed Lambda Calculus had better watch out for legal action from the makers of Lambda Calculus(tm) too... :-) Is it really a major change in semantics if it's just a subset? ;-) To some extent the verifier's check of only accessing private attributes through self is just checking a coding style that I already follow when writing Python code (except sometimes for writing test cases). Of course some of the verifier's checks, such as only allowing attribute assignments through self, are a lot more draconian than coding style checks.
If you wrote the same function definition at the top level:
def f(var): return var._field # rejected
the attribute access would be rejected by the verifier, because "var" is not a self variable, and private attributes may only be accessed through self variables.
I renamed the variable in the example,
What do you mean by this?
I just mean that I applied alpha conversion. def f(self): return self._field is equivalent to def f(var): return var._field Whether these function definitions are accepted by the verifier depends on their context.
Do you also catch things like
g = getattr s = 'field'.replace('f', '_f')
print g(x, s)
?
The default environment doesn't provide the real getattr() function. It provides a wrapped version that rejects private attribute names. Mark