Jeremy Hylton wrote:
"BL" == Ben Laurie email@example.com writes:
BL> I'll admit to being that person. A capability is, in essence, an BL> opaque bound method. Of course, for them to be much use, you BL> want the system to provide some other stuff, like not being able BL> to recreate capabilities (i.e. you get hold of them from on BL> high, and that's the _only_ way to get them).
That seems like a funny defintion of a capability. That is, you seem to have chosen a particular way to represent capabilities and are using the representation to describe the abstract idea. A more general definition might be: A capability is an object that carries with it the authorization to perform some action. Informally, a capability can be thought of as a ticket. The ticket is good for some event and possession of the ticket is sufficient to attend the event.
I agree - I was trying to choose an example (just as you have) that would get the flavour of a capability across to any Python programmer.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language. I assume you're suggesting that methods be thought of as capabilities, so that possession of a bound method object implies permission to invoke it. That seems like a reasonable design, but what about classes or functions or instances?
The idea I had was that these are all icing on the cake. If I can secure bound methods, I have what I want, which is something that enforces the properties needed to have a capability that imposes minimum overhead on the programmer. If we also get access to classes, function or instances in a way that is secure, then that's great. But if we don't, its not a huge loss.
The problem, which rexec solves after a fashion, is to prevent unauthorized copying of the capabilities, or more specifically of the access rights contained in the capability. That is, there's some object that checks tickets (the reference monitor). It needs to be able to inspect the ticket, but it shouldn't be possible for someone else to use that mechanism to forge a ticket.
I don't like this. There is no "reference monitor" that checks tickets, if you implement as I have suggested. You either have them or you don't. The ticket is the method. The method is the ticket. But, of course, you can have implementations that are less direct, I agree.
The problem for Python is that its introspection features make it impossible (?) for pure Python code to hide something. In Java, you could declare an instance variable private and know that the type system will prevent client code from accessing the variable directly. In Python, there is no private.
Rexec provides a way to turn off some of the introspection in order to allow some confinement. If you can't extract the im_class, im_func, and im_self attributes of a bound method, then you can use a bound method as a capability without risk that the holder will break into the object. On the other hand, if you want to use some other kind of object as a capability, you must be sure that there isn't some introspection mechanism that allows the holder to get into the representation. If there is, rexec needs to turn it off.
The problem with rexec is that the security code is smeared across the entire interpreter. Each object or instrospection facility would need to has to have a little bit of code that participates in rexec. And every change to the language needs to taken rexec into account. What's worse is that rexec turns off a set of introspection features, so you can't use any of those features in code that might get loaded in rexec.
I agree that this is a problem.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
This approach seems promising because it is fairly self-contained. The proxy code can be largely independent of the rest of the interpreter, so that you can analyze it without scouring the entire source tree. It's also quite flexible. If you want to use instances is capabilities, you could, for example, use a proxy that only allows access to methods, not to instance variables. It's a simple mechanism that allows many policies, as opposed to rexec which couples policy and mechanism.
I will admit to not being thoroughly familiar with Zope proxies, but I'd love to be persuaded. Right now, I have two instant issues with them:
a) They do not appear to be simple to use, in the slightest. One of the beauties of using opaque bound methods is that they are trivial and natural to use. No programmer should find it difficult, or even a noticable overhead. In fact, I observed whilst doing limited experiments in this area that it led to what seemed to me to be an improvement in style (for example, if I wanted to have something read a configuration file, rather than passing the name of the file, a capability approach is to pass a file reader already bound to that file - this is, IMO, rather more elegant).
b) I don't understand how they avoid all the problems that are left lying around in the language itself without rexec.
I did look at the Zope proxy stuff. I found it very hard to understand, so I may well be totally missing the point.