
My attentions was drawn to this unanswered email, so here goes... Jeremy Hylton wrote:
"KPY" == Ka-Ping Yee <ping@zesty.ca> writes:
KPY> Wow, how did this topic end up crossing over to this list while KPY> i wasn't looking? :0
You sure react quick for someone who isn't looking <wink>.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language.
KPY> I suppose there could be, but there is really only one obvious KPY> way: creating a capability is equivalent to creating an object KPY> -- which you can only do if you hold the constructor. A KPY> capability is copied into another object (for the security KPY> folks, "object" == "protection domain") when it is transmitted KPY> as an argument to a method call.
KPY> To build a capability system, all you need to do is to KPY> constrain the transfer of object references such that they can KPY> only be transmitted along other object references. That's all.
I don't follow you hear. What does it mean to "transmit along other object references?" That is, everything in Python is an object and the only kind of references that exist are object references.
He's actually going slightly in circles here. The idea is that in order to acquire an object reference you either create the object, or are given the reference by another object you already have a reference to, or are given it by another object that has a reference to you. Where "you" is some object, of course. What is _not_ supposed to happen is finding objects by poking around in the symbol table, for example.
I think, based on your the rest of your mail, that we're largely on the same page, but I'd like to make sure I understand where you're coming from.
I don't quite follow the definition of protection domain either, as most of the literature I'm familiar with (not much of it about capabilities specifically) talks about a protection domain as the set of objects a principal has access to. The natural way to extend that to capabilities seems to me to be that a protection domain is the set of capabilities possessed by a principal.
That sounds right. The transitive closure of the capabilties possessed by a principal is also interesting, though the code in the objects determines whether you have access to any particular member of that set in practice.
Are these questions are off-topic for python-dev?
At any rate, it still seems like there are a variety of ways to realize capabilities in a programming language. For example, ZODB uses a special base class called Persistent to mark persistent objects. One could imagine using the same approach so that only some objects have capabilities associated with them.
This was the approach I tool initially but its substantially more messy than using bound methods.
KPY> The problem for Python, as Jeremy explained, is that there are KPY> so many other ways of crawling into objects and pulling out KPY> bits of their internals.
KPY> Off the top of my head, i only see two things that would have KPY> to be fixed to turn Python into a capability-secure system:
KPY> 1. Access to each object is limited to its declared exposed KPY> interface; no introspection allowed.
KPY> 2. No global namespace of modules (sys.modules etc.).
KPY> If there is willingness to consider a "secure mode" for Python KPY> in which these two things are enforced, i would be interested KPY> in making it happen.
I think there is interest and I agree with your problem statement. I'd rephrase 2 to make it more general. Control access to other modules. The import statement is just as much of a problem as sys.modules, right? In a secure environment, you have to control what code can be loaded in the first place.
Correct.
In Python, there is no private.
KPY> Side note (probably irrelevant): in some sense there is, but KPY> nobody uses it. Scopes are private. If you were to implement KPY> classes and objects using lambdas with message dispatch KPY> (i.e. the Scheme way, instead of having a separate "class" KPY> keyword), then the scoping would take care of all the KPY> private-ness for you.
I was aware of Rees's dissertation when I did the nested scopes and, partly as a result, did not provide any introspection mechanism for closures. That is, you can get at a function's func_closure slot but there's no way to look inside the cells from Python. I was thinking that closures could replace Bastions. It stills seems possible, but on several occasions I've wished I could introspect about closures from Python code. I'm also unsure that the idea flies so well for Python, because you really want secure Python to be as much like regular Python as possible. If the mechanism is based on functions, it seems hard to make it work naturally for classes and instances.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
KPY> The proxy mechanism is interesting, but not for this purpose. KPY> A proxy is how you implement revocation of capabilities: if you KPY> insert a proxy in front of an object and grant access to that KPY> proxy, then you can revoke the access just by telling the proxy KPY> to stop responding.
Sure, you can use proxies for revocation, but that's not what I was trying to say.
I think the fundamental problem for rexec is that you don't have a security kernel. The code for security gets scatter throughout the interpreter. It's hard to have much assurance in the security when its tangled up with everything else in the language.
You can use a proxy for an object to deal with goal #1 above -- enforce an interface for an object. I think about this much like a hardware capability architecture. The protected objects live in the capability segment and regular code can't access them directly. The only access is via a proxy object that is bound to the capability.
Regardless of proxy vs. rexec, I'd be interested to hear what you think about a sound way to engineer a secure Python.
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec. The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff