[Python-Dev] Capabilities in Python

Mon, 03 Mar 2003 13:42:43 +0000

My attentions was drawn to this unanswered email, so here goes...

Jeremy Hylton wrote:
>>>>>>"KPY" == Ka-Ping Yee <ping@zesty.ca> writes:
> 
> 
>   KPY> Wow, how did this topic end up crossing over to this list while
>   KPY> i wasn't looking?  :0
> 
> You sure react quick for someone who isn't looking <wink>.
> 
>   >> A capability system must have some rules for creating and copying
>   >> capabilities, but there is more than one way to realize those
>   >> rules in a programming language.
> 
>   KPY> I suppose there could be, but there is really only one obvious
>   KPY> way: creating a capability is equivalent to creating an object
>   KPY> -- which you can only do if you hold the constructor.  A
>   KPY> capability is copied into another object (for the security
>   KPY> folks, "object" == "protection domain") when it is transmitted
>   KPY> as an argument to a method call.
> 
>   KPY> To build a capability system, all you need to do is to
>   KPY> constrain the transfer of object references such that they can
>   KPY> only be transmitted along other object references.  That's all.
> 
> I don't follow you hear.  What does it mean to "transmit along other
> object references?"  That is, everything in Python is an object and
> the only kind of references that exist are object references.

He's actually going slightly in circles here. The idea is that in order 
to acquire an object reference you either create the object, or are 
given the reference by another object you already have a reference to, 
or are given it by another object that has a reference to you. Where 
"you" is some object, of course.

What is _not_ supposed to happen is finding objects by poking around in 
the symbol table, for example.

> I think, based on your the rest of your mail, that we're largely on
> the same page, but I'd like to make sure I understand where you're
> coming from.
> 
> I don't quite follow the definition of protection domain either, as
> most of the literature I'm familiar with (not much of it about
> capabilities specifically) talks about a protection domain as the set
> of objects a principal has access to.  The natural way to extend that
> to capabilities seems to me to be that a protection domain is the set
> of capabilities possessed by a principal.

That sounds right. The transitive closure of the capabilties possessed 
by a principal is also interesting, though the code in the objects 
determines whether you have access to any particular member of that set 
in practice.

> Are these questions are off-topic for python-dev?
> 
> At any rate, it still seems like there are a variety of ways to
> realize capabilities in a programming language.  For example, ZODB
> uses a special base class called Persistent to mark persistent
> objects.  One could imagine using the same approach so that only some
> objects have capabilities associated with them.

This was the approach I tool initially but its substantially more messy 
than using bound methods.

>   KPY> The problem for Python, as Jeremy explained, is that there are
>   KPY> so many other ways of crawling into objects and pulling out
>   KPY> bits of their internals.
> 
>   KPY> Off the top of my head, i only see two things that would have
>   KPY> to be fixed to turn Python into a capability-secure system:
> 
>   KPY> 1. Access to each object is limited to its declared exposed
>   KPY>      interface; no introspection allowed.
> 
>   KPY> 2. No global namespace of modules (sys.modules etc.).
> 
>   KPY> If there is willingness to consider a "secure mode" for Python
>   KPY> in which these two things are enforced, i would be interested
>   KPY> in making it happen.
> 
> I think there is interest and I agree with your problem statement.
> I'd rephrase 2 to make it more general.  Control access to other
> modules.  The import statement is just as much of a problem as
> sys.modules, right?  In a secure environment, you have to control what
> code can be loaded in the first place.

Correct.

>   >> In Python, there is no private.
> 
>   KPY> Side note (probably irrelevant): in some sense there is, but
>   KPY> nobody uses it.  Scopes are private.  If you were to implement
>   KPY> classes and objects using lambdas with message dispatch
>   KPY> (i.e. the Scheme way, instead of having a separate "class"
>   KPY> keyword), then the scoping would take care of all the
>   KPY> private-ness for you.
> 
> I was aware of Rees's dissertation when I did the nested scopes and,
> partly as a result, did not provide any introspection mechanism for
> closures.  That is, you can get at a function's func_closure slot but
> there's no way to look inside the cells from Python.  I was thinking
> that closures could replace Bastions.  It stills seems possible, but
> on several occasions I've wished I could introspect about closures
> from Python code.  I'm also unsure that the idea flies so well for
> Python, because you really want secure Python to be as much like
> regular Python as possible.  If the mechanism is based on functions,
> it seems hard to make it work naturally for classes and instances.
> 
>   >> The Zope proxy approach seems a little more promising, because it
>   >> centralizes all the security machinery in one object, a security
>   >> proxy.  A proxy for an object can appear virtually
>   >> indistinguishable for the object itself, except that type(proxy)
>   >> != type(object_being_proxied).  The proxy guarantees that any
>   >> object returned through the proxy is wrapped in its own proxy,
>   >> except for simple immutable objects like ints or strings.
> 
>   KPY> The proxy mechanism is interesting, but not for this purpose.
>   KPY> A proxy is how you implement revocation of capabilities: if you
>   KPY> insert a proxy in front of an object and grant access to that
>   KPY> proxy, then you can revoke the access just by telling the proxy
>   KPY> to stop responding.
> 
> Sure, you can use proxies for revocation, but that's not what I was
> trying to say.
> 
> I think the fundamental problem for rexec is that you don't have a
> security kernel.  The code for security gets scatter throughout the
> interpreter.  It's hard to have much assurance in the security when
> its tangled up with everything else in the language.
> 
> You can use a proxy for an object to deal with goal #1 above --
> enforce an interface for an object.  I think about this much like a
> hardware capability architecture.  The protected objects live in the
> capability segment and regular code can't access them directly.  The
> only access is via a proxy object that is bound to the capability.
> 
> Regardless of proxy vs. rexec, I'd be interested to hear what you
> think about a sound way to engineer a secure Python.

I'm told that proxies actually rely on rexec, too. So, I guess whichever 
approach you take, you need rexec.

The problem is that although you can think about proxies as being like a 
segmented architecture, you have to enforce that segmentation. And that 
means doing so throughout the interpreter, doesn't it? I suppose it 
might be possible to abstract things in some way to make that less 
widespread, but probably not without having an adverse impact on speed.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff