[Python-Dev] Capabilities
Zooko
zooko@zooko.com
Mon, 31 Mar 2003 17:22:41 -0500
It's apparent that I didn't explain capabilities clearly enough. Also
I misunderstood something about rexec in general and ZipFile in particular.
Once we succeed at understanding each other, I'll then inquire whether you agree
with my Big Word Proofs.
(I, Zooko, wrote lines prepended with "> > ".)
Guido wrote:
>
> > So in the "separate policy language" way of life, access to the
> > ZipFile class gives you the ability to open files anywhere in the
> > filesystem. The ZipFile class therefore has the "dangerous" flag
> > set, and when you run code that you think might misuse this feature,
> > you set the "can't use dangerous things" flag on that code.
>
> But that's not how rexec works. In the rexec world, the zipfile
> module has no special privileges; when it is imported by untrusted
> code, it is reloaded from disk as if it were untrusted itself. The
> zipfile.ZipFile class is a client of "open", an implementation of
> which is provided to the untrusted code by the trusted code.
<Zooko reads the zipfile module docs.>
How is the implementation of "open" provided by the trusted code to the
untrusted code? Is it possible to provide a different "open" implementation to
different "instances" of the zipfile module? (I think not, as there is no such
thing as "a different instance of a module", but perhaps you could have two
rexec "workspaces" each of which has a zipfile module with a different "open"?)
> > In this scheme, there are no flags, and when you run code
> > that you think might misuse this feature, you simply don't give that
> > code a reference to the ZipFile class. (Also, we have to arrange
> > that it can't acquire a reference by "import zipfile".)
>
> The rexec world solves this very nicely IMO. Can't the capability
> world do it the same way? The only difference might be that 'open'
> would have to be a capability.
I don't understand exactly how rexec works yet, but so far it sounds like
capabilities.
Here's a two sentence definition of capabilities:
Authority originates in C code (in the interpreter or C extension modules), and
is passed from thing to thing. A given thing "X" -- an instance of ZipFile, for
example -- has the authority to use a given authority -- to invoke the real
open(), for example -- if and only if some thing "Y" previously held both the
"open()" authority and the "authority to extend authorities to X" authority, and
chose to extend the "open()" authority to X.
That rule could be enforced with the rexec system, right?
Here is a graphical representation of this rule. (Taken from [1].)
http://www.erights.org/elib/capability/ode/images/fundamental.gif
In the diagram, the authority is "Carol", the thing that started with the
authority is "Alice", and Alice is in the process of extending to Bob the
authority to use Carol. This act -- the extending of authority from Alice to
Bob -- is the only way that Bob can gain authority, and it can only happen if
Alice has both the authority to use Carol and the authority to extend
authorities to Bob.
Those two sentences above (and equivalently the graph) completely define
capabilities, in the abstract. They don't say how they are implemented. A
particular implementation that I find deeply appealing is to make "has a
reference to 'open'" be the determiner of whether a thing has the authority to
use "open", and to make "has a reference to X" be the determiner of whether a
thing has the authority to extend authorities to X. That's "unifying
designation with authority", and that's what the E language does.
> But I think "this code can't use ZipFile" is the wrong thing to say.
> You should only have to say "this code can't write files" (or
> something more specific).
I agree. I incorrectly inferred from previous messages that the current problem
under discussion was allowing or denying access to the ZipFile class. But
whatever resource we wish to control access to, these same techniques will
apply.
> > In a system where designation is not unified with authority, you
> > tell this untrusted code "I want you to do this action X.", and then
> > you also have to go update the policy specification to say that the
> > code in question is allowed to do the action X.
>
> Sorry, you've lost me here. Which part is the "designation" (new word
> for me) and which part is the "authority"?
Sorry. First let me point out that the issue of unifying designation with
authority is separable from "the capability access control rule" described
above. The two have good synergy, but aren't identical.
By "designation" I meant "naming". For example... Let's see, I think I'll go
back to my toy tictactoe example from [2].
In the tictactoe example, you have to specify which wxWindow the tictactoe game
object should draw into. This is "designation" -- you pass a reference, which
designates which specific window you are talking about. If you use the
principle of unifying designation and authority, then this same act -- passing a
reference to this particular wxWindows object -- conveys both the identification
of which window to draw into and the authority to draw into it.
# access control system with unified designation and authority
game = TicTacToeGame()
game.display(wxPython.wxWindow())
If you have separate designation and authority, then the same code has to look
something like this:
# access control system with separate designation and authority
game = TicTacToeGame()
window = wxPython.wxWindow()
def policy(subject, resource, operation):
if (subject is game) and (resource is window) and \
(operation == "invoke methods of"):
return True
return False
rexec.register_policy_hook(policy)
game.display(window)
This is what I call "say it twice if you really mean it".
Hm. Reviewing the rexec docs, I being to suspect that the "access control
system with unified designation and authority" *is* how Python does access
control in restricted mode, and that rexec itself is just to manage module
import and certain dangerous builtins.
> It really sounds to me like at least one of our fundamental (?)
> differences is the autonomicity of code units. I think of code (at
> least Python code) as a passive set of instructions that has no
> inherent authority but derives authority from the built-ins passed to
> it; you seem to describe code as having inherent authority.
I definitely don't intend for code to have inherent authority (other than the
Trusted Code Base -- the interpreter -- which can't help but have it). The
word "thing" in my two-sentence definition (a white circle in the diagram) are
"computational things that can have state and behavior". (This includes Python
objects, closures, stack frames, etc... In another context I would call them
"objects", but Python uses the word "object" for something more specific -- an
instance of a class.)
> > This would be effectively the "virtualization" of access control. I
> > regard it as a kind of holy Grail for internet computing.
>
> How practical is this dream? How useful?
Let's revisit the issue once we understand one another's access control schemes.
;-)
Regards,
Zooko
[1] http://www.erights.org/elib/capability/ode/overview.html
[2] http://mail.python.org/pipermail/python-dev/2003-March/033938.html
http://zooko.com/
^-- under re-construction: some new stuff, some broken links