(I, Zooko, wrote the lines prepended with "> > ".)
Guido wrote:
>
> > The [Principle-of-Least-Privilege approach to securing a standard library]
> > is to separate the tools so that dangerous ones don't come tied together
> > with common ones. The security policy, then, is expressed by code that
> > grants or withholds capabilities (== references) rather than by code that
> > toggles the "restricted" bit.
>
> This sounds interesting, but I'm not sure I follow it. Can you
> elaborate by giving a couple of examples?
First let me say that "capability access control" [1] is a theoretical
construct, comparable to "access control lists" [2] and "Trust Management" [3].
Each is a formal model for specifying access control rules -- who is allowed to
do what.
But in the context of Python we are interested not only in the theoretical
model but also in a specific way of implementing it -- by making object
references unforgeable and binding all authorities to object references.
So in this discussion it may not be clear whether a claimed advantage of
"capabilities" flows from the formal model or from the practice of unifying
security programming with object oriented programming. I don't think it is
important to differentiate in this discussion.
Now for examples...
Hm, well first of all, where are rexec and Zope proxies currently used?
I believe that a "cap-Python" would support those uses, implementing the same
security policies, but more cleanly since access control would be a first-class
part of the language.
I don't know Zope very well, and rather than guess, I'd like to ask someone who
does know Zope to give a typical example of how proxies are used in workaday
Zope. I suspect that capabilities are quite similar to Zope proxies.
Now for a quick made-up example to demonstrate what I meant about expressing
security policy above, consider a tic-tac-toe game that is supposed to draw to
the screen.
In "restricted Python v1", certain modules have been flagged as "safe" and others
"unsafe". Code can execute other code with a "restricted" flag set, something
like this:
# restricted Python v1
game = eval(TicTacToeGame, restricted=True)
game.display()
Unfortunately, in "restricted Python v1", all of the modules that allow drawing
to the screen are marked as "unsafe", so the tic-tac-toe-game immediately dies
with an exception.
In "restricted Python v2", an arbitrary security policy can be implemented:
# restricted Python v2
games=[]
def securitypolicy(subject, action, object):
if ((subject in games) and (action == "import") and (object == "wxPython")) or
(subject in games) and (action == "execute") and (object == "wxPython.Window") or
(subject in games) and (action == "execute") and (object == "wxPython.Window.paint")):
return True
# ...
return False
game = eval(TicTacToeGame, policy=securitypolicy)
gameobjh.append(game)
game.display()
I think that the "rexec" design was along the lines of "restricted Python v2",
but I apologize if this simple analogy insults anyone.
I'm not sure whether "restricted Python v2" is expressive enough to implement
the capability security access control model or not, but I don't care, because
I don't like "restricted Python v2". I like restricted Python v3:
# restricted Python v3
game = TicTacToeGame()
game.display(wxPython.wxWindow())
Now the game object has a reference to the window object, and it can use that
reference to draw the pictures. If I later change this design and decide that
instead of drawing to a window, I want the game to write to a file, then I'll
change the implementation of the TicTacToeGame class, and then'll I'll come back
here to this code and change it from passing a wxWindows to:
# restricted Python v3
game = TicTacToeGame()
game.display(open("/tmp/tttgame.out","w"))
Now if I were writing in "restricted Python v2", then in addition to those two
changes I would also have to make a third change, which is to edit my
securitypolicy function in order to allow this particular game object to access
a file named "/tmp/tttgame.out", and to disallow it access to wxPython:
# restricted Python v2
def securitypolicy(subject, action, object):
if (subject in games) and (action in ("read", "write",)) and (object == "file:/tmp/tttgame.out"):
return True
# ...
return False
game = TicTacToeGame()
game.display("/tmp/tttgame.out")
This is what I meant by saying that the security policy is expressed in Python
instead of by twiddling access bits in an embedded policy language. In a
capability-secure language, the change (which the programmer has to make anyway),
from "wxPython.wxWindows()" to "open('/tmp/tttgame.out', 'w')" is necessary and
sufficient to enforce the programmer's intended security policy, so there is no
need for the redundant and brittle "policy" function.
I find this unification access control and application logic to resonate deeply
with the Zen of Python.
Regards,
Zooko
[1] http://www.eros-os.org/papers/shap-thesis.ps
[2] http://www.research.microsoft.com/~lampson/09-Protection/Acrobat.pdf
[3] http://citeseer.nj.nec.com/blaze96decentralized.html
I'm not sure whether to classify this as a bug or a feature request.
Recently, I got burned by the fact that despite the name, dirname() does not
return the expected directory portion of a path if you pass it a directory,
instead it will return the parent directory because it uses split.
That it uses split is clearly documented and also evident in the source,
though both fail to point out the case of passing in a directory path.
"dirname(path)
Return the directory name of pathname path. This is the first half of the
pair returned by split(path)."
# Return the head (dirname) part of a path.
def dirname(p):
"""Returns the directory component of a pathname"""
return split(p)[0]
However, to get what I would consider correct behavior based on the function
name, the code would need to be:
def dirname(p):
"""Returns the directory component of a pathname"""
if isdir(p):
return p
else:
return split(p)[0]
Changing dirname() may in fact break existing code if people expect it to
just use split, so a dirname2() function seems called for, but that seems
silly, given that dirname should probably be doing an isdir() check.
ka
Four members of PythonLabs will be at the pre-PyCon sprint (more info on
sprints at http://www.python.org/cgi-bin/moinmoin/SprintPlan ) running one
for the Python core. If you would like to attend, email me at
brett(a)python.org to say so. You must be registered for PyCon to be able
to attend. And please do this ASAP so we can get the ball rolling on this
and lock down who will be there.
And regardless whether you care to attend or not, please look at
http://www.python.org/cgi-bin/moinmoin/PyCoreSprint and make suggestions
on what the group should sprint on.
-Brett
I just tried running regrtest with "-uall,-largefile" (after a "cvs up",
"./config.status --recheck", and "make") on my Mac OS X system. It chugged
for awhile, then spit this out several times:
Exception in thread reader 4:
Traceback (most recent call last):
File "/Users/skip/src/python/head/dist/src/Lib/threading.py", line 411, in __bootstrap
self.run()
File "/Users/skip/src/python/head/dist/src/Lib/threading.py", line 399, in run
self.__target(*self.__args, **self.__kwargs)
File "/Users/skip/src/python/head/dist/src/Lib/bsddb/test/test_thread.py", line 270, in reade
rThread
rec = c.first()
DBLockDeadlockError: (-30995, 'DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock')
once for each thread, then this:
/Users/skip/src/python/head/dist/src/Lib/bsddb/dbutils.py:67: RuntimeWarning: DB_INCOMPLETE: Ca
che flush was unable to complete
return function(*_args, **_kwargs)
After chugging awhile longer, it segfaulted.
What (if anything) can I do to provide useful inputs to someone who can
possibly fix the problem?
Skip
Back at work on the ossaudiodev docs for a few minutes. Documenting an
API is always a great opportunity to clean it up, and the
ossaudiodev.open() function has a weird interface right now. From the
current docs:
"""
open([device, ] mode)
Open an audio device and return an OSS audio device object. This
object supports many file-like methods, such as read(), write(), and
fileno() (although there are subtle differences between conventional
Unix read/write semantics and those of OSS audio devices). It also
supports a number of audio-specific methods; see below for the
complete list of methods.
Note the unusual calling syntax: the first argument is optional, and
the second is required. This is a historical artifact for
compatibility with the older linuxaudiodev module which ossaudiodev
supersedes.
device is the audio device filename to use. If it is not specified,
this module first looks in the environment variable AUDIODEV for a
device to use. If not found, it falls back to /dev/dsp.
mode is one of 'r' for read-only (record) access, 'w' for write-only
(playback) access and 'rw' for both. Since many soundcards only
allow one process to have the recorder or player open at a time it
is a good idea to open the device only for the activity
needed. Further, some soundcards are half-duplex: they can be opened
for reading or writing, but not both at once.
"""
The historical background is that in linuxaudiodev prior to Python 2.3,
it was *impossible* to specify the device file to open -- you had to do
something like this:
os.environ['AUDIODEV'] = "/dev/dsp2"
dsp = linuxaudiodev.open("w")
Fixing that wart is what led me to create ossaudiodev in the first
place. Cleaning up the remaining ugliness in ossaudiodev.open() brings
things nicely full-circle. Anyways, since the module has been renamed,
who cares about backwards compatibility with linuxaudiodev? I'd like to
change the open() interface to:
open(device, mode)
where both are required. (Most use of the audio device is for playback,
not recording. But a default mode of "w" goes counter to expectations.
So I think 'mode' should be required.)
This would also mean getting rid of the $AUDIODEV check in
ossaudiodev.c. Less C code is a good thing, unless of course it leads
to lots of redundant Python code all over the world.
Finally, for consistency I should also change openmixer() to require a
'device' argument (currently, it does the same thing, but hardcodes
"/dev/mixer" and checks $MIXERDEVICE).
Of course, this will lead people to hardcode "/dev/dsp" (and/or
"/dev/mixer") into their Python audio scripts. That's bad if other
OSS-using operating systems have different names for the standard audio
devices. Do they?
But it's certainly no *worse* than the situation for C programmers, who
have to assume "/dev/dsp" as a default -- the open(2) system call
certainly doesn't let you get away with leaving the filename out. And
besides, "/dev/dsp" is already hard-coded into ossaudiodev.c, so if
that's inappropriate on certain operating systems, somebody's going to
lose already.
Thoughts?
Greg
--
Greg Ward <gward(a)python.net> http://www.gerg.ca/
Sure, I'm paranoid... but am I paranoid ENOUGH?
Someone changed test_popen to "quote" the path to python:
cmd = '"%s" -c "import sys;print sys.argv" %s' % (sys.executable,
cmdline)
^ ^
The double-quote characters above the carets are new.
This causes test_popen to fail on Win2K, but not on Win98. The relevant
difference appears to be the default shell (cmd.exe on the former,
command.com on the latter).
Simplifed example, on Win2K:
>>> p = os.popen('python -c "print 666"')
>>> p.read()
'666\n'
>>> p.close()
>>>
Worked fine, but doesn't if python is quoted:
>>> p = os.popen('"python" -c "print 666"')
>>> p.read()
''
>>> p.close()
1
>>>
The same kind of behavior can be observed directly from a DOS-box prompt:
C:\Code\python\PCbuild>cmd /c python -c "print 666"
666
C:\Code\python\PCbuild>
Worked fine, but quoting the program name flops:
C:\Code\python\PCbuild>cmd /c "python" -c "print 666"
'python" -c "print' is not recognized as an internal or external command,
operable program or batch file.
C:\Code\python\PCbuild>
So it looks like it stripped off the first and last double-quote characters,
leaving two senseless double-quote characters "in the middle".
(I, Zooko, wrote the lines prepended with "> > ".)
Jeremy Hylton <jeremy(a)zope.com> wrote:
>
> > Until you have a substantial Least-Privilege-respecting library you can't gain
> > the big benefit of capabilities -- code which is capable of doing something
> > useful without also being capable of doing harm. (You can gain the "sandbox"
> > style of security -- code which is incapable of doing anything useful or
> > harmful.)
>
> If you need to rewrite all the libraries to be capability-aware, then
> you need to trust everyone who writes library code to understand
> capabilities and be thorough enough to get them right.
With capabilities, as with any other security regime, you can execute code while
denying it access to any of the standard libraries. However if you want to
provide code access to some of the standard library's privileges without
providing access to all of them, then you in any possible security regime need
(a) some way to express which privileges it gets and which it doesn't, with
sufficiently fine granularity that you can grant the privileges you want while
excluding those you must, and (b) when actually executing the code you have to
choose which specific privileges to extend.
In a capability secure language the first step, (a) is done by the language
designer. Then the library designer provides a library of bundles of
privileges, and then (b) a programmer executes the code, passing to that code
all and only those privileges which he wants that code to have.
The library designer's job is actually pretty easy -- just: 1. try to make
privileges which are likely to be wanted separately conveniently separable and
2. try to make privileges which are likely to be wanted together conveniently
bundled.
If the library designers err on either side, the application programmer can
patch it up. For example, suppose the library designer made it so that a single
object, the "os" object, contained both the "os.system()" method and the
"os.times()" method, and the programmer wants to extend the ability to get a
timestamp without extending the ability to invoke arbitrary commands. (Note:
I'm aware that os is a module and not an object, but for now I want to think of
it as an object to be passed by reference instead of as a modules to be
"import"'ed. If we continue along the cap-Python path we'll have to come back
to this.)
So the programmer just defines a proxy:
class osproxy:
def __init__(self, os):
self.os=os
def times(self):
return self.os.times()
and gives an instance of osproxy instead of the os object itself. (In practice,
when it is only a single method, you would of course prefer to just pass the
method itself. The proxy pattern is more general.)
If the library designer has erred on the other side, making separate objects for
each of a dozen different related and innocuous functions, the programmer will
very likely define one object which contains all of those functions and pass a
reference to that object where he would have had to pass a dozen references to a
dozen functions.
I may have made too big a deal about this originally. I just spent a few
minutes browsing through modindex.html (parts of which I am already intimately
familiar with), and nothing jumped out at me as needing to be wrapped or
refactored before it could be used in a cap-Python. Perhaps the Python Standard
Library's natural modularity has already gotten us most of the way there.
> > http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop
>
> I don't see the part of this paper that talks about library design :-).
> I assume that it's the first section "Only Connectivity Begets
> Connectivity." But I don't know if I understand how that applies to
> library design in concrete terms.
No, "Only Connectivity Begets Connectivity" is just the "pointer-safety"
requirement -- that one can't get a reference to an object, except by either
(a) creating the object, or (b) getting the reference from some other object
which already had the reference.
Hm. Yes, that page doesn't really talk about library design. The authors of
E performed a project [1] for DARPA in which they implemented a web browser
which could host pluggable renderers, such that a malicious renderer was
constrained in the damage it could do. (I have no idea what DARPA wants with
such a thing. ;-))
The security review team at the conclusion of the project (which included great
cryptographer David Wagner) wrote [2] that E appeared to have advanced the state
of the art without breaking a sweat. The security flaws that they uncovered
were mostly due to insufficient wrapping of the Java standard libraries. For
example, the E folks had allowed an object to access a Java "File" object so
that it could access a single file, without realizing that the Java File object
has a "getParentFile()" method which returns the parent directory.
That was why I made such a big deal about the importance of a secure standard
library in my previous message. (As you know, Python's file objects don't have
a "getParentFile()" method, so we're already one step ahead of Java there...)
Regards,
Zooko
[1] http://www.combex.com/tech/darpaBrowser.html
[2] http://www.combex.com/papers/darpa-review/index.html
Jeremy Hylton <jeremy(a)alum.mit.edu> wrote:
>
> Exceptions do seem like a problem.
This reminds me of a similar problem. Object A is a powerful object. Object B
is a proxy for A which passes through only a subset of A's methods. So B is
suitable to give to Object C, which should be able to use the B subset but not
the full A set.
The problem is if the B subset of methods includes a callback idiom, in which
Object A calls a function provided by its client and passes a reference to
itself as an argument.
class A:
def register_event_handler(self, handler):
self.handlers.append(handler)
def process_events(self):
# ...
for handler in self.handlers:
handler(self)
This allows C full access to object A's methods if C has access to the
register_event_handler() method. (Even if A has private data and even if there
is no flaw in the proxy or capability enforcement that prevents C from getting
access to A through B.)
So the designer of the B proxy has to not only exclude dangerous methods of A,
but also has to either exclude methods that lead to this kind of callback, or
else make B a two-faced proxy that registers itself instead of C as the handler,
forwards the callback, and passes a reference to itself instead of to A in the
callback.
Regards,
Zooko
To enforce capability access control, a language requires three things:
1. Pointer-safety. (There must not be a function available which performs the
inverse of id().) Python has pointer-safety (unless a 3rd party native
extension module has been executed).
2. Mandatory private data (accessible only by the object itself). Normal
Python doesn't have mandatory private data. If I understand correctly, both
rexec and proxies (attempt to) provide this. They also attempt to provide
another safety feature: a wrapper around the standard library and builtins that
turns off access to dangerous features according to an overridable security
policy.
3. A standard library that follows the Principle of Least Privilege. That is,
a library full of tools that you can extend to an object in order to empower it
to do specific things (e.g. __builtin__.abs(), os.times(), ...) without thereby
also empowering it to do other things (e.g. __builtin__.file(), os.system(),
...). Python doesn't have such a library.
Now the Principle of Least Privilege approach to making a library safe is very
different from the "sandbox" approach. The latter is to remove all "dangerous"
tools from the toolbox (or in our case, to have them dynamically disabled by the
"restricted" bit which is determined by an overridable policy). The former is
to separate the tools so that dangerous ones don't come tied together with
common ones. The security policy, then, is expressed by code that grants or
withholds capabilities (== references) rather than by code that toggles the
"restricted" bit.
Of course, you can start by denying the entire standard library to restricted
code, and then incrementally refactor the library or wrap it in Least-Privilege
wrappers.
Until you have a substantial Least-Privilege-respecting library you can't gain
the big benefit of capabilities -- code which is capable of doing something
useful without also being capable of doing harm. (You can gain the "sandbox"
style of security -- code which is incapable of doing anything useful or
harmful.)
This requirement also means that there can be no "ambient authority" --
authority that an object receives even if its creator has given it no
references.
Regards,
Zooko
P.S. I learned this three-part paradigm from Mark Miller whose paper with Chip
Morningstar and Bill Frantz articulates it in more detail:
http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop
From: Samuele Pedroni [mailto:pedronis@bluewin.ch]
> the first candidate would be a generalization of 'class'
> (although that make it redundant with 'class' and meta-classes)
> so that
>
> KEYW-TO-BE kind name [ '(' expr,... ')' ] [ maybe [] extended syntax ]:
> suite
>
> would be equivalent to
>
> name = kind(name-as-string,(expr,...),dict-populated-executing-suite)
[fixed up to exclude the docstring, as per the followup message]
I like this - it's completely general, and easy to understand. Then again,
I always like constructs defined in terms of code equivalence, it seems to
be a good way to make the semantics completely explicit.
The nice thing, to me, is that it solves the immediate problem (modulo a
suitable "kind" to work for properties), as well as being extensible to
allow it to be used in more general contexts.
The downside may be that it's *too* general - I've no feel for how it would
look if overused - it might feel like people end up defining their own
application language.
> the remaining problem would be to pick a suitable KEYW-TO-BE
"block"?
Someone, I believe, suggested reusing "def" - this might be nice, but IIRC
it won't work because of the grammar's strict lookahead limits. (If it does
work, then "def" looks good to me).
If def won't work, how about "define"? The construct is sort of an extended
form of def. Or is that too cute?
By the way, can I just say that I am +1 on Michael Hudson's original patch
for [...] on definitions. Even though it doesn't solve the issue of
properties, I think it's a nice solution for classmethod and staticmethod,
and again I like the generality.
Paul.