On 7/20/06, Lawrence Oluyede <l.oluyede@gmail.com> wrote:
That's great. I just read your draft but I have little comments to do
but before let me say that I liked the idea to borrow concepts from E.
I've crossed the E's path in the beginning of this year and I found it
a pot of really nice ideas (for promises and capabilities). Here are
my comments about the draft:

- it's not really clear to me what the "powerbox" is. I think I got
the concept of "super process" but maybe it's to be clarified, isn't
it? It become clear in the "threat model" paragraph

The powerbox is the thing that gives your security domains their initial abilities.  The OS gives the process its abilities, but it does not directly work with the interpreter.  Since the process does, though, it is considered the powerbox and farms out abilities that it has been given by the OS.

I have tried to clarify the definition at the start of the doc.

- I hope no Rubystas will read the "Problem of No Private Namespace"
section because they have private/protected keywords to enforce this
stuff :-) Writing proxies in C will slow down the dev process (altough
will speed up the performance maybe) but in a far future someone will
come up with an alternative closer to the Python level

Maybe.  As I said in the doc, any changes must be Pythonic and adding private namespaces right now wouldn't be without much more thought and work.

And if Ruby ends up with this security model but more thoroughly, more power to them.  Their language is different in the right ways to support it.

As for coding in C, thems the breaks.  I plan in adding stuff to the stdlib for the common case.  I might eventually think of a good, generic proxy object that could be used, but as of right now I am not worrying about that since it would be icing on the cake.

- Can you write down a simple example of what you mean with "changing
something of the built-in objects"? (in "Problem of mutable shared
state")

Done.

- What about the performance issues of the capabilities model overall?

Should be faster than an IBAC model since certain calls will not need to check the identity of the caller every time.

But I am not worrying about performance, I am worrying about correctness, so I did not try to make any performance claims.

- I know what you meant to say but the paragraph about pythonicness
and the security model seems a little "fuzzy" to me. Which are the
boundaries of the allowed changes for the security stuff?

Being "pythonic" is a fuzzy term in itself and Guido is the only person who can make definitive claims over what is and is not Pythonic.  As I have said, this doc was mostly written with python-dev in mind since they are the ones I have to convince to let this into the core and they all know the term.

But I have tacked in a sentence on what the term means.

- You don't say anything about networking and networked resources in
the list of the standard sandboxed interpreter

Nope.  Have not started worrying about that yet.  Just trying to get the basic model laid out.

- Suppose we have a .py module. Based on your security model we can
import it, right? When imported it generates a .pyc file. The second
time we import it what happens? .pyc is ignored? import is not allowed
at all? We can't rely on the name of the file.pyc because an attacker
who knows the file.py is secure and the second import is done against
file.pyc can replace the "secure" file.pyc with an implementation not
secure and can do some kind of harm to the sandbox

It will be ignored.  But I am hoping that through rewriting the import machinery more control over generating .pyc files can be had (see Skip Montanaro's PEP on this; forget the number).  This is why exact details were left out of the implementation details.  I just wanted people understand the approach to everything, not the concrete details of how it will be coded up.

- About "Filesystem information". Does the sandboxed interpreter need
to know all that information about file paths, files and so on? Can't
we reset those attributes to something arbitrary?

That is the point.  It is not that the sandbox needs to know it, its that it needs to be hidden from the sandbox.

- About sys module: I think the best way is to have a purged fake sys
module with only the stuff you need. pypy has the concept of faked
modules too (altough for a different reason)

OK.

- About networking: what do you think about the E's model of really
safe networking, protected remotable objects and safe RPC? Is that
model applicable to Python's in some way? We can't use the E's model
as a whole (ask people to generate a safe key and send it by email is
unfeasible)

I have not looked at it.  I am also not trying to build an RPC system *and* a security model for Python.  That is just too much work right now.

- is the protected memory model a some kind of memory monitor system?

Basically.  It just keeps a size_t on the memory cap and another on memory usage, and when memory is requested it makes sure that it won't go over the cap.  And when memory is freed the usage goes down.  It's very rough (hard to account for padding bits, etc. in C structs), but it should be good enough to prevent a program from hitting 800 MB when you really just wanted it to have 5 MB.

I think that's all for the draft. I wrote these comments during the
reading of the document.

Hope some of these help

Thanks, Lawrence.

-Brett