[replying to both Ping and Michael in the same email]<br><br><div><span class="gmail_quote">On 7/6/06, <b class="gmail_sendername">Michael Chermside</b> <<a href="mailto:mcherm@mcherm.com">mcherm@mcherm.com</a>> wrote:
</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Ka-Ping Yee writes:<br>> i'm starting to think<br>> that it would be good to clarify what kinds of threats we are
<br>> trying to defend against, and specify what invariants we are<br>> intending to preserve.<br><br>Yes!<br><br>> So here are a couple of questions for clarification (some with my<br>> guesses as to their answers):
<br><br>Okay, I'll throw in my thoughts also.<br><br>> 1. When we say "restricted/untrusted/<whatever> interpreter" we<br>> don't really mean that the *interpreter* is untrusted, right?<br>> We mean that the Python code that runs in that interpreter is
<br>> untrusted (i.e. to be prevented from doing harm), right?<br><br>Agreed. My interpretation of the proposal was that interpreters<br>were either "sandboxed" or "trusted". "Sandboxed" means that there
<br>are security restrictions imposed at some level (perhaps even NO<br>restrictions). "Trusted" means that the interpreter implements<br>no security restrictions (beyond what CPython already implements,<br>which isn't much) and thus runs faster.
</blockquote><div><br>Yep. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> 2. I'm assuming that the implementation of the Python interpreter
<br>> is always trusted<br><br>Sure... it's got to be.</blockquote><div><br>Yep. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> What do<br>> we take the Trusted Computing Base to include? The Python VM<br>> implementation -- plus all the builtin objects and C modules?<br>> Plus the whole standard library?<br><br>My interpretation of Brett's proposal is that the CPython developers
<br>would try to ensure that Python VM had no "security holes" when<br>running in sandboxed mode. Of course, we also "try" to ensure no<br>crashes are possible also, and while we're quite good, we're not
<br>perfect.<br><br>Beyond that, all pure-python modules with source available (whether<br>in the stdlib or not) can be "trusted" because they run in a<br>sandboxed VM. All C modules are *up to the user*. Brett proposes
<br>to provide a default list of useful-but-believed-to-be-safe modules<br>in the stdlib, but the user can configure the C-module whitelist<br>to whatever she desires.</blockquote><div><br>Michael has it on the money.<br>
</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> 3. Is it part of the plan that we want to protect Python code from<br>> other Python code? For example, should a Python program/function
<br>> X be able to say "i want to launch/call program/function Y with<br>> *these* parameters and have it run under *these* limitations?"<br>> This has a big impact on the model.<br><br>Now *that* is a good question. I would say the answer is a partial
<br>"no", because there are pieces of Brett's security model that are<br>tied to the interpreter instance. Python code cannot launch another<br>interpreter (but perhaps it *should* be able to?), so it cannot<br>
modify those restrictions for new Python code it launches.<br><br>However, I would rather like to allow Python code to execute other<br>code with greater restrictions, although I would accept all kinds<br>of limitations and performance penalties to do so. I would be
<br>satisfied if the caller could restrict certain things (like web<br>and file access) but not others (like memory limits or use of<br>stdout). I would satisfied if the caller paid huge overhead costs<br>of launching a separate interpreter -- heck, even a separate
<br>process. And if it is willing to launch a separate process, then<br>Brett's model works just fine: allow the calling code to start<br>a new (restricted) Python VM.</blockquote><div><br>The plan is that there is no sandboxed eval() that runs unsafe code from a trusted interpreter within its namespace. I hope to provide Python code access to running a sandboxed interpreter where you can pass in a string to be executed, but the namespace for that sandboxed interpreter will be fresh and will not carry over in any way from the trusted interpreter.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> We want to be able to guarantee that...<br>><br>> A. The interpreter will not crash no matter what Python code
<br>> it is given to execute.<br><br>Agreed. We already want to guarantee that, with the caveat that the<br>guarantee doesn't apply to a few special modules (like ctypes).</blockquote><div><br>Right, which is why I have been trying to plug the various known crashers that do not rely upon a specific extension module from being imported.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> B. Python programs running in different interpreters embedded<br>> in the same process cannot communicate with each other.
<br><br>I don't want to guarantee this, does someone else? It's<br>astonishingly hard... there are all kinds of clever "knock on the<br>walls" tricks. For instance, communicate by varying your CPU<br>utilization up and down in regular patterns.
<br><br>I'd be satisfied if they could pass information (perhaps even<br>someday provide a library making it *easy* to do so), but could<br>not pass unforgable items like Python object references, open file<br>descriptors, and so forth.
</blockquote><div><br>Or at least cannot communicate without explicit allowances to do so.<br><br>As for knocking on the walls, if you protect access to that kind of information well, it shouldn't be a problem.<br></div><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> C. Python programs running in different interpreters embedded<br>> in the same process cannot access each other's Python objects.
<br><br>I strengthen that slightly to all "unforgable" items, not just<br>object references.</blockquote><div><br>I would change that to add the caveat that what is exposed by a C extension module attribute will be shared. That is an implementation detail of multiple interpreters.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> D. A given piece of Python code cannot access or communicate<br>> with certain Python objects in the same interpreter.
<br>><br>> E. A given piece of Python code can access only a limited set<br>> of Python objects in the same interpreter.<br><br>Hmmm. I'm not sure.</blockquote><div><br>Not quite sure what you are getting at here, Ping. Are you saying to run code within an interpreter (sandboxed and not) and restricted even more beyond what the interpreter has been given by the security settings?
<br><br>These emails have convinced me to add a "Threat Model" section for the next draft of the design doc.<br></div><br>-Brett<br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
-- Michael Chermside<br></blockquote></div><br>