<br><br><div><span class="gmail_quote">On 9/6/06, <b class="gmail_sendername">Ka-Ping Yee</b> &lt;<a href="mailto:python-dev@zesty.ca" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">python-dev@zesty.ca

</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hi Brett,<br><br>Here are some comments on your proposal.&nbsp;&nbsp;Sorry this took so long.<br>I apologize if any of these comments are out of date (but also look<br>forward to your answers to some of the questions, as they'll help

<br>me understand some more of the details of your proposal).&nbsp;&nbsp;Thanks!</blockquote><div><br>I think they are slightly outdated.&nbsp; The latest version of the doc is in the bcannon-objcap branch and is named securing_python.txt (

<a href="http://svn.python.org/view/python/branches/bcannon-objcap/securing_python.txt" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://svn.python.org/view/python/branches/bcannon-objcap/securing_python.txt

</a>).<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Introduction<br>&gt; ///////////////////////////////////////<br>[...]<br>&gt; Throughout this document several terms are going to be used.&nbsp;&nbsp;A<br>&gt; &quot;sandboxed interpreter&quot; is one where the built-in namespace is not the

<br>&gt; same as that of an interpreter whose built-ins were unaltered, which<br>&gt; is called an &quot;unprotected interpreter&quot;.<br><br>Is this a definition or an implementation choice?&nbsp;&nbsp;As in, are you<br>defining &quot;sandboxed&quot; to mean &quot;with altered built-ins&quot; or just

<br>&quot;restricted in some way&quot;, and does the above mean to imply that<br>altering the built-ins is what triggers other kinds of restrictions<br>(as it did in Python's old restricted execution mode)?</blockquote><div>

<br>There is no &quot;triggering&quot; of other restrictions.&nbsp; This is an implementation choice.&nbsp; &quot;Sandboxed&quot; means &quot;with altered built-ins&quot;.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; A &quot;bare interpreter&quot; is one where the built-in namespace has been<br>&gt; stripped down the bare minimum needed to run any form of basic Python<br>&gt; program.&nbsp;&nbsp;This means that all atomic types (i.e., syntactically

<br>&gt; supported types), ``object``, and the exceptions provided by the<br>&gt; ``exceptions`` module are considered in the built-in namespace.&nbsp;&nbsp;There<br>&gt; have also been no imports executed in the interpreter.<br><br>

Is a &quot;bare interpreter&quot; just one example of a sandboxed interpreter, or are all sandboxed interpreters in your design initially bare (i.e. &quot;sandboxed&quot; = &quot;bare&quot; + zero or more granted authorities)?

</blockquote><div><br>You build up from a bare interpreter by adding in authorities (e.g., providing a wrapped version of open()) to reach the level of security you want.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; The &quot;security domain&quot; is the boundary at which security is cared<br>&gt; about.&nbsp;&nbsp;For this dicussion, it is the interpreter.<br><br>It might be clearer to say (if i understand correctly) &quot;Each interpreter

<br>is a separate security domain.&quot;<br></blockquote><div>&nbsp;</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Many interpreters can run within a single operating system process,

<br>right?</blockquote><div><br>Yes. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&nbsp;&nbsp;Could you say a bit about what sort of concurrency model you

<br>have in mind?</blockquote><div><br>None specifically.&nbsp; Each new interpreter automatically runs in its own Python thread, so they have essentially the same concurrency as using the 'thread' module.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&nbsp;&nbsp;How would this interact (if at all) with use of the<br>existing threading functionality?</blockquote><div><br>See above. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; The &quot;powerbox&quot; is the thing that possesses the ultimate power in the &gt; system.&nbsp;&nbsp;In our case it is the Python process. This could also be the application process, right?</blockquote><div> If Python is embedded, yes.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Rationale<br>&gt; ///////////////////////////////////////<br>[...]<br>&gt; For instance, think of an application that supports a plug-in system

<br>&gt; with Python as the language used for writing plug-ins.&nbsp;&nbsp;You do not<br>&gt; want to have to examine every plug-in you download to make sure that<br>&gt; it does not alter your filesystem if you can help it.&nbsp;&nbsp;With a proper

&gt; security model and implementation in place this hinderance of having &gt; to examine all code you execute should be alleviated. I'm glad to have this use case set out early in the document, so the reader can keep it in mind as an example while reading about the model.

<br><br>&gt; Approaches to Security<br>&gt; ///////////////////////////////////////<br>&gt;<br>&gt; There are essentially two types of security: who-I-am<br>&gt; (permissions-based) security and what-I-have (authority-based)

<br>&gt; security.<br><br>As Mark Miller mentioned in another message, your descriptions of<br>&quot;who-I-am&quot; security and &quot;what-I-have&quot; security make sense, but<br>they don't correspond to &quot;permission&quot; vs. &quot;authority&quot;.&nbsp;&nbsp;They

<br>correspond to &quot;identity-based&quot; vs. &quot;authority-based&quot; security.</blockquote><div><br>Right.&nbsp; This was fixed the day Mark and Alan Karp made the comment.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Difficulties in Python for Object-Capabilities<br>&gt; //////////////////////////////////////////////<br>[...]<br>&gt; Three key requirements for providing a proper perimeter defence is

<br>&gt; private namespaces, immutable shared state across domains, and<br>&gt; unforgeable references.<br><br>Nice summary.<br><br>&gt; Problem of No Private Namespace<br>&gt; ===============================<br>[...]<br>

&gt; The Python language has no such thing as a private namespace. Don't local scopes count as private namespaces?&nbsp;&nbsp;It seems clear that they aren't designed with the intention of being exposed, unlike other namespaces in Python.

</blockquote><div> Sort of.&nbsp; But you can still get access to them if you have an execution frame and they are not persistent.&nbsp; Generators are are worse since they store their execution frame with the generator itself, completely exposing the local namespace.

</div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; It also makes providing security at the object level using &gt; object-capabilities non-existent in pure Python code.

</blockquote><div><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I don't think this is necessarily the case.&nbsp;&nbsp;No Python code i've

<br>ever seen expects to be able to invade the local scopes of other

<br>functions, so you could use them as private namespaces.&nbsp;&nbsp;There<br>are two ways i've seen to invade local scopes:<br><br>&nbsp;&nbsp;&nbsp;&nbsp;(a) Use gc.get_referents to get back from a cell object<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;to its contents.<br><br>&nbsp;&nbsp;&nbsp;&nbsp;(b) Compare the cell object to another cell object, thereby

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;causing __eq__ to be invoked to compare the contents of &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;the cells.</blockquote><div> Or the execution frame which is exposed directly on generators. But regardless, the comment was meant to apply to Python as it stands, not that it couldn't be possibly tweaked somehow.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">So you could protect local scopes by prohibiting these or by<br>simply turning off access to func_closure.&nbsp;&nbsp;It's clear that hardly

<br>any code depends on these introspection featuresl, so it would be<br>reasonble to turn them off in a sandboxed interpreter.&nbsp;&nbsp;(It seems<br>you would have to turn off some introspection features anyway in<br>order to have reliable import guards.)

</blockquote><br>Maybe this can be changed in the future, but this more than I need at the moment so I am not going to go down that path right now.&nbsp; But I added a quick mention of this.<br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Problem of Mutable Shared State &gt; =============================== [...] &gt; Regardless, sharing of state that can be influenced by another &gt; interpreter is not safe for object-capabilities.

Yup. &gt; Threat Model &gt; /////////////////////////////////////// Good to see this specified here.&nbsp;&nbsp;I like the way you've broken this down.</blockquote><div> The current version has more details per point than the one you read.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; * An interpreter cannot gain abilties the Python process possesses

<br>&gt;&nbsp;&nbsp; without explicitly being given those abilities.<br><br>It would be good to enumerate which abilities you're referring to in<br>this item.&nbsp;&nbsp;For example, a bare interpreter should be able to allocate<br>memory and call most of the built-in functions, but should not be able

to open network connections. &gt; * An interpreter cannot influence another interpreter directly at the &gt;&nbsp;&nbsp; Python level without explicitly allowing it. You mean, without some other entity explicitly allowing it, right?

</blockquote><div><br>Yep. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">What would that other entity be -- presumably the interpreter that

<br>spawned both of these sub-interpreters?</blockquote><div><br>Sure.&nbsp; You could stick something in the built-in namespace of the sub-interpreter to use for communicating.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; * An interpreter cannot use operating system resources without being &gt;&nbsp;&nbsp; explicitly given those resources.

Okay. &gt; * A bare Python interpreter is always trusted. What does &quot;trusted&quot; mean in the above?</blockquote><div> It means that if Python source code can execute within a bare interpreter it is considered safe code.&nbsp; This is covered in the new version of the doc.

</div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; * Python bytecode is always distrusted. &gt; * Pure Python source code is always safe on its own.

<br><br>It would be helpful to clarify &quot;safe&quot; here.&nbsp;&nbsp;I assume by &quot;safe&quot; you<br>mean that the Python source code can express whatever it wants,<br>including potentially dangerous activities, but when run in a bare

<br>or sandboxed interpreter it cannot have harmful effects.&nbsp;&nbsp;But then<br>in what sense does the &quot;safety&quot; have to do with the Python source code<br>rather than the restrictions on the interpreter?<br><br>Would it be correct to say:

&nbsp;&nbsp;+ We want to guarantee that Python source code cannot violate &nbsp;&nbsp;&nbsp;&nbsp;the restrictions in a restricted or bare interpreter. &nbsp;&nbsp;+ We do not prevent arbitrary Python bytecode from violating &nbsp;&nbsp;&nbsp;&nbsp;these restrictions, and assume that it can.

</blockquote><div><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt;&nbsp;&nbsp;&nbsp;&nbsp; + Malicious abilities are derived from C extension modules,

<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; built-in modules, and unsafe types implemented in C, not from<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pure Python source.<br><br>By &quot;malicious&quot; do you just mean &quot;anything that isn't accessible to

<br>a bare interpreter&quot;?</blockquote><div><br>Anything that could harm the system or interpreter. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; * A sub-interpreter started by another interpreter does not inherit &gt;&nbsp;&nbsp; any state. Do you envision a tree of interpreters and sub-interpreters?&nbsp;&nbsp;Can the levels of spawning get arbitrarily deep?

</blockquote><div><br>Yes and yes. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">If i am visualizing your model correctly, maybe it would be useful to

<br>introduce the term &quot;parent&quot;, where each interpreter has as its parent<br>either the Python process or another interpreter.&nbsp;&nbsp;Then you could say

that each interpreter acquires authority only by explicit granting from its parent.</blockquote><div> You could, although there is not hierarchy at the implementation level.&nbsp; But it works in terms of who has a reference to whom and who gives each interpreter their authority.

<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Then i have another question: can an interpreter acquire<br>authorities only when it is started, or can it acquire them while it is

running, and how?</blockquote><div> &nbsp;Well, whatever you want to do through the built-in namespace.&nbsp; So if you pass in a mutable object like a dict and add stuff to it on the fly, I don't see why you couldn't give new authorities on the fly.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Implementation<br>&gt; ///////////////////////////////////////<br>&gt;<br>

&gt; Guiding Principles<br>&gt; ========================<br>&gt;<br>&gt; To begin, the Python process garners all power as the powerbox.&nbsp;&nbsp;It is

<br>&gt; up to the process to initially hand out access to resources and<br>&gt; abilities to interpreters.&nbsp;&nbsp;This might take the form of an interpreter<br>&gt; with all abilities granted (i.e., a standard interpreter as launched

<br>&gt; when you execute Python), which then creates sub-interpreters with<br>&gt; sandboxed abilities.&nbsp;&nbsp;Another alternative is only creating<br>&gt; interpreters with sandboxed abilities (i.e., Python being embedded in

<br>

&gt; an application that only uses sandboxed interpreters). This sounds like part of your design to me.&nbsp;&nbsp;It might help to have this earlier in the document (maybe even with an example diagram of a tree of interpreters).

</blockquote><div><br>Made Guiding Principles its own section and split off the bottom part of the section and put it under Implementation.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; All security measures should never have to ask who an interpreter is.<br>&gt; This means that what abilities an interpreter has should not be stored<br>&gt; at the interpreter level when the security can use a proxy to protect

<br>&gt; a resource.&nbsp;&nbsp;This means that while supporting a memory cap can<br>&gt; have a per-interpreter setting that is checked (because access to the<br>&gt; operating system's memory allocator is not supported at the program

<br>&gt; level), protecting files and imports should not such a per-interpreter<br>&gt; protection at such a low level (because those can have extension<br>&gt; module proxies to provide the security).<br><br>It might be good to declare two categories of resources -- those

protected by object hiding and those protected by a per-interpreter setting -- and make lists.</blockquote><div> That is rather unknown since I am constantly finding stuff that is global to the process compared to the interpreter, so making the list seems premature.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Backwards-compatibility will not be a hindrance upon the design or<br>&gt; implementation of the security model.&nbsp;&nbsp;Because the security model will

<br>&gt; inherently remove resources and abilities that existing code expects,<br>&gt; it is not reasonable to expect existing code to work in a sandboxed<br>&gt; interpreter.<br><br>You might qualify the last statement a bit.&nbsp;&nbsp;For example, a Python

<br>implementation of a pure algorithm (e.g. string processing, data<br>compression, etc.) would still work in a sandboxed interpreter.</blockquote><div><br>I tossed in &quot;all&quot; to clarify. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Keeping Python &quot;pythonic&quot; is required for all design decisions.

<br><br>As Lawrence Oluyede also mentioned, it would be helpful to say a<br>little more about what &quot;pythonic&quot; means.</blockquote><div><br>Done in the current version. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Restricting what is in the built-in namespace and the safe-guarding<br>&gt; the interpreter (which includes safe-guarding the built-in types) is

&gt; where security will come from. Sounds good. &gt; Abilities of a Standard Sandboxed Interpreter &gt; ============================================= &gt; [...] &gt; * You cannot open any files directly.

&gt; * Importation &gt;&nbsp;&nbsp;&nbsp;&nbsp; + You can import any pure Python module. &gt;&nbsp;&nbsp;&nbsp;&nbsp; + You cannot import any Python bytecode module. &gt;&nbsp;&nbsp;&nbsp;&nbsp; + You cannot import any C extension module. &gt;&nbsp;&nbsp;&nbsp;&nbsp; + You cannot import any built-in module.

<br>&gt; * You cannot find out any information about the operating system you<br>&gt;&nbsp;&nbsp; are running on.<br>&gt; * Only safe built-ins are provided.<br><br>This looks reasonable.&nbsp;&nbsp;This is probably a good place to itemize<br>

exactly which built-ins are considered safe.<br><br>&gt; Imports<br>&gt; -------<br>&gt;<br>&gt; A proxy for protecting imports will be provided.&nbsp;&nbsp;This is done by<br>&gt; setting the ``__import__()`` function in the built-in namespace of the

<br>&gt; sandboxed interpreter to a proxied version of the function.<br>&gt;<br>&gt; The planned proxy will take in a passed-in function to use for the<br>&gt; import and a whitelist of C extension modules and built-in modules to

&gt; allow importation of. Presumably these are passed in to the proxy's constructor.</blockquote><div> Current plan is to expose the built-in namespace, imported modules, and sys module dict when creating an Interpreter instance.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; If an import would lead to loading an extension<br>&gt; or built-in module, it is checked against the whitelist and allowed

&gt; to be imported based on that list.&nbsp;&nbsp;All .pyc and .pyo file will not &gt; be imported.&nbsp;&nbsp;All .py files will be imported. I'm unclear about this.&nbsp;&nbsp;Is the whitelist a list of module names only, or of filenames with extensions?

</blockquote><div><br>Have not deciced, but probably module name. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&nbsp;&nbsp;Does the normal path-searching process

<br>take place or can it be restricted in some way?</blockquote><div><br>Have not decided. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&nbsp;&nbsp;Would it simplify the<br>security analysis to have the whitelist be a dictionary that maps module<br>names to absolute pathnames?</blockquote><div><br>Don't know.&nbsp; Protecting imports is the last thing I am going to implement since it is the trickiest. 

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">If both the .py and .pyc are present, the normal import would find the

<br>.pyc file; would the import proxy reject such an import or ignore it<br>and recompile the .py instead?</blockquote><div><br>Somethign along those lines. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; It must be warned that importing any C extension module is dangerous.<br><br>Right.<br><br>&gt; Implementing Import in Python

&gt; +++++++++++++++++++++++++++++ &gt; &gt; To help facilitate in the exposure of more of what importation &gt; requires (and thus make implementing a proxy easier), the import &gt; machinery should be rewritten in Python.

This seems like a good idea.&nbsp;&nbsp;Can you identify which minimum essential pieces of the import machinery have to be written in C?</blockquote><div> Loading of C extensions, stating files, reading files, etc.&nbsp; Pretty much that requires help from the OS.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Sanitizing Built-In Types<br>&gt; -------------------------<br>[...]<br>

&gt; Constructors<br>&gt; ++++++++++++<br>&gt;<br>&gt; Almost all of Python's built-in types<br>&gt; contain a constructor that allows code to create a new instance of a<br>&gt; type as long as you have the type itself.&nbsp;&nbsp;Unfortunately this does not

<br>&gt; work in an object-capabilities system without either providing a proxy<br>&gt; to the constructor or just turning it off.<br><br>The existence of the constructor isn't (by itself) the problem.<br>The problem is that both of the following are true:

&nbsp;&nbsp;&nbsp;&nbsp;(a) From any object you can get its type object. &nbsp;&nbsp;&nbsp;&nbsp;(b) Using any type object you can construct a new instance. So, you can control this either by hiding the type object, separating the constructor from the type, or disabling the constructor.

</blockquote><div><br>I separated the constructor or initializer (tp_new or tp_init) into a factory function.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Types whose constructors are considered dangerous are:<br>&gt;<br>&gt; * ``file``<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; + Will definitely use the ``open()`` built-in.<br>&gt; * code objects<br>&gt; * XXX sockets?<br>&gt; * XXX type?<br>

&gt; * XXX<br><br>Looks good so far.&nbsp;&nbsp;Not sure i see what's dangerous about 'type'.</blockquote><div><br>That's why it has the question mark.&nbsp; =) <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Filesystem Information<br>&gt; ++++++++++++++++++++++<br>&gt;<br>&gt; When running code in a sandboxed interpreter, POLA suggests that you

<br>&gt; do not want to expose information about your environment on top of<br>&gt; protecting its use.&nbsp;&nbsp;This means that filesystem paths typically should<br>&gt; not be exposed.&nbsp;&nbsp;Unfortunately, Python exposes file paths all over the

<br>&gt; place:<br>&gt;<br>&gt; * Modules<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; + ``__file__`` attribute<br>&gt; * Code objects<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; + ``co_filename`` attribute<br>&gt; * Packages<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; + ``__path__`` attribute<br>&gt; * XXX<br>&gt;<br>

&gt; XXX how to expose safely?<br><br>It seems that in most cases, a single Python object is associated with<br>a single pathname.&nbsp;&nbsp;If that's true in general, one solution would be<br>to provide an introspection function named 'getpath' or something

<br>similar that would get the path associated with any object.&nbsp;&nbsp;This<br>function might go in a module containing all the introspection functions,<br>so imports of that module could be easily restricted.</blockquote><div>

<br>That is the current thinking. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Mutable Shared State

<br>&gt; ++++++++++++++++++++<br>&gt;<br>&gt; Because built-in types are shared between interpreters, they cannot<br>&gt; expose any mutable shared state.&nbsp;&nbsp;Unfortunately, as it stands, some<br>&gt; do.&nbsp;&nbsp;Below is a list of types that share some form of dangerous state,

<br>&gt; how they share it, and how to fix the problem:<br>&gt;<br>&gt; * ``object``<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; + ``__subclasses__()`` function<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - Remove the function; never seen used in real-world code.<br>&gt; * XXX<br>

<br>Okay, more to work out here. :)</blockquote><div><br>Possibly.&nbsp; I might have to wait until I am much closer to being done to discover more places where mutable shared state is exposed in a bare interpreter because I have not been able to think of anymore. 

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Perimeter Defences Between a Created Interpreter and Its Creator<br>&gt; ----------------------------------------------------------------

<br>&gt;<br>&gt; The plan is to allow interpreters to instantiate sandboxed

<br>&gt; interpreters safely.&nbsp;&nbsp;By using the creating interpreter's abilities to<br>&gt; provide abilities to the created interpreter, you make sure there is<br>&gt; no escalation in abilities.<br><br>Good.<br><br>&gt; * ``__del__`` created in sandboxed interpreter but object is cleaned

<br>&gt;&nbsp;&nbsp; up in unprotected interpreter.<br><br>How do you envision the launching of a sandboxed interpreter to look?<br>Could you sketch out some rough code examples?</blockquote><div><br>&gt;&gt;&gt; interp = interpreter.Interpreter

()<br>&gt;&gt;&gt; interp.builtins['open'] = wrapped_open()<br>&gt;&gt;&gt; interp.sys_dict['path'] = []<br>&gt;&gt;&gt; interp.exec(&quot;2 + 3&quot;)<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Were you thinking of<br>something like:<br><br>&nbsp;&nbsp;&nbsp;&nbsp;

sys.spawn(code, dict)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;code: a string containing Python source code<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dict: the global namespace in which to run the code<br><br>If you allow the parent interpreter to pass mutable objects into the<br>

child interpreter, then the parent and child can already communicate

via the object, so '__del__' is a moot issue.&nbsp;&nbsp;Do you want to prevent all communication between parent and child?&nbsp;&nbsp;It's not obvious to me why that would be necessary.</blockquote><div> No, I don't since there should be a secure way to allow that.&nbsp; The __del__ worry came up from Guido pointing out you might be able to screw with it.&nbsp; But if you pass in something implemented in C you should be okay.

</div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; * Using frames to walk the frame stack back to another interpreter.

Could you just disable introspection of the frame stack?</blockquote><div> If you don't allow importing of 'sys' then yes, and that is planned.&nbsp; I just wanted to make sure I didn't forget this needs to be protected.

<br><br>I do need to check what a generator's frame exposes, though.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Making the ``sys`` Module Safe

<br>&gt; ------------------------------<br>[...]<br>&gt; This means that the ``sys`` module needs to have its safe information

<br>&gt; separated out from the unsafe settings.<br><br>Yes.<br><br>&gt; XXX separate modules, ``sys.settings`` and ``sys.info``, or strip<br>&gt; ``sys`` to settings and put info somewhere else?&nbsp;&nbsp;Or provide a method<br>

&gt; that will create a faked sys module that has the safe values copied

<br>&gt; into it?<br><br>I think the last suggestion above would lead to confusion.&nbsp;&nbsp;The two<br>groups should have two distinct names and it should be clear which<br>attribute goes with which group.</blockquote><div><br>

This is also more complicated by the fact that some things are for the entire process while others are per interpreter.&nbsp; Might have to separate things out even more. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; Protecting I/O

<br>&gt; ++++++++++++++<br>&gt;<br>&gt; The ``print`` keyword and the built-ins ``raw_input()`` and<br>&gt; ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``.<br>&gt; By exposing these attributes to the creating interpreter, one can set

&gt; them to safe objects, such as instances of ``StringIO``. Sounds good. &gt; Safe Networking &gt; --------------- &gt; &gt; XXX proxy on socket module, modify open() to be the constructor, etc.

<br><br>Lots more to think about here. :)</blockquote><div><br>Oh yeah.&nbsp; =) <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; Protecting Memory Usage

<br>&gt; -----------------------<br>&gt;<br>&gt; To protect memory, low-level hooks into the memory allocator for<br>&gt; Python is needed.&nbsp;&nbsp;By hooking into the C API for memory allocation and

&gt; deallocation a very rough running count of used memory can kept.&nbsp;&nbsp;This &gt; can be used to prevent sandboxed interpreters from using so much &gt; memory that it impacts the overall performance of the system.

Preventing denial-of-service is in general quite difficult, but i applaud the attempt.&nbsp;&nbsp;I agree with your decision to separate this</blockquote><div> The memory tracking has a proof-of-concept done in the bcannon-sandboxing branch.&nbsp; Not perfect, but it does show how one could go about accounting for every byte of data in terms of what it is basically used for.

<br></div><br></div>-Brett<br>