[Python-checkins] r47249 - python/branches/bcannon-sandboxing/sandboxing_design_doc.txt
brett.cannon
python-checkins at python.org
Thu Jul 6 00:09:00 CEST 2006
Author: brett.cannon
Date: Thu Jul 6 00:08:59 2006
New Revision: 47249
Added:
python/branches/bcannon-sandboxing/sandboxing_design_doc.txt (contents, props changed)
Log:
Add initial draft of design doc (same as one initially sent to python-dev).
Added: python/branches/bcannon-sandboxing/sandboxing_design_doc.txt
==============================================================================
--- (empty file)
+++ python/branches/bcannon-sandboxing/sandboxing_design_doc.txt Thu Jul 6 00:08:59 2006
@@ -0,0 +1,998 @@
+Restricted Execution for Python
+#######################################
+
+About This Document
+=============================
+
+This document is meant to lay out the general design for re-introducing a
+restriced execution model for Python. This document should provide one with
+enough information to understand the goals for restricted execution, what
+considerations were made for the design, and the actual design itself. Design
+decisions should be clear and explain not only why they were chosen but
+possible drawbacks from taking that approach.
+
+
+Goal
+=============================
+
+A good restricted execution model provides enough protection to prevent
+malicious harm to come to the system, and no more. Barriers should be
+minimized so as to allow most code that does not do anything that would be
+regarded as harmful to run unmodified.
+
+An important point to take into consideration when reading this document is to
+realize it is part of my (Brett Cannon's) Ph.D. dissertation. This means it is
+heavily geared toward the restricted execution when the interpreter is working
+with Python code embedded in a web page. While great strides have been taken
+to keep the design general enough so as to allow all previous uses of the
+'rexec' module [#rexec]_ to be able to use the new design, it is not the
+focused goal. This means if a design decision must be made for the embedded
+use case compared to sandboxing Python code in a Python application, the former
+will win out.
+
+Throughout this document, the term "resource" is to represent anything that
+deserves possible protection. This includes things that have a physical
+representation (e.g., memory) to things that are more abstract and specific to
+the interpreter (e.g., sys.path).
+
+When referring to the state of an interpreter, it is either "trusted" or
+"untrusted". A trusted interpreter has no restrictions imposed upon any
+resource. An untrusted interpreter has at least one, possibly more, resource
+with a restriction placed upon it.
+
+
+.. contents::
+
+
+Use Cases
+/////////////////////////////
+
+All use cases are based on how many untrusted or trusted interpreters are
+running in a single process.
+
+
+When the Interpreter Is Embedded
+================================
+
+Single Untrusted Interpreter
+----------------------------
+
+This use case is when an application embeds the interpreter and never has more
+than one interpreter running.
+
+The main security issue to watch out for is not having default abilities be
+provided to the interpreter by accident. There must also be protection from
+leaking resources that the interpreter needs for general use underneath the
+covers into the untrusted interpreter.
+
+
+Multiple Untrusted Interpreters
+-------------------------------
+
+When multiple interpreters, all untrusted at varying levels, need to be running
+within a single application. This is the key use case that this proposed
+design is targetted for.
+
+On top of the security issues from a single untrusted interpreter, there is one
+additional worry. Resources cannot end up being leaked into other interpreters
+where they are given escalated rights.
+
+
+Stand-Alone Python
+==================
+
+When someone has written a Python program that wants to execute Python code in
+an untrusted interpreter(s). This is the use case that 'rexec' attempted to
+fulfill.
+
+The added security issues for this use case (on top of the ones for the other
+use cases) is preventing something from the trusted interpreter leaking into an
+untrusted interpreter and having elevated permissions. With the multiple
+untrusted interpreters one did not have to worry about preventing actions from
+occurring that are disallowed for all untrusted interpreters. With this use
+case you do have to worry about the binary distinction between trusted and
+untrusted interpreters running in the same process.
+
+
+Resources to Protect
+/////////////////////////////
+
+XXX Threading?
+XXX CPU?
+
+Filesystem
+===================
+
+The most obvious facet of a filesystem to protect is reading from it. One does
+not want what is stored in ``/etc/passwd`` to get out. And one also does not
+want writing to the disk unless explicitly allowed for basically the same
+reason; if someone can write ``/etc/passwd`` then they can set the password for
+the root account.
+
+But one must also protect information about the filesystem. This includes both
+the filesystem layout and permissions on files. This means pathnames need to
+be properly hidden from an untrusted interpreter.
+
+
+Physical Resources
+===================
+
+Memory should be protected. It is a limited resource on the system that can
+have an impact on other running programs if it is exhausted. Being able to
+restrict the use of memory would help alleviate issues from denial-of-service
+(DoS) attacks.
+
+
+Networking
+===================
+
+Networking is somewhat like the filesystem in terms of wanting similar
+protections. You do not want to let untrusted code make tons of socket
+connections or accept them to do possibly nefarious things (e.g., acting as a
+zombie).
+
+You also want to prevent finding out information about the network you are
+connected to. This includes doing DNS resolution since that allows one to find
+out what addresses your intranet has or what subnets you use.
+
+
+Interpreter
+===================
+
+One must make sure that the interpreter is not harmed in any way. There are
+several ways to possibly do this. One is generating hostile bytecode. Another
+is some buffer overflow. In general any ability to crash the interpreter is
+unacceptable.
+
+There is also the issue of taking it over. If one is able to gain control of
+the overall process through the interpreter then heightened abilities could be
+gained.
+
+
+Types of Security
+///////////////////////////////////////
+
+As with most things, there are multiple approaches one can take to tackle a
+problem. Security is no exception. In general there seem to be two approaches
+to protecting resources.
+
+
+Resource Hiding
+=============================
+
+By never giving code a chance to access a resource, you prevent it from be
+(ab)used. This is the idea behind resource hiding. This can help minimize
+security checks by only checking if someone should be given a resource. By
+having possession of a resource be what determines if one should be allowed to
+use it you minimize the checks to only when a resource is handed out.
+
+This can be viewed as a passive system for security. Once a resource has been
+given to code there are no more checks to make sure the security model is being
+violated.
+
+The most common implementation of resource hiding is capabilities. In this
+type of system a resource's reference acts as a ticket that represents the right
+to use the resource. Once code has a reference it is considered to have full
+use of that resource it represents and no further security checks are
+performed.
+
+To allow customizable restrictions one can pass references to wrappers of
+resources. This allows one to provide custom security to resources instead of
+requiring an all-or-nothing approach.
+
+The problem with capabilities is that it requires a way to control access to
+references. In languages such as Java that use a capability-based security
+system, namespaces provide the protection. By having private attributes and
+compartmentalized namespaces, references cannot be reached without explicit
+permission.
+
+For instance, Java has a ClassLoader class that one can call to have return a
+reference that is desired. The class does a security check to make sure the
+code should be allowed to access the resource, and then returns a reference as
+appropriate. And with private attributes in objects and packages not providing
+global attributes you can effectively hide references to prevent security
+breaches.
+
+To use an analogy, imagine you are providing security for your home. With
+capabilities, security came from not having any way to know where your house is
+without being told where it was; a reference to its location. You might be
+able to ask a guard (e.g., Java's ClassLoader) for a map, but if they refuse
+there is no way for you to guess its location without being told. But once you
+knew where it was, you had complete use of the house.
+
+And that complete access is an issue with a capability system. If someone
+played a little loose with a reference for a resource then you run the risk of
+it getting out. Once a reference leaves your hands it becomes difficult to
+revoke the right to use that resource. A capability system can be designed to
+do a check every time a reference is handed to a new object, but that can be
+difficult to do properly when grafting a new way to handle resources on to an
+existing system such as Python since the check is no longer at a point for
+requesting a reference but also at plain assignment time.
+
+
+Resource Crippling
+=============================
+
+Another approach to security is to provide constant, proactive security
+checking of rights to use a resource. One can have a resource perform a
+security check every time someone tries to use a method on that resource. This
+pushes the security check to a lower level; from a reference level to the
+method level.
+
+By performing the security check every time a resource's method is called the
+worry of a resource's reference leaking out to insecure code is alleviated
+since the resource cannot be used without authorizing it regardless of whether
+even having the reference was granted. This does add extra overhead, though,
+by having to do so many security checks.
+
+FreeBSD's jail system provides a system similar to this. Various system calls
+allow for basic usage, but knowing of the system call is not enough to grant
+usage. Every call of a system call requires checking that the proper rights
+have been granted to the use in order to allow for the system call to perform
+its action.
+
+An even better example in FreeBSD's jail system is its protection of sockets.
+One can only bind a single IP address to a jail. Any attempt to do more or
+perform uses with the one IP address that is granted is prevented. The check
+is performed at every call involving the one granted IP address.
+
+Using our home analogy, everyone in the world can know where your home is. But
+to access any door in your home, you have to pass a security check. The
+overhead is higher and slows down your movement in your home, but not caring if
+perfect strangers know where your home is prevents the worry of your address
+leaking out to the world.
+
+
+The 'rexec' Module
+///////////////////////////////////////
+
+The 'rexec' module [#rexec]_ was based on the design used by Safe-Tcl
+[#safe-tcl]_. The design was essentially a capability system. Safe-Tcl
+allowed you to launch a separate interpreter where its global functions were
+specified at creation time. This prevented one from having any abilities that
+were not explicitly provided.
+
+For 'rexec', the Safe-Tcl model was tweaked to better match Python's situation.
+An RExec object represented a restricted environment. Imports were checked
+against a whitelist of modules. You could also restrict the type of modules to
+import based on whether they were Python source, bytecode, or C extensions.
+Built-ins were allowed except for a blacklist of built-ins to not provide.
+Several other protections were provided; see documentation for the complete
+list.
+
+With an RExec object created, one could pass in strings of code to be executed
+and have the result returned. One could execute code based on whether stdin,
+stdout, and stderr were provided or not.
+
+The ultimate undoing of the 'rexec' module was how access to objects that in
+normal Python require no direct action to reach was handled. Importing modules
+requires a direct action, and thus can be protected against directly in the
+import machinery. But for built-ins, they are accessible by default and
+require no direct action to access in normal Python; you just use their name
+since they are provided in all namespaces.
+
+For instance, in a restricted interpreter, one only had to do
+``del __builtins__`` to gain access to the full set of built-ins. Another way
+is through using the gc module:
+``gc.get_referrers(''.__class__.__bases__[0])[6]['file']``. While both of
+these could be fixed (the former a bug in 'rexec' and the latter not allowing
+gc to be imported), they are examples of things that do not require proactive
+actions on the part of the programmer in normal Python to gain access to
+tends to leak out. An unfortunate side-effect of having all of that wonderful
+reflection in Python.
+
+There is also the issue that 'rexec' was written in Python which provides its
+own problems.
+
+Much has been learned since 'rexec' was written about how Python tends to be
+used and where security issues tend to appear. Essentially Python's dynamic
+nature does not lend itself very well to passive security measures since the
+reflection abilities in the language lend themselves to getting around
+non-proactive security checks.
+
+
+The Proposed Approach
+///////////////////////////////////////
+
+In light of where 'rexec' succeeded and failed along with what is known about
+the two main types of security and how Python tends to operate, the following
+is a proposal on how to secure Python for restricted execution.
+
+First, security will be provided at the C level. By taking advantage of the
+language barrier of accessing C code from Python without explicit allowance
+(i.e., ignoring ctypes [#ctypes]_), direct manipulation of the various security
+checks can be substantially reduced and controlled.
+
+Second, all proactive actions that code can do to gain access to resources will
+be protected through resource hiding. By having to go through Python to get to
+something (e.g., modules), a security check can be put in place to deny access
+as appropriate (this also ties into the separation between interpreters,
+discussed below).
+
+Third, any resource that is usually accessible by default will use resource
+crippling. Instead of worrying about hiding a resource that is available by
+default (e.g., 'file' type), security checks within the resource will prevent
+misuse. Crippling can also be used for resources where an object could be
+desired, but not at its full capacity (e.g., sockets).
+
+Performance should not be too much of an issue for resource crippling. It's
+main use if for I/O types; files and sockets. Since operations on these types
+are I/O bound and not CPU bound, the overhead for doing the security check
+should be a wash overall.
+
+Fourth, the restrictions separating multiple interpreters within a single
+process will be utilized. This helps prevent the leaking of objects into
+different interpreters with escalated privileges. Python source code
+modules are reloaded for each interpreter, preventing an object that does not
+have resource crippling from being leaked into another interpreter unless
+explicitly allowed. C extension modules are shared by not reloading them
+between interpreters, but this is considered in the security design.
+
+Fifth, Python source code is always trusted. Damage to a system is considered
+to be done from either hostile bytecode or at the C level. Thus protecting the
+interpreter and extension modules is the great worry, not Python source code.
+Python bytecode files, on the other hand, are considered inherently unsafe and
+will never be imported directly.
+
+Attempts to perform an action that is not allowed by the security policy will
+raise an XXX exception (or subclass thereof) as appropriate.
+
+
+Implementation Details
+===============================
+
+XXX prefix/module name; Restrict, Secure, Sandbox? Different tense?
+XXX C APIs use abstract names (e.g., string, integer) since have not decided if
+Python objects or C types (e.g., PyStringObject vs. char *) will be used
+
+Support for untrusted interpreters will be a compilation flag. This allows the
+more common case of people not caring about protections to not have a
+performance hindrance when not desired. And even when Python is compiled for
+untrusted interpreter restrictions, when the running interpreter *is* trusted,
+there will be no accidental triggers of protections. This means that
+developers should be liberal with the security protections without worrying
+about there being issues for interpreters that do not need/want the protection.
+
+At the Python level, the __restricted__ built-in will be set based on whether
+the interpreter is untrusted or not. This will be set for *all* interpreters,
+regardless of whether untrusted interpreter support was compiled in or not.
+
+For setting what is to be protected, the XXX<pointer to interpreter> for the
+untrusted interpreter must be passed in. This makes the protection very
+explicit and helps make sure you set protections for the exact interpreter you
+mean to.
+
+The functions for checking for permissions are actually macros that take
+in at least an error return value for the function calling the macro. This
+allows the macro to return for the caller if the check failed and cause the XXX
+exception to be propagated. This helps eliminate any coding errors from
+incorrectly checking a return value on a rights-checking function call. For
+the rare case where this functionality is disliked, just make the check in a
+utility function and check that function's return value (but this is strongly
+discouraged!).
+
+
+API
+--------------
+
+* interpreter PyXXX_NewInterpreter()
+ Return a new interpreter that is considered untrusted. There is no
+ corresponding PyXXX_EndInterpreter() as Py_EndInterpreter() will be taught
+ how to handle untrusted interpreters.
+
+* PyXXX_Trusted(error_return)
+ Macro that has the caller return with 'error_return' if the interpreter is
+ not a trusted one.
+
+
+Memory
+=============================
+
+Protection
+--------------
+
+An memory cap will be allowed.
+
+Modification to pymalloc will be needed to properly keep track of the
+allocation and freeing of memory. Same goes for the macros around the system
+malloc/free system calls. This provides a platform-independent system for
+protection instead of relying on the operating system providing a service for
+capping memory usage of a process. Also allows the protection to be at the
+interpreter level instead of at the process level.
+
+
+Why
+--------------
+
+Protecting excessive memory usage allows one to make sure that a DoS attack
+against the system's memory is prevented.
+
+
+Possible Security Flaws
+-----------------------
+
+If code makes direct calls to malloc/free instead of using the proper PyMem_*()
+macros then the security check will be circumvented. But C code is *supposed*
+to use the proper macros or pymalloc and thus this issue is not with the
+security model but with code not following Python coding standards.
+
+
+API
+--------------
+
+* int PyXXX_SetMemoryCap(interpreter, integer)
+ Set the memory cap for an untrusted interpreter. If the interpreter is not
+ running an untrusted interpreter, return NULL.
+
+* PyXXX_MemoryAlloc(integer, error_return)
+ Macro to increase the amount of memory that is reported that the running
+ untrusted interpreter is running. If the increase puts the total count
+ passed the set limit, raise an XXX exception and cause the calling function
+ to return with the value of error_return. For trusted interpreters or
+ untrusted interpreters where a cap has not been set, the macro does
+ nothing.
+
+* int PyXXX_MemoryFree(integer)
+ Decrease the current running interpreter's allocated memory. If this puts
+ the memory returned to below 0, raise an XXX exception and return NULL.
+ For trusted interpreters or untrusted interpreters where there is no memory
+ cap, the macro does nothing.
+
+
+CPU
+=============================
+XXX Needed? Difficult to get right for all platforms. Would have to be very
+platform-specific.
+
+
+Reading/Writing Files
+=============================
+
+Protection
+--------------
+
+The 'file' type will be resource crippled. The user may specify files or
+directories that are acceptable to be opened for reading/writing, or both.
+
+All operations that either read, write, or provide info on a file will require
+a security check to make sure that it is allowed for the file that the 'file'
+object represents. This includes the 'file' type's constructor not raising an
+IOError stating a file does not exist but XXX instead so that information about
+the filesystem is not improperly provided.
+
+The security check will be done for all 'file' objects regardless of where the
+'file' object originated. This prevents issues if the 'file' type or an
+instance of it was accidentally made available to an untrusted interpreter.
+
+
+Why
+--------------
+
+Allowing anyone to be able to arbitrarily read, write, or learn about the
+layout of your filesystem is extremely dangerous. It can lead to loss of data
+or data being exposed to people whom should not have access.
+
+
+Possible Security Flaws
+-----------------------
+
+Assuming that the method-level checks are correct and control of what
+files/directories is not exposed, 'file' object protection is secure, even when
+a 'file' object is leaked from a trusted interpreter to an untrusted one.
+
+
+API
+--------------
+
+* int PyXXX_AllowFile(interpreter, path, mode)
+ Add a file that is allowed to be opened in 'mode' by the 'file' object. If
+ the interpreter is not untrusted then return NULL.
+
+* int PyXXX_AllowDirectory(interpreter, path, mode)
+ Add a directory that is allowed to have files opened in 'mode' by the
+ 'file' object. This includes both pre-existing files and any new files
+ created by the 'file' object.
+ XXX allow for creating/reading subdirectories?
+
+* PyXXX_CheckPath(path, mode, error_return)
+ Macro that causes the caller to return with 'error_return' and XXX as the
+ exception if the specified path with 'mode' is not allowed. For trusted
+ interpreters, the macro does nothing.
+
+
+Extension Module Importation
+============================
+
+Protection
+--------------
+
+A whitelist of extension modules that may be imported must be provided. A
+default set is given for stdlib modules known to be safe.
+
+A check in the import machinery will check that a specified module name is
+allowed based on the type of module (Python source, Python bytecode, or
+extension module). Python bytecode files are never directly imported because
+of the possibility of hostile bytecode being present. Python source is always
+trusted based on the assumption that all resource harm is eventually done at
+the C level, thus Python code directly cannot cause harm. Thus only C
+extension modules need to be checked against the whitelist.
+
+The requested extension module name is checked in order to make sure that it
+is on the whitelist if it is a C extension module. If the name is not correct
+an XXX exception is raised. Otherwise the import is allowed.
+
+Even if a Python source code module imports a C extension module in a trusted
+interpreter it is not a problem since the Python source code module is reloaded
+in the untrusted interpreter. When that Python source module is freshly
+imported the normal import check will be triggered to prevent the C extension
+module from becoming available to the untrusted interpreter.
+
+For the 'os' module, a special restricted version will be used if the proper
+C extension module providing the correct abilities is not allowed. This will
+default to '/' as the path separator and provide as much reasonable abilities
+as possible from a pure Python module.
+
+The 'sys' module is specially addressed in
+`Changing the Behaviour of the Interpreter`_.
+
+By default, the whitelisted modules are:
+
+* XXX work off of rexec whitelist?
+
+
+Why
+--------------
+
+Because C code is considered unsafe, its use should be regulated. By using a
+whitelist it allows one to explicitly decide that a C extension module should
+be considered safe.
+
+
+Possible Security Flaws
+-----------------------
+
+If a trusted C extension module imports an untrusted C extension module and
+make it an attribute of the trust module there will be a breach in security.
+Luckily this a rarity in extension modules.
+
+There is also the issue of a C extension module calling the C API of an
+untrusted C extension module.
+
+Lastly, if a trusted C extension module is loaded in a trusted interpreter and
+then loaded into an untrusted interpreter then there is no possible checks
+during module initialization for possible security issues for resources opened
+during initialization of the module if such checks exist in the init*()
+function.
+
+All of these issues can be handled by never blindly whitelisting a C extension
+module. Added support for dealing with C extension modules comes in the form
+of `Extension Module Crippling`_.
+
+API
+--------------
+
+* int PyXXX_AllowModule(interpreter, module_name)
+ Allow the untrusted interpreter to import 'module_name'. If the
+ interpreter is not untrusted, return NULL.
+ XXX sub-modules in packages allowed implicitly? Or have to list all
+ modules explicitly?
+
+* int PyXXX_BlockModule(interpreter, module_name)
+ Remove the specified module from the whitelist. Used to remove modules
+ that are allowed by default. If called on a trusted interpreter, returns
+ NULL.
+
+* PyXXX_CheckModule(module_Name, error_return)
+ Macro that causes the caller to return with 'error_return' and sets the
+ exception XXX if the specified module cannot be imported. For trusted
+ interpreters the macro does nothing.
+
+
+Extension Module Crippling
+==========================
+
+Protection
+--------------
+
+By providing a C API for checking for allowed abilities, modules that have some
+useful functionality can do proper security checks for those functions that
+could provide insecure abilities while allowing safe code to be used (and thus
+not fully deny importation).
+
+
+Why
+--------------
+
+Consider a module that provides a string processing ability. If that module
+provides a single convenience function that reads its input string from a file
+(with a specified path), the whole module should not be blocked from being
+used, just that convenience function. By whitelisting the module but having a
+security check on the one problem function, the user can still gain access to
+the safe functions. Even better, the unsafe function can be allowed if the
+security checks pass.
+
+
+Possible Security Flaws
+-----------------------
+
+If a C extension module developer incorrectly implements the security checks
+for the unsafe functions it could lead to undesired abilities.
+
+
+API
+--------------
+
+Use PyXXX_Trusted() to protect unsafe code from being executed.
+
+
+Hostile Bytecode
+=============================
+
+Protection
+--------------
+
+The code object's constructor is not callable from Python. Importation of .pyc
+and .pyo files is also prohibited.
+
+
+Why
+--------------
+
+Without implementing a bytecode verification tool, there is no way of making
+sure that bytecode does not jump outside its bounds, thus possibly executing
+malicious code. It also presents the possibility of crashing the interpreter.
+
+
+Possible Security Flaws
+-----------------------
+
+None known.
+
+
+API
+--------------
+
+None.
+
+
+Changing the Behaviour of the Interpreter
+=========================================
+
+Protection
+--------------
+
+Only a subset of the 'sys' module will be made available to untrusted
+interpreters. Things to allow from the sys module:
+
+* byteorder
+* subversion
+* copyright
+* displayhook
+* excepthook
+* __displayhook__
+* __excepthook__
+* exc_info
+* exc_clear
+* exit
+* getdefaultencoding
+* _getframe
+* hexversion
+* last_type
+* last_value
+* last_traceback
+* maxint
+* maxunicode
+* modules
+* stdin # See `Stdin, Stdout, and Stderr`_.
+* stdout
+* stderr
+* __stdin__ # See `Stdin, Stdout, and Stderr`_ XXX Perhaps not needed?
+* __stdout__
+* __stderr__
+* version
+* api_version
+
+
+Why
+--------------
+
+Filesystem information must be removed. Any settings that could
+possibly lead to a DoS attack (e.g., sys.setrecursionlimit()) or risk crashing
+the interpreter must also be removed.
+
+
+Possible Security Flaws
+-----------------------
+
+Exposing something that could lead to future security problems (e.g., a way to
+crash the interpreter).
+
+
+API
+--------------
+
+None.
+
+
+Socket Usage
+=============================
+
+Protection
+--------------
+
+Allow sending and receiving data to/from specific IP addresses on specific
+ports.
+
+
+Why
+--------------
+
+Allowing arbitrary sending of data over sockets can lead to DoS attacks on the
+network and other machines. Limiting accepting data prevents your machine from
+being attacked by accepting malicious network connections. It also allows you
+to know exactly where communication is going to and coming from.
+
+
+Possible Security Flaws
+-----------------------
+
+If someone managed to influence the used DNS server to influence what IP
+addresses were used after a DNS lookup.
+
+
+API
+--------------
+
+* int PyXXX_AllowIPAddress(interpreter, IP, port)
+ Allow the untrusted interpreter to send/receive to the specified IP
+ address on the specified port. If the interpreter is not untrusted,
+ return NULL.
+
+* PyXXX_CheckIPAddress(IP, port, error_return)
+ Macro to verify that the specified IP address on the specified port is
+ allowed to be communicated with. If not, cause the caller to return with
+ 'error_return' and XXX exception set. If the interpreter is trusted then
+ do nothing.
+
+* PyXXX_AllowHost(interpreter, host, port)
+ Allow the untrusted interpreter to send/receive to the specified host on
+ the specified port. If the interpreter is not untrusted, return NULL.
+ XXX resolve to IP at call time to prevent DNS man-in-the-middle attacks?
+
+* PyXXX_CheckHost(host, port, error_return)
+ Check that the specified host on the specified port is allowed to be
+ communicated with. If not, set an XXX exception and cause the caller to
+ return 'error_return'. If the interpreter is trusted then do nothing.
+
+
+Network Information
+=============================
+
+Protection
+--------------
+
+Limit what information can be gleaned about the network the system is running
+on. This does not include restricting information on IP addresses and hosts
+that are have been explicitly allowed for the untrusted interpreter to
+communicate with.
+
+
+Why
+--------------
+
+With enough information from the network several things could occur. One is
+that someone could possibly figure out where your machine is on the Internet.
+Another is that enough information about the network you are connected to could
+be used against it in an attack.
+
+
+Possible Security Flaws
+-----------------------
+
+As long as usage is restricted to only what is needed to work with allowed
+addresses, there are no security issues to speak of.
+
+
+API
+--------------
+
+* int PyXXX_AllowNetworkInfo(interpreter)
+ Allow the untrusted interpreter to get network information regardless of
+ whether the IP or host address is explicitly allowed. If the interpreter
+ is not untrusted, return NULL.
+
+* PyXXX_CheckNetworkInfo(error_return)
+ Macro that will return 'error_return' for the caller and set XXX exception
+ if the untrusted interpreter does not allow checking for arbitrary network
+ information. For a trusted interpreter this does nothing.
+
+
+Filesystem Information
+=============================
+
+Protection
+--------------
+
+Do not allow information about the filesystem layout from various parts of
+Python to be exposed. This means blocking exposure at the Python level to:
+
+* __file__ attribute on modules
+* __path__ attribute on packages
+* co_filename attribute on code objects
+
+
+Why
+--------------
+
+Exposing information about the filesystem is not allowed. You can figure out
+what operating system one is on which can lead to vulnerabilities specific to
+that operating system being exploited.
+
+
+Possible Security Flaws
+-----------------------
+
+Not finding every single place where a file path is exposed.
+
+
+API
+--------------
+
+* int PyXXX_AllowFilesystemInfo(interpreter)
+ Allow the untrusted interpreter to expose filesystem information. If the
+ passed-in interpreter is not untrusted, return NULL.
+
+* PyXXX_CheckFilesystemInfo(error_return)
+ Macro that checks if exposing filesystem information is allowed. If it is
+ not, cause the caller to return with the value of 'error_return' and raise
+ XXX.
+
+
+Threading
+=============================
+
+XXX Needed?
+
+
+Stdin, Stdout, and Stderr
+=============================
+
+Protection
+--------------
+
+By default, sys.__stdin__, sys.__stdout__, and sys.__stderr__ will be set to
+instances of cStringIO. Allowing use of the normal stdin, stdout, and stderr
+will be allowed.
+XXX Or perhaps __stdin__ and friends should just be blocked and all you get is
+sys.stdin and friends set to cStringIO.
+
+
+Why
+--------------
+
+Interference with stdin, stdout, or stderr should not be allowed unless
+desired.
+
+
+Possible Security Flaws
+-----------------------
+
+Unless cStringIO instances can be used maliciously, none to speak of.
+XXX Use StringIO instances instead for even better security?
+
+
+API
+--------------
+
+* int PyXXX_UseTrueStdin(interpreter)
+ int PyXXX_UseTrueStdout(interpreter)
+ int PyXXX_UseTrueStderr(interpreter)
+ Set the specific stream for the interpreter to the true version of the
+ stream and not to the default instance of cStringIO. If the interpreter is
+ not untrusted, return NULL.
+
+
+Adding New Protections
+=============================
+
+Protection
+--------------
+
+Allow for extensibility in the security model by being able to add new types of
+checks. This allows not only for Python to add new security protections in a
+backwards-compatible fashion, but to also have extension modules add their own
+as well.
+
+An extension module can introduce a group for its various values to check, with
+a type being a specific value within a group. The "Python" group is
+specifically reserved for use by the Python core itself.
+
+
+Why
+--------------
+
+We are all human. There is the possibility that a need for a new type of
+protection for the interpreter will present itself and thus need support. By
+providing an extensible way to add new protections it helps to future-proof the
+system.
+
+It also allows extension modules to present their own set of security
+protections. That way one extension module can use the protection scheme
+presented by another that it is dependent upon.
+
+
+Possible Security Flaws
+------------------------
+
+Poor definitions by extension module users of how their protections should be
+used would allow for possible exploitation.
+
+
+API
+--------------
+
+XXX Could also have PyXXXExtended prefix instead for the following functions
+
++ Bool
+ * int PyXXX_ExtendedSetTrue(interpreter, group, type)
+ Set a group-type to be true. Expected use is for when a binary
+ possibility of something is needed and that the default is to not allow
+ use of the resource (e.g., network information). Returns NULL if the
+ interpreter is not untrusted.
+
+ * PyXXX_ExtendedCheckTrue(group, type, error_return)
+ Macro that if the group-type is not set to true, cause the caller to
+ return with 'error_return' with XXX exception raised. For trusted
+ interpreters the check does nothing.
+
++ Numeric Range
+ * int PyXXX_ExtendedValueCap(interpreter, group, type, cap)
+ Set a group-type to a capped value, with the initial value set to 0.
+ Expected use is when a resource has a capped amount of use (e.g.,
+ memory). Returns NULL if the interpreter is not untrusted.
+
+ * PyXXX_ExtendedValueAlloc(increase, error_return)
+ Macro to raise the amount of a resource is used by 'increase'. If the
+ increase pushes the resource allocation past the set cap, then return
+ 'error_return' and set XXX as the exception.
+
+ * PyXXX_ExtendedValueFree(decrease, error_return)
+ Macro to lower the amount a resource is used by 'decrease'. If the
+ decrease pushes the allotment to below 0 then have the caller return
+ 'error_return' and set XXX as the exception.
+
+
++ Membership
+ * int PyXXX_ExtendedAddMembership(interpreter, group, type, string)
+ Add a string to be considered a member of a group-type (e.g., allowed
+ file paths). If the interpreter is not an untrusted interpreter,
+ return NULL.
+
+ * PyXXX_ExtendedCheckMembership(group, type, string, error_return)
+ Macro that checks 'string' is a member of the values set for the
+ group-type. If it is not, then have the caller return 'error_return'
+ and set an exception for XXX. For trusted interpreters the call does
+ nothing.
+
++ Specific Value
+ * int PyXXX_ExtendedSetValue(interpreter, group, type, string)
+ Set a group-type to a specific string. If the interpreter is not
+ untrusted, return NULL.
+
+ * PyXXX_ExtendedCheckValue(group, type, string, error_return)
+ Macro to check that the group-type is set to 'string'. If it is not,
+ then have the caller return 'error_return' and set an exception for
+ XXX. If the interpreter is trusted then nothing is done.
+
+
+References
+///////////////////////////////////////
+
+.. [#rexec] The 'rexec' module
+ (http://docs.python.org/lib/module-rexec.html)
+
+.. [#safe-tcl] The Safe-Tcl Security Model
+ (http://research.sun.com/technical-reports/1997/abstract-60.html)
+
+.. [#ctypes] 'ctypes' module
+ (http://docs.python.org/dev/lib/module-ctypes.html)
More information about the Python-checkins
mailing list