Hello, holger krekel wrote:
Second, comments on py3k list indicated that secure python is difficult because of a) introspection, b) type inference, and c) GIL acquisition.
Hum, this list looks a bit weird to me. Could you state what the actual attacks are for which security measures are discussed? Or which use cases are people on py3k having in mind?
This is an amalgam of several different posts (and maybe different threads) but here goes: In the thread "Will we have a true restricted exec environment for python 3000," Vineet Jain asked for a restricted mode which would "1. Limit the memory consumed by the script 2. Limit access to file system and other system resources 3. Limit cpu time that the script will take 4. Be able to specify which modules are available for import." In responses to that request, various people commented on the difficulties of implementing such a restricted mode. On that thread, several people had the same idea I had, to try to use PyPy for this purpose - however, it didn't look like many people were up-to-date reading both lists (and thus familiar-ish with PyPy's execution model). A) Introspection Nick Coghlan stated that: "I'm interested, but I'm also aware of how much work it would be. I'm disinclined to trust any mechanism which allows the untrusted code to run in the same process, as the implications of being able to do: self.__class__.__mro__[-1].__subtypes__() are somewhat staggering, and designing an in-process sandbox to cope with that is a big ask (and demonstrating that the sandbox actually *achieves* that goal is even tougher)." Vineet volunteered with a proposal to start a "light" python subinterpreter, which would be controlled by the main interpreter. Nick countered, "But will it allow you to use numbers or strings? If yes, then you can get to object(), and hence to pretty much whatever C builtins you want. So its not enough to try to hide dangerous builtins like file(), you want to remove them from the light version entirely (routing all file system and network access requests through the main application). But if the file objects are gone, what happens to the Python machinery that relies on them (like import)? Python's powerful introspection is a severe drawback from a security POV - it is *really* hard to make a user stay in a box you put them in without crippling some part of the language as a side effect." Thus, in CPy, allowing someone to access a C type effectively opens up all the C types. In PyPy, however, each type is effectively in its own box. Further, PyPy already has a structure that can deal with these sorts of accesses: the flowgraph. Operations in PyPy come about because of traversals of the graph - certain branches of the graph could be restricted or proxied out to a trusted interpreter. B) GIL Acquisition Another person suggested leveraging the multiple subinterpreter code which already exists in CPython to create a restricted-exec interpreter. MvL noted that GIL acquisition made that difficult: "Part of the problem is that it doesn't really work. Some objects *are* shared across interpreters, such as global objects in extension modules (extension modules are initialized only once). I believe that the GIL management code (for acquiring the GIL out of nowhere) breaks if there are multiple interpreters." C) Type inference I tried to find the thread for this one - its not from the Py3K list - but I recall a couple years ago someone attempting to make an rexec version of python. One of the comments that I recall from that discussion had to do with understanding what types were being manipulated. I believe there was an example somewhat like operator.add is trusted class A: def __add__(self, other): ... something evil here ... a, b = A(), 1 a + b [something evil happens] However, this is a foggy memory that I have so far been unable to substantiate. Thanks, VanL