[Python-Dev] GIL, Python 3, and MP vs. UP
Phillip J. Eby
pje at telecommunity.com
Wed Sep 21 22:12:19 CEST 2005
At 12:04 PM 9/21/2005 -0700, Guido van Rossum wrote:
>Actually Python itself has a hard time keeping multiple interpreters
>truly separate. Also of course there are some shared resources
>maintained by the operating system: current directory, open file
>descriptors, signal settings, child processes, that sort of thing.
>If we were to completely drop this feature, we could make built-in
I'd personally much rather we got back the ability to change the type of an
instance of a builtin to that of a Python subclass of that builtin type, or
to change it back. I have more use cases for that than for actually
modifying builtins. (E.g. "observable" lists/dicts, hooking module
> > A system like Java's classloader would be helpfull, where the
> > classloader of a class is used to load the classes used by that
> > class. I have no idea if this can be adapted to python at all. A
> > strict coding style seems to work for now.
>You can do something like this using the restricted execution support,
>which works by setting the __builtins__ name in a dict where you exec
>code, and overriding __import__ in that __builtins__ dict. (I can't
>explain it too well in one paragraph, just go look up the rexec.py
>It's not great for guaranteeing there's absolutely no escape possible
>from the sandbox, but it works well enough to make accidental resource
>sharing a non-issue (apart from the OS shared resources and the
>built-in types). A misfeature (for this purpose) is that certain kinds
>of introspection are disabled (this was of course to enable restricted
Another misfeature is that some C-level Python code expects to obtain
sys.modules, builtins, etc. via the interpreter struct. Thus, you tend to
have to reimplement those things in Python to get them to respect a
virtualization of sys.modules. I have to admit I've only dabbled in
attempting this, just long enough to hit a stumbling block or two and then
discover that they were because sys.modules is in the interpreter
struct. Of course, my next thought then was to just expose the
multi-interpreter API as an extension module, so that you could create
interpreters from Python code. The project I'd originally planned to do
this for never materialized though, so I never actually attempted it.
My thought, though, was that by swapping the current interpreter of the
thread state when crossing code boundaries, you could keep both the
Python-level and C-level code happy. However, it might also suffice to
have a way to switch in and out the interpreter configuration (sys.modules,
sys.__dict__, and __builtins__ at minimum; I don't have any clear use case
for changing the three codec_* vars at the moment).
>I'd be willing to entertain improvements that improve the insulation
Since there's already a way to change __builtins__ in the threadstate,
maybe the C API could be changed to obtain the six interpreter variables
via builtins rather than the other way around. This would allow us to drop
the multi-interpreter API from C (along with support for restricted mode)
but still allow complete virtualization from inside Python code.
The steps would be:
1. Remove restricted mode support
2. Change the tstate structure to have a 'builtins'
3. Change code that does tstate->interp lookups to instead lookup special
names in the tstate's builtins
At that point, you can exec code with new builtins to bootstrap a virtual
Python, subject to limitations like being able to load a given extension
module only once.
Systems like mod_python that use the multi-interpreter API now would just
need to bootstrap a new __builtins__.
Sadly, this doesn't *really* cure the GIL-ensure problem, in that you still
don't have a specially-distinguished __builtins__ to use when you call into
Python from a C-started thread. On the other hand, I suspect that the use
cases for that, and the use cases for virtualization don't overlap much, so
having a distinguished place to hold the "default" (i.e. initial) builtins
probably doesn't hurt virtualization much, since you can always *modify*
that set of builtins if you need to.
More information about the Python-Dev