[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

17 Jun 2020


      On 6/17/2020 6:03 PM, Jeff Allen wrote:
...
On 17/06/2020 19:28, Eric V. Smith wrote:
...
...
If (1) interpreters manage the life-cycle of objects, and (2) a race 
condition arises when the life-cycle or state of an object is 
accessed by the interpreter that did not create it, and (3) an 
object will sometimes be passed to an interpreter that did not 
create it, and (4) an interpreter with a reference to an object will 
sometimes access its life-cycle or state, then (5) a race condition 
will sometimes arise. This seems to be true (as a deduction) if all 
the premises hold. 
I'm assuming that passing an object between interpreters would not be 
supported. It would require that the object somehow be marshalled 
between interpreters, so that no object would be operated on outside
On 6/17/2020 12:07 PM, Jeff Allen wrote:
the interpreter that created it. So 2-5 couldn't happen in valid code.
The Python level doesn't support it, prevents it I think, and perhaps 
the implementation doesn't support it, but nothing can stop C actually 
doing it. I would agree that with sufficient discipline in the code it 
should be possible to  prevent the worlds from colliding. But it is 
difficult, so I think that is why Mark is arguing for a separate 
address space. Marshalling the value across is supported, but that's 
just the value, not a shared object.
Yes, it's difficult to have the discipline in C, just as multi-threaded 
is difficult in C. I agree separate address spaces makes isolation much 
easier, but I think there are use cases that don't align with separate 
address spaces, and we should support those.
...
...
Sorry for being loose with terms. If I want to create an interpreter 
and execute it, then I'd allocate and initialize an interpreter state 
object, then call it, passing the interpreter state object in to 
whatever Python functions I want to call. They would in turn pass 
that pointer to whatever they call, or access the state through it 
directly. That pointer is the "current interpreter".
I think that can work if you have disciplined separation, which you 
are assuming. I think you would pass the function to the interpreter, 
not the other way around. I'm assuming this is described from the 
perspective of some C code and your Python functions are PyFunction 
objects, not just text? What, however, prevents you creating that 
function in one interpreter and giving it to another? The function, 
and any closure or defaults are owned by the creating interpreter.
In the C API (which is what I think we're discussing), I think it would 
be passing the interpreter state to the function. And nothing would 
prevent you from getting it wrong.
...
...
There's a lot of state per interpreter, including the module state. 
See "struct _is" in Include/internal/pycore_interp.h.
So much more than when I last looked! Look back in time and 
interpreter state mostly contains the module context (in a broad sense 
that includes shortcuts to sys, builtins, codec state, importlib). Ok, 
there's some stuff about exit handling and debugging too. The recent 
huge growth is to shelter previously singleton object allocation 
mechanisms, a consequence of the implementation choice that gives the 
interpreter object that responsibility too. I'm not saying this is 
wrong, just that it's not a concept in Python-the-language, while the 
module state is.
I think most of these changes are Victor's, and I think they're a step 
in the right direction. Since Python globals are really module state, it 
makes sense that that's the part that's visible to Python.

Eric