[Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code

Tue Sep 12 16:46:25 EDT 2017

On Thu, Sep 7, 2017 at 11:19 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Sep 7, 2017 at 8:11 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> My concern is that this is a chicken-and-egg problem.  The situation
>> won't improve until subinterpreters are more readily available.
>
> Okay, but you're assuming that "more libraries work well with
> subinterpreters" is in fact an improvement. I'm asking you to convince
> me of that :-). Are there people saying "oh, if only subinterpreters
> had a Python API and less weird interactions with C extensions, I
> could do <something awesome>"? So far they haven't exactly taken the
> world by storm...

The problem is that most people don't know about the feature.  And
even if they do, using it requires writing a C-extension, which most
people aren't comfortable doing.

>> Other than C globals, is there some other issue?
>
> That's the main one I'm aware of, yeah, though I haven't looked into it closely.

Oh, good.  I haven't missed something. :)  Do you know how often
subinterpreter support is a problem for users?  I was under the
impression from your earlier statements that this is a recurring issue
but my understanding from mod_wsgi is that it isn't that common.

>> I'm fine with Nick's idea about making this a "provisional" module.
>> Would that be enough to ease your concern here?
>
> Potentially, yeah -- basically I'm fine with anything that doesn't end
> up looking like python-dev telling everyone "subinterpreters are the
> future! go forth and yell at any devs who don't support them!".

Great!  I'm also looking at the possibility of adding a mechanism for
extension modules to opt out of subinterpreter support (using PEP 489
ModuleDef slots).  However, I'd rather wait on that if making the PEP
provisional is sufficient.

> What do you think the criteria for graduating to non-provisional
> status should be, in this case?

Consensus among the (Dutch?) core devs that subinterpreters are worth
keeping in the stdlib and that we've smoothed out any rough parts in
the module.

> I guess I would be much more confident in the possibilities here if
> you could give:
>
> - some hand-wavy sketch for how subinterpreter A could call a function
> that as originally defined in subinterpreter B without the GIL, which
> seems like a precondition for sharing user-defined classes

(Before I respond, note that this is way outside the scope of the PEP.
The merit of subinterpreters extends beyond any benefits of running
sans-GIL, though that is my main goal.  I've been updating the PEP to
(hopefully) better communicate the utility of subinterpreters.)

Code objects are immutable so that part should be relatively
straight-forward.  There's the question of closures and default
arguments that would have to be resolved.  However, those are things
that would need to be supported anyway in a world where we want to
pass functions and user-defined types between interpreters.  Doing so
will be a gradual process of starting with immutable non-container
builtin types and expanding out from there to other immutable types,
including user-defined ones.

Note that sharing mutable objects between interpreters would be a
pretty advanced usage (i.e. opt-in shared state vs. threading's
share-everything).  If it proves desirable then we'd sort that out
then.  However, I don't see that as a more than an esoteric feature
relative to subinterpreters.

In my mind, the key advantage of being able to share more (immutable)
objects, including user-defined types, between interpreters is in the
optimization opportunities.  It would allow us to avoid instantiating
the same object in each interpreter.  That said, the way I imagine it
I wouldn't consider such an optimization to be very user-facing so it
doesn't impact the PEP.  The user-facing part would be the expanded
set of immutable objects interpreters could pass back and forth, and
expanding that set won't require any changes to the API in the PEP.

> - some hand-wavy sketch for how refcounting will work for objects
> shared between multiple subinterpreters without the GIL, without
> majorly impacting single-thread performance (I actually forgot about
> this problem in my last email, because PyPy has already solved this
> part!)

(same caveat as above)

There are a number of approaches that may work.  One is to give each
interpreter its own allocator and GC.  Another is to mark shared
objects such that they never get GC'ed.  Another is to allow objects
to exist only in one interpreter at a time.  Similarly, object
ownership (per interpreter) could help.  Asynchronous refcounting
could be an option.  That's only some of the possible approaches.  I
expect that at least one of them will be suitable.  However, the first
step is to get the multi-interpreter support out there.  Then we can
tackle the problem of optimization and multi-core utilization.

FWIW, the biggest complexity is actually in synchronizing the sharing
strategy across the inter-interpreter boundary (e.g. FIFO).  We should
expect the relative time spent passing objects between interpreters to
be very small.  So not only does that provide us will a good target
for our refcount resolving strategy, we can afford some performance
wiggle room in that solution.  (again, we're looking way ahead here)

> Thanks for attempting such an ambitious project :-).

Hey, I'm learning a lot and feel like every step along the way is
making Python better in some stand-alone way. :)

-eric