[Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib.

Thu May 25 15:01:21 EDT 2017

On Thu, May 25, 2017 at 11:19 AM, Nathaniel Smith <njs at pobox.com> wrote:
> My impression is that the code to support them inside CPython is fine, but
> they're broken and not very useful in the sense that lots of C extensions
> don't really support them, so in practice you can't reliably use them to run
> arbitrary code. Numpy for example definitely has lots of
> subinterpreter-related bugs, and when they get reported we close them as
> WONTFIX.
>
> Based on conversations at last year's pycon, my impression is that numpy
> probably *could* support subinterpreters (i.e. the required apis exist), but
> none of us really understand the details, it's the kind of problem that
> requires a careful whole-codebase audit, and a naive approach might make
> numpy's code slower and more complicated for everyone. (For example, there
> are lots of places where numpy keeps a little global cache that I guess
> should instead be per-subinterpreter caches, which would mean adding an
> extra lookup operation to fast paths.)

Thanks for pointing this out.  You've clearly described probably the
biggest challenge for folks that try to use subinterpreters.  PEP 384
was meant to help with this, but seems to have fallen short.  PEP 489
can help identify modules that profess subinterpreter support, as well
as facilitating future extension module helpers to deal with global
state.  However, I agree that *right now* getting extension modules to
reliably work with subinterpreters is not easy enough.  Furthermore,
that won't change unless there is sufficient benefit tied to
subinterpreters, as you point out below.

>
> Or maybe it'd be fine, but no one is motivated to figure it out, because the
> other side of the cost/benefit analysis is that almost nobody actually uses
> subinterpreters. I think the only two projects that do are mod_wsgi and jep
> [1].
>
> So yeah, the status quo is broken. But there are two possible ways to fix
> it: IMHO either subinterpreters should be removed *or* they should have some
> compelling features added to make them actually worth the effort of fixing c
> extensions to support them. If Eric can pull off this multi-core idea then
> that would be pretty compelling :-).

Agreed. :)

> (And my impression is that the things
> that break under subinterpreters are essentially the same as would break
> under any GIL-removal plan.)

More or less.  There's a lot of process-global state in CPython that
needs to get pulled into the interpreter state.  So in that regard the
effort and tooling will likely correspond fairly closely with what
extension modules have to do.

>
> The problem is that we don't actually know yet whether the multi-core idea
> will work, so it seems like a bad time to double down on committing to
> subinterpreter support and pressuring C extensions to keep up. Eric- do you
> have a plan written down somewhere? I'm wondering what the critical path
> from here to a multi-core proof of concept looks like.

Probably the best summary is here:

  http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html

The caveat is that doing this myself is slow-going due to persistent
lack of time. :/  So any timely solution would require effort from
more people.  I've had enough positive responses from folks at PyCon
that I think enough people would pitch in to get it done in a timely
manner.  More significantly, I genuinely believe that isolated
interpreters in the same process is a tool that many people will find
extremely useful and will help the Python community.  Consequently,
exposing subinterpreters in the stdlib would result in a stronger
incentive for folks to fix the known bugs and find a solution to the
challenges for extension modules.

-eric