On Thu, May 25, 2017 at 11:19 AM, Nathaniel Smith <njs@pobox.com> wrote:
My impression is that the code to support them inside CPython is fine, but they're broken and not very useful in the sense that lots of C extensions don't really support them, so in practice you can't reliably use them to run arbitrary code. Numpy for example definitely has lots of subinterpreter-related bugs, and when they get reported we close them as WONTFIX.
Based on conversations at last year's pycon, my impression is that numpy probably *could* support subinterpreters (i.e. the required apis exist), but none of us really understand the details, it's the kind of problem that requires a careful whole-codebase audit, and a naive approach might make numpy's code slower and more complicated for everyone. (For example, there are lots of places where numpy keeps a little global cache that I guess should instead be per-subinterpreter caches, which would mean adding an extra lookup operation to fast paths.)
Thanks for pointing this out. You've clearly described probably the biggest challenge for folks that try to use subinterpreters. PEP 384 was meant to help with this, but seems to have fallen short. PEP 489 can help identify modules that profess subinterpreter support, as well as facilitating future extension module helpers to deal with global state. However, I agree that *right now* getting extension modules to reliably work with subinterpreters is not easy enough. Furthermore, that won't change unless there is sufficient benefit tied to subinterpreters, as you point out below.
Or maybe it'd be fine, but no one is motivated to figure it out, because the other side of the cost/benefit analysis is that almost nobody actually uses subinterpreters. I think the only two projects that do are mod_wsgi and jep [1].
So yeah, the status quo is broken. But there are two possible ways to fix it: IMHO either subinterpreters should be removed *or* they should have some compelling features added to make them actually worth the effort of fixing c extensions to support them. If Eric can pull off this multi-core idea then that would be pretty compelling :-).
Agreed. :)
(And my impression is that the things that break under subinterpreters are essentially the same as would break under any GIL-removal plan.)
More or less. There's a lot of process-global state in CPython that needs to get pulled into the interpreter state. So in that regard the effort and tooling will likely correspond fairly closely with what extension modules have to do.
The problem is that we don't actually know yet whether the multi-core idea will work, so it seems like a bad time to double down on committing to subinterpreter support and pressuring C extensions to keep up. Eric- do you have a plan written down somewhere? I'm wondering what the critical path from here to a multi-core proof of concept looks like.
Probably the best summary is here: http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html The caveat is that doing this myself is slow-going due to persistent lack of time. :/ So any timely solution would require effort from more people. I've had enough positive responses from folks at PyCon that I think enough people would pitch in to get it done in a timely manner. More significantly, I genuinely believe that isolated interpreters in the same process is a tool that many people will find extremely useful and will help the Python community. Consequently, exposing subinterpreters in the stdlib would result in a stronger incentive for folks to fix the known bugs and find a solution to the challenges for extension modules. -eric