Petr, thanks for clearly stating your interests and goals for subinterpreters. This lays to rest some of my own fears. I am still skeptical that (even after the GIL is separated) they will enable multi-core in ways that multiple processes couldn't handle just as well or better, but your clear statement that *embedding* is the more important use case helps me feel supportive of the concept.

On Tue, Jun 9, 2020 at 6:26 AM Petr Viktorin <encukou@gmail.com> wrote:
On 2020-06-05 16:32, Mark Shannon wrote:
> Hi,
>
> There have been a lot of changes both to the C API and to internal
> implementations to allow multiple interpreters in a single O/S process.
>
> These changes cause backwards compatibility changes, have a negative
> performance impact, and cause a lot of churn.
>
> While I'm in favour of PEP 554, or some similar model for parallelism in
> Python, I am opposed to the changes we are currently making to support it.
>
>
> What are sub-interpreters?
> --------------------------
>
> A sub-interpreter is a logically independent Python process which
> supports inter-interpreter communication built on shared memory and
> channels. Passing of Python objects is supported, but only by copying,
> not by reference. Data can be shared via buffers.

Here's my biased take on the subject:

Interpreters are contexts in which Python runs. They contain
configuration (e.g. the import path) and runtime state (e.g. the set of
imported modules). An interpreter is created at Python startup
(Py_InitializeEx), and you can create/destroy additional ones with
Py_NewInterpreter/Py_EndInterpreter.
This is long-standing API that is used, most notably by mod_wsgi.

Many extension modules and some stdlib modules don't play well with the
existence of multiple interpreters in a process, mainly because they use
process-global state (C static variables) rather than some more granular
scope.
This tends to result in nasty bugs (C-level crashes) when multiple
interpreters are started in parallel (Py_NewInterpreter) or in sequence
(several Py_InitializeEx/Py_FinalizeEx cycles). The bugs are similar in
both cases.

Whether Python interpreters run sequentially or in parallel, having them
work will enable a use case I would like to see: allowing me to call
Python code from wherever I want, without thinking about global state.
Think calling Python from an utility library that doesn't care about the
rest of the application it's used in. I personally call this "the Lua
use case", because light-weight, worry-free embedding is an area where
Python loses to Lua. (And JS as well—that's a relatively recent
development, but much more worrying.)

The part I have been involved in is moving away from process-global
state. Process-global state can be made to work, but it is much safer to
always default to module-local state (roughly what Python-language's
`global` means), and treat process-global state as exceptions one has to
think through. The API introduced in PEPs 384, 489, 573 (and future
planned ones) aims to make module-local state possible to use, then
later easy to use, and the natural default.

Relatively recently, there is an effort to expose interpreter creation &
finalization from Python code, and also to allow communication between
them (starting with something rudimentary, sharing buffers). There is
also a push to explore making the GIL per-interpreter, which ties in to
moving away from process-global state. Both are interesting ideas, but
(like banishing global state) not the whole motivation for
changes/additions. It's probably possible to do similar things with
threads or subprocesses, sure, but if these efforts went away, the other
issues would remain.

I am not too fond of the term "sub-interpreters", because it implies
some kind of hierarchy. Of course, if interpreter creation is exposed to
Python, you need some kind of "parent" to start the "child" and get its
result when done. Also, due to some practical issues you might (sadly,
currently) need some notion of "the main interpreter". But ideally, we
can make interpreters entirely independent to allow the "Lua use case".
In the end-game of these efforts, I see Py_NewInterpreter transparently
calling Py_InitializeEx if global state isn't set up yet, and similarly,
Py_EndInterpreter turning the lights off if it's the last one out.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NLITVUIZQSUJ2F6XDTPMD7IP7FGTMNBA/
Code of Conduct: http://python.org/psf/codeofconduct/


--
--Guido van Rossum (python.org/~guido)
Pronouns: he/him (why is my pronoun here?)