Thanks for the feedback, Petr! Responses inline below. -eric On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin <encukou@gmail.com> wrote:
This PEP definitely makes per-interpreter GIL sound possible :)
Oh good. :)
PEP: 684 Title: A Per-Interpreter GIL Author: Eric Snow <ericsnowcurrently@gmail.com> Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst
This iteration of the PEP should also have `Requires: 683` (Immortal Objects).
+1
Most of the effort needed for a per-interpreter GIL has benefits that make those tasks worth doing anyway:
* makes multiple-interpreter behavior more reliable * has led to fixes for long-standing runtime bugs that otherwise hadn't been prioritized > * has been exposing (and inspiring fixes for) previously unknown runtime bugs * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`) * has driven cleaner and more complete runtime finalization * led to structural layering of the C-API (e.g. ``Include/internal``) * also see `Benefits to Consolidation`_ below
Do you want to dig up some bpo examples, to make these more convincing to the casual reader?
Heh, the casual reader isn't really my target audience. :) I actually have a stockpile of links but left them all out until they were needed. Would the decision-makers benefit from the links? I'm trying to avoid adding to the already sizeable clutter in this PEP. :) I'll add some links in if you think it matters.
Furthermore, much of that work benefits other CPython-related projects:
* performance improvements ("faster-cpython") * pre-fork application deployment (e.g. Instagram)
Maybe say “e.g. with Instagram's Cinder” – both the household name and the project you can link to?
+1 Note that Instagram isn't exactly using Cinder. I'll have to check if Cinder uses the pre-fork model.
* extension module isolation (see :pep:`630`, etc.) * embedding CPython
A lot of these points are duplicated in "Benefits to Consolidation" list below, maybe there'd be, ehm, benefits to consolidating them?
There shouldn't be any direct overlap. FWIW, the whole "Extra Context" section is essentially a separate PEP that I inlined (with the caveat that it really isn't worth its own PEP). I'm still considering yanking it, so the above list should stand on its own.
PEP 554 -------
Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for people who don't remember the magic numbers but want to skim the table of contents.
+1
This list doesn't render correctly in ReST, you need blank lines everywhere. There are more cases like this below.
Hmm, I had blank lines and the PEP editor told me I needed to remove them.
[...]> Per-Interpreter State
---------------------
The following runtime state will be moved to ``PyInterpreterState``:
* all global objects that are not safely shareable (fully immutable) * the GIL * mutable, currently protected by the GIL
Spelling out “mutable state” in these lists would make this clearer, since “state” isn't elided from all the points.
+1
* mutable, currently protected by some other per-interpreter lock * mutable, may be used independently in different interpreters
This includes extension modules (with multi-phase init), right?
Yep.
The following state will not be moved:
* global objects that are safely shareable, if any * immutable, often ``const`` * treated as immutable
Do you have an example for this?
Strings (PyUnicodeObject) actually cache some info, making them not strictly immutable, but they are close enough to be treated as such. I'll add a note to the PEP.
* related to CPython's ``main()`` execution * related to the REPL
Would “only used by” work instead of “related to”?
Sure.
* set during runtime init, then treated as immutable
`main()`, REPL and runtime init look like special cases of functionality that only runs in one interpreter. If it's so, maybe generalize this?
+1
* ``_PyInterpreterConfig`` * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)
Since the API is not documented (and _PyInterpreterConfig is not even in main yet!), it would be good to sketch out the docs (intended behavior) here.
+1
The following fields will be added to ``PyInterpreterConfig``:
* ``own_gil`` - (bool) create a new interpreter lock (instead of sharing with the main interpreter)
As a user of the API, what should I consider when setting this flag? Would the GIL be shared with the *parent* interpreter or the main one?
The GIL would be shared with the main interpreter. I state that there but it looks like I wasn' clear enough.
What are the restrictions/implications of this flag?
Good point. I'll add a brief explanation of why you would want to keep sharing the GIL (e.g. the status quo) and what is different if you don't.
* ``strict_extensions`` - fail import in this interpreter for incompatible extensions (see `Restricting Extension Modules`_)
I'm not sure about including a workaround flag in the structure. Since the Python API will get a context manager for this, maybe the C API should get a function to set/reset it instead of this flag?
The flag is necessary if we want to be able to preserve the current behavior of the existing API, Py_NewInterpreter(), which we do.
Restricting Extension Modules -----------------------------
Extension modules have many of the same problems as the runtime when state is stored in global variables. :pep:`630` covers all the details of what extensions must do to support isolation, and thus safely run in multiple interpreters at once. This includes dealing with their globals.
Extension modules that do not implement isolation will only run in the main interpreter. In all other interpreters, the import will raise ``ImportError``. This will be done through ``importlib._bootstrap_external.ExtensionFileLoader``.
“Main interpreter” should be defined.
+1
(Or maybe the term should be avoided instead -- always having to spell out “interpreter started by Py_Initialize rather than Py_NewInterpreter” might push us toward finding ways to avoid the special case...)
We (me, Nick, Victor, others) have considered this in the past and have concluded that having a distinct main interpreter is valuable. That topic is a bit out of scope for this PEP though.
We will work with popular extensions to help them support use in multiple interpreters. This may involve adding to CPython's public C-API, which we will address on a case-by-case basis.
Extension Module Compatibility ''''''''''''''''''''''''''''''
As noted in `Extension Modules`_, many extensions work fine in multiple interpreters without needing any changes. The import system will still fail if such a module doesn't explicitly indicate support. At first, not many extension modules will, so this is a potential source of frustration.
We will address this by adding a context manager to temporarily disable the check on multiple interpreter support: ``importlib.util.allow_all_extensions()``. >
I'd prefer a more dangerous-sounding name, to guide code readers (and autocomplete users) toward checking the warning in the docs.
+1 I had meant to explicitly ask for suggestions for a better name. :)
Now we get to the break in compatibility mentioned above. Some extensions are safe under multiple interpreters, even though they haven't indicated that. Unfortunately, there is no reliable way for the import system to infer that such an extension is safe, so importing them will still fail. This case is addressed in `Extension Module Compatibility`_ below.
Extensions that use multi-phase init should already be compatible with multiple interpreters. Multi-phase init itself is the flag that indicates this.
Correct and ExtensionFileLoader will use that if PyInterpreterConfig.strict_extensions is set, regardless of whether or not we need a second extension module indicator for I-said-I-was-isolated-but-now-I-really-mean-it (per-interpreter GIL).
But they might not be compatible with *per-interpreter GIL*. I don't like how that's conflated with multiple interpreters here.
Hmm, I suppose in my mind they *have* been the same thing. :)
For example, extension modules can currently support multiple interpreters, but rely on the GIL to protect calls to a non-threadsafe library, access shared memory, etc. As an example, the PEP 630 “opt-out” is not thread-safe.
So, you are saying that some mutli-phase init extensions may still be relying on the GIL as a lock for some shared state. In the case of your "opt-out" example, there is a possible (albeit super unlikely) race on "loaded". So such an extension needs to be able to separately opt in to being used without a GIL between interpreters. Is all that correct? Out of curiosity, do you have any examples of extensions that implement multi-phase init but need to opt out (like in your example)? Is it only the case where the maintainer is in the process of isolating the module, so the opt-out is temporary? Aside from the unsafe-flag-to-indicate-not-isolated case, do you know of any other examples where a module is safe for use between interpreters but still relies on the shared GIL? I'm struggling to imagine such a scenario, but where they don't also opt out of multi-interpreter support already. FWIW, my assumption is that, if an extension has been made isolated enough for use between multiple interpreters, then it is extremely likely that it is also isolated enough to use without a GIL shared between the interpreters.
It seems to me that there should be a separate flag (slot) to indicate support for per-interpreter GIL, and the `strict_extensions` bit should work with that.
I think I see what you are saying. My concern is that anything beyond the default settings is an obstacle for extension maintainers, so opt-in is especially painful if it ends up being the common case. Of course, it may be unavoidable in the end. While we may end up needing a flag to indicate yes-I'm-isolated-but-not-that-isolated, the following alternative came to mind: * multi-phase init extensions should be expected to be isolated, thus compatible with use in multiple interpreters (already true) * multi-phase init extensions should be expected to be fully isolated, thus compatible with per-interpreter GIL * there would be a new module def slot that a multi-phase init extension can use to explicitly opt out * the ExtensionFileLoader would enforce loading the module only once in the process (and use a dedicated granular lock to prevent races) Instead of using its own static "loaded" variable, the module in your opt-out example would use this new slot. To me, really-truly-fully-isolated is the sensible long-term default for multi-phase init. Most maintainers will implement multi-phase init, using PEP 630 to get isolated enough. We can avoid the extra step for the common case. So our future selves would be much happier if we go with an explicit opt-out now. :) This follows my earlier assumption that few extensions will be safe in multi-interpreter but not per-interpreter GIL. FWIW, I was going to say perhaps we could get away with treating the vast majority of extensions as already safe in multiple interpreters, to avoid requiring extensions to implement multi-phase init. However, I can already think of a number of relatively common cases where that isn't true (e.g. static types). :) Plus, multi-phase init is such a good thing and doesn't require that much effort (especially if we provide an opt-out slot for multi-interpreter support). Per-interpreter GIL would be a pretty good carrot. :)
[...]
How to Teach This =================
This is an advanced feature for users of the C-API. There is no expectation that this will be taught. Oh, I'm afraid this will need some docs related to making sure an extension is compatible with per-interpreter GIL. I'd rather not repeat my mistake of hand-wavingly noting "All modules created using multi-phase initialization are expected to support sub-interpreters" in the docs, and only writing PEP 630 much later.
good point Is PEP 630 the de facto documentation for this sort of thing or is there something on docs.python.org?
[...]
References ==========
Related:
* :pep:`384` * :pep:`432` * :pep:`489` * :pep:`554` * :pep:`573` * :pep:`587` * :pep:`630` * :pep:`683` * :pep:`3121`
Please write out the titles here.
will do