Before I dive in, I'll say that I'd really like to hear Nick's opinion
on all this. :)
On Thu, Aug 2, 2018 at 9:59 AM Victor Stinner
2018-08-02 1:18 GMT+02:00 Eric Snow
: The "core" config is basically the config for the runtime. In fact, PEP 432 renamed "core" to "runtime". Please keep the firm distinction between the runtime and the (main) interpreter.
There is already something called _PyRuntime but it's shared between all interpreters.
_PyRuntime is a static global of type PyRuntimeState. It is where I consolidated (nearly) all the global runtime state last September.
_PyCoreConfig is already *per* interpreter.
This was done as part of the PEP 432 implementation, which I landed during PyCon 2017. If PyRuntimeState had existed already I'm sure it would be there instead.
Would you mind to elaborate what you mean by the "main interpreter"? I don't see anything obvious in the current code about what is a "main interpreter". Technically, I don't see anything like that.
The main interpreter is the first one created (during runtime initialization). It is special for a variety of reasons. Here are the ones I could think of: 1. the "main" thread will always belong to the main interpreter since it is the first PyThreadState created 2. runtime initialization uses the main interpreter exclusively 3. the first phase of runtime initialization (pre-initialization) ends with the main interpreter being *partially* configured 4. during the second phase (initializing), the partially-configured main interpreter facilitates the use of most of the C-API and may be used by embedders * this is the only time that an interpreter may be used in this way, and it only happens with the main interpreter 5. runtime finalization takes place using the main interpreter 6. the main interpreter is the last one destroyed during finalization 7. the REPL runs only in the main interpreter 8. the Python CLI is run in the main interpreter (i.e. in its __main__ module) 9. the main interpreter cannot be destroyed (except during finalization) 10. in Python code the main interpreter will always exist 11. it is the parent of all subinterpreters created in Python code (via PEP 554) 12. signals are handled only in the main interpreter 13. all single-threaded Python code is run in the main interpreter Note that there isn't anything special to the interpreter itself, but rather in where and how it's used. However, that matters and the runtime needs to treat it specially. I expect all this isn't well-documented because it is relevant to very few people.
I'm still not convinced that we need _PyMainInterpreterConfig:
Let's step back a moment and consider the course of events: 1. PEP 432 was created nearly 6 years ago to address the tangle that runtime initialization had become, with the intent of helping both the CPython maintainers and embedders 2. Nick did some re-organization around then (e.g. factoring out pylifecycle.c) to facilitate an implementation of the PEP 3. Nick implemented PEP 432, with a plan to merge it as a *private* API regardless of whether or not the PEP was accepted (with general consensus that doing so was a good idea) * see https://bitbucket.org/ncoghlan/cpython_sandbox/branch/pep432_modular_bootstr... * landing the private API would allow us to iron out the details of the PEP * work happened in spurts in 2013, 2015, and 2016; I kept poking Nick because the implementation was a big blocker for my multi-core/subinterpreters project 4. leading up to (and at) PyCon 2017, I forked Nick's branch, moved it to github, rebased it onto master, got it working again, created a PR, and finally landed it 5. since then the implementation has changed a bunch (due to Victor's much appreciate efforts) and has diverged from the PEP * notably it's unclear that code (especially pymain) strictly conforms to the phases in the PEP At this point the PEP is out of date. There have been several mailing list threads (all python-dev, IIRC) and some BPO issues where Victor solicited clarification or expressed a desire to change things and Nick gave feedback. None of that made it into the PEP. :( Consequently the PEP is inconsistent with the actual target. Furthermore, as was intended, we've learned of a few ways that the PEP could be improved. We *really* need to get the PEP updated so we can be sure everyone has all the info. Regarding the justification for the "main interpreter" config, the implementation has diverged from the original intent of the PEP: * the core/runtime config was meant to hold the minimal data needed to bootstrap/initialize the basic (limited) functionality of the C-API, including a restricted main interpreter + the struct members were strictly C plain-old-types since using PyObject would require the runtime to already be (partially) initialized + in the last year a lot of data has been added to this config; I don't know how much is strictly necessary to bootstrap the runtime (end of phase 1) and how much could be dealt with in phase 2 * the "main interpreter" config was meant to hold all the config needed to finish initializing the runtime (end of phase 2) + the struct members were mostly PyObject* (possible since most builtin types are available at this point) + the PEP proposes a bunch more fields than the implementation has; we planned on adding them a few at a time
_PyCoreConfig contains the same information. Is it really worth it to duplicate all _PyCoreConfig (more than 36 fields) in _PyMainInterpreterConfig? _PyMainInterpreterConfig adds a third copy of many paramters: another opportunity to introduce an inconsistency.
TBH, the PEP *should* have a clear answer for your question here, Victor. It has some explanation, but clearly it is incomplete (hence this continuing email thread). The duplication is partly a consequence of what has happened in the last year: a bunch of fields were added to the core config that were not in the PEP. However, note the key differences between the two structs: * core/runtime config + minimal + simple C fields + meant for embedders/pymain to bootstrap a limited runtime + not really meant to be used after calling Py_InitializeRuntime (AKA Py_InitializeCore) * main interpreter config + includes everything needed to finish full runtime initialization + has PyObject* fields + meant for embedders/pymain to finish initializing the runtime + not really meant to be used after calling Py_ConfigureMainInterpreter (except when initializing a subintepreter) Originally there wasn't much overlap. Furthermore, both of them are kept around so that, via the C-API (or directly in the CPython impl.), we could expose what data was used to initialize the runtime. This fills much the same role as the existing global Py_* variables. The duplication is due to there being C and PyObject versions. It is for the sake of embedders (and a little bit of sanity). The big reason why it shouldn't be a problem is because PyMainInterpreterConfig is generated directly from PyRuntimeConfig (AKA PyCoreConfig) and only *after* we've used the runtime config to bootstrap the limited runtime (after which it shouldn't be modified ever). So there's no risk of inconsistency, right? Perhaps it would make sense to only keep a const copy of both, to avoid modification?
Right now, an interpreter contains both: core and main configurations...
As noted above, the core/runtime config should probably be on PyRuntimeState instead. Regarding the "main" config, PyMainInterpreterConfig probably makes more sense as one of the following: 1. on PyRuntimeState, like the core/runtime config (since it's a one-off) 2. on PyInterpreterState, like now, but set to NULL on all but the main interpreter (which would allow us to distinguish the main interpreter from the rest) Both would require PyInterpreterConfig from PEP 432, but expanded to cover all config that might be unique to an interpreter. Also, conceptually there's a different between the-config-used-to-finish-runtime-init and the config-used-to-initialize-an-interpreter (including the main interpreter). In fact, PEP 432 does include a PyInterpreterConfig. However, in the current implementation, PyMainInterpreterConfig fills that role exclusively, which is confusing since we use the "main interpreter" config to initialize all interpreters (not just the main one). So here's what might make sense to do: 1. rename "core" to "runtime" (to reduce confusion) 2. move PyInterpreterState.runtime_config to PyRuntimeState.config + prevent modification after Py_InitializeRuntime() is called (e.g. keep a const copy)? 3. move PyInterpreterState.config to PyRuntimeConfig.main_config + prevent modification after Py_ConfigureMainInterpreter() is called (e.g. keep a const copy)? + keep the PyMainInterpreterConfig and Py_ConfigureMainInterpreter names 4. add PyInterpreterConfig with only the parts of PyMainInterpreterConfig needed to initialize any interpreter + add Py_NewInterpreterEx(PyInterpreterConfig) to allow explicitly passing a config? 5. add PyInterpreterState.config (type PyInterpreterConfig) to record the config used to initialize that interpreter + prevent modification after the interpreter is initialized (e.g. keep a const copy)?
I propose to *remove* _PyMainInterpreterConfig and rename _PyCoreConfig as _PyInterpreterConfig. I would also propose to merge again Py_Initialize() to have a single step instead of the current core step + main step: 2 steps.
So you are not in favor of PEP 432 then. :) -eric