[issue32124] Document functions safe to be called before Py_Initialize()

New submission from STINNER Victor victor.stinner@gmail.com:
Follow-up of bpo-32086, bpo-32096 and "[Python-Dev] Python initialization and embedded Python" thread: https://mail.python.org/pipermail/python-dev/2017-November/150605.html
I propose to explicitly list functions that can be safetely called before Py_Initialize(). This safety warranty must be part of the C API.
Maybe we should even test all tests function in test_capi using Programs/_testembed, as we did for Py_DecodeLocale() and Py_SetProgramName() in commit 9e87e7776f7ace66baaf7247233afdabd00c2b44 ("pre_initialization_api" test).
Attached PR adds proposed documentation. It also documents "global configuration variables" like Py_DebugFlag.
---------- assignee: docs@python components: Documentation messages: 306894 nosy: docs@python, eric.snow, ncoghlan, vstinner priority: normal severity: normal status: open title: Document functions safe to be called before Py_Initialize() versions: Python 3.7
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Change by STINNER Victor victor.stinner@gmail.com:
---------- keywords: +patch pull_requests: +4474 stage: -> patch review
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Serhiy Storchaka storchaka+cpython@gmail.com added the comment:
Are you sure about PyMem_Malloc() and PyObject_Malloc()? What functions require them? I thought only PyMem_RawMalloc() can be called before Py_Initialize().
I think that for all functions that *can* or *should* be called before Py_Initialize() this should be explicitly documented in the documentation of this function, like for Py_SetProgramName() and PyImport_AppendInittab().
---------- nosy: +serhiy.storchaka
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

STINNER Victor victor.stinner@gmail.com added the comment:
Are you sure about PyMem_Malloc() and PyObject_Malloc()?
Technically, the pymalloc memory allocator is initialized statically by the compiler, from the first instruction of the process.
But maybe we should not suggest users to call them, especially because the allocator can be modified by the PYTHONMALLOC environment variable.
What functions require them?
No function to initalize Python require PyMem or PyObject allocators. Only PyMem_Raw allocator is needed.
Py_EncodeLocale() uses it, but this function also uses Python objects (str, bytes), and so Py_EncodeLocale() must no be called before Py_Initialize().
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Serhiy Storchaka storchaka+cpython@gmail.com added the comment:
Should PyMem_SetAllocator() and PyObject_SetArenaAllocator() be called before Py_Initialize(), or they can be called after it?
If PyMem_Malloc() and PyObject_Malloc() are not needed for pre-initialization, should we support calling them before Py_Initialize()? There are other functions and macros that can be safely used before Py_Initialize().
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

STINNER Victor victor.stinner@gmail.com added the comment:
Should PyMem_SetAllocator() and PyObject_SetArenaAllocator() be called before Py_Initialize(), or they can be called after it?
I'm quite sure that calling PyMem_SetAllocator() or PyObject_SetArenaAllocator() after Py_Initialize() will quickly crash.
If PyMem_Malloc() and PyObject_Malloc() are not needed for pre-initialization, should we support calling them before Py_Initialize()?
We don't have to support them.
Ok, I remove them from the pre-init documentation.
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Serhiy Storchaka storchaka+cpython@gmail.com added the comment:
I'm quite sure that calling PyMem_SetAllocator() or PyObject_SetArenaAllocator() after Py_Initialize() will quickly crash.
Then document this explicitly like for other functions that *should* be called before Py_Initialize() if called at all.
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

STINNER Victor victor.stinner@gmail.com added the comment:
New changeset 84c4b1938fade2b425ac906730beabd413de094d by Victor Stinner in branch 'master': bpo-32124: Document C functions safe before init (#4540) https://github.com/python/cpython/commit/84c4b1938fade2b425ac906730beabd413d...
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

STINNER Victor victor.stinner@gmail.com added the comment:
Then document this explicitly like for other functions that *should* be called before Py_Initialize() if called at all.
I agree that it would be even better to document if a function must not be called after Py_Initialize().
*But* I'm not sure of what I wrote, I have to check the code, and maybe even test manually to "see what happens" (ensure that it works) :-)
So I decided to push my first PR, and will work on a second PR later.
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Eric Snow ericsnowcurrently@gmail.com added the comment:
I've left a review (writing it as you merged the PR).
My main concern is that we not promise more than we must. Every pre-init function or variable we promise to embedders represents global state that is hard to get rid of. It also entrenches pre-init API and state that we're aiming to deprecate (via PEP 432).
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Nick Coghlan ncoghlan@gmail.com added the comment:
Key point to note regarding PEP 432: at least personally, I'm not actually aiming to deprecate the legacy embedding API.
Instead, I'm just aiming to eventually stop *adding* to it, with new config structs replacing the current ad hoc mix of pre-init function calls, C globals, environment variables, and filesystem state.
That means I'm quite willing to accept maintaining compatibility for applications using the current single phase initialisation approach as a design constraint for the PEP.
We have a similar constraint in place for extension modules: even though any *new* features we introduce are likely to be dependent on switching over to PEP 489's multi-phase initialisation APIs, we still ensure that single-phase initialisation continues working for existing modules.
----------
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Change by STINNER Victor victor.stinner@gmail.com:
---------- pull_requests: +4536
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________

Change by STINNER Victor victor.stinner@gmail.com:
---------- resolution: -> fixed stage: patch review -> resolved status: open -> closed
_______________________________________ Python tracker report@bugs.python.org https://bugs.python.org/issue32124 _______________________________________
participants (4)
-
Eric Snow
-
Nick Coghlan
-
Serhiy Storchaka
-
STINNER Victor