Calling PyState_AddModule in module init?

Hello, Currently it's not necessary to call PyState_AddModule in a module PyInit_* function. For single-phase init modules, Python calls it automatically. However, documentation of PyState_FindModule sugests that you need to call PyState_AddModule.
I wrote what I think is the expected behavior in PR 16101, but I'm not sure that's actually it. If anyone who knows the original intent is reading this, could you chime in? :)

On Sat, 14 Sep 2019 at 01:59, Petr Viktorin <encukou@gmail.com> wrote:
Hello, Currently it's not necessary to call PyState_AddModule in a module PyInit_* function. For single-phase init modules, Python calls it automatically. However, documentation of PyState_FindModule sugests that you need to call PyState_AddModule.
I wrote what I think is the expected behavior in [PR 16101], but I'm not sure that's actually it. If anyone who knows the original intent is reading this, could you chime in? :)
Poking around with the git commit annotations brought me to https://bugs.python.org/issue15042
So the issue is that the import machinery calls PyState_AddModule *after* the init function returns. That means that if you want to be able to use PyState_FindModule *from* the init function (directly or indirectly), then you need to call PyState_AddModule explicitly before the calls that need it.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

So the issue is that the import machinery calls PyState_AddModule after the init function returns.
There's a bit more to it. It seems that the fact that PyState_AddModule is called after the init function is an implementation detail of the default import mechanism of CPython.
For instance, builtins.__import__ can be replaced with an alternate import mechanism that does not use PyState_AddModule. Then, calling PyImport_Import twice on a C will execute the init function twice. If this is a C Extension that holds module state then you wouldn't want that init function to be called twice.
Therefore, PyState_AddModule should always be added to a C Extension that initializes module state in the init function. Then, PyState_FindModule should be used to avoid the re-initialization.
- Eddie

On 9/17/19 8:09 AM, Eddie Elizondo wrote:
So the issue is that the import machinery calls PyState_AddModule after the init function returns.
There's a bit more to it. It seems that the fact that PyState_AddModule is called after the init function is an implementation detail of the default import mechanism of CPython.
For instance, builtins.__import__ can be replaced with an alternate import mechanism that does not use PyState_AddModule. Then, calling PyImport_Import twice on a C will execute the init function twice. If this is a C Extension that holds module state then you wouldn't want that init function to be called twice.
Therefore, PyState_AddModule should always be added to a C Extension that initializes module state in the init function. Then, PyState_FindModule should be used to avoid the re-initialization.
My call would rather be to require alternate import mechanisms to call PyState_AddModule. That way, any module that works with CPython would work with the other importers. There are far fewer custom import mechanisms than custom modules.

On Wed, 18 Sep 2019 at 18:01, Petr Viktorin <encukou@gmail.com> wrote:
On 9/17/19 8:09 AM, Eddie Elizondo wrote:
So the issue is that the import machinery calls PyState_AddModule after the init function returns.
There's a bit more to it. It seems that the fact that PyState_AddModule is called after the init function is an implementation detail of the default import mechanism of CPython.
For instance, builtins.__import__ can be replaced with an alternate import mechanism that does not use PyState_AddModule. Then, calling PyImport_Import twice on a C will execute the init function twice. If this is a C Extension that holds module state then you wouldn't want that init function to be called twice.
Therefore, PyState_AddModule should always be added to a C Extension that initializes module state in the init function. Then, PyState_FindModule should be used to avoid the re-initialization.
My call would rather be to require alternate import mechanisms to call PyState_AddModule. That way, any module that works with CPython would work with the other importers. There are far fewer custom import mechanisms than custom modules.
Right, I think the documentation you've already added is correct on that front (all extension module import implementations should be automatically calling PyState_AddModule), but we should also mention the reasons why a module might want to call it explicitly:
- in order to use PyState_FindModule (directly or indirectly) from within the extension module's own init function
- in order to work around an existing alternate extension module import function that doesn't call it automatically
At least some standard library extension modules do the latter (they call PyState_AddModule as the last thing they do before returning).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2019-09-20 12:29, Nick Coghlan wrote:
On Wed, 18 Sep 2019 at 18:01, Petr Viktorin <encukou@gmail.com> wrote:
On 9/17/19 8:09 AM, Eddie Elizondo wrote:
So the issue is that the import machinery calls PyState_AddModule after the init function returns.
There's a bit more to it. It seems that the fact that PyState_AddModule is called after the init function is an implementation detail of the default import mechanism of CPython.
For instance, builtins.__import__ can be replaced with an alternate import mechanism that does not use PyState_AddModule. Then, calling PyImport_Import twice on a C will execute the init function twice. If this is a C Extension that holds module state then you wouldn't want that init function to be called twice.
Therefore, PyState_AddModule should always be added to a C Extension that initializes module state in the init function. Then, PyState_FindModule should be used to avoid the re-initialization.
My call would rather be to require alternate import mechanisms to call PyState_AddModule. That way, any module that works with CPython would work with the other importers. There are far fewer custom import mechanisms than custom modules.
Right, I think the documentation you've already added is correct on that front (all extension module import implementations should be automatically calling PyState_AddModule), but we should also mention the reasons why a module might want to call it explicitly:
- in order to use PyState_FindModule (directly or indirectly) from within the extension module's own init function
+1
- in order to work around an existing alternate extension module import function that doesn't call it automatically
At least some standard library extension modules do the latter (they call PyState_AddModule as the last thing they do before returning).
I don't think it's useful to point out the latter in the docs. If you're in that situation, you know it already.
participants (3)
-
Eddie Elizondo
-
Nick Coghlan
-
Petr Viktorin