[Python-Dev] Can we split PEP 489 (extension module init) ?

Petr Viktorin encukou at gmail.com
Fri Aug 10 07:48:56 EDT 2018


On 08/10/18 12:21, Stefan Behnel wrote:
> Petr Viktorin schrieb am 10.08.2018 um 11:51:
>> On 08/10/18 11:21, Stefan Behnel wrote:
>>> coming back to PEP 489 [1], the multi-phase extension module
>>> initialization. We originally designed it as an "all or nothing" feature,
>>> but as it turns out, the "all" part is so difficult to achieve that most
>>> potential users end up with "nothing". So, my question is: could we split
>>> it up so that projects can get at least the main advantages: module spec
>>> and unicode module naming.
>>>
>>> PEP 489 is a great protocol in the sense that it allows extension modules
>>> to set themselves up in the same way that Python modules do: load, create
>>> module, execute module code. Without it, creating the module and executing
>>> its code are a single step that is outside of the control of CPython, which
>>> prevents the module from knowing its metadata and CPython from knowing
>>> up-front what the module will actually be.
>>>
>>> Now, the problem with PEP 489 is that it requires support for reloading and
>>> subinterpreters at the same time [2]. For this, extension modules must
>>> essentially be free of static global state, which comprises both the module
>>> code itself and any external native libraries that it uses. That is
>>> somewhere between difficult and impossible to achieve. PEP 573 [3] explains
>>> some of the reasons, and lists solutions for some of the issues, but cannot
>>> solve the general problem that some extension modules simply cannot get rid
>>> of their global state, and are therefore inherently incompatible with
>>> reloading and subinterpreters.
>>
>> Are there any issues that aren't explained in PEP 573?
>> I don't think Python modules should be *inherently* incompatible with
>> subinterpreters. Static global state is perhaps unavoidable in some cases,
>> but IMO it should be managed when it's exposed to Python.
>> If there are issues not in the PEPs, I'd like to collect the concrete cases
>> in some document.
> 
> There's always the case where an external native library simply isn't
> re-entrant and/or requires configuration to be global. I know, there's
> static linking and there are even ways to load an external shared library
> multiple times, but that's just adding to the difficulties. Let's just
> accept that some things are not easy enough to make for a good requirement.

For that case, I think the right thing to do is for the module to raise 
an extension when it's being initialized for the second time, or when 
the underlying library would be initialized for the second time.

"Avoid static global state" is a good rule of thumb for supporting 
subinterpreters nicely, but other strategies are possible.
If an underlying library just expects to be initialized once, and then 
work from several modules, the Python wrapper should ensure that (using 
global state, most likely). Other ways of handling things should be 
possible, depending on the underlying library.

>>> I would like the requirement in [2] to be lifted in PEP 489, to make the
>>> main features of the PEP generally available to all extension modules.
>>>
>>> The question is then how to opt out of the subinterpreter support. The PEP
>>> explicitly does not allow backporting new init slot functions/feeatures:
>>>
>>> "Unknown slot IDs will cause the import to fail with SystemError."
>>>
>>> But at least changing this in Py3.8 should be doable and would be really
>>> nice.
>>
>> I don't think we can just silently skip unknown slots -- that would mean
>> modules wouldn't be getting features they asked for.
>> Do you have some more sophisticated model for slots in mind, or is this
>> something to be designed?
> 
> Sorry for not being clear here. I was asking for changing the assumptions
> that PEP 489 makes about modules that claim to support the multi-step
> initialisation part of the PEP. Adding a new (flag?) slot was just one idea
> for opting out of multi-initialisation support.

Would this be better than a flag + raising an error on init?
One big disadvantage of a big opt-out-of-everything button is that it 
doesn't encourage people to think about what the actual non-reentrant 
piece of code is.


More information about the Python-Dev mailing list