[Import-SIG] [Python-Dev] PEP 489: Redesigning extension module loading

Mon Mar 16 18:42:44 CET 2015

On Mon, Mar 16, 2015 at 4:42 PM, Jim J. Jewett <jimjjewett at gmail.com> wrote:
>
> On 16 March 2015 Petr Viktorin wrote:
>
>> If PyModuleCreate is not defined, PyModuleExec is expected to operate
>> on any Python object for which attributes can be added by PyObject_GetAttr*
>> and retrieved by PyObject_SetAttr*.
>
> I assume it is the other way around (add with Set and retrieve with Get),
> rather than a description of the required form of magic.

Right you are, I mixed that up.

>>         PyObject *PyModule_AddCapsule(
>>             PyObject *module,
>>             const char *module_name,
>>             const char *attribute_name,
>>             void *pointer,
>>             PyCapsule_Destructor destructor)
>
> What happens if module_name doesn't match the module's __name__?
> Does it become a hidden attribute?  A dotted attribute?  Is the
> result undefined?

The module_name is used to name the capsule, following the convention
from PyCapsule_Import. The "module.__name__" is not used or checked.
The function would do this:
    capsule_name = module_name + '.' + attribute_name
    capsule = PyCapsule_New(pointer, capsule_name, destructor)
    PyModule_AddObject(module, attribute_name, capsule)
just with error handling, and suitable C code for the "+".
I will add the pseudocode to the PEP.

> Later, there is
>
>>         void *PyModule_GetCapsulePointer(
>>             PyObject *module,
>>             const char *module_name,
>>             const char *attribute_name)
>
> with the same apparently redundant arguments,

Here the behavior would be:
    capsule_name = module_name + '.' + attribute_name
    capsule = PyObject_GetAttr(module, attribute_name)
    return PyCapsule_GetPointer(capsule, capsule_name)

> but not a
> PyModule_SetCapsulePointer.  Are capsule pointers read-only, or can
> they be replaced with another call to PyModule_AddCapsule, or by a
> simple PyObject_SetAttr?

You can replace the capsule using any of those two, or set the pointer
using PyCapsule_SetPointer, or (most likely) change the data the
pointer points to.
The added functions are just simple helpers for common operations,
meant to encourage keeping per-module state.

>> Subinterpreters and Interpreter Reloading
> ...
>> No user-defined functions, methods, or instances may leak to different
>> interpreters.
>
> By "user-defined" do you mean "defined in python, as opposed to in
> the extension itself"?

Yes.

> If so, what is the recommendation for modules that do want to support,
> say, callbacks?  A dual-layer mapping that uses the interpreter as the
> first key?  Naming it _module and only using it indirectly through
> module.py, which is not shared across interpreters?  Not using this
> API at all?

There is a separate module object, with its own dict, for each
subinterpreter (as when creating the module with "PyModuleDef.m_size
== 0" today).
Callbacks should be stored on the appropriate module instance.
Does that answer your question? I'm not sure how you meant "callbacks".

>> To achieve this, all module-level state should be kept in either the module
>> dict, or in the module object.
>
> I don't see how that is related to leakage.
>
>> A simple rule of thumb is: Do not define any static data, except
>> built-in types
>> with no mutable or user-settable class attributes.
>
> What about singleton instances?  Should they be per-interpreter?

Yes, definitely.

> What about constants, such as PI?

In PyModuleExec, create the constant using PyFloat_FromDouble, and add
it using PyModule_FromObject. That will do the right thing.
(Float constants can be shared, since they cannot refer to
user-defined code. But this PEP shields you from needing to know this
for every type.)

> Where should configuration variables (e.g., MAX_SEARCH_DEPTH) be
> kept?

On the module object.

> What happens if this no-leakage rule is violated?  Does the module
> not load, or does it just maybe lead to a crash down the road?

It may, as today, lead to unexpected behavior down the road. This is
explained here:
https://docs.python.org/3/c-api/init.html#sub-interpreter-support
Unfortunately, there's no good way to detect such leakage. This PEP
adds the tools, documentation, and guidelines to make it easy to do
the right thing, but won't prevent you from shooting yourself in the
foot in C code.

Thank you for sharing your concerns! I will keep them in mind when
writing the docs for this.