[Import-SIG] Proto-PEP: Redesigning extension module loading

Nick Coghlan ncoghlan at gmail.com
Mon Feb 23 14:47:15 CET 2015

On 23 February 2015 at 23:18, Petr Viktorin <encukou at gmail.com> wrote:
> On Sat, Feb 21, 2015 at 1:19 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On 21 February 2015 at 00:56, Petr Viktorin <encukou at gmail.com> wrote:
>>> The "module" argument receives the module object.
>>> If PyModuleCreate is defined, this will be the the object returned by it.
>>> If PyModuleCreate is not defined, PyModuleExec is epected to operate
>>> on any Python object for which attributes can be added by PyObject_GetAttr*
>>> and retreived by PyObject_SetAttr*.
>>> Specifically, as the module may not be a PyModule_Type subclass,
>>> PyModule_* functions should not be used on it, unless they explicitly support
>>> operating on all objects.
>> I think this is too permissive on the interpreter side of things, thus
>> making things more complicated than we'd like them to be for extension
>> module authors.
> What complications are you thinking about? I was worried about this
> too, but I don't see the complications. I don't think there is enough
> difference between PyModule_Type and any object with getattr/setattr,
> either on the C or Python level. After initialization, the differences
> are:
> - Modules have a __dict__. But, as the docs say, "It is recommended
> extensions use other PyModule_*() and PyObject_*() functions rather
> than directly manipulate a module’s __dict__." This would become a
> requirement.
> - The finalization is special. There have been efforts to remove this
> difference. Any problems here are for the custom-module-object
> provider (e.g. the lazy-load library) to sort out, the extension
> author shouldn't have to do anything extra.
> - There's a PyModuleDef usable for registration.
> - There's a custom __repr__.
> Currently there is a bunch of convenience functions/macros that only
> work on modules do little more than get/setattr. They can easily be
> made to work on any object.

It occurs to me that we'd like folks to steer clear of relying on
struct layout details anyway (to help promote use of the stable ABI),
so yeah, I think you've persuaded me that the more general "expect an
object that supports setting & getting attributes, but still check
your error codes appropriately" directive for module authors using the
new initialisation API is a good way to go.

For the other areas, I'll mostly wait until I see the next draft
before commenting further.

However, I will note that the difference I see between create_module
becoming compulsory (but allowed to return None) and whether or not
PyModuleCreate_* should also be optional (in addition to letting it
return None) is that the latter would need to be added at a
*per-module* level for everyone writing extension modules using the
new API, while create_module only exists at a *per-loader* level. That
changes the equation for who pays the cost of making the method

For create_module:
  * if it's mandatory, cost is borne by loader authors, but importlib
provides a default impl that returns None
  * if it's optional, cost is borne by the already complex import
system and anyone else manipulating loaders directly

So making create_module mandatory is likely to reduce the net
complexity of the overall system.

For PyModuleCreate_*:

  * if it's mandatory, cost is borne by every extension module author
as a bit of standard boilerplate they have to add
  * if it's optional, cost is borne in the create_module
implementation for the updated extension module loader, and anyone
writing their own custom extension module loader (which is even more
unusual than interacting with loaders directly)

Here, I think the relative frequency of the two activities (writing
extension modules vs writing extension module loaders) favours making
the C level module creation function entirely optional in addition to
letting it return None.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Import-SIG mailing list