[Import-SIG] PEP 489: Redesigning extension module loading
encukou at gmail.com
Thu Mar 19 14:37:36 CET 2015
On 03/19/2015 11:31 AM, Stefan Behnel wrote:
> Hi Petr,
> thanks for working on this. I added my comments inline.
Thanks for your comments, they're a nice reality check.
I'm feeling a bit like I and Nick misunderstood Cython requirements
somewhat, and concentrated on unimportant points (loading into
pre-created modules) while ignoring important ones (fast access to
module state). You also pointed out interesting things we didn't think
about too much (non-ASCII names, multi-module extensions).
One of the PEP's stated goals is that the behavior of extension modules
should be be closer to Python modules. But if the solution (Exec-only
modules) does't work for Cython, then the goal is pretty much
irrelevant. I believe PyCapsule is the cleanest way of putting C state
onto arbitrary objects, and by this time I can say it's not working.
Perhaps it's time to say that extension modules *are* fundamentally
different from pure Python ones. (And rewrite the PEP. *sigh*)
I'll keep your comments in mind, but I have this idea that could make
them obsolete; I'll reply to them if it gets shot down.
>> Multiple modules in one library
>> To support multiple Python modules in one shared library, the library
>> must export appropriate PyModuleExec_<name> or PyModuleCreate_<name> hooks
>> for each exported module.
>> The modules are loaded using a ModuleSpec with origin set to the name of the
>> library file, and name set to the module name.
>> Note that this mechanism can currently only be used to *load* such modules,
>> not to *find* them.
>> XXX: This is an existing issue; either fix it/wait for a fix or provide
>> an example of how to load such modules.
> I really like that idea. It's essentially an extended inittab mechanism,
> also usable for executable single-file distributions (maybe even "python
> -m"), non-ASCII module names and "__init__.so" packages that import as an
> entire package structure of multiple modules.
> Needs some kind of "import module from library" C-API mechanism, though, or
> at least an explicitly exported list of modules to import from a shared
> library in the right order. I'd rather go for some kind of explicit import
> that creates these modules on request.
It seems that, with this PEP, the main reason for extension authors to
implement Create would be to get per-module state. PyCapsules in the
module dict are not a good idea speed-wise; static C-level data is not
an option if subinterpreters need to be supported.
The "inittab" idea made me think of this:
An extension could export an array of PyModuleDef, which has all the
needed data for module creation and initialization:
- m_name - for the "requested" name for the module (not necessarily what
it'll be loaded as), for identifying modules in multi-module extensions
- m_size - for requesting per-module C state)
- m_reload (currently unused) would be the exec function (called for
initialization and reload)
This would rule out completely custom module objects, but are those
needed anyway? A module can always replace itself in sys.modules if it
needs extra magic. Getting rid of Create entirely supports a lot of the
other goals (running user code in Create, pushing for subinterpreter
support). And things like module properties or callable modules are not
possible in source modules as well; perhaps those should be solved at a
With this, you couldn't load extensions into arbitrary objects. But it
would be possible to load into pre-created modules, as long as they were
pre-created with the correct ModuleDef. It would probably be somewhat
more difficult to make runpy (or custom loading libraries) work with
these extension modules, but it should be possible.
Implementation-wise, having m_reload filled in from the start would
help: the PEP calls for looking up two entrypoints, and the lookup is
relatively expensive (judging by the amount of caching in current code).
It would also help with non-ASCII names, since the name is a string
rather than a C identifier. Entrypoint and file names would need some
design to make everything work. But before I go thinking about that:
Does this seem like a better direction than Create/Exec?
More information about the Import-SIG