[Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP 489

Nikita Nemkin nikita at nemkin.ru
Thu Apr 14 04:23:05 EDT 2016


Reading PEP 3121/PEP 489 I can't stop wondering, why do extension
modules require such specialized lifecycle APIs? Why not just let
them subclass ModuleType? (Or any type, really, but ModuleType might
be a good place to define some standard behavior.)

Module instance naturally encapsulates C level module state.
GC/finalization happens just like for any other object. PEP 3121
becomes redundant.

Two-step initialization (PEP 489) can be achieved by defining
a new kind of PyInit_XXX entry point, returning a module *type*,
instead of a module *instance*. No extra API needed beyond that!

Now, importer can simply instantiate this module type, passing
__name__, __file__ and the rest. ModuleType.tp_new will perform
attribute init, sys.modules registration etc.
OR
the importer can manually pull tp_new/tp_init/attribute setup, supplanting
type_call. (This is closer to the current way of doing things.)

Actual module initialization ("executing the module body")
happens in tp_init. reload() is equivalent to calling tp_init again.

Subinterpreter interaction becomes transparent: every interpreter
instantiates its own module copy. "Singleton" modules with
external global state should fail second instantiation
(maybe by deriving from a special SingletonModuleType subclass
that will handle it for them).

Additionally, custom module type allows fine grained attribute
access control (aka metamodules), useful to many complex modules.
C synchronized module "variables" become super-easy to define
(tp_members). For lazy loading and importing there's
tp_getattro, tp_getset, etc.


One problem not solved by this approach (nor the current approach)
is module state access from methods of extension types.
At least two solutions are possible:
1. Look up the module by name (sys.modules) or type (new per-interpreter
   cache).
2. Define a new METH_XXX calling convention (or flag) and pass both
   PyCFunctionObject.m_self and PyCFunctionObject.m_module
   to the C level method implementations.
Both can be implemented, #1 being simple and #2 being proper.

What do you think?


More information about the Python-ideas mailing list