[Import-SIG] PEP 489: Redesigning extension module loading

Sat Mar 21 11:30:21 CET 2015

On 03/21/2015 09:17 AM, Nick Coghlan wrote:
> On 21 March 2015 at 06:29, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> Petr Viktorin schrieb am 19.03.2015 um 14:37:
>>> On 03/19/2015 11:31 AM, Stefan Behnel wrote:
>>>> thanks for working on this. I added my comments inline.
>>>
>>> Thanks for your comments, they're a nice reality check.
>>
>> Sorry if it was a bit too much. I didn't mean to shoot it down or so. I
>> think we're on the right track, and the PEP will allow us to get a better
>> idea of where we should be heading.

Oh, no need to apologize! The result will be much better with your input :)

>>> I'm feeling a bit like I and Nick misunderstood Cython requirements
>>> somewhat, and concentrated on unimportant points (loading into pre-created
>>> modules)
>>
>> You mean the split between Create and Exec? I think that's a very good and
>> simple design. It gives the extension module full control over the module
>> instance and its implementation (if it wants to), while leaving the core
>> runtime full control over the basic setup of the module object's common API
>> (__file__, __name__, etc.).
>>
>> Simple extension modules should be able to get away without implementing
>> Create, so separating the two steps sounds better than requiring the module
>> instantiation on user side and providing a callback into CPython to
>> initialise it.
>>
>>
>>> while ignoring important ones (fast access to module state).
>>
>> Yes, that *is* important. And I believe that a custom module (sub)type is a
>> good way to achieve that, at least for Cython. For manually written
>> modules, it might be easier to call PyModule_GetState().
>
> Right, while I never really articulated it (not even to myself, let
> alone to Petr), I think my underlying assumption was that Cython would
> typically use Create+Exec for speed, but might offer a slower
> Exec-only option to get a more "Python-like" module behaviour that
> allowed Cython acceleration of directory, zipfile and package __main__
> modules, along with other modules intended to be executed with the
> "-m" switch.

It would be nice to extend runpy to handle Create+Exec modules. If this 
can be pulled off, there'd be no need for Exec-only modules except the 
convenience.

* module reloading is useless for extension modules – a changed version 
version can't be read from the disk, and correct reload behavior is 
another corner case for authors to think about
* loading into custom objects is cool, but if the only use case 
mentioned so far is lazy loading, I think it's safe to drop
* running as __main__ somehow taken care of

> One of the things I like about the PEP 489 design is that it should be
> general enough that Cython itself can decide what it wants to do on
> that front, without CPython needing to be aware of the details.
>
> On the capsule side of things, I think it's good to facilitate that as
> an alternative to having C extension modules link directly to each
> other, but I'm not sure it makes sense to encourage it as a way for a
> module to access its *own* state that can't readily be stored in a
> Python dictionary as a normal Python object. So perhaps the patterns
> to encourage here are:
>
> * prefer only defining Exec, with state stored as Python objects in
> the module globals
> * if you need C level global state, then you need to define Create as
> well and return a suitable object, such as a PyModule subclass, or the
> result of calling PyModule_Create with m_size > 0 in PyModuleDef
> * if you also need fast access to operations defined in other
> extension modules, prefer reading and saving references to the
> relevant capsule objects in Exec over direct C level linking at build
> time

One thing I'm not clear about: what are the advantages of a module 
subclass over a normal module with m_size>0?
It seems I'm missing something obvious here.

> (Regarding that last point, we may want to some day consider exposing
> suitable capsules for some C accelerated standard library modules,
> like _decimal, rather than expanding the C API itself to cover those
> types)
>
>>> You also pointed out interesting things we didn't think about too much
>>> (non-ASCII names, multi-module extensions).
>>
>> I just mentioned what came to my mind. We should still try to keep the PEP
>> focussed on the problem at hand, but having some idea of what else might
>> lie ahead can help with design decisions.
>
> I briefly looked into C level UTF-8 support when adding a Unicode
> literal to the org() and chr() docs (I originally had it in the
> docstring as well, and it was pointed out in review that that might
> cause problems), and I'm not sure it's possible to sensibly support
> arbitrary Unicode module names for extension modules while our
> baseline assumption at the C level is C89 compatibility. We should
> definitely aim to cope with the fact that extension module names
> *might* contain arbitrary Unicode some day, even if we don't
> officially support that yet.

Do you mean using non-ASCII characters in the literal itself?
The proposal is not to make it easy and straightforward to use UTF-8 
module names, but to make it possible. Cython can escape an UTF-8 
string. The stdlib won't need it (outside tests, where it can be 
escaped). And extension authors are not all bound to cross-platform C89 
– if someone's writing a Chinese extension, they also need Chinese 
identifiers, to they probably already require a suitable compiler.

> I thought Brett actually implemented multi-module extension support a
> while back (which this PEP would then inherit), but I can't find any
> current evidence of that change, so either my recollection is wrong,
> or my search skills are failing me :)

It's there, grep issue16421.

> While looking for such evidence, I was also reminded of the fact that
> https://docs.python.org/3/c-api/ is missing a reference section on how
> extension module importing actually works - the only current
> explanation is in the more tutorial style
> https://docs.python.org/3/extending/extending.html#the-module-s-method-table-and-initialization-function
>
> That missing reference section is a docs gap that should likely be
> fixed as part of these changes.

Yes.

>>> The "inittab" idea made me think of this:
>>>
>>> An extension could export an array of PyModuleDef, which has all the needed
>>> data for module creation and initialization:
>>
>> I remember discussing this on python-dev, it was one of the ideas in the
>> original thread that lead to the Create-Exec proto-pep:
>>
>> http://thread.gmane.org/gmane.comp.python.devel/135764/focus=140986
>>
>> I think the main counter argument at the time was that there should be a
>> way to control the module object instantiation. :)
>
> It's an interesting notion - you could export the arguments to a call
> to PyModule_Create (and/or PyModule_Create2, and/or a new different
> function that accepts a different declaration API) and have an
> entirely static module initialisation process in at least some cases.
>
> It likely makes sense as a separate follow-on PEP for 3.6 though, as
> it's a further simplification of a certain way of using Create+Exec,
> and it's not clear just how you'd handle certain combinations of
> values in the current PyModuleDef struct. PEP 489 currently deals with
> that neatly by breaking out separate helper functions for initialising
> the docstring and the module globals function table that can be called
> from either Exec or Create as appropriate.

The neatness might be more superficial than it seems. Separating Create 
and Exec has these effects:
- Allowing you to implement just one and leave the rest to default 
machinery. This is good.
- Allowing some time to pass between Create and Exec is called. This 
might be useful for lazy loading, I guess.
- Allowing the loader or third-party code to modify the object between 
Create and Exec is called. This is dangerous (for consenting adults who 
don't mind the occasional segfault).
- Allowing Exec to be called multiple times after Create, i.e. module 
reloading. I don't think there is a use case (and for module-specific 
cases it can be done in a separately exported function).
- Allowing Exec without the corresponding Create, i.e. loading into 
arbitrary objects. This is cool, and it mimics what source modules can 
do, but I'm less and less convinced that it's actually useful.

It's a lot to think about if you want to design a module that behaves 
correctly, and for some combinations it's not clear what "correctly" means.

>>> - m_name - for the "requested" name for the module (not necessarily what
>>> it'll be loaded as), for identifying modules in multi-module extensions
>>> - m_size - for requesting per-module C state)
>>> - m_reload (currently unused) would be the exec function (called for
>>> initialization and reload)
>>>
>>> This would rule out completely custom module objects, but are those needed
>>> anyway? A module can always replace itself in sys.modules if it needs extra
>>> magic. Getting rid of Create entirely supports a lot of the other goals
>>> (running user code in Create, pushing for subinterpreter support). And
>>> things like module properties or callable modules are not possible in
>>> source modules as well; perhaps those should be solved at a higher level.
>
> I'd prefer not to guess at what might be useful in this space - the
> fact that the Create hook design leaves it open to third party
> experimentation is a feature, not a bug.
>
> If particularly useful patterns emerge that we want to recommend to
> new Python extension module authors, then we can standardise them at a
> later date (just as the current PEP 489 design is designed to
> standardise particular patterns in writing extension module Init
> methods).
>
>>> With this, you couldn't load extensions into arbitrary objects. But it
>>> would be possible to load into pre-created modules, as long as they were
>>> pre-created with the correct ModuleDef. It would probably be somewhat more
>>> difficult to make runpy (or custom loading libraries) work with these
>>> extension modules, but it should be possible.
>>>
>>> Implementation-wise, having m_reload filled in from the start would help:
>>> the PEP calls for looking up two entrypoints, and the lookup is relatively
>>> expensive (judging by the amount of caching in current code).
>>>
>>> It would also help with non-ASCII names, since the name is a string rather
>>> than a C identifier. Entrypoint and file names would need some design to
>>> make everything work. But before I go thinking about that: Does this seem
>>> like a better direction than Create/Exec?
>>
>> It's still an alternative, I think. Nick objected to extending PyModuleDef
>> because it's (obviously) part of the stable ABI. But we could instead
>> export a new struct that *contains* a PyModuleDef, with additional callback
>> functions like "new(spec)", as known from other extension types (tp_new).
>> That would give us the Create() functionality (if set to non-NULL), or
>> allow CPython to instantiate a regular module object (if set to NULL).
>>
>> With a magic version field at the top of the struct, this would also make
>> it easy to extend in the future if we ever need more metadata or callbacks
>> that we can't foresee now. Updating the version magic and appending to the
>> struct is so much easier than writing a new PEP and redesigning the entire
>> extension module init process again...
>>
>> So, yes, exporting a struct with module metadata and callbacks sounds like
>> a very generic and straight forward interface to me.
>
> Unless such an API is very carefully designed, it would be easy to
> fall into the trap of creating an API along the lines of the way C
> level extension class definitions were traditionally defined. That's a
> pretty horrible user experience if you're defining them by hand, which
> is why a lot of folks tend to cargo cult an existing class definition
> (and fill in the pieces they need), or else let something like Cython,
> SWIG or Boost deal with the problem for them. One of the biggest
> hassles with using a static struct is that you end up with a lot of
> cryptic padding to cover the slots you don't care about in order to
> get to the slots you actually do care about.

*sigh*
Yeah, I'm very much looking forward to the day Python moves to C99, and 
everyone can use designated initializers.

> The API design for defining types through the stable ABI
> (https://www.python.org/dev/peps/pep-0384/#type-objects), which was
> designed with the benefit of years of experience with the old
> approach, is much nicer, as the NULL-terminated list of named slots
> lets you only worry about the slots you care about, and the
> interpreter takes care of everything else.

Well, if we end up needing to extend PyModuleDef, let's use slots.

The idea of extending ModuleDef brings me back to the runpy problem. I 
don't think it's actually necessary for "-m" to mean "exec the module in 
an object named "__main__". Let's provide a slot for a main function, 
and have runpy call that.
This would mean in Cython modules the "if __name__ == "__main__" hack 
won't work, ever (as opposed to that being a bug this PEP can help fix). 
Is that an acceptable loss?
(Maybe my next PEP should be letting Python modules define a 
__main__function, and slowly deprecating the things runpy needs to do.)

Another possible extension is hooks for resources. Imagine using Cython 
like zipapp, to pack an entire app including extensions into one file.

> With the current design of PEP 489, the idea is that if you don't
> really care about the module object, you just define Exec, and the
> interpreter gives you a standard Python level module object. All your
> global state still gets stored as Python objects, and you just get the
> "C execution model with the Python data model" development experience
> which is actually quite a nice environment to program in.
>
> However, if you want straighforward access to the C *data* model at
> runtime as well as its execution model, then you can define Create and
> use the existing PyModule_Create APIs, or (as a new feature) a custom
> module subclass or a completely custom type, to define how your module
> state is stored.

The problem is that to add C data, you'd either need to define an whole 
extra hook, or jump through inefficient PyCapsule hoops on every access. 
I worry that module authors will just take the path of least resistance, 
and use static data. I think it's substantially better to say "use 
sizeof(mydata) instead of 0, and use this fast function/macro to get at 
your data".

> That two level approach gives you all the same flexibility you have
> today by defining a custom Init hook (and more), but also lets you opt
> out of learning most of the details of the C data model if all you're
> really after is faster low level manipulation of data stored in Python
> objects.

A module def array additionally gives:
- support for non-ASCII module names
- a catalog of the modules the extension contains
but you can't use custom module subclasses -- unless a create slot is 
added to the module def. (Or you can replace the sys.modules entry -- I 
believe the overhead of a wasted empty module object is negligible.)