[Import-SIG] PEP 489: Multi-phase extension module initialization; version 5

Wed May 20 16:56:33 CEST 2015

On Wed, May 20, 2015 at 4:55 AM, Petr Viktorin <encukou at gmail.com> wrote:
> On 05/20/2015 02:22 AM, Eric Snow wrote:
>> On Tue, May 19, 2015 at 5:06 AM, Petr Viktorin <encukou at gmail.com> wrote:
  [snip]
>>> No, that won't work. It's possible (via direct calls to the import
>>> machinery) to load a module without adding it to sys.modules.
>>
>> What direct calls do you mean?  I would not expect any such mechanism
>> to work properly with extension modules.
>
> Reimplement
> <https://www.python.org/dev/peps/pep-0451/#how-loading-will-work>
> without the sys.modules parts.

You mean someone could do so?  Sure, they could.  Python has a
philosophy of not stopping you from doing what is usually the wrong
thing because sometimes it is the right thing for you.  As we say,
we're all consenting adults.

In this case, we expect that folks will use the import system (or
importlib) to import modules.  If they do it manually then they are
responsible to satisfy the semantics of the import system or risk
bugs.  One of the key goals of PEP 451 was to leave certain semantics
up to the import machinery rather than requiring all finder/loader
authors to implement the behavior.  This includes a number of tricky
parts like the sys.modules handling.

> The point is that exec_module doesn't a priori depend on the module
> being in sys.modules, which I think is a good thing.

Well, there's an explicit specification about how sys.modules is used
during loading.  For post-exec sys.modules lookup specifically,
https://docs.python.org/3.5//reference/import.html#id2.  The note in
the language reference says that it is an implementation detail.
However, keep in mind that this PEP is a CPython-specific proposal.

That said, I'm only -0 on not matching the sys.modules lookup behavior
of module loading.  It could be okay if we were to document the
behavior clearly.  My concern is with having different semantics even
if it only relates to a remote corner case.  It may be a corner case
that someone will rely on.

  [snip]
>> Be that as it may, I think it would be a mistake to treat support for
>> multiple exec slots as a second-class citizen in the design.
>> Personally I find it an appealing feature.
>
> It's there, but I'll not not advertise it too much in the docs.

I'm okay with that.  It's not like we're precluding promoting the
behavior later. :)

  [snip]
>>> Still, the steps are processed in a loop from a single function
>>> (PyModule_ExecDef), and that function operates on a module object -- it
>>> doesn't know about sys.modules and can't easily check if you replaced
>>> the module somewhere.
>>
>> I would consider this approach to be a mistake as well.  The approach
>> should stay consistent with the semantics of the whole import system,
>> where sys.modules is checked directly.  Unfortunately, that ship has
>> already sailed.
>
> It's the loader that checks sys.modules, *after* exec_module is called.

Not the loader.  It's the import machinery that does it.  See
importlib._bootstrap._exec.

> No other implementation of exec_module checks sys.modules in the middle
> of its operation. So I think the semantics are consistent.

I was thinking of each exec slot as a parallel to Loader.exec_module.
Thus I was expecting the same sys.modules lookup behavior that you get
during module loading.  That's why I would expect the module to get
updated to sys.modules[spec.name] after each exec slot runs.

At the moment I'm still -0 on not matching the sys.modules lookup
semantics.  However, like I said above, I can be convinced otherwise.

  [snip]
>> I'm simply thinking in terms of the options we have for a PEP I'm
>> working on that will facilitate passing objects between
>> subinterpreters and even possibly sharing some state between them.
>> Currently it will be practically necessary to exclude extension
>> modules from any such mechanism.  So I was wondering if there would be
>> a way to allow extension module authors to define how at least some of
>> the module's data could be shared between subinterpreters.
>
> You should be able to put that info in slots. It's hard to speculate
> without knowing specifics, though.

I'm sure you're right about slots so we should be fine.  We can cross
the bridge later. :)

[snip]
>> As I just noted, I'm looking at making use of subinterpreters for a
>> different use case where it *does* make sense to effectively share
>> objects between them.
>
> OK. This PEP isn't designed for that, but it should offer enough
> extensibility.

Right.

  [snip]
>>> The internal _imp module will have backwards incompatible changes --
>>> functions will be added and removed as necessary. That's what the
>>> underscore means :)
>>
>> Be careful with that assumption.  We've had plenty of experiences
>> where the assumption because unreliable.
>
> That's why I provide backcompat shims for undocumented, deprecated
> functions in "imp". But _imp is just too low-level to do that easily.

I'm okay with that, particularly since the _imp module is relatively new.

-eric