[Import-SIG] PEP 451 (ModuleSpec) round 3

Nick Coghlan ncoghlan at gmail.com
Sun Sep 1 14:47:56 CEST 2013


On 1 September 2013 15:36, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Sat, Aug 31, 2013 at 7:53 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> Yep, that's exactly what I had in mind, although reload would also have an
>> extra check to ensure the module returned was the same as the one passed in
>> (that way, in-place reloading support for custom loaders that define
>> create_module would always be opt-in rather than opt-out).
>>
>> Talking to Stefan about making this work on the extension module API side
>> has confirmed my belief that this is the way to go, since it also deals
>> nicely with placing custom objects in sys.modules.
>>
>> The one downside is that it means preconditions will be checked twice in
>> the reload case (once in prepare, once in exec), but I can live with that
>> for the likely reliability gains in the reloading API.
>>
>> If it works as well as I hope, I may finally be comfortable with proposing
>> "imp.reload_fresh" for 3.5 :)
>>
>> Cheers,
>> Nick.
>
> I haven't had a chance to respond to a few comments in detail yet, nor have
> I been able to read through the extension module API threads, but what I
> have seen has gotten me thinking about what exactly matters here regarding
> preparing and executing modules.
>
> * There are two kinds of module state: internal (not exposed in Python) and
> external (module.__dict__).
> * External state is associated with a module object, but internal state is
> not--it may be associated with a module name, a location (locatable
> resource), or something else.
>
> * Internal state (if it exists) is not necessarily created at execution
> (load/reload) time and may be shared between modules.
> * The internal module state may or may not be managed by the interpreter
> (regardless of Python implementation).
>
> * External module state is established at import execution time (load or
> reload).
> * Loading puts the external state into a new module and reloading into an
> existing one (likely overwriting at least some contents).
> * During module execution (during load/reload), external module state is
> copied from internal state, dynamically generated (e.g. .py files), or a mix
> of both.
> * Dynamic external state generation is only allowed once for some modules.
> * Dynamic generation is not necessarily (but sometimes is) an idempotent
> operation.
> * Dynamic generation may be associated with a locatable resource.
> * Non-locatable sources are not necessarily unchanging.
>
> * Loaders are in charge of managing module execution and may be involved
> with managing internal state.
>
> The life-cycle of module state, both internal and external, is pretty
> congruent with objects in Python:
>
> 1. create
> 2. init
> 3. modify
> 4. destroy

I'd tweak this slightly, and say that modules are more congruent with
*class namespaces* than they are with ordinary objects (which is why I
chose "prepare" as a suggested alternative to "create"). The main
difference is that we don't support reinitialising a class namespace
in place, while we do support doing so for modules.

That makes the lifecycle:

1. prepare
2. exec
3. modify
4. destroy

> Modules have some special cases that fit in there:
>
> 2a. dynamically generate
> 2b. populate from another namespace
> 3a/2b. reset
>
> * In the generation case, population may happen simultaneously (e.g. .py
> files).

I don't quite understand what you mean by "generate" here, unless you
mean the fact that "init" for a module usually involves running
arbitrary user provided code in that namespace. If so, well that's why
I think "exec" is a better name for it than "init" :)

> * Resetting a module's state may not be the same operation as init.
>
> Am I missing anything in all of this?
>
> ---
>
> Some questions that come to mind:
>
> * Should loaders cover all the permutations of module state and it
> lifecycle?  Our proposed APIs are moving in that direction.  Are they
> enough?
> * When does internal state get generated and how is it managed?  Should
> loaders be the official liaison for the import system?  Python
> implementation extension module APIs cover this somewhat (particularly for
> CPython).

Take a look at the discussion between Stefan Behnel and I on
python-dev. The loaders should really only care about the Python
visible state. PEP 3121 was a useful evolution of the extension module
design, but ultimately failed in its aims by adding an additional kind
of hidden state, rather than using the existing mechanisms for adding
hidden state to extension types and instances.

The result is that both Stefan and I now agree that references to
hidden state from extension modules should be maintained directly on
the objects exposed as the module's externally visible state. This
includes exposing already bound instance methods of a hidden state
object rather than ordinary functions for any top level callables, as
well as including a reference to the hidden state in custom type
definitions (which they may then optionally transfer to instances for
fewer indirections when accessing the hidden state, at the cost of an
extra pointer per instance).

The advantage of this approach is that it avoids needing a custom
mechanism to allow the module to get access to its hidden state -
instead, all modules, including extension modules, are expected to
ensure that module level APIs always have direct access to any
internal state they need, rather than relying on C static variables or
a hidden storage area like that provided by PEP 3121.

> * Should the language provide a non-implementation-specific API for
> associating internal APIs with modules?  (PEP 3121-ish)

No.

> * Does reset deserve its own explicit API?

How does reset differ from reload?

> * How do you keep init from happening more than once?  IOW, what happens
> when ModuleSpec.create() is called more than once?

create() should either be idempotent, use the PEP 3121 APIs to
implicitly return the same object, or else throw an error if it
detects it has already been initialised. This is up to the loader,
though, rather than being the responsibility of the import system
(although we should document the three options).

> I have more questions but they mostly line up with the details above.
>
> Anyway, there are the things I am mulling over.  For PEP 451 I'm not going
> to try to accomplish module API perfection, but I do want to make sure we're
> on the right track with a more explicit perspective.  The confusion with
> create_module() and exec_module() made it clear to me that the picture
> should be more clear before we start

I definitely recommend the thread on python-dev :)

The messages from the last couple of days are probable enough (start
with http://mail.python.org/pipermail/python-dev/2013-September/128244.html),
but if you want more context that thread actually started in August.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Import-SIG mailing list