[Import-SIG] Thoughts on cleaner reloading support

Nick Coghlan ncoghlan at gmail.com
Mon Sep 2 15:53:44 CEST 2013


The extension module discussion on python-dev got me thinking about
the different ways in which the "singleton" assumption for modules can
be broken, and how to ensure that extension modules play nicely in
that environment.

As I see it, there are 4 ways the "singleton that survives for the
lifetime of the process following initial import" assumption regarding
modules can turn out to be wrong:

1. In-place reload, overwriting the existing contents of a namespace.
imp.reload() does this. We sort of do it for __main__, except we
usually keep re-using that namespace to run *different* things, rather
than rerunning the same code.

2. Parallel loading. We remove the existing module from sys.modules
(keeping a reference to it alive), and load a second copy.
Alternatively, we call the loader APIs directly. Either way, we end up
with two independent copies of the "same" module, potentially
reflecting difference system states at the time of execution.

3. Subinterpreter support. Quite similar to parallel loading, but
we're loading the second copy because we're in a subinterpreter and
can't see the original.

4. Unloading. We remove the existing module from sys.modules and drop
all other references to it. The module gets destroyed, and we later
import a completely fresh copy.

Even pure Python modules may not support these, since they may have
side effects, or assume they're in the main interpreter, or other
things. Currently, there is no way to signal this to the import
system, so we're left with implicit misbehaviour when we attempt to
reload the modules with global side effects.

For a while, I was thinking we could design the import system to "just
figure it out", but now I'm thinking a selection of read/write
properties on spec objects may make more sense:

    allow_reload
    allow_unload
    allow_reimport
    allow_subinterpreter_import

These would all default to True, but loaders and modules could
selectively turn them off.

They would also be advisory rather than enforced via all possible
import state manipulation mechanisms. New functions in importlib.util
could provide easier alternatives to directly manipulating
sys.modules:

- importlib.util.reload (replacement for imp.reload that checks the
spec allows reloading)
- importlib.util.unload (replacement for "del
sys.modules[module.__name__]" that checks the spec allows unloading,
and also unloads all child modules)
- importlib.util.reimport (replacement for
test.support.import_fresh_module that checks the spec of any existing
sys.module entry allows reimporting a parallel copy)

One of these is not like the others... aside from the existing
extension module specific mechanism defined in PEP 3121, I'm not sure
we can devise a general *loader* level API to force imports for a
particular name to fail in a subinterpreter. So this concern probably
needs to be ignored in favour of a possible future C API level
solution.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Import-SIG mailing list