[Python-Dev] PEP 451 update

Nick Coghlan ncoghlan at gmail.com
Sun Oct 27 04:44:11 CET 2013


On 27 October 2013 03:27, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 26 October 2013 08:51, PJ Eby <pje at telecommunity.com> wrote:
>> Mostly, this just seems like an ugly wart -- Python should be dynamic
>> by default, and that includes reloading.  While the import machinery
>> has lots of ugly caching under the hood, a user-level function like
>> reload() should not require you to do the equivalent of saying, "no,
>> really...  I want you to *really* reload, not just pull in whatever
>> exists where you found it last time, while ignoring whether I switched
>> from module to package or vice versa, or just fixed my sys.path so I
>> can load the right version of the module."
>>
>> It is a really tiny thing in the overall scheme of things, because
>> reload() is not used all that often, but it's still a thing.  If this
>> isn't treated as a bug, then the docs for reload() at least need to
>> include a forward-supported workaround so you can say "no, really...
>> *really* reload" in an approved fashion.
>>
>> (ISTM that any production code out there that currently uses reload()
>> would want to perform the "really reload" incantation in order to
>> avoid the edge cases, even if they haven't actually run into any of
>> them yet.)
>
> Having imp.reload() kick off the full reload cycle makes sense to me.
> The only reason it doesn't currently do that in importlib is because
> there was no test for it in the regression test suite, and PEP 302 and
> the library reference are entirely vague on how module reloading works
> (aside from the requirement that it reuse the same module namespace).

I've been thinking about this, and I've come to the conclusion that:

1. The old __name__ based behaviour was broken (since __name__ may not
be the actual module name, as in "__main__" or when a pseudo-module is
lying about its identity)
2. The 3.3 behaviour is still broken for similar reasons *and* added
the dependency on __loader__
3. Even PEP 451 still underspecifies the semantics of reloading (e.g.
it's implied create_module isn't called for reload(), but not stated
explicitly)

Accordingly, I think we should add a "How Reloading Will Work"
section, akin to the existing "How Loading Will Work"
(http://www.python.org/dev/peps/pep-0451/#how-loading-will-work). We
may also want to spit out some import warnings for edge cases that are
likely to do the wrong thing.

Side note: in "How Loading Will Work", _init_module_attrs needs to be
passed both the module *and* the spec (current text only passes it the
module). It also needs to be updated to handle the namespace package
case (as shown below)

And a sketch of some possible fully specified reload semantics:

    def reload(module):
        # Get the name to reload from the spec if it is available,
otherwise __name__
        current_spec = getattr(module, "__spec__")
        current_name = module.__name__
        name_to_reload = current_name if current_spec is None else
current_spec.name
        # Check this module is properly imported before trying to reload it
        loaded_module = sys.modules.get(name_to_reload)
        if loaded_module is not module:
            msg = "module {} not in sys.modules"
            raise ImportError(msg.format(name_to_reload), name=name_to_reload)
        parent_name = name_to_reload.rpartition('.')[0]
        while parent_name:
            if parent_name not in sys.modules:
                msg = "parent {!r} not in sys.modules"
                raise ImportError(msg.format(parent_name), name=parent_name)
            parent_name = parent_name.rpartition('.')[0]
        # Check for a modified spec (e.g. if the import hooks have changed)
        updated_spec = importlib.find_spec(name_to_reload)
        # The import-related module attributes get updated here
        # We leave __name__ alone, so reloading __main__ has some hope
of working correctly
        _init_module_attrs(loaded_module, updated_spec, set_name=False)

        # Now re-execute the module
        loader = updated_spec.loader
        if loader is None:
            # Namespace package, already reinitialised above
            return loaded_module
        elif hasattr(loader, 'exec_module'):
            loader.exec_module(module)
        else:
            loader.load_module(name_to_reload)

        # Reloading modules that replace themselves in sys.modules doesn't work
        reloaded_module = sys.modules[name_to_reload]
        if reloaded_module is not loaded_module:
            # Perhaps this should initially be a DeprecationWarning?
            # We've never actively *prevented* this case before, even
though it is
            # almost guaranteed to break things if you do it.
Alternatively, we could
            # leave "imp.reload()" as the more permissive version, and
include this
            # check in the new importlib.reload() (with imp.reload
calling an internal
            # importlib._unchecked_reload() instead)
            msg = "Module {!r} was replaced in sys.modules during reload"
            raise ImportError(msg.format(name_to_reload), name=name_to_reload)
        return reloaded_module

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list