[Import-SIG] PEP 451 (ModuleSpec) round 3

Eric Snow ericsnowcurrently at gmail.com
Wed Aug 28 10:50:55 CEST 2013


I've incorporated the feedback into the PEP and gave up on trying to
re-purpose Finder.find_module() (which wasn't worth it).  Let me know what
you think.  I'll have the implementation up on
http://bugs.python.org/issue18864 in the next couple days.

-eric

----------------------------------------------------------------------------------------

PEP: 451
Title: A ModuleSpec Type for the Import System
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Discussions-To: import-sig at python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 8-Aug-2013
Python-Version: 3.4
Post-History: 8-Aug-2013
              28-Aug-2013
Resolution:


Abstract
========

This PEP proposes to add a new class to ``importlib.machinery`` called
``ModuleSpec``.  It will be authoritative for all the import-related
information about a module, and will be available without needing to
load the module first.  Finders will provide a module's spec instead of
a loader.  The import machinery will be adjusted to take advantage of
module specs, including using them to load modules.


Motivation
==========

The import system has evolved over the lifetime of Python.  In late 2002
PEP 302 introduced standardized import hooks via ``finders`` and
``loaders`` and ``sys.meta_path``.  The ``importlib`` module, introduced
with Python 3.1, now exposes a pure Python implementation of the APIs
described by PEP 302, as well as of the full import system.  It is now
much easier to understand and extend the import system.  While a benefit
to the Python community, this greater accessibilty also presents a
challenge.

As more developers come to understand and customize the import system,
any weaknesses in the finder and loader APIs will be more impactful.  So
the sooner we can address any such weaknesses the import system, the
better...and there are a couple we can take care of with this proposal.

Firstly, any time the import system needs to save information about a
module we end up with more attributes on module objects that are
generally only meaningful to the import system and occasionally to some
people.  It would be nice to have a per-module namespace to put future
import-related information.  Secondly, there's an API void between
finders and loaders that causes undue complexity when encountered.

Currently finders are strictly responsible for providing the loader
which the import system will use to load the module.  The loader is then
responsible for doing some checks, creating the module object, setting
import-related attributes, "installing" the module to ``sys.modules``,
and loading the module, along with some cleanup.  This all takes place
during the import system's call to ``Loader.load_module()``.  Loaders
also provide some APIs for accessing data associated with a module.

Loaders are not required to provide any of the functionality of
``load_module()`` through other methods.  Thus, though the import-
related information about a module is likely available without loading
the module, it is not otherwise exposed.

Furthermore, the requirements assocated with ``load_module()`` are
common to all loaders and mostly are implemented in exactly the same
way.  This means every loader has to duplicate the same boilerplate
code.  ``importlib.util`` provides some tools that help with this, but
it would be more helpful if the import system simply took charge of
these responsibilities.  The trouble is that this would limit the degree
of customization that ``load_module()`` facilitates.  This is a gap
between finders and loaders which this proposal aims to fill.

Finally, when the import system calls a finder's ``find_module()``, the
finder makes use of a variety of information about the module that is
useful outside the context of the method.  Currently the options are
limited for persisting that per-module information past the method call,
since it only returns the loader.  Popular options for this limitation
are to store the information in a module-to-info mapping somewhere on
the finder itself, or store it on the loader.

Unfortunately, loaders are not required to be module-specific.  On top
of that, some of the useful information finders could provide is
common to all finders, so ideally the import system could take care of
that.  This is the same gap as before between finders and loaders.

As an example of complexity attributable to this flaw, the
implementation of namespace packages in Python 3.3 (see PEP 420) added
``FileFinder.find_loader()`` because there was no good way for
``find_module()`` to provide the namespace search locations.

The answer to this gap is a ``ModuleSpec`` object that contains the
per-module information and takes care of the boilerplate functionality
of loading the module.

(The idea gained momentum during discussions related to another PEP.[1])


Specification
=============

The goal is to address the gap between finders and loaders while
changing as little of their semantics as possible.  Though some
functionality and information is moved to the new ``ModuleSpec`` type,
their behavior should remain the same.  However, for the sake of clarity
the finder and loader semantics will be explicitly identified.

This is a high-level summary of the changes described by this PEP.  More
detail is available in later sections.

importlib.machinery.ModuleSpec (new)
------------------------------------

Attributes:

* name - a string for the name of the module.
* loader - the loader to use for loading and for module data.
* origin - a string for the location from which the module is loaded.
* submodule_search_locations - strings for where to find submodules,
  if a package.
* loading_info - a container of data for use during loading (or None).
* cached (property) - a string for where the compiled module will be
  stored.
* is_location (RO-property) - the module's origin refers to a location.

.. XXX Find a better name than loading_info?
.. XXX Add ``submodules`` (RO-property) - returns possible submodules
   relative to spec (or None)?
.. XXX Add ``loaded`` (RO-property) - the module in sys.modules, if any?

Factory Methods:

* from_file_location() - factory for file-based module specs.
* from_module() - factory based on import-related module attributes.
* from_loader() - factory based on information provided by loaders.

.. XXX Move the factories to importlib.util or make class-only?

Instance Methods:

* init_module_attrs() - populate a module's import-related attributes.
* module_repr() - provide a repr string for a module.
* create() - provide a new module to use for loading.
* exec() - execute the spec into a module namespace.
* load() - prepare a module and execute it in a protected way.
* reload() - re-execute a module in a protected way.

.. XXX Make module_repr() match the spec (BC problem?)?

API Additions
-------------

* ``importlib.abc.Loader.exec_module()`` will execute a module in its
  own namespace, replacing ``importlib.abc.Loader.load_module()``.
* ``importlib.abc.Loader.create_module()`` (optional) will return a new
  module to use for loading.
* Module objects will have a new attribute: ``__spec__``.
* ``importlib.find_spec()`` will return the spec for a module.
* ``__subclasshook__()`` will be implemented on the importlib ABCs.

.. XXX Do __subclasshook__() separately from the PEP (issue18862).

API Changes
-----------

* Import-related module attributes will no longer be authoritative nor
  used by the import system.
* ``InspectLoader.is_package()`` will become optional.

.. XXX module __repr__() will prefer spec attributes?

Deprecations
------------

* ``importlib.abc.MetaPathFinder.find_module()``
* ``importlib.abc.PathEntryFinder.find_module()``
* ``importlib.abc.PathEntryFinder.find_loader()``
* ``importlib.abc.Loader.load_module()``
* ``importlib.abc.Loader.module_repr()``
* The parameters and attributes of the various loaders in
  ``importlib.machinery``
* ``importlib.util.set_package()``
* ``importlib.util.set_loader()``
* ``importlib.find_loader()``

Removals
--------

* ``importlib.abc.Loader.init_module_attrs()``
* ``importlib.util.module_to_load()``

Other Changes
-------------

* The spec for the ``__main__`` module will reflect the appropriate
  name and origin.
* The module type's ``__repr__`` will defer to ModuleSpec exclusively.

Backward-Compatibility
----------------------

* If a finder does not define ``find_spec()``, a spec is derived from
  the loader returned by ``find_module()``.
* ``PathEntryFinder.find_loader()`` will be used, if defined.
* ``Loader.load_module()`` is used if ``exec_module()`` is not defined.
* ``Loader.module_repr()`` is used by ``ModuleSpec.module_repr()`` if it
  exists.

What Will not Change?
---------------------

* The syntax and semantics of the import statement.
* Existing finders and loaders will continue to work normally.
* The import-related module attributes will still be initialized with
  the same information.
* Finders will still create loaders, storing them in the specs.
* ``Loader.load_module()``, if a module defines it, will have all the
  same requirements and may still be called directly.
* Loaders will still be responsible for module data APIs.


ModuleSpec Users
================

``ModuleSpec`` objects has 3 distinct target audiences: Python itself,
import hooks, and normal Python users.

Python will use specs in the import machinery, in interpreter startup,
and in various standard library modules.  Some modules are
import-oriented, like pkgutil, and others are not, like pickle and
pydoc.  In all cases, the full ``ModuleSpec`` API will get used.

Import hooks (finders and loaders) will make use of the spec in specific
ways, mostly without using the ``ModuleSpec`` instance methods.  First
of all, finders will use the factory methods to create spec objects.
They may also directly adjust the spec attributes after the spec is
created.  Secondly, the finder may bind additional information to the
spec for the loader to consume during module creation/execution.
Finally, loaders will make use of the attributes on a spec when creating
and/or executing a module.

Python users will be able to inspect a module's ``__spec__`` to get
import-related information about the object.  Generally, they will not
be using the ``ModuleSpec`` factory methods nor the instance methods.
However, each spec has methods named ``create``, ``exec``, ``load``, and
``reload``.  Since they are so easy to access (and misunderstand/abuse),
their function and availability require explicit consideration in this
proposal.


What Will Existing Finders and Loaders Have to Do Differently?
==============================================================

Immediately?  Nothing.  The status quo will be deprecated, but will
continue working.  However, here are the things that the authors of
finders and loaders should change relative to this PEP:

* Implement ``find_spec()`` on finders.
* Implement ``exec_module()`` on loaders, if possible.

The factory methods of ``ModuleSpec`` are intended to be helpful for
converting existing finders.  ``from_loader()`` and
``from_file_location()`` are both straight-forward utilities in this
regard.  In the case where loaders already expose methods for creating
and preparing modules, a finder may use ``ModuleSpec.from_module()`` on
a throw-away module to create the appropriate spec.

As for loaders, ``exec_module()`` should be a relatively direct
conversion from a portion of the existing ``load_module()``.  However,
``Loader.create_module()`` will also be necessary in some uncommon
cases.  Furthermore, ``load_module()`` will still work as a final option
when ``exec_module()`` is not appropriate.


How Loading Will Work
=====================

This is an outline of what happens in ``ModuleSpec.load()``.

1. A new module is created by calling ``spec.create()``.

   a. If the loader has a ``create_module()`` method, it gets called.
      Otherwise a new module gets created.
   b. The import-related module attributes are set.

2. The module is added to sys.modules.
3. ``spec.exec(module)`` gets called.

   a. If the loader has an ``exec_module()`` method, it gets called.
      Otherwise ``load_module()`` gets called for backward-compatibility
      and the resulting module is updated to match the spec.

4. If there were any errors the module is removed from sys.modules.
5. If the module was replaced in sys.modules during ``exec()``, the one
   in sys.modules is updated to match the spec.
6. The module in sys.modules is returned.

These steps are exactly what ``Loader.load_module()`` is already
expected to do.  Loaders will thus be simplified since they will only
need to implement the portion in step 3a.


ModuleSpec
==========

This is a new class which defines the import-related values to use when
loading the module.  It closely corresponds to the import-related
attributes of module objects.  ``ModuleSpec`` objects may also be used
by finders and loaders and other import-related APIs to hold extra
import-related state concerning the module.  This greatly reduces the
need to add any new new import-related attributes to module objects, and
loader ``__init__`` methods will no longer need to accommodate such
per-module state.

General Notes
-------------

* The spec for each module instance will be unique to that instance even
  if the information is identical to that of another spec.
* A module's spec is not intended to be modified by anything but
  finders.

Creating a ModuleSpec
---------------------

**ModuleSpec(name, loader, *, origin=None, is_package=None)**

.. container::

   ``name``, ``loader``, and ``origin`` are set on the new instance
   without any modification.  If ``is_package`` is not passed in, the
   loader's ``is_package()`` gets called (if available), or it defaults
   to `False`.  If ``is_package`` is true,
   ``submodule_search_locations`` is set to a new empty list.  Otherwise
   it is set to None.

   Other attributes not listed as parameters (such as ``package``) are
   either read-only dynamic properties or default to None.

**from_filename(name, loader, *, filename=None,
submodule_search_locations=None)**

.. container::

   This factory classmethod allows a suitable ModuleSpec instance to be
   easily created with extra file-related information.  This includes
   the values that would be set on a module as ``__file__`` or
   ``__cached__``.

   ``is_location`` is set to True for specs created using
   ``from_filename()``.

**from_module(module, loader=None)**

.. container::

   This factory is used to create a spec based on the import-related
   attributes of an existing module.  Since modules should already have
   ``__spec__`` set, this method has limited utility.

**from_loader(name, loader, *, origin=None, is_package=None)**

.. container::

   A factory classmethod that returns a new ``ModuleSpec`` derived from
   the arguments.  ``is_package`` is used inside the method to indicate
   that the module is a package.  If not explicitly passed in, it falls
   back to using the result of the loader's ``is_package()``, if
   available.  If not available, if defaults to False.

   In contrast to ``ModuleSpec.__init__()``, which takes the arguments
   as-is, ``from_loader()`` calculates missing values from the ones
   passed in, as much as possible.  This replaces the behavior that is
   currently provided by several ``importlib.util`` functions as well as
   the optional ``init_module_attrs()`` method of loaders.  Just to be
   clear, here is a more detailed description of those calculations::

      If not passed in, ``filename`` is to the result of calling the
      loader's ``get_filename()``, if available.  Otherwise it stays
      unset (``None``).

      If not passed in, ``submodule_search_locations`` is set to an empty
      list if ``is_package`` is true.  Then the directory from ``filename``
      is appended to it, if possible.  If ``is_package`` is false,
      ``submodule_search_locations`` stays unset.

      If ``cached`` is not passed in and ``filename`` is passed in,
      ``cached`` is derived from it.  For filenames with a source suffix,
      it set to the result of calling
      ``importlib.util.cache_from_source()``.  For bytecode suffixes (e.g.
      ``.pyc``), ``cached`` is set to the value of ``filename``.  If
      ``filename`` is not passed in or ``cache_from_source()`` raises
      ``NotImplementedError``, ``cached`` stays unset.

      If not passed in, ``origin`` is set to ``filename``.  Thus if
      ``filename`` is unset, ``origin`` stays unset.


Attributes
----------

Each of the following names is an attribute on ``ModuleSpec`` objects.
A value of ``None`` indicates "not set".  This contrasts with module
objects where the attribute simply doesn't exist.

While ``package`` is a read-only property, the remaining attributes can
be replaced after the module spec is created and even after import is
complete.  This allows for unusual cases where directly modifying the
spec is the best option.  However, typical use should not involve
changing the state of a module's spec.

Most of the attributes correspond to the import-related attributes of
modules.  Here is the mapping, followed by a description of the
attributes.  The reverse of this mapping is used by
``ModuleSpec.init_module_attrs()``.

========================== ===========
On ModuleSpec              On Modules
========================== ===========
name                       __name__
loader                     __loader__
package                    __package__
origin                     __file__*
cached                     __cached__*
submodule_search_locations __path__**
loading_info                \-
has_location (RO-property)  \-
========================== ===========

\* Only if ``is_location`` is true.
\*\* Only if not None.

**name**

.. container::

   The module's fully resolved and absolute name.  It must be set.

**loader**

.. container::

   The loader to use during loading and for module data.  These specific
   functionalities do not change for loaders.  Finders are still
   responsible for creating the loader and this attribute is where it is
   stored.  The loader must be set.

**origin**

.. container::

   A string for the location from which the module originates.  Aside from
   the informational value, it is also used in ``module_repr()``.

   The module attribute ``__file__`` has a similar but more restricted
   meaning.  Not all modules have it set (e.g. built-in modules).  However,
   ``origin`` is applicable to essentially all modules.  For built-in
   modules it would be set to "built-in".

Secondary Attributes
--------------------

Some of the ``ModuleSpec`` attributes are not set via arguments when
creating a new spec.  Either they are strictly dynamically calculated
properties or they are simply set to None (aka "not set").  For the
latter case, those attributes may still be set directly.

**package**

.. container::

   A dynamic property that gives the name of the module's parent.  The
   value is derived from ``name`` and ``is_package``.  For packages it is
   the value of ``name``.  Otherwise it is equivalent to
   ``name.rpartition('.')[0]``.  Consequently, a top-level module will have
   the empty string for ``package``.

**has_location**

.. container::

   Some modules can be loaded by reference to a location, e.g. a filesystem
   path or a URL or something of the sort.  Having the location lets you
   load the module, but in theory you could load that module under various
   names.

   In contrast, non-located modules can't be loaded in this fashion, e.g.
   builtin modules and modules dynamically created in code.  For these, the
   name is the only way to access them, so they have an "origin" but not a
   "location".

   This attribute reflects whether or not the module is locatable.  If it
   is, ``origin`` must be set to the module's location and ``__file__``
   will be set on the module.  Furthermore, a locatable module is also
   cacheable and so ``__cached__`` is tied to ``has_location``.

   The corresponding module attribute name, ``__file__``, is somewhat
   inaccurate and potentially confusion, so we will use a more explicit
   combination of ``origin`` and ``has_location`` to represent the same
   information.  Having a separate ``filename`` is unncessary since we have
   ``origin``.

**cached**

.. container::

   A string for the location where the compiled code for a module should be
   stored.  PEP 3147 details the caching mechanism of the import system.

   If ``has_location`` is true, this location string is set on the module
   as ``__cached__``.  When ``from_filename()`` is used to create a spec,
   ``cached`` is set to the result of calling
   ``importlib.util.source_to_cache()``.

   ``cached`` is not necessarily a file location.  A finder or loader may
   store an alternate location string in ``cached``.  However, in practice
   this will be the file location dicated by PEP 3147.

**submodule_search_locations**

.. container::

   The list of location strings, typically directory paths, in which to
   search for submodules.  If the module is a package this will be set to
   a list (even an empty one).  Otherwise it is ``None``.

   The corresponding module attribute's name, ``__path__``, is relatively
   ambiguous.  Instead of mirroring it, we use a more explicit name that
   makes the purpose clear.

**loading_info**

.. container::

   A finder may set ``loading_info`` to any value to provide additional
   data for the loader to use during loading.  A value of ``None`` is the
   default and indicates that there is no additional data.  Otherwise it is
   likely set to some containers, such as a ``dict``, ``list``, or
   ``types.SimpleNamespace`` containing the relevant extra information.

   For example, ``zipimporter`` could use it to pass the zip archive name
   to the loader directly, rather than needing to derive it from ``origin``
   or create a custom loader for each find operation.

Methods
-------

**module_repr()**

.. container::

   Returns a repr string for the module, based on the module's import-
   related attributes and falling back to the spec's attributes.  The
   string will reflect the current output of the module type's
   ``__repr__()``.

   The module type's ``__repr__()`` will use the module's ``__spec__``
   exclusively.  If the module does not have ``__spec__`` set, a spec is
   generated using ``ModuleSpec.from_module()``.

   Since the module attributes may be out of sync with the spec and to
   preserve backward-compatibility in that case, we defer to the module
   attributes and only when they are missing do we fall back to the spec
   attributes.

**init_module_attrs(module)**

.. container::

   Sets the module's import-related attributes to the corresponding values
   in the module spec.  If ``has_location`` is false on the spec,
   ``__file__`` and ``__cached__`` are not set on the module.  ``__path__``
   is only set on the module if ``submodule_search_locations`` is None.
   For the rest of the import-related module attributes, a ``None`` value
   on the spec (aka "not set") means ``None`` will be set on the module.
   If any of the attributes are already set on the module, the existing
   values are replaced.  The module's own ``__spec__`` is not consulted but
   does get replaced with the spec on which ``init_module_attrs()`` was
   called.  The earlier mapping of ``ModuleSpec`` attributes to module
   attributes indicates which attributes are involved on both sides.

**create()**

.. container::

   A new module is created relative to the spec and its import-related
   attributes are set accordingly.  If the spec's loader has a
   ``create_module()`` method, that gets called to create the module.  This
   give the loader a chance to do any pre-loading initialization that can't
   otherwise be accomplished elsewhere.  Otherwise a bare module object is
   created.  In both cases ``init_module_attrs()`` is called on the module
   before it gets returned.

**exec(module)**

.. container::

   The spec's loader is used to execute the module.  If the loader has
   ``exec_module()`` defined, the namespace of ``module`` is the target of
   execution.  Otherwise the loader's ``load_module()`` is called, which
   ignores ``module`` and returns the module that was the actual
   execution target.  In that case the import-related attributes of that
   module are updated to reflect the spec.  In both cases the targeted
   module is the one that gets returned.

**load()**

.. container::

   This method captures the current functionality of and requirements on
   ``Loader.load_module()`` without any semantic changes.  It is
   essentially a wrapper around ``create()`` and ``exec()`` with some
   extra functionality regarding ``sys.modules``.

   itself in ``sys.modules`` while executing.  Consequently, the module in
   ``sys.modules`` is the one that gets returned by ``load()``.

   Right before ``exec()`` is called, the module is added to
   ``sys.modules``.  In the case of error during loading the module is
   removed from ``sys.modules``.  The module in ``sys.modules`` when
   ``load()`` finishes is the one that gets returned.  Returning the module
   from ``sys.modules`` accommodates the ability of the module to replace
   itself there while it is executing (during load).

   As already noted, this is what already happens in the import system.
   ``load()`` is not meant to change any of this behavior.

   If ``loader`` is not set (``None``), ``load()`` raises a ValueError.

**reload(module)**

.. container::

   As with ``load()`` this method faithfully fulfills the semantics of
   ``Loader.load_module()`` in the reload case, with one exception:
   reloading a module when ``exec_module()`` is available actually uses
   ``module`` rather than ignoring it in favor of the one in
   ``sys.modules``, as ``Loader.load_module()`` does.  The functionality
   here mirrors that of ``load()``, minus the ``create()`` call and the
   ``sys.modules`` handling.

.. XXX add more of importlib.reload()'s boilerplate to reload()?

Omitted Attributes and Methods
------------------------------

There is no ``PathModuleSpec`` subclass of ``ModuleSpec`` that provides
the ``has_location``, ``cached``, and ``submodule_search_locations``
functionality.  While that might make the separation cleaner, module
objects don't have that distinction.  ``ModuleSpec`` will support both
cases equally well.

While ``is_package`` would be a simple additional attribute (aliasing
``self.submodule_search_locations is not None``), it perpetuates the
artificial (and mostly erroneous) distinction between modules and
packages.

Conceivably, ``ModuleSpec.load()`` could optionally take a list of
modules with which to interact instead of ``sys.modules``.  That
capability is left out of this PEP, but may be pursued separately at
some other time, including relative to PEP 406 (import engine).

Likewise ``load()`` could be leveraged to implement multi-version
imports.  While interesting, doing so is outside the scope of this
proposal.

Backward Compatibility
----------------------

``ModuleSpec`` doesn't have any.  This would be a different story if
``Finder.find_module()`` were to return a module spec instead of loader.
In that case, specs would have to act like the loader that would have
been returned instead.  Doing so would be relatively simple, but is an
unnecessary complication.

Subclassing
-----------

Subclasses of ModuleSpec are allowed, but should not be necessary.
Simply setting ``loading_info`` or adding functionality to a custom
finder or loader will likely be a better fit and should be tried first.
However, as long as a subclass still fulfills the requirements of the
import system, objects of that type are completely fine as the return
value of ``Finder.find_spec()``.


Existing Types
==============

Module Objects
--------------

**__spec__**

.. container::

   Module objects will now have a ``__spec__`` attribute to which the
   module's spec will be bound.

None of the other import-related module attributes will be changed or
deprecated, though some of them could be; any such deprecation can wait
until Python 4.

``ModuleSpec`` objects will not be kept in sync with the corresponding
module object's import-related attributes.  Though they may differ, in
practice they will typically be the same.

One notable exception is that case where a module is run as a script by
using the ``-m`` flag.  In that case ``module.__spec__.name`` will
reflect the actual module name while ``module.__name__`` will be
``__main__``.

Finders
-------

**MetaPathFinder.find_spec(name, path=None)**

**PathEntryFinder.find_spec(name)**

.. container::

   Finders will return ModuleSpec objects when ``find_spec()`` is
   called.  This new method replaces ``find_module()`` and
   ``find_loader()`` (in the ``PathEntryFinder`` case).  If a loader does
   not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are
   used instead, for backward-compatibility.

   Adding yet another similar method to loaders is a case of practicality.
   ``find_module()`` could be changed to return specs instead of loaders.
   This is tempting because the import APIs have suffered enough,
   especially considering ``PathEntryFinder.find_loader()`` was just
   added in Python 3.3.  However, the extra complexity and a less-than-
   explicit method name aren't worth it.

Finders are still responsible for creating the loader.  That loader will
now be stored in the module spec returned by ``find_spec()`` rather
than returned directly.  As is currently the case without the PEP, if a
loader would be costly to create, that loader can be designed to defer
the cost until later.

Loaders
-------

**Loader.exec_module(module)**

.. container::

   Loaders will have a new method, ``exec_module()``.  Its only job
   is to "exec" the module and consequently populate the module's
   namespace.  It is not responsible for creating or preparing the module
   object, nor for any cleanup afterward.  It has no return value.

**Loader.load_module(fullname)**

.. container::

   The ``load_module()`` of loaders will still work and be an active part
   of the loader API.  It is still useful for cases where the default
   module creation/prepartion/cleanup is not appropriate for the loader.
   If implemented, ``load_module()`` will still be responsible for its
   current requirements (prep/exec/etc.) since the method may be called
   directly.

   For example, the C API for extension modules only supports the full
   control of ``load_module()``.  As such, ``ExtensionFileLoader`` will not
   implement ``exec_module()``.  In the future it may be appropriate to
   produce a second C API that would support an ``exec_module()``
   implementation for ``ExtensionFileLoader``.  Such a change is outside
   the scope of this PEP.

A loader must define either ``exec_module()`` or ``load_module()``.  If
both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()``
and ignores ``load_module()``.

**Loader.create_module(spec)**

.. container::

   Loaders may also implement ``create_module()`` that will return a
   new module to exec.  However, most loaders will not need to implement
   the method.

PEP 420 introduced the optional ``module_repr()`` loader method to limit
the amount of special-casing in the module type's ``__repr__()``.  Since
this method is part of ``ModuleSpec``, it will be deprecated on loaders.
However, if it exists on a loader it will be used exclusively.

``Loader.init_module_attr()`` method, added prior to Python 3.4's
release , will be removed in favor of the same method on ``ModuleSpec``.

However, ``InspectLoader.is_package()`` will not be deprecated even
though the same information is found on ``ModuleSpec``.  ``ModuleSpec``
can use it to populate its own ``is_package`` if that information is
not otherwise available.  Still, it will be made optional.

The path-based loaders in ``importlib`` take arguments in their
``__init__()`` and have corresponding attributes.  However, the need for
those values is eliminated by module specs.  The only exception is
``FileLoader.get_filename()``, which uses ``self.path``.  The signatures
for these loaders and the accompanying attributes will be deprecated.

In addition to executing a module during loading, loaders will still be
directly responsible for providing APIs concerning module-related data.


Other Changes
=============

* The various finders and loaders provided by ``importlib`` will be
  updated to comply with this proposal.
* The spec for the ``__main__`` module will reflect how the interpreter
  was started.  For instance, with ``-m`` the spec's name will be that
  of the run module, while ``__main__.__name__`` will still be
  "__main__".
* We add ``importlib.find_spec()`` to mirror
  ``importlib.find_loader()`` (which becomes deprecated).
* Deprecations in ``importlib.util``: ``set_package()``,
  ``set_loader()``, and ``module_for_loader()``.  ``module_to_load()``
  (introduced prior to Python 3.4's release) can be removed.
* ``importlib.reload()`` is changed to use ``ModuleSpec.load()``.
* ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of
  the per-module import lock, whereas ``Loader.load_module()`` did not.


Reference Implementation
========================

A reference implementation will be available at
http://bugs.python.org/issue18864.


Open Issues
==============

\* The impact of this change on pkgutil (and setuptools) needs looking
into.  It has some generic function-based extensions to PEP 302.  These
may break if importlib starts wrapping loaders without the tools'
knowledge.

\* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc,
inspect.

\* Add ``ModuleSpec.data`` as a descriptor that wraps the data API of the
spec's loader?

\* How to limit possible end-user confusion/abuses relative to spec
attributes (since __spec__ will make them really accessible)?


References
==========

[1] http://mail.python.org/pipermail/import-sig/2013-August/000658.html


Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130828/8e0cfcea/attachment-0001.html>


More information about the Import-SIG mailing list