[Python-ideas] PEP for executing a module in a package containing relative imports

Brett Cannon brett at python.org
Fri Apr 20 05:38:42 CEST 2007


Some of you might remember a discussion that took place on this list
about not being able to execute a script contained in a package that
used relative imports (read the PEP if you don't quite get what I am
talking about).  The PEP below proposes a solution (along with a
counter-solution).

Let me know what you think.  I especially want to hear which proposal
people prefer; the one in the PEP or the one in the Open Issues
section.  Plus I wouldn't mind suggestions on a title for this PEP.
=)

-------------------------------------------
PEP: XXX
Title: XXX
Version: $Revision: 52916 $
Last-Modified: $Date: 2006-12-04 11:59:42 -0800 (Mon, 04 Dec 2006) $
Author: Brett Cannon
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: XXX-Apr-2007

Abstract
========

Because of how name resolution works for relative imports in a world
where PEP 328 is implemented, the ability to execute modules within a
package ceases being possible.  This failing stems from the fact that
the module being executed as the "main" module replaces its
``__name__`` attribute with ``"__main__"`` instead of leaving it as
the actual, absolute name of the module.  This breaks import's ability
to resolve relative imports from the main module into absolute names.

In order to resolve this issue, this PEP proposes to change how a
module is delineated as the module that is being executed as the main
module.  By leaving the ``__name__`` attribute in a module alone and
setting a module attribute named ``__main__`` to a true value for the
main module (and thus false in all others), proper relative name
resolution can occur while still having a clear way for a module to
know if it is being executed as the main module.


The Problem
===========

With the introduction of PEP 328, relative imports became dependent on
the ``__name__`` attribute of the module performing the import.  This
is because the use of dots in a relative import are used to strip away
parts of the calling module's name to calcuate where in the package
hierarchy a relative import should fall (prior to PEP 328 relative
imports could fail and would fall back on absolute imports which had a
chance of succeeding).

For instance, consider the import ``from .. import spam`` made from the
``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package
itself, i.e., does not define ``__path__``).  Name resolution of the
relative import takes the caller's name (``bacon.ham.beans``), splits
on dots, and then slices off the last n parts based on the level
(which is 2).  In this example both ``ham`` and ``beans`` are dropped
and ``spam`` is joined with what is left (``bacon``).  This leads to
the proper import of the module ``bacon.spam``.

This reliance on the ``__name__`` attribute of a module when handling
realtive imports becomes an issue with executing a script within a
package.  Because the executing script is set to ``'__main__'``,
import cannot resolve any relative imports.  This leads to an
``ImportError`` if you try to execute a script in a package that uses
any relative import.

For example, assume we have a package named ``bacon`` with an
``__init__.py`` file containing::

  from . import spam

Also create a module named ``spam`` within the ``bacon`` package (it
can be an empty file).  Now if you try to execute the ``bacon``
package (either through ``python bacon/__init__.py`` or
``python -m bacon``) you will get an ``ImportError`` about trying to
do a relative import from within a non-package.  Obviously the import
is valid, but because of the setting of ``__name__`` to ``'__main__'``
import thinks that ``bacon/__init__.py`` is not in a package since no
dots exist in ``__name__``.  To see how the algorithm works, see
``importlib.Import._resolve_name()`` in the sandbox [#importlib]_.

Currently a work-around is to remove all relative imports in the
module being executed and make them absolute.  This is unfortunate,
though, as one should not be required to use a specific type of
resource in order to make a module in a package be able to be
executed.


The Solution
============

The solution to the problem is to not change the value of ``__name__``
in modules.  But there still needs to be a way to let executing code
know it is being executed as a script.  This is handled with a new
module attribute named ``__main__``.

When a module is being executed as a script, ``__main__`` will be set
to a true value.  For all other modules, ``__main__`` will be set to a
false value.  This changes the current idiom of::

  if __name__ == '__main__':
      ...

to::

  if __main__:
      ...

The current idiom is not as obvious and could cause confusion for new
programmers.  The proposed idiom, though, does not require explaining
why ``__name__`` is set as it is.

With the proposed solution the convenience of finding out what module
is being executed by examining ``sys.modules['__main__']`` is lost.
To make up for this, the ``sys`` module will gain the ``main``
attribute.  It will contain a string of the name of the module that is
considered the executing module.

A competing solution is discussed in `Open Issues`_.


Transition Plan
===============

Using this solution will not work directly in Python 2.6.  Code is
dependent upon the semantics of having ``__name__`` set to
``'__main__'``.  There is also the issue of pre-existing global
variables in a module named ``__main__``.  To deal with these issues,
a two-step solution is needed.

First, a Py3K deprecation warning will be raised during AST generation
when a global variable named ``__main__`` is defined.  This will help
with the detection of code that would reset the value of ``__main__``
for a module.  Without adding a warning when a global variable is
injected into a module, though, it is not fool-proof.  But this
solution should cover the vast majority of variable rebinding
problems.

Second, 2to3 [#2to3]_ will gain a rule to transform the current ``if
__name__ == '__main__': ...`` idiom to the new one.  While it will not
help with code that checks ``__name__`` outside of the idiom, that
specific line of code makes up a large proporation of code that every
looks for ``__name__`` set to ``'__main__'``.


Open Issues
===========

A counter-proposal to introducing the ``__main__`` attribute on
modules was to introduce a built-in with the same name.  The value of
the built-in would be the name of the module being executed (just like
the proposed ``sys.main``).  This would lead to a new idiom of::

  if __name__ == __main__:
      ...

The perk of this idiom over the one proposed earlier is that the
general semantics does not differ greatly from the current idiom.

The drawback is that the syntactic difference is subtle; the dropping
of quotes around "__main__".  Some believe that for existing Python
programmers bugs will be introduced where the quotation marks will be
put on by accident.  But one could argue that the bug would be
discovered quickly through testing as it is a very shallow bug.

The other pro of this proposal over the earlier one is the alleviation
of requiring import code to have to set the value of ``__main__``.  By
making it a built-in variable import does not have to care about
``__main__`` as executing the code itself will pick up the built-in
``__main__`` itself.  This simplies the implementation of the proposal
as it only requires setting a built-in instead of changing import to
set an attribute on every module that has exactly one module have a
different value (much like the current implementation has to do to set
``__name__`` in one module to ``'__main__'``).


References
==========

.. [#2to3]  2to3 tool
    (http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC]

.. [#importlib] importlib
    (http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=markup)
    [ViewVC]



Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-ideas mailing list