New subject: PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules

Aug. 8, 2015

      I was recently bitten by the fact that the command:

  python -m foo

pulls in the module and attaches it as sys.modules['__main__'], but not to 
sys.modules['foo'].  Should the program also:

  import foo

it pulls in the same module code, but binds a completely independent separate 
instance of it to sys.modules['foo']. This is counter intuitive; it is a 
natural expectation that "python -m foo" imports "foo" in a normal fashion.

If the program modifies items in "foo", those modifications are not effected in 
"__main__", since these are two distinct modules.

I propose that "python -m foo" imports foo as normal, binding it to 
sys.modules["__main__"] as at present, but that it also binds the module to 
sys.modules["foo"]. This will remove the disconnect between "python -m foo" and 
a program's internal "import foo".

For people who are concerned that the modules .__name__ is "__main__", note 
that the module's resolved "offical" name is present in .__spec__.name as 
described in PEP 451.

There are two recent discussion threads on this in python-list at:

  https://mail.python.org/pipermail/python-list/2015-August/694905.html

and in python-ideas at:

  https://mail.python.org/pipermail/python-ideas/2015-August/034947.html

Please give them a read and give this PEP your thoughts.

The raw text of the PEP is below. It feels uncontroversial to me, but then it 
would:-)

It is visible on the web here:

  https://www.python.org/dev/peps/pep-0499/

and I've made a public repository to track the text as it evolves here:

  https://bitbucket.org/cameron_simpson/pep-0499/

Cheers,
Cameron Simpson <cs@zip.com.au>

PEP: 499
Title: ``python -m foo`` should bind ``sys.modules['foo']`` in addition to ``sys.modules['__main__']``
Version: $Revision$
Last-Modified: $Date$
Author: Cameron Simpson <cs@zip.com.au>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 07-Aug-2015
Python-Version: 3.6

Abstract
========

When a module is used as a main program on the Python command line,
such as by:

    python -m module.name ...

it is easy to accidentally end up with two independent instances
of the module if that module is again imported within the program.
This PEP proposes a way to fix this problem.

When a module is invoked via Python's -m option the module is bound
to ``sys.modules['__main__']`` and its ``.__name__`` attribute is set to
``'__main__'``.
This enables the standard "main program" boilerplate code at the
bottom of many modules, such as::

    if __name__ == '__main__':
        sys.exit(main(sys.argv))

However, when the above command line invocation is used it is a
natural inference to presume that the module is actually imported
under its official name ``module.name``,
and therefore that if the program again imports that name
then it will obtain the same module instance.

That actuality is that the module was imported only as ``'__main__'``.
Another import will obtain a distinct module instance, which can
lead to confusing bugs.

Proposal
========

It is suggested that to fix this situation all that is needed is a
simple change to the way the ``-m`` option is implemented: in addition
to binding the module object to ``sys.modules['__main__']``, it is also
bound to ``sys.modules['module.name']``.

Nick Coghlan has suggested that this is as simple as modifying the
``runpy`` module's ``_run_module_as_main`` function as follows::

    main_globals = sys.modules["__main__"].__dict__

to instead be::

    main_module = sys.modules["__main__"]
    sys.modules[mod_spec.name] = main_module
    main_globals = main_module.__dict__

Considerations and Prerequisites
================================

Pickling Modules
----------------

Nick has mentioned `issue 19702`_ which proposes (quoted from the issue):

- runpy will ensure that when __main__ is executed via the import
  system, it will also be aliased in sys.modules as __spec__.name
- if __main__.__spec__ is set, pickle will use __spec__.name rather
  than __name__ to pickle classes, functions and methods defined in
  __main__
- multiprocessing is updated appropriately to skip creating __mp_main__
  in child processes when __main__.__spec__ is set in the parent
  process

The first point above covers this PEP's specific proposal.

Background
==========

`I tripped over this issue`_ while debugging a main program via a
module which tried to monkey patch a named module, that being the
main program module.  Naturally, the monkey patching was ineffective
as it imported the main module by name and thus patched the second
module instance, not the running module instance.

However, the problem has been around as long as the ``-m`` command
line option and is encountered regularly, if infrequently, by others.

In addition to `issue 19702`_, the discrepancy around `__main__`
is alluded to in PEP 451 and a similar proposal (predating PEP 451)
is described in PEP 395 under `Fixing dual imports of the main module`_.

References
==========

.. _issue 19702: http://bugs.python.org/issue19702

.. _I tripped over this issue: https://mail.python.org/pipermail/python-list/2015-August/694905.html

.. _Fixing dual imports of the main module: https://www.python.org/dev/peps/pep-0395/#fixing-dual-imports-of-the-main-mo...

Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

PEP-499: "python -m foo" should bind to both "main" and "foo" in sys.modules

Cameron Simpson

Chris Angelico

Cameron Simpson

Andrew Barnert

Joseph Jevnik

Cameron Simpson

Cameron Simpson

Joseph Jevnik

Cameron Simpson

Cameron Simpson

Chris Angelico

Cameron Simpson

Andrew Barnert

Joseph Jevnik

Cameron Simpson

Cameron Simpson

Joseph Jevnik

Cameron Simpson

Cameron Simpson

tags

participants (4)