[Python-Dev] PEP 395: Module Aliasing

Fri Mar 4 17:59:40 CET 2011

On Fri, Mar 4, 2011 at 07:30, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Fixing dual imports of the main module
> --------------------------------------
>
> Two simple changes are proposed to fix this problem:
>
> 1. In ``runpy``, modify the implementation of the ``-m`` switch handling to
>   install the specified module in ``sys.modules`` under both its real name
>   and the name ``__main__``. (Currently it is only installed as the latter)
> 2. When directly executing a module, install it in ``sys.modules`` under
>   ``os.path.splitext(os.path.basename(__file__))[0]`` as well as under
>   ``__main__``.
>
> With the main module also stored under its "real" name, imports will pick
> it
> up from the ``sys.modules`` cache rather than reimporting it under a new
> name.
>

This does nothing to fix another common error: *unwittingly* importing the
main module under its real name -- for example when you intended to import,
say, a standard library module of the same name. ('socket.py' is a
surprisingly common name for a script to experiment with socket
functionality. Likewise, nowadays, 'twitter.py'.) While the proposed change
would make it less *broken* to import the main module again, does it make it
any more *sensible*? Is there really a need to support this? A clear warning
-- or even error -- would seem much more in place. Doing so is not
particularly hard: keep a mapping of modules by canonical filename along
with by modulename, and refuse to add the same file twice. (I'm not talking
about executing a module inside a package, mind, since that can't shadow a
stdlib module by accident anymore.)

Fixing direct execution inside packages
> ---------------------------------------
>
> To fix this problem, it is proposed that an additional filesystem check be
> performed before proceeding with direct execution of a ``PY_SOURCE`` or
> ``PY_COMPILED`` file that has been named on the command line.
>

This should only happen if the file is a valid import target.

> This additional check would look for an ``__init__`` file that is a peer to
> the specified file with a matching extension (either ``.py``, ``.pyc`` or
> ``.pyo``, depending what was passed on the command line).
>

I assume you mean for this to match the normal import rules for packages;
why not just say that? Also, should this consider situations other than the
vanilla run/import-from-filesystem? Should meta-importers and such get a
crack at solving this?

> If this check fails to find anything, direct execution proceeds as usual.
>
> If, however, it finds something, execution is handed over to a
> helper function in the ``runpy`` module that ``runpy.run_path`` also
> invokes
> in the same circumstances. That function will walk back up the
> directory hierarchy from the supplied path, looking for the first directory
> that doesn't contain an ``__init__`` file. Once that directory is found, it
> will be set to ``sys.path[0]``, ``sys.argv[0]`` will be set to ``-m`` and
> ``runpy._run_module_as_main`` will be invoked with the appropriate module
> name (as calculated based on the original filename and the directories
> traversed while looking for a directory without an ``__init__`` file.
>
>
> Fixing pickling without breaking introspection
> ----------------------------------------------
>
> To fix this problem, it is proposed to add two optional module level
> attributes: ``__source_name__`` and ``__pickle_name__``.
>
> When setting the ``__module__`` attribute on a function or class, the
> interpreter will be updated to use ``__source_name__`` if defined, falling
> back to ``__name__`` otherwise.
>
> ``__source_name__`` will automatically be set to the main module's "real"
> name
> (as described above under the fix to prevent duplicate imports of the main
> module) by the interpreter. This will fix both pickling and introspection
> for
> the main module.
>
> It is also proposed that the pickling mechanism for classes and functions
> be
> updated to use an optional ``__pickle_module__`` attribute when deciding
> how
> to pickle these objects (falling back to the existing ``__module__``
> attribute if the optional attribute is not defined). When a class or
> function
> is defined, this optional attribute will be defined if ``__pickle_name__``
> is
> defined at the module level, and left out otherwise. This will allow
> pseudo-modules to fix pickling without breaking introspection.
>
> Other serialisation schemes could add support for this new attribute
> relatively easily by replacing ``x.__module__`` with ``getattr(x,
> "__pickle_module__", x.__module__)``.
>
> ``pydoc`` and ``inspect`` would also be updated to make appropriate use of
> the new attributes for any cases not already covered by the above rules for
> setting ``__module__``.
>

Is this cornercase really worth polluting the module namespace with more
confusing __*__ names? It seems more sensible to me to simply make pickle
refuse to operate on classes and functions defined in __main__. It wouldn't
even be the least understandable restriction in pickle. The
'__source_name__' attribute would read better as '__modulename__' (although
I'm not convinced of its need for the other reasons, either.)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110304/3ca8b7e0/attachment-0001.html>