PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules
I was recently bitten by the fact that the command:
python -m foo
pulls in the module and attaches it as sys.modules['__main__'], but not to
sys.modules['foo']. Should the program also:
import foo
it pulls in the same module code, but binds a completely independent separate
instance of it to sys.modules['foo']. This is counter intuitive; it is a
natural expectation that "python -m foo" imports "foo" in a normal fashion.
If the program modifies items in "foo", those modifications are not effected in
"__main__", since these are two distinct modules.
I propose that "python -m foo" imports foo as normal, binding it to
sys.modules["__main__"] as at present, but that it also binds the module to
sys.modules["foo"]. This will remove the disconnect between "python -m foo" and
a program's internal "import foo".
For people who are concerned that the modules .__name__ is "__main__", note
that the module's resolved "offical" name is present in .__spec__.name as
described in PEP 451.
There are two recent discussion threads on this in python-list at:
https://mail.python.org/pipermail/python-list/2015-August/694905.html
and in python-ideas at:
https://mail.python.org/pipermail/python-ideas/2015-August/034947.html
Please give them a read and give this PEP your thoughts.
The raw text of the PEP is below. It feels uncontroversial to me, but then it
would:-)
It is visible on the web here:
https://www.python.org/dev/peps/pep-0499/
and I've made a public repository to track the text as it evolves here:
https://bitbucket.org/cameron_simpson/pep-0499/
Cheers,
Cameron Simpson
On Sat, Aug 8, 2015 at 7:49 PM, Cameron Simpson
The raw text of the PEP is below. It feels uncontroversial to me, but then it would:-)
I'm not sure that it'll be uncontroversial, but I agree with it :) The risk that I see (as I mentioned in the previous thread, but reiterating for those who just came in) is that it becomes possible to import something whose __name__ is not what you imported. Currently, you can "import math" and see that math.__name__ is "math", or "import urllib.parse" and, as you'd expect, urllib.parse.__name__ is "urllib.parse". In the few cases where it isn't exactly what you imported, it's the canonical name for it - for instance, os.path.__name__ is posixpath on my system. The change proposed here means that the canonical name for the module you're running as the main file is now "__main__", and not whatever else it would have been. Consequences for pickle/multiprocessing/Windows are mentioned in the PEP. Are there any other places where a module's name is checked? ChrisA
On 08Aug2015 20:30, Chris Angelico
On Sat, Aug 8, 2015 at 7:49 PM, Cameron Simpson
wrote: The raw text of the PEP is below. It feels uncontroversial to me, but then it would:-)
I'm not sure that it'll be uncontroversial, but I agree with it :)
The risk that I see (as I mentioned in the previous thread, but reiterating for those who just came in) is that it becomes possible to import something whose __name__ is not what you imported. Currently, you can "import math" and see that math.__name__ is "math", or "import urllib.parse" and, as you'd expect, urllib.parse.__name__ is "urllib.parse". In the few cases where it isn't exactly what you imported, it's the canonical name for it - for instance, os.path.__name__ is posixpath on my system. The change proposed here means that the canonical name for the module you're running as the main file is now "__main__", and not whatever else it would have been.
I think I take the line that as of PEP 451 the conanical name for a module is
.__spec__.name. The module's .__name__ normally matches that, but obviously in
the case of "python -m" it does not.
As you point out, suddenly a module can appear somewhere other than
sys.modules['__main__'] where that difference shows.
Let's ask the associated question: who introspects module.__name__ and expects
it to be the cononical name? For what purpose?
I'm of the opinion that those cases are few, and that they should in any case
be updated to consult .__spec__.name these days (with, I suppose, fallback for
older Python versions). I think that is the case even without the change
suggested by PEP 499.
Cheers,
Cameron Simpson
On Aug 8, 2015, at 16:18, Cameron Simpson
I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not.
As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows.
Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose?
I'd think the first place to look is code that deals directly with module objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges (a la AppScript or PyObjC), etc. Especially since many of them want to retain compatibility with 3.3, if not 3.2, and to share as much code as possible with a 2.x version Of course you're probably right that there aren't too many such things, and they're also presumably written by people who know what they're doing and wouldn't have too much trouble adapting them for 3.6+ if needed.
If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect. On Sun, Aug 9, 2015 at 1:12 AM, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
On Aug 8, 2015, at 16:18, Cameron Simpson
wrote: I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not.
As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows.
Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose?
I'd think the first place to look is code that deals directly with module objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges (a la AppScript or PyObjC), etc. Especially since many of them want to retain compatibility with 3.3, if not 3.2, and to share as much code as possible with a 2.x version
Of course you're probably right that there aren't too many such things, and they're also presumably written by people who know what they're doing and wouldn't have too much trouble adapting them for 3.6+ if needed.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 09Aug2015 03:05, Joseph Jevnik
If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect.
Yes. Yes it does.
I just did a quick test package named "testmod" via "python -m testmod" and:
- __init__.py has the __name__ "testmod"
- __main__.py has the __name__ "__main__"
in both python 2.7 and python 3.4.
Since my test script reports:
% python3.4 -m testmod
__init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod
__main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__
% python2.7 -m testmod
('__init__.py:', '/Users/cameron/rc/python/testmod/__init__.pyc', 'testmod')
('__main__.py:', '/Users/cameron/rc/python/testmod/__main__.py', '__main__')
would it be enough to say that this change should only apply if the module is
not a package?
I'll do some more fiddling to see exactly what happens in packages when I
import pieces of them, too.
Cheers,
Cameron Simpson
On 09Aug2015 20:34, Cameron Simpson
On 09Aug2015 03:05, Joseph Jevnik
wrote: If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect.
Yes. Yes it does. [...] would it be enough to say that this change should only apply if the module is not a package?
I append the code for my testmod below, being an __init__.py and a __main__.py.
A run shows:
% python3.4 -m testmod
__init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod testmod
__main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__ testmod.__main__
__main__
I would be okay if this change did not affect execution of a package with
the python -m flag. I was only concerned because a __main__ in a package is
common and wanted to make sure you had addressed it.
On Sun, Aug 9, 2015 at 6:48 PM, Cameron Simpson
On 09Aug2015 20:34, Cameron Simpson
wrote: On 09Aug2015 03:05, Joseph Jevnik
wrote: If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect.
Yes. Yes it does.
[...]
would it be enough to say that this change should only apply if the module is not a package?
I append the code for my testmod below, being an __init__.py and a __main__.py. A run shows:
% python3.4 -m testmod __init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod testmod __main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__ testmod.__main__ __main__
testmod (4 lines, should your mailer fold the output.)
It seems to me that Python already does the "right thing" for packages, and it is only non-package modules which need the change proposed by the PEP.
Comments please?
Code below.
Cheers, Cameron Simpson
testmod/__init__.py: #!/usr/bin/python print('__init__.py:', __file__, __name__, __spec__.name)
testmod/__main__.py: #!/usr/bin/python import pprint import sys print('__main__.py:', __file__, __name__, __spec__.name) for modname, mod in sorted(sys.modules.items()): rmod = repr(mod) if 'testmod' in modname or 'testmod' in rmod: print(modname, rmod)
On 09Aug2015 19:33, Joseph Jevnik
On 09Aug2015 03:05, Joseph Jevnik
wrote: If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect.
Yes. Yes it does. [...] would it be enough to say that this change should only apply if the module is not a package? I would be okay if this change did not affect execution of a package with
On 09Aug2015 20:34, Cameron Simpson
wrote: the python -m flag. I was only concerned because a __main__ in a package is common and wanted to make sure you had addressed it.
Good point. Please see if this update states your issue fairly and addresses
it:
https://bitbucket.org/cameron_simpson/pep-0499/commits/3efcd9b54e238a1ff7f5c...
Cheers,
Cameron Simpson
On 08Aug2015 22:12, Andrew Barnert
On Aug 8, 2015, at 16:18, Cameron Simpson
wrote: I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not.
As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows.
Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose?
I'd think the first place to look is code that deals directly with module objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges (a la AppScript or PyObjC), etc. Especially since many of them want to retain compatibility with 3.3, if not 3.2, and to share as much code as possible with a 2.x version
Of course you're probably right that there aren't too many such things, and they're also presumably written by people who know what they're doing and wouldn't have too much trouble adapting them for 3.6+ if needed.
One might hope. So I've started with the stdlib in two passes: looking for
.__name__ associated with "mod", and looking for __main__ not in the standard
boilerplate (__name__ == '__main__').
Obviously all this code is unfamiliar to me so anyone with deeper understanding
who wants to look is most welcome.
Pass 1 with this command:
find . -type f -name \*.py | xargs fgrep .__name__ /dev/null | grep mod
to look for module related code using .__name__. Of course a lot of it is
reporting, but there are some interesting relevant bits.
doctest:
This refers to module.__name__ quite a lot. The _normalize_module() function
uses __name__ instead of __spec__.name. _from_module() tests is an object is
defined in a particular module based on __name__; I'm (naively) surprised that
this can't use "is", but it looks like an object's __module__ attribute is a
string, which I imagine avoids circular references. _get_test() uses __name__
instead of __spec__.name, though only as a fallback if there is no __file__.
SkipDocTestCase.shortDescription() uses __name__.
importlib: mostly seems fine according to my shallow understanding?
inspect: getmodule() seems correct (uses __name__ but seems correctish) - this
does seem to be a grope around in the available places looking for a match
function, and feels unreliable anyway.
modulefinder: this does look like it could use __spec__.name more widely, or as
an adjunct to __name__. scan_code() looks like another "grope around" function
trying to infer structure from the pieces sitting about:-)
pdb: Pdb.do_whatis definitely reports using .__name__. Not necessarily
incorrect.
pkgutils: get_loader() uses .__name__, probably ougtht to be __spec__.name
pydoc: also probably should upgrade to .__spec__.name
unittest: TestLoader.discover seems to rely on __name__ instead of
__spec__.name while constructing a pathname; definitely seems like it needs
updating for PEP 451. It also looks up __name__ in sys.builtin_module_names to
reject constructing a pathname.
Pass 2 with this command:
find . -type f -name \*.py |xxargs fgrep __main__ | grep -v 'if *__name__ *== *["'\'']__main__'
looking for __main__ but discarding the boilerplate.
I'm actually striking out here. Since this PEP doesn't change __name__ ==
'__main__' I've not found anything here that looks like it would stop working.
Even runpy, surcory though my look at it is, is going forward: setting __name__
to '__main__' instead of working backwards.
Further thoughts?
Cheers,
Cameron Simpson
participants (4)
-
Andrew Barnert
-
Cameron Simpson
-
Chris Angelico
-
Joseph Jevnik