[issue39452] Improve the __main__ module documentation
New submission from Géry <gery.ogam@gmail.com>: This PR will apply the following changes on the [`__main__` module documentation](https://docs.python.org/3.7/library/__main__.html): - correct the phrase "run as script" by "run from the file system" (as used in the [`runpy`](https://docs.python.org/3/library/runpy.html) documentation) since "run as script" does not mean the intended `python foo.py` but `python -m foo` (cf. [PEP 338](https://www.python.org/dev/peps/pep-0338/)); - replace the phrase "run with `-m`" by "run from the module namespace" (as used in the [`runpy`](https://docs.python.org/3/library/runpy.html) documentation) since the module can be equivalently run with `runpy.run_module('foo')` instead of `python -m foo`; - make the block comment [PEP 8](https://www.python.org/dev/peps/pep-0008/#comments)-compliant (located before the `if` block, capital initialised, period ended); - add a missing case for which a package's \_\_main\_\_.py is executed (when the package is run from the file system: `python foo/`). ---------- assignee: docs@python components: Documentation messages: 360682 nosy: docs@python, maggyero priority: normal pull_requests: 17565 severity: normal status: open title: Improve the __main__ module documentation type: enhancement versions: Python 3.8 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue39452> _______________________________________
Steven D'Aprano <steve+python@pearwood.info> added the comment: There are some serious problems with the PR. You state that these two phrases are from the runpy documentation: * "run from the module namespace" * "run from the file system" but neither of those phrases appear in the runpy documentation here: https://docs.python.org/3/library/runpy.html You also say:
"run as script" does not mean the intended `python foo.py` but `python -m foo`
but this is incorrect, and I think based on a misunderstanding of PEP 338. The title of PEP 338, "Executing modules as scripts", is not exclusive: the PEP is about the -m mechanism for *locating the module* in order to run it as a script. It doesn't imply that `python spam.py` should no longer be considered to be running a script. In common parlance, "run as a script" certainly does include the case where you specify the module by filename `python spam.py` as well as the -m case where you specify it as a module name and let the interpreter locate the file. In other words, both python pathname/spam.py python -m spam are correctly described as "running spam.py as a script" (and other variations). They differ in how the script is specified, but both mechanisms treat the spam.py file as a script and run it. See for example https://duckduckgo.com/?q=how+to+run+a+python+script for examples of common usage. Consequently, it is simply wrong to say that the intended usage of "run a script" is the -m mechanism. The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. `__main__` is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope. The PR introduces the phrase "when the module is run from the file system" to mean the case where a script is run using `python spam.py`, but it equally applies to the case of `python -m spam`. In both cases, spam is located somewhere in the file system. (It is conceivable that -m could locate and run a built-in module, but I don't know any cases where that actually works. Even if it does, we surely don't need to complicate the docs for this corner case. It's enough to know that -m will locate the module and run it.) The PR describes three cases: running from the file system, running from stdin, and running "from the module namespace" but that last one is a clumsy phrase which, it seems to me, is not correct. How do you run a module from its own namespace? Modules *are* a namespace, and we say code runs *in* a namespace, not "from" it. In any case, it doesn't matter whether the script is specified on the command line as a file name, or as a module name with -m, or double-clicked in a GUI, in all three cases the module's code is executed in the module's namespace. So it is wrong to distinguish "from the file system" and "from (in) the module namespace" as two distinct cases. They are the same case. The PR replaces the comment inside the `if` block: # execute only if run as a script with a comment above the `if` statement: # Execute only if the module is not imported. but the new comment is factually incorrect on two counts. Firstly, it is not correct that the `if` statement executes only if the module is not imported. There is no magic to the `if` statement. It always executes, regardless of whether the module is being run as a script or not. We can write code like this: if print("Hello, this always runs!") or __name__ == '__main__': # execute only if run as a script print('running as a script') else: # execute only if *not* run as a script print('not run as a script') Placing the comment above the `if`, where it will apply to the entire `if` statement, is incorrect. The second problem is that when running a module with -m it *is* imported. PEP 338 is clear about this: "if -m is used to execute a module the PEP 302 import mechanisms are used to locate the module and retrieve its compiled code, before executing the module" (in other words: import the module). We can test this, for example, if you create a package: spam/ +-- __init__.py +-- eggs.py and then run `python -m spam.eggs`, not only `__main__` (the eggs.py module) but also `spam` will be found in sys.modules. So the new comment is simply wrong. There may be other issues with the PR. ---------- nosy: +steven.daprano _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue39452> _______________________________________
Géry <gery.ogam@gmail.com> added the comment: Thanks for your extended review Steven.
You state that these two phrases are from the runpy documentation:
* "run from the module namespace" * "run from the file system"
but neither of those phrases appear in the runpy documentation here:
I agree. Actually the first paragraph of the page uses the phrases: - "located using the module namespace"; - "located using the file system", so instead of saying: - "run a module located using the module namespace" to mean "python <file> - "run a module located using the file system" to mean "python -m <module>", I simplified to: - "run from the module namespace" - "run from the file system" But since the terminology is misleading I have used these phrases instead: - `python`: "module initialized from an interactive prompt"; - `python < <file>`: "module initialized from standard input"; - `python <file>`: "module initialized from a file argument"; - `python -c <code>`: "module initialized from a `-c` argument"; - `python -m <module>`: "module initialized from a `-m` argument"; - `import <module>`: "module initialized from an import statement". What the documentation tries to explain is that in all of these cases except the last one, code is executed in the __main__ module. I have updated the PR. ----
The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. `__main__` is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope.
I disagree. According to Wikipedia (https://en.wikipedia.org/wiki/Scope_(computer_science)), the term "scope" is the part of a program where a name binding is valid, while the term "environment" (synonym of "context") is the set of name bindings that are valid within a part of a program. Therefore "scope" is a property of a name binding (a name binding has a scope), and "environment" is a property of a part of a program (a part of a program has an environment). And the term "environment" is actually already used in the original title and synopsis of the document (and it is correct):
:mod:`__main__` --- Top-level script environment
.. module:: __main__ :synopsis: The environment where the top-level script is run.
So my change to the body fixes the inconsistent and incorrect usage of "scope": - ``'__main__'`` is the name of the scope in which top-level code executes. + ``'__main__'`` is the name of the environment where top-level code is run. - A module can discover whether or not it is running in the main scope + A module can discover whether or not it is running in the main environment ----
Placing the comment above the `if`, where it will apply to the entire `if` statement, is incorrect.
I agree. Sometimes you see comments before if statements but they usually don't start with "execute". I have updated the PR. ----
The second problem is that when running a module with -m it *is* imported. PEP 338 is clear about this:
I agree. I should have said "when the module is not initialized from an import statement". But note that even before my change the original document already used the phrase "not imported": - executing code in a module when it is run as a script or with ``python - -m`` but not when it is imported:: + executing code in a module when it is not imported:: - # execute only if run as a script + # Execute only if the module is not imported. I have updated the PR. ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue39452> _______________________________________
Terry J. Reedy <tjreedy@udel.edu> added the comment: The main issue I have with the existing doc is its use of 'top-level' to mean the main, initial, startup module that first executes the user code for a python 'program'. We routinely use 'top-level' instead for the global scope of a module. Example: https://docs.python.org/3/glossary.html, 'qualified name' entry, line 2: "For top-level functions and classes, ..." Within '__main__', some code is top-level, but class and function bodies are not. But this does not have to be part of this PR. ---------- nosy: +terry.reedy _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue39452> _______________________________________
Géry <gery.ogam@gmail.com> added the comment: I agree with you Terry. Another thing that bothers me: in the current document, the __main__ module is reduced to its environment (aka context or dictionary), whereas a module object has other important attributes such as its code. So how about adding the following changes? - :mod:`__main__` --- Top-level code environment - ============================================== + :mod:`__main__` --- Startup module + ================================== - :synopsis: The environment where top-level code is run. + :synopsis: The first module from which the code is executed at startup. - ``'__main__'`` is the name of the environment where top-level code is run. + ``'__main__'`` is the name of the startup module. - A module can discover whether or not it is running in the main environment + A module can discover whether or not it is initialized as the :mod:`__main__` module ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue39452> _______________________________________
participants (3)
-
Géry
-
Steven D'Aprano
-
Terry J. Reedy