__main__ vs official module name: distinct module instances
Steven D'Aprano
steve at pearwood.info
Sun Aug 2 03:41:10 EDT 2015
On Sun, 2 Aug 2015 01:53 pm, Cameron Simpson wrote:
> Hi All,
>
> Maybe this should be over in python-ideas, since there is a proposal down
> the bottom of this message. But first the background...
>
> I've just wasted a silly amount of time debugging an issue that really I
> know about, but had forgotten.
:-)
> I have a number of modules which include a main() function, and down the
> bottom this code:
>
> if __name__ == '__main__':
> sys.exit(main(sys.argv))
>
> so that I have a convenient command line tool if I invoke the module
> directly. I typically have tiny shell wrappers like this:
>
> #!/bin/sh
> exec python -m cs.app.maildb -- ${1+"$@"}
I know this isn't really relevant to your problem, but why use "exec python"
instead of just "python"?
And can you explain the -- ${1+"$@"} bit for somebody who knows just enough
sh to know that it looks useful but not enough to know exactly what it
does?
> In short, invoke this module as a main program, passing in the command
> line arguments. Very useful.
>
> My problem?
>
> When invoked this way, the module cs.app.maildb that is being executed is
> actually the module named "__main__".
Yep. Now, what you could do in cs.app.maildb is this:
# untested, but should work
if __name__ = '__main__':
import sys
sys.modules['cs.app.maildb'] = sys.modules[__name__]
sys.exit(main(sys.argv))
*** but that's the wrong solution ***
The problem here is that by the time cs.app.maildb runs, some other part of
cs or cs.app may have already imported it. The trick of setting the module
object under both names can only work if you can guarantee to run this
before importing anything that does a circular import of cs.app.maildb.
The right existing solution is to avoid having the same module do
double-duty as both runnable script and importable module. In a package,
that's easy. Here's your package structure:
cs
+-- __init__.py
+-- app
+-- __init__.py
+-- mailbd.py
and possibly others. Every module that you want to be a runnable script
becomes a submodule with a __main__.py file:
cs
+-- __init__.py
+-- __main__.py
+-- app
+-- __init__.py
+-- __main__.py
+-- mailbd
+-- __init__.py
+-- __mail__.py
and now you can call:
python -m cs
python -m cs.app
python -m cs.app.mailbd
as needed. The __main__.py files look like this:
if __name__ = '__main__':
import cs.app.maildb
sys.exit(cs.app.maildb.main(sys.argv))
or as appropriate.
Yes, it's a bit more work. If your package has 30 modules, and every one is
runnable, that's a lot more work. But if your package is that, um,
intricate, then perhaps it needs a redesign?
The major use-case for this feature is where you have a package, and you
want it to have a single entry point when running it as a script. (That
would be "python -m cs" in the example above.) But it can be used when you
have multiple entry points too.
For a single .py file, you can usually assume that when you are running it
as a stand alone script, there are no circular imports of itself:
# spam.py
import eggs
if __name__ == '__main__':
main()
# eggs.py
import spam # circular import
If that expectation is violated, then you can run into the trouble you
already did.
So...
* you can safely combine importable module and runnable script in
the one file, provided the runnable script functionality doesn't
depend on importing itself under the original name (either
directly or indirectly);
* if you must violate that expectation, the safest solution is to
make the module a package with a __main__.py file that contains
the runnable script portion;
* if you don't wish to do that, you're screwed, and I think that the
best you can do is program defensively by detecting the problem
after the event and bailing out:
# untested
import __main__
import myactualfilename
if os.path.samefile(__main__.__path__, myactualfilename.__path__):
raise RuntimeError
--
Steven
More information about the Python-list
mailing list