PEP for executing a module in a package containing relative imports

Some of you might remember a discussion that took place on this list about not being able to execute a script contained in a package that used relative imports (read the PEP if you don't quite get what I am talking about). The PEP below proposes a solution (along with a counter-solution). Let me know what you think. I especially want to hear which proposal people prefer; the one in the PEP or the one in the Open Issues section. Plus I wouldn't mind suggestions on a title for this PEP. =) ------------------------------------------- PEP: XXX Title: XXX Version: $Revision: 52916 $ Last-Modified: $Date: 2006-12-04 11:59:42 -0800 (Mon, 04 Dec 2006) $ Author: Brett Cannon Status: Draft Type: Standards Track Content-Type: text/x-rst Created: XXX-Apr-2007 Abstract ======== Because of how name resolution works for relative imports in a world where PEP 328 is implemented, the ability to execute modules within a package ceases being possible. This failing stems from the fact that the module being executed as the "main" module replaces its ``__name__`` attribute with ``"__main__"`` instead of leaving it as the actual, absolute name of the module. This breaks import's ability to resolve relative imports from the main module into absolute names. In order to resolve this issue, this PEP proposes to change how a module is delineated as the module that is being executed as the main module. By leaving the ``__name__`` attribute in a module alone and setting a module attribute named ``__main__`` to a true value for the main module (and thus false in all others), proper relative name resolution can occur while still having a clear way for a module to know if it is being executed as the main module. The Problem =========== With the introduction of PEP 328, relative imports became dependent on the ``__name__`` attribute of the module performing the import. This is because the use of dots in a relative import are used to strip away parts of the calling module's name to calcuate where in the package hierarchy a relative import should fall (prior to PEP 328 relative imports could fail and would fall back on absolute imports which had a chance of succeeding). For instance, consider the import ``from .. import spam`` made from the ``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package itself, i.e., does not define ``__path__``). Name resolution of the relative import takes the caller's name (``bacon.ham.beans``), splits on dots, and then slices off the last n parts based on the level (which is 2). In this example both ``ham`` and ``beans`` are dropped and ``spam`` is joined with what is left (``bacon``). This leads to the proper import of the module ``bacon.spam``. This reliance on the ``__name__`` attribute of a module when handling realtive imports becomes an issue with executing a script within a package. Because the executing script is set to ``'__main__'``, import cannot resolve any relative imports. This leads to an ``ImportError`` if you try to execute a script in a package that uses any relative import. For example, assume we have a package named ``bacon`` with an ``__init__.py`` file containing:: from . import spam Also create a module named ``spam`` within the ``bacon`` package (it can be an empty file). Now if you try to execute the ``bacon`` package (either through ``python bacon/__init__.py`` or ``python -m bacon``) you will get an ``ImportError`` about trying to do a relative import from within a non-package. Obviously the import is valid, but because of the setting of ``__name__`` to ``'__main__'`` import thinks that ``bacon/__init__.py`` is not in a package since no dots exist in ``__name__``. To see how the algorithm works, see ``importlib.Import._resolve_name()`` in the sandbox [#importlib]_. Currently a work-around is to remove all relative imports in the module being executed and make them absolute. This is unfortunate, though, as one should not be required to use a specific type of resource in order to make a module in a package be able to be executed. The Solution ============ The solution to the problem is to not change the value of ``__name__`` in modules. But there still needs to be a way to let executing code know it is being executed as a script. This is handled with a new module attribute named ``__main__``. When a module is being executed as a script, ``__main__`` will be set to a true value. For all other modules, ``__main__`` will be set to a false value. This changes the current idiom of:: if __name__ == '__main__': ... to:: if __main__: ... The current idiom is not as obvious and could cause confusion for new programmers. The proposed idiom, though, does not require explaining why ``__name__`` is set as it is. With the proposed solution the convenience of finding out what module is being executed by examining ``sys.modules['__main__']`` is lost. To make up for this, the ``sys`` module will gain the ``main`` attribute. It will contain a string of the name of the module that is considered the executing module. A competing solution is discussed in `Open Issues`_. Transition Plan =============== Using this solution will not work directly in Python 2.6. Code is dependent upon the semantics of having ``__name__`` set to ``'__main__'``. There is also the issue of pre-existing global variables in a module named ``__main__``. To deal with these issues, a two-step solution is needed. First, a Py3K deprecation warning will be raised during AST generation when a global variable named ``__main__`` is defined. This will help with the detection of code that would reset the value of ``__main__`` for a module. Without adding a warning when a global variable is injected into a module, though, it is not fool-proof. But this solution should cover the vast majority of variable rebinding problems. Second, 2to3 [#2to3]_ will gain a rule to transform the current ``if __name__ == '__main__': ...`` idiom to the new one. While it will not help with code that checks ``__name__`` outside of the idiom, that specific line of code makes up a large proporation of code that every looks for ``__name__`` set to ``'__main__'``. Open Issues =========== A counter-proposal to introducing the ``__main__`` attribute on modules was to introduce a built-in with the same name. The value of the built-in would be the name of the module being executed (just like the proposed ``sys.main``). This would lead to a new idiom of:: if __name__ == __main__: ... The perk of this idiom over the one proposed earlier is that the general semantics does not differ greatly from the current idiom. The drawback is that the syntactic difference is subtle; the dropping of quotes around "__main__". Some believe that for existing Python programmers bugs will be introduced where the quotation marks will be put on by accident. But one could argue that the bug would be discovered quickly through testing as it is a very shallow bug. The other pro of this proposal over the earlier one is the alleviation of requiring import code to have to set the value of ``__main__``. By making it a built-in variable import does not have to care about ``__main__`` as executing the code itself will pick up the built-in ``__main__`` itself. This simplies the implementation of the proposal as it only requires setting a built-in instead of changing import to set an attribute on every module that has exactly one module have a different value (much like the current implementation has to do to set ``__name__`` in one module to ``'__main__'``). References ========== .. [#2to3] 2to3 tool (http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC] .. [#importlib] importlib (http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=mark...) [ViewVC] Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

On 4/19/07, Brett Cannon <brett@python.org> wrote:
As you've probably already guessed, I prefer the:: if __main__: version. I don't think I've ever used sys.modules['__main__'].
Could you explain a bit why __main__ couldn't be inserted into modules before the module is actually executed? E.g. something like:: >>> module_text = '''\ ... __main__ = 'foo' ... print __main__ ... ''' >>> import new >>> mod = new.module('mod') >>> mod.__main__ = True >>> exec module_text in mod.__dict__ foo >>> mod.__main__ 'foo' I would have thought that if Python inserted __main__ before any of the module contents got exec'd, it would be backwards compatible because any use of __main__ would just overwrite the default one. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/19/07, Steven Bethard <steven.bethard@gmail.com> wrote:
Yeah, I figured you would. =)
That's right, and that is the problem. That would mean if __main__ was false but then overwritten by a function or something, it suddenly became true. It isn't a problem in terms of whether the code will run, but whether the expected semantics will occur. -Brett

On 4/20/07, Brett Cannon <brett@python.org> wrote:
Sure, but I don't see how it's much different from anyone who writes:: list = [foo, bar, baz] and then later wonders why:: list(obj) gives a ``TypeError: 'list' object is not callable``. If someone doesn't understand that the __main__ they defined at the beginning of a module is going to be the same __main__ they use at the end of the module, they're going to need to go do some reading about how name binding works in Python anyway. Of course, I definitely think it would be valuable to have a Py3K deprecation warning to help users identify when they've made a silly mistake like this. (Note that the counter-proposal has the same problem, so this needs to be resolved regardless of which approach gets taken.) I'd really like there to be a way to write Python 3.0 compatible code in Python 2.6 without having to run through 2to3. I think it's clear that __main__ can be defined (at module-level or in the builtins) without introducing any backwards compatibility problems right? Anyone that doesn't want to use the Python 3.0 idiom can still write ``if __name__ == '__main__'`` and it will continue to work in Python 2.X. And anyone who does want to use the Python 3.0 idiom is probably using the Py3K flag anyway, so if they make a stupid mistake, it'll get caught pretty quickly. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/20/07, Steven Bethard <steven.bethard@gmail.com> wrote:
Exactly. It's just that 'list' was known about when the code was written while __main__ was not.
Yep.
Exactly. Python 2.6 will still have __name__ set to '__main__', but also have __main__ set. Python 3.0 will not change __name__ at all. This is why the PEP is a Py3K PEP and not a 2.6 PEP. -Brett

On 4/20/07, Brett Cannon <brett@python.org> wrote:
Exactly. Python 2.6 will still have __name__ set to '__main__', but also have __main__ set. Python 3.0 will not change __name__ at all.
That should be Python 3.0 will not change __main__ at all, right? Because __name__ is going to change from being "__main__" in the main module to being the actual module name in Python 3.0, right? Assuming that's right, I think it was unclear to me that you wanted to add __main__ to Python 2.x. Probably chainging: First, a Py3K deprecation warning will be raised... to: First, each module will gain a __main__ attribute and a Py3K deprecation warning will be raised... would make the intent clearer. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/20/07, Brett Cannon <brett@python.org> wrote:
If the code is still using a __main__ variable of its own, then presumably it isn't using the new meaning of __main__, and isn't affected by the unexpected semantics. Or are you concerned that some code *outside* a module could check to see whether that module is __main__?
Sure, but I don't see how it's much different from anyone who writes::
list = [foo, bar, baz]
and then later wonders why::
list(obj)
gives a ``TypeError: 'list' object is not callable``.
Exactly. It's just that 'list' was known about when the code was written while __main__ was not.
In that case, the module itself isn't using (and doesn't care) about the new __main__ semantics. Code external to the module can't rely on either (list or __main__) being unchanged, even today.
I'd really like there to be a way to write Python 3.0 compatible code in Python 2.6 without having to run through 2to3.
To me, this is a fairly important requirement that I fear is sometimes being forgotten. 2to3 isn't really a one-time translation unless you stop supporting 2.x after running it. -jJ

"Brett Cannon" <brett@python.org> wrote:
About all I can come up with is "Fixing relative imports".
According to your PEP, the point of the above is so that __name__ can become something descriptive, so that relative imports can do their thing as per PEP 328 semantics. However, both of your proposals seek to offer a value for __main__ (either as a builtin or module global). While others will probably disagree with me, I'm going to go with your 'open issues' proposal of ...
if __name__ == __main__: ...
As you say, errors arising from the 'subtle' removal of quotes will be quickly discovered (without a 2to3 conversion), and with a 2to3 conversion can be automatically converted. In 2.6, it could result in a warning or exception, depending on how Python 2.6 is run and/or what __future__ statements are used. It also doesn't rely on sticking yet another value in a module's globals (which makes it easier for 3rd parties to handle module loading by hand), while still makeing __main__ accessable. For people who had previously been using sys.modules['__main__'], they can instead use sys.modules[__main__] to get the same effect, which your initial proposal does not allow. - Josiah

After reading other posts in the thread, I'm going to put my support into the sys.main variant. It has all of the benefits of the builtin __name__ == __main__, with none of the drawbacks (no builtin!), and only a slight annoyance of 'import sys', which is more or less free. - Josiah

On 4/21/07, Brett Cannon <brett@python.org> wrote:
Note that the one benefit the sys.main-only variant doesn't have is the lower cognitive load of just having to know about __main__, instead of having to know about __name__, import and sys.main. That said, since the PEP as it stands introduces a sys.main anyway, we might as well start with that. People can then play around with it and see if we need to introduce a __main__ module attribute or builtin as well. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Sat, Apr 21, 2007, Steven Bethard wrote:
From my POV that is indeed a lower cognitive load because all I need to remember is to look in the docs for the sys module -- everything else is
there. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons." --Aahz

On 4/22/07, Aahz <aahz@pythoncraft.com> wrote:
As a newbie, you need to remember to lookup at least two things: __name__ and sys.main. As compared to having to lookup just __main__. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/21/07, Brett Cannon <brett@python.org> wrote:
On 4/21/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Yeah, I am starting to like it as well. Steven and Jim, what do you think?
Better than adding a builtin. I'm not sure I like the idea of another semi-random object in sys either, though. (1) One of the motivations was importing. It looks like __file__ already has sufficient information. I understand that relying on it (or on __package__?) seems a bit hacky, but is it really worse than adding something? (2) Is there a reason the main module can't appear in sys.modules twice, once under the alias "__main__"? # Equivalent to today if __name__ == sys.modules["__main__"].__name__: # Better than today if __name__ is sys.modules["__main__"].__name__: # What I would like (pending PEP I hope to write tonight) if __this_module__ is sys.modules["__main__"]: -jJ

"Jim Jewett" <jimjjewett@gmail.com> wrote:
While it is unlikely, there may be cleanup issues when the process is ending.
The above two should be equivalent unless the importer has a bad habit.
# What I would like (pending PEP I hope to write tonight) if __this_module__ is sys.modules["__main__"]:
While I would also very much like the ability to access *this module*, I don't believe that this necessarily precludes the use of a proper package.module naming scheme for all __name__ values. - Josiah

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Is it just me, or are the proposals starting to look more and more like:: public static void main(String args[]) I think this PEP now needs to explicitly state that keeping the "am I the main module?" idiom as simple as possible is *not* a goal. Because everything I've seen (except for the original proposals in the PEP) are substantially more complicated than the current:: if __name__ == '__main__': I guess I don't understand why we wouldn't be willing to put up with a new module attribute or builtin to minimize the boilerplate in pretty much every Python application out there. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
I'm proposing the following changes: * sys.main is added which contains the dotted name of the main script. This allows code like: if __name__ == sys.main: ... main_module = sys.modules[sys.main] * __name__ is never mangled and contains always the dotted name of the current module. It's not set to '__main__' any more. You can get the current module object with this_module = sys.modules[__name__] * I'm against sys.modules['__main__] = main_module because it may cause ugly side effects with reload. The same functionality is available with sys.modules[sys.main]. The Zen Of Python says that there should be one and only one obvious way.
Why bother with the second price when you can win the first prize? In my opinion a __main__() function makes live easier than a __main__ module level variable. It's also my opinion that the main code should be in a function and not in the body of the module. I consider it good style because the code is unit testable (is this a word? *g*) and callable from another module while code in the body is not accessable from unit tests and other scripts. I know that some people are against __main__(argv) but I've good reasons to propose the argv syntax. Although argv is available via sys.argv I like the see it as an argument for __main__() for the same reasons I like to see __main__. It makes unit testing and calls from another module possible. W/o the argv argument is harder to change the argument in unit tests. Now for some syntactic sugar and a dream of mine: @argumentdecorator(MyOptionParserClass) def __main__(egg, spam=5): pass The argumentdecorator function takes some kind of option parser class that is used to parse argv. This would allow nice code like __main__(('mainscript.py', '--eggs 5', '--no-spam')) Christian

On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
Note that this really requires the code:: import sys if __name__ == sys.main: The import statement matters to me because 77% of my modules that use the __main__ idiom *don't* import sys. Hence, for those modules, this new idiom introduces more boilerplate. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/22/07, Steven Bethard <steven.bethard@gmail.com> wrote:
On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
I'm proposing the following changes:
* sys.main is added which contains the dotted name of the main script. This allows code like:
if __name__ == sys.main:
Note that this really requires the code::
import sys if __name__ == sys.main:
As long as we're in python-ideas, I'll throw out the radical suggestion of auto-importing sys into builtins, the way os autoimports path. -jJ

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
While that would address my concern, I wonder if adding sys to the builtins is really any better than adding __main__ to the builtins. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
While that would address my concern, I wonder if adding sys to the builtins is really any better than adding __main__ to the builtins.
If I understand the proposal right then __main__ won't be a builtin. Each module would get a new global variable __main__ which is set either to True or False. Also I consider sys kinda reserved for the sys module while the __main__ global var approach would reserve a new name that I like to see used for something else. +0.25 for sys in builtins Christian

Jim Jewett wrote:
+1 I thought that this was discussed before and had gotten general approval. Also it makes sense to me to have additional entries in sys to identify the starting main, and the root package modules. I also like the idea of having a way to say this_module. if __module__ is sys.__main__: ... Notice names aren't used this way, which is generally how you would compare any object in python. You wouldn't try to get it's name and then compare that to the name of another object. Ron

On 22 Apr 2007, at 19.50, Ron Adam wrote:
Agreed... I suggested something like that a couple of days ago (except assuming __main__ would be a builtin global instead of in sys). I proposed __this__ as the name for accessing the current module. Mainly because I like the Englishlike way it reads: "if __this__ is __main__". 'If this is main' -- couldn't be simpler. Though I'd also be fine with sys.__main__ or sys.main (I'd prefer the latter). I would support having sys be an automatic global.

On Sun, Apr 22, 2007, Steven Bethard wrote:
Does this follow the axiom that 83% of all statistics are made up on the spot? ;-) Seriously, if I'm writing a script that requires __main__, chances are excellent that it already includes sys (because it's probably a command-line script that's graduating to module status). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons." --Aahz

On 4/22/07, Aahz <aahz@pythoncraft.com> wrote:
No, I actually went and counted in my local repository. There are two main reasons why that's true: (1) Most unittest modules just run unittest.main(), so no import of sys. (2) Most other modules use optparse or argparse, so no import of sys. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
That can't be true. If I am in the directory /spam but I execute the file /bacon/code.py, what is the name of /bacon/code.py supposed to be? It makes absolutely no sense unless sys.path happens to have either / or /bacon. This is why I wondered out loud if setting whatever attribute that is chosen not to __main__ should only be done with '-m' as that keeps it simple and clear instead of having to try to reverse-engineer a file's __name__ attribute.
I assume that key is a string? There is a single quote that is not closed off.
People can stop wishing for this. I am not going to be writing a PEP supporting this. I don't like it; never have. I like how Python handles things currently in terms of relying on how module are executed linearly. I am totally fine if people propose a competing PEP or try to resurrect PEP 299, but I am not going to be the person who does that leg work. -Brett

Steven Bethard wrote:
Agreed - it's getting horrid. As Pythonic as they think this is, they're completely forgetting the newb. So let's look at it from his point of view. Say I'm a Python newb. I've written some modules and some executable Python scripts and I'm somewhat comfy with the language. (Of course, it only took me about two hours to get comfy - this is Python, after all.) I now want to write either: 1) A module that runs unit tests when it is run as a script, but not when it's just imported; or 2) A script that can be imported as a module when I need a few of its functions. (I should really split them into another module, but this is a use case.) Now I have to import sys? Never seen that one... okay. Imported. Now, what's this Greek I have to write to test whether the script is the main script? How am I supposed to remember this? This is worse than fork()! On the other hand, IMNSHO, either of the following two are just about perfect in terms of understandability, and parsimony: def __main__(): # we really don't need args here # stuff if __main__: # stuff Chances are, the first will be very familiar, but refreshing that it's just a plain old, gibberish-free function. Both are easier than what we've got currently. (IMO, the first is better, because 1) the code can be put anywhere in the module; 2) it automatically doesn't pollute the global namespace; and 3) it's less boilerplate for complex modules and no more boilerplate for simple ones.) FWIW, I don't see a problem with a sys.modules['__main__'] - it would even occasionally be useful - but nobody should be *required* to use an abomination like that for what's clearly a newbie task: determining whether a module is run as a script. Neil

Neil Toronto wrote:
I think __main__(*argv) has some benefits over __main__(). It allows you to call the function with different arguments from another script or a unit test. def __main__(argv=None): if argv is None: argv = sys.argv # has the same effect but it is ugly
I see the problem in having the same module under two names in sys.modules. It may lead to issues (reload?). Also it is not necessary to get the main module if we store the dotted name in sys.main. So sys.modules[sys.main] would return the main module. Christian

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Yes, because you have no guarantee __file__ will in any way be unique or even defined (look at 'sys'). It's up to the loader to set __file__ and it can do whatever it wants. This doesn't happen with __name__ since it is rather clear what that should be no matter where the module was loaded from (unless it was a Python file specified at the command line in some random directory). -Brett

Brett Cannon schrieb:
What about import sys if __name__ == sys.main: ... You won't have to introduce a new global module var __name__ and it's easy to understand for newbies and experienced developers. The code is only executed when the name of the current module is equal to the executed main module (sys.main). IMO it's much less PIT...B then introducing __main__. Christian

On 4/20/07, Christian Heimes <lists@cheimes.de> wrote:
True, but it does introduce an import for a module that may never be used if the module is not being executed. That kind of sucks for minor performance reasons. But what do other people think? -Brett

Brett Cannon schrieb:
Yeah but sys is used by a lot of modules. Probably 95%+ of executable modules are either using sys directly to access sys.argv or os which imports sys. Also sys is a builtin module which is imported ridiculously fast. I assume that the speed penalty for scripts that don't use sys is minor. In my humble opinion it sucks less to force the import of a core module that is already used by most modules than to bind valuable developer time in the __main__ approach. I think it's a Pythonic solution as well. :) Christian

On Fri, Apr 20, 2007, Brett Cannon wrote:
Looks good to me! sys is essentially guaranteed to be imported, so you're only wasting a few cycles to bring it into the module namespace. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Help a hearing-impaired person: http://rule6.info/hearing.html

On 4/20/07, Christian Heimes <lists@cheimes.de> wrote:
But you have to understand a few things to understand why this works. You have to know that __name__ is the name of the module, and that if you want to find out the name of the main module, you need to look at sys.main. With the idiom:: if __main__: all you need to know is that the main module has __main__ set to true.
IMO it's much less PIT...B then introducing __main__.
Could you elaborate? Do you think it would be hard to introduce another module-level attribute (like we already do for __name__)? Or do you think that the code would be hard to maintain? Or something else...? Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
This is just my humble opinion. I'm new to Python core development. Well, in my opinion a new module level var like __main__ isn't worth to add when it is just boolean flag. With the proposed addition of sys.main the same information is available with just few more characters to type. If I recall correctly Python is trying to get rid of global variables in Python 3000. I don't think it's hard to add - even for me although I know less about the Python core. I'm more worried about the side effect when people have already used __main__ as a function. The problem is in 2to3. If you like to introduce __main__ why not implement http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*) module level function that replaced the "if __name__ == '__main__'" idiom. The __main__ function follows the example of other programming languages like C, C# and Java. I'm aware of the fact that the PEP was rejected but I think it's worth to discuss it again. Christian

On 4/21/07, Christian Heimes <lists@cheimes.de> wrote:
I don't like the __main__ function signature. There are lots of options, like optparse and argparse_ that are much better than manually parsing sys.argv as the PEP 299 signature would suggest. And if there's nothing to be passed to the function, why make it a function at all? Personally, I thought one of the pluses of the current status quo (as well as what Brett is proposing here) is that it *didn't* follow in the (misplaced IMHO) footsteps of languages like C and Java. I think we're probably best letting dead PEPs lie. .. _argparse: http://argparse.python-hosting.com/ STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
I agree that optparse and argparse are better ways to parse a command line than using sys.argv directly, but nothing in PEP299 would prevent you from using them. In fact, I am pretty sure that with a suitable decorator on __main__ you could make their use even simpler.
I find it very sad that PEP299 did in fact die, because I think it is much cleaner solution than the proposal that started this thread. That said, I would like to se a way to remove the __name__=='__main__' weirdness. I am +1 on resurrecting PEP299, but also +1 on adding a "sys.main" that could be used in a new "if __name__=sys.main". I am -1 on adding a builtin/global __main__ as proposed, because that would clash with my own PEP299-like use of that name. Jacob -- Jacob Holm CTO Improva ApS

Jacob Holm wrote:
I had at one time (about 4 years ago) thought it was a bit strange. But that was only for a very short while. Python differs from other languages in a very important way. python *always* starts at the top of the file and works it way down until if falls off the bottom. What it does in between the top and the bottom is entirely up to you. It's very dynamic. Other languages *compile* all the code first without executing any of it. Then you are required to tell the the compiler where the program will start, which is why you need to define a main() function. In Python, letting control fall off the bottom in order to start again at some place in the middle doesn't make much sense. It's already started, so you don't need to do that. Cheers, Ron

Ron Adam wrote:
To clarify: By "weirdness" here I meant the fact that the name of a module changes when it is used as the main module.
I know all that.
There are a number of reasons to want to use a function for the main part of the code, instead of putting it in an "if" at the end of the module. Two simple ones are: Keeping the module namespace clean. The ability to call the function from other code, most likely with different args. Since I am usually writing such a function anyway, I would prefer not to have to write the "if" boilerplate at the bottom in order to get it called. Oh, and automatically calling a __main__ function if it exists, does not prevent people who like the current "if" aproach from using that. It would just make *my* life that tiny bit easier. Therefore I would like to keep that door open by *not* adding the proposed __main__ variable at this point. Fortunately, the people that matter here seem to think avoiding the extra variable is a good idea (although for different reasons). Jacob -- Jacob Holm CTO Improva ApS

Steven Bethard wrote:
if there's nothing to be passed to the function, why make it a function at all?
I don't usually like to put big lumps of init code at the module level, because it pollutes the module namespace with local variables. So I typically end up with def main(): ... ... ... if __name__ == "__main__": main() So I'd be quite happy if I could just define a function called __main__() and be done with. I don't understand why there's so much opposition to that idea. -- Greg

On 4/22/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
+1. Although I may start out at the module level, that's typically the idiom I use eventually for any non-trivial (e.g. more than 1-2 lines) main*. George * Only exception is if the module consists essentially of main(), i.e. a small standalone script without classes, functions, etc.

On 4/22/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I guess I'm just the odd one out here in that I parse my arguments before passing them to module-level functions. So my code normally looks like:: if __name__ == '__main__': ... a few lines of argument parsing code ... some_function_name(args.foo, args.bar, args.baz) That is, I do the argument parsing at the module level, and then call the module functions with more meaningful arguments than sys.argv. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/19/07, Brett Cannon <brett@python.org> wrote:
Part of me says that you are already proposing the right answer, as these alternatives are just a little too hackish. Still, they are good enough that they should be listed in the PEP, even if only as rejected alternatives. (1) You could add a builtin __main__ that is false. The real main module would mask it, but no other code would need to change. Con: Another builtin, and this one wouldn't even make sense as an independent object. (2) You could special-case the import to use __file__ instead of __name__ when __name__ == "__main__" Con: may be more fragile. (3) You could set __name__ to (an instance of) a funky string subclass that overrides __eq__. Con: may be hard to find exactly the *right* behavior. Examples: What should str(name) do? Maybe __main__ should be the primary value, and split should be overridden? -jJ

I realized two things that I didn't mention in the PEP. One is that Python will have to infer the proper package name for a module being executed. Currently Python only knows the name of a module because you asked for something and it tries to find a module that fits that request. But what is being proposed here has to figure out what you would have asked for in order for the import to happen. So I need to spell out the algorithm that will need to be used to figure out ``python bacon/__init__.py`` is the bacon package. Using the '-m' option solves this as the name is given as an argument. Maybe this should only be expected to work with the -m option? Would simplify things, but it does restrict the usefulness overall (but not entirely as you would still gain a new feature). The other issue is what to do if the module being executed is above the current directory where Python is executing from (e.g., ``python ../spam.py``). You can't infer the name for that module if the parent directory is not on sys.path. Setting the name to "__main__" might need to stay for instances where the module being executed cannot have it's name inferred. This is another argument to only support '-m' with this. -Brett On 4/19/07, Brett Cannon <brett@python.org> wrote:

"Brett Cannon" <brett@python.org> wrote:
There's also the rub that if you 'run' the module in /a/b/c/d/e/f.py, but all a-e are packages, the "proper" semantics may state that you need to import a/__init__.py, a/b/__init__.py, etc., prior to the execution of f.py . Of course the only way that you would know that is if you checked the paths .../e/, .../d/, etc. The PEP should probably be changed to state the order of imports in a case similar to this, and whether or not it bothers to check ancestor paths for package information. - Josiah

On 4/20/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Good point. It's one of the ways my import implementation differs from the current one as I just import the parent up to the requested module while the current implementation throws an exception. -Brett

On 19 Apr 2007, at 23.38, Brett Cannon wrote:
I like that one. But one thing I've always thought would be handy is a builtin (maybe __this__?) pointing to the current module object itself (instead of its name). Any chance of that happening? In that case, __main__ could globally point to the main module instead of its name. The idiom would then be "if __this__ is __main__:...'. I think that reads pretty well: "If this is [the] main [module, then ...]."

"Brett Cannon" <brett@python.org> wrote in message news:bbaeab100704192038v110b053eqfdcf49f613302f8@mail.gmail.com... | Let me know what you think. I especially want to hear which proposal | people prefer; the one in the PEP or the one in the Open Issues section. This PEP has two proposals, which I think should be better separated. 1. Leave __name__ alone (without the '__main__' hack) so that relative imports work when executing scripts within packages. My comment here is that I am fuzzy on the difference between __name__ and __file__ and why we would then need both. 2. Fix the 'main' self-knowledge problem introduced by fix 1. The 'counter-proposal' is only an alternative to this second proposal, as it agree with the first. I had the same idea as Christian as a third alternative, but as a user would prefer the simplest invocation possible. I agree with Jim that multiple alternatives should be listed. I think the '__main__' hack was both elegant and a wart, and agree that we should seriously consider a pair of coupled fixes. | Plus I wouldn't mind suggestions on a title for this PEP.| =) Package scripts, relative imports, and main identification. Terry Jan Reedy

I revised the PEP to use the sys.main idea and sent it off to python-3000. If you care to participate in the discussion please move it over there. Thanks to everyone who contributed to the discussion. I really appreciate the help! -Brett

On 4/22/07, Brett Cannon <brett@python.org> wrote:
I revised the PEP to use the sys.main idea and sent it off to python-3000.
Just wanted to say thanks Brett for putting the time into this! Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/22/07, Steven Bethard <steven.bethard@gmail.com> wrote:
Welcome. I am just glad I got the email off literally 15 minutes or so before my laptop died. So if the hard drive is gone at least I have the latest version still. =)

what about: if __module__ is __main__: ... __module__ is a reference to "this module" and __main__ is a reference to the module which stated execution this would have the sideeffect that: def f(): __module__.x = 42 equals: def f(): global x x = 42 etc. Off Topic: a __func__ would also be nice: def foo(): print __func__.__name__ prints "foo" def foo(): x = 1 def bar(): __func__.parent.x += 1 return __func__.parent.x return bar print bar() prints "2" well, on second thought this is looks a bit cluttered.

On 4/19/07, Brett Cannon <brett@python.org> wrote:
As you've probably already guessed, I prefer the:: if __main__: version. I don't think I've ever used sys.modules['__main__'].
Could you explain a bit why __main__ couldn't be inserted into modules before the module is actually executed? E.g. something like:: >>> module_text = '''\ ... __main__ = 'foo' ... print __main__ ... ''' >>> import new >>> mod = new.module('mod') >>> mod.__main__ = True >>> exec module_text in mod.__dict__ foo >>> mod.__main__ 'foo' I would have thought that if Python inserted __main__ before any of the module contents got exec'd, it would be backwards compatible because any use of __main__ would just overwrite the default one. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/19/07, Steven Bethard <steven.bethard@gmail.com> wrote:
Yeah, I figured you would. =)
That's right, and that is the problem. That would mean if __main__ was false but then overwritten by a function or something, it suddenly became true. It isn't a problem in terms of whether the code will run, but whether the expected semantics will occur. -Brett

On 4/20/07, Brett Cannon <brett@python.org> wrote:
Sure, but I don't see how it's much different from anyone who writes:: list = [foo, bar, baz] and then later wonders why:: list(obj) gives a ``TypeError: 'list' object is not callable``. If someone doesn't understand that the __main__ they defined at the beginning of a module is going to be the same __main__ they use at the end of the module, they're going to need to go do some reading about how name binding works in Python anyway. Of course, I definitely think it would be valuable to have a Py3K deprecation warning to help users identify when they've made a silly mistake like this. (Note that the counter-proposal has the same problem, so this needs to be resolved regardless of which approach gets taken.) I'd really like there to be a way to write Python 3.0 compatible code in Python 2.6 without having to run through 2to3. I think it's clear that __main__ can be defined (at module-level or in the builtins) without introducing any backwards compatibility problems right? Anyone that doesn't want to use the Python 3.0 idiom can still write ``if __name__ == '__main__'`` and it will continue to work in Python 2.X. And anyone who does want to use the Python 3.0 idiom is probably using the Py3K flag anyway, so if they make a stupid mistake, it'll get caught pretty quickly. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/20/07, Steven Bethard <steven.bethard@gmail.com> wrote:
Exactly. It's just that 'list' was known about when the code was written while __main__ was not.
Yep.
Exactly. Python 2.6 will still have __name__ set to '__main__', but also have __main__ set. Python 3.0 will not change __name__ at all. This is why the PEP is a Py3K PEP and not a 2.6 PEP. -Brett

On 4/20/07, Brett Cannon <brett@python.org> wrote:
Exactly. Python 2.6 will still have __name__ set to '__main__', but also have __main__ set. Python 3.0 will not change __name__ at all.
That should be Python 3.0 will not change __main__ at all, right? Because __name__ is going to change from being "__main__" in the main module to being the actual module name in Python 3.0, right? Assuming that's right, I think it was unclear to me that you wanted to add __main__ to Python 2.x. Probably chainging: First, a Py3K deprecation warning will be raised... to: First, each module will gain a __main__ attribute and a Py3K deprecation warning will be raised... would make the intent clearer. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/20/07, Brett Cannon <brett@python.org> wrote:
If the code is still using a __main__ variable of its own, then presumably it isn't using the new meaning of __main__, and isn't affected by the unexpected semantics. Or are you concerned that some code *outside* a module could check to see whether that module is __main__?
Sure, but I don't see how it's much different from anyone who writes::
list = [foo, bar, baz]
and then later wonders why::
list(obj)
gives a ``TypeError: 'list' object is not callable``.
Exactly. It's just that 'list' was known about when the code was written while __main__ was not.
In that case, the module itself isn't using (and doesn't care) about the new __main__ semantics. Code external to the module can't rely on either (list or __main__) being unchanged, even today.
I'd really like there to be a way to write Python 3.0 compatible code in Python 2.6 without having to run through 2to3.
To me, this is a fairly important requirement that I fear is sometimes being forgotten. 2to3 isn't really a one-time translation unless you stop supporting 2.x after running it. -jJ

"Brett Cannon" <brett@python.org> wrote:
About all I can come up with is "Fixing relative imports".
According to your PEP, the point of the above is so that __name__ can become something descriptive, so that relative imports can do their thing as per PEP 328 semantics. However, both of your proposals seek to offer a value for __main__ (either as a builtin or module global). While others will probably disagree with me, I'm going to go with your 'open issues' proposal of ...
if __name__ == __main__: ...
As you say, errors arising from the 'subtle' removal of quotes will be quickly discovered (without a 2to3 conversion), and with a 2to3 conversion can be automatically converted. In 2.6, it could result in a warning or exception, depending on how Python 2.6 is run and/or what __future__ statements are used. It also doesn't rely on sticking yet another value in a module's globals (which makes it easier for 3rd parties to handle module loading by hand), while still makeing __main__ accessable. For people who had previously been using sys.modules['__main__'], they can instead use sys.modules[__main__] to get the same effect, which your initial proposal does not allow. - Josiah

After reading other posts in the thread, I'm going to put my support into the sys.main variant. It has all of the benefits of the builtin __name__ == __main__, with none of the drawbacks (no builtin!), and only a slight annoyance of 'import sys', which is more or less free. - Josiah

On 4/21/07, Brett Cannon <brett@python.org> wrote:
Note that the one benefit the sys.main-only variant doesn't have is the lower cognitive load of just having to know about __main__, instead of having to know about __name__, import and sys.main. That said, since the PEP as it stands introduces a sys.main anyway, we might as well start with that. People can then play around with it and see if we need to introduce a __main__ module attribute or builtin as well. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Sat, Apr 21, 2007, Steven Bethard wrote:
From my POV that is indeed a lower cognitive load because all I need to remember is to look in the docs for the sys module -- everything else is
there. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons." --Aahz

On 4/22/07, Aahz <aahz@pythoncraft.com> wrote:
As a newbie, you need to remember to lookup at least two things: __name__ and sys.main. As compared to having to lookup just __main__. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/21/07, Brett Cannon <brett@python.org> wrote:
On 4/21/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Yeah, I am starting to like it as well. Steven and Jim, what do you think?
Better than adding a builtin. I'm not sure I like the idea of another semi-random object in sys either, though. (1) One of the motivations was importing. It looks like __file__ already has sufficient information. I understand that relying on it (or on __package__?) seems a bit hacky, but is it really worse than adding something? (2) Is there a reason the main module can't appear in sys.modules twice, once under the alias "__main__"? # Equivalent to today if __name__ == sys.modules["__main__"].__name__: # Better than today if __name__ is sys.modules["__main__"].__name__: # What I would like (pending PEP I hope to write tonight) if __this_module__ is sys.modules["__main__"]: -jJ

"Jim Jewett" <jimjjewett@gmail.com> wrote:
While it is unlikely, there may be cleanup issues when the process is ending.
The above two should be equivalent unless the importer has a bad habit.
# What I would like (pending PEP I hope to write tonight) if __this_module__ is sys.modules["__main__"]:
While I would also very much like the ability to access *this module*, I don't believe that this necessarily precludes the use of a proper package.module naming scheme for all __name__ values. - Josiah

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Is it just me, or are the proposals starting to look more and more like:: public static void main(String args[]) I think this PEP now needs to explicitly state that keeping the "am I the main module?" idiom as simple as possible is *not* a goal. Because everything I've seen (except for the original proposals in the PEP) are substantially more complicated than the current:: if __name__ == '__main__': I guess I don't understand why we wouldn't be willing to put up with a new module attribute or builtin to minimize the boilerplate in pretty much every Python application out there. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
I'm proposing the following changes: * sys.main is added which contains the dotted name of the main script. This allows code like: if __name__ == sys.main: ... main_module = sys.modules[sys.main] * __name__ is never mangled and contains always the dotted name of the current module. It's not set to '__main__' any more. You can get the current module object with this_module = sys.modules[__name__] * I'm against sys.modules['__main__] = main_module because it may cause ugly side effects with reload. The same functionality is available with sys.modules[sys.main]. The Zen Of Python says that there should be one and only one obvious way.
Why bother with the second price when you can win the first prize? In my opinion a __main__() function makes live easier than a __main__ module level variable. It's also my opinion that the main code should be in a function and not in the body of the module. I consider it good style because the code is unit testable (is this a word? *g*) and callable from another module while code in the body is not accessable from unit tests and other scripts. I know that some people are against __main__(argv) but I've good reasons to propose the argv syntax. Although argv is available via sys.argv I like the see it as an argument for __main__() for the same reasons I like to see __main__. It makes unit testing and calls from another module possible. W/o the argv argument is harder to change the argument in unit tests. Now for some syntactic sugar and a dream of mine: @argumentdecorator(MyOptionParserClass) def __main__(egg, spam=5): pass The argumentdecorator function takes some kind of option parser class that is used to parse argv. This would allow nice code like __main__(('mainscript.py', '--eggs 5', '--no-spam')) Christian

On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
Note that this really requires the code:: import sys if __name__ == sys.main: The import statement matters to me because 77% of my modules that use the __main__ idiom *don't* import sys. Hence, for those modules, this new idiom introduces more boilerplate. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/22/07, Steven Bethard <steven.bethard@gmail.com> wrote:
On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
I'm proposing the following changes:
* sys.main is added which contains the dotted name of the main script. This allows code like:
if __name__ == sys.main:
Note that this really requires the code::
import sys if __name__ == sys.main:
As long as we're in python-ideas, I'll throw out the radical suggestion of auto-importing sys into builtins, the way os autoimports path. -jJ

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
While that would address my concern, I wonder if adding sys to the builtins is really any better than adding __main__ to the builtins. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
While that would address my concern, I wonder if adding sys to the builtins is really any better than adding __main__ to the builtins.
If I understand the proposal right then __main__ won't be a builtin. Each module would get a new global variable __main__ which is set either to True or False. Also I consider sys kinda reserved for the sys module while the __main__ global var approach would reserve a new name that I like to see used for something else. +0.25 for sys in builtins Christian

Jim Jewett wrote:
+1 I thought that this was discussed before and had gotten general approval. Also it makes sense to me to have additional entries in sys to identify the starting main, and the root package modules. I also like the idea of having a way to say this_module. if __module__ is sys.__main__: ... Notice names aren't used this way, which is generally how you would compare any object in python. You wouldn't try to get it's name and then compare that to the name of another object. Ron

On 22 Apr 2007, at 19.50, Ron Adam wrote:
Agreed... I suggested something like that a couple of days ago (except assuming __main__ would be a builtin global instead of in sys). I proposed __this__ as the name for accessing the current module. Mainly because I like the Englishlike way it reads: "if __this__ is __main__". 'If this is main' -- couldn't be simpler. Though I'd also be fine with sys.__main__ or sys.main (I'd prefer the latter). I would support having sys be an automatic global.

On Sun, Apr 22, 2007, Steven Bethard wrote:
Does this follow the axiom that 83% of all statistics are made up on the spot? ;-) Seriously, if I'm writing a script that requires __main__, chances are excellent that it already includes sys (because it's probably a command-line script that's graduating to module status). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "...string iteration isn't about treating strings as sequences of strings, it's about treating strings as sequences of characters. The fact that characters are also strings is the reason we have problems, but characters are strings for other good reasons." --Aahz

On 4/22/07, Aahz <aahz@pythoncraft.com> wrote:
No, I actually went and counted in my local repository. There are two main reasons why that's true: (1) Most unittest modules just run unittest.main(), so no import of sys. (2) Most other modules use optparse or argparse, so no import of sys. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/22/07, Christian Heimes <lists@cheimes.de> wrote:
That can't be true. If I am in the directory /spam but I execute the file /bacon/code.py, what is the name of /bacon/code.py supposed to be? It makes absolutely no sense unless sys.path happens to have either / or /bacon. This is why I wondered out loud if setting whatever attribute that is chosen not to __main__ should only be done with '-m' as that keeps it simple and clear instead of having to try to reverse-engineer a file's __name__ attribute.
I assume that key is a string? There is a single quote that is not closed off.
People can stop wishing for this. I am not going to be writing a PEP supporting this. I don't like it; never have. I like how Python handles things currently in terms of relying on how module are executed linearly. I am totally fine if people propose a competing PEP or try to resurrect PEP 299, but I am not going to be the person who does that leg work. -Brett

Steven Bethard wrote:
Agreed - it's getting horrid. As Pythonic as they think this is, they're completely forgetting the newb. So let's look at it from his point of view. Say I'm a Python newb. I've written some modules and some executable Python scripts and I'm somewhat comfy with the language. (Of course, it only took me about two hours to get comfy - this is Python, after all.) I now want to write either: 1) A module that runs unit tests when it is run as a script, but not when it's just imported; or 2) A script that can be imported as a module when I need a few of its functions. (I should really split them into another module, but this is a use case.) Now I have to import sys? Never seen that one... okay. Imported. Now, what's this Greek I have to write to test whether the script is the main script? How am I supposed to remember this? This is worse than fork()! On the other hand, IMNSHO, either of the following two are just about perfect in terms of understandability, and parsimony: def __main__(): # we really don't need args here # stuff if __main__: # stuff Chances are, the first will be very familiar, but refreshing that it's just a plain old, gibberish-free function. Both are easier than what we've got currently. (IMO, the first is better, because 1) the code can be put anywhere in the module; 2) it automatically doesn't pollute the global namespace; and 3) it's less boilerplate for complex modules and no more boilerplate for simple ones.) FWIW, I don't see a problem with a sys.modules['__main__'] - it would even occasionally be useful - but nobody should be *required* to use an abomination like that for what's clearly a newbie task: determining whether a module is run as a script. Neil

Neil Toronto wrote:
I think __main__(*argv) has some benefits over __main__(). It allows you to call the function with different arguments from another script or a unit test. def __main__(argv=None): if argv is None: argv = sys.argv # has the same effect but it is ugly
I see the problem in having the same module under two names in sys.modules. It may lead to issues (reload?). Also it is not necessary to get the main module if we store the dotted name in sys.main. So sys.modules[sys.main] would return the main module. Christian

On 4/22/07, Jim Jewett <jimjjewett@gmail.com> wrote:
Yes, because you have no guarantee __file__ will in any way be unique or even defined (look at 'sys'). It's up to the loader to set __file__ and it can do whatever it wants. This doesn't happen with __name__ since it is rather clear what that should be no matter where the module was loaded from (unless it was a Python file specified at the command line in some random directory). -Brett

Brett Cannon schrieb:
What about import sys if __name__ == sys.main: ... You won't have to introduce a new global module var __name__ and it's easy to understand for newbies and experienced developers. The code is only executed when the name of the current module is equal to the executed main module (sys.main). IMO it's much less PIT...B then introducing __main__. Christian

On 4/20/07, Christian Heimes <lists@cheimes.de> wrote:
True, but it does introduce an import for a module that may never be used if the module is not being executed. That kind of sucks for minor performance reasons. But what do other people think? -Brett

Brett Cannon schrieb:
Yeah but sys is used by a lot of modules. Probably 95%+ of executable modules are either using sys directly to access sys.argv or os which imports sys. Also sys is a builtin module which is imported ridiculously fast. I assume that the speed penalty for scripts that don't use sys is minor. In my humble opinion it sucks less to force the import of a core module that is already used by most modules than to bind valuable developer time in the __main__ approach. I think it's a Pythonic solution as well. :) Christian

On Fri, Apr 20, 2007, Brett Cannon wrote:
Looks good to me! sys is essentially guaranteed to be imported, so you're only wasting a few cycles to bring it into the module namespace. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Help a hearing-impaired person: http://rule6.info/hearing.html

On 4/20/07, Christian Heimes <lists@cheimes.de> wrote:
But you have to understand a few things to understand why this works. You have to know that __name__ is the name of the module, and that if you want to find out the name of the main module, you need to look at sys.main. With the idiom:: if __main__: all you need to know is that the main module has __main__ set to true.
IMO it's much less PIT...B then introducing __main__.
Could you elaborate? Do you think it would be hard to introduce another module-level attribute (like we already do for __name__)? Or do you think that the code would be hard to maintain? Or something else...? Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
This is just my humble opinion. I'm new to Python core development. Well, in my opinion a new module level var like __main__ isn't worth to add when it is just boolean flag. With the proposed addition of sys.main the same information is available with just few more characters to type. If I recall correctly Python is trying to get rid of global variables in Python 3000. I don't think it's hard to add - even for me although I know less about the Python core. I'm more worried about the side effect when people have already used __main__ as a function. The problem is in 2to3. If you like to introduce __main__ why not implement http://www.python.org/dev/peps/pep-0299 ? It proposes a __main__(*argv*) module level function that replaced the "if __name__ == '__main__'" idiom. The __main__ function follows the example of other programming languages like C, C# and Java. I'm aware of the fact that the PEP was rejected but I think it's worth to discuss it again. Christian

On 4/21/07, Christian Heimes <lists@cheimes.de> wrote:
I don't like the __main__ function signature. There are lots of options, like optparse and argparse_ that are much better than manually parsing sys.argv as the PEP 299 signature would suggest. And if there's nothing to be passed to the function, why make it a function at all? Personally, I thought one of the pluses of the current status quo (as well as what Brett is proposing here) is that it *didn't* follow in the (misplaced IMHO) footsteps of languages like C and Java. I think we're probably best letting dead PEPs lie. .. _argparse: http://argparse.python-hosting.com/ STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
I agree that optparse and argparse are better ways to parse a command line than using sys.argv directly, but nothing in PEP299 would prevent you from using them. In fact, I am pretty sure that with a suitable decorator on __main__ you could make their use even simpler.
I find it very sad that PEP299 did in fact die, because I think it is much cleaner solution than the proposal that started this thread. That said, I would like to se a way to remove the __name__=='__main__' weirdness. I am +1 on resurrecting PEP299, but also +1 on adding a "sys.main" that could be used in a new "if __name__=sys.main". I am -1 on adding a builtin/global __main__ as proposed, because that would clash with my own PEP299-like use of that name. Jacob -- Jacob Holm CTO Improva ApS

Jacob Holm wrote:
I had at one time (about 4 years ago) thought it was a bit strange. But that was only for a very short while. Python differs from other languages in a very important way. python *always* starts at the top of the file and works it way down until if falls off the bottom. What it does in between the top and the bottom is entirely up to you. It's very dynamic. Other languages *compile* all the code first without executing any of it. Then you are required to tell the the compiler where the program will start, which is why you need to define a main() function. In Python, letting control fall off the bottom in order to start again at some place in the middle doesn't make much sense. It's already started, so you don't need to do that. Cheers, Ron

Ron Adam wrote:
To clarify: By "weirdness" here I meant the fact that the name of a module changes when it is used as the main module.
I know all that.
There are a number of reasons to want to use a function for the main part of the code, instead of putting it in an "if" at the end of the module. Two simple ones are: Keeping the module namespace clean. The ability to call the function from other code, most likely with different args. Since I am usually writing such a function anyway, I would prefer not to have to write the "if" boilerplate at the bottom in order to get it called. Oh, and automatically calling a __main__ function if it exists, does not prevent people who like the current "if" aproach from using that. It would just make *my* life that tiny bit easier. Therefore I would like to keep that door open by *not* adding the proposed __main__ variable at this point. Fortunately, the people that matter here seem to think avoiding the extra variable is a good idea (although for different reasons). Jacob -- Jacob Holm CTO Improva ApS

Steven Bethard wrote:
if there's nothing to be passed to the function, why make it a function at all?
I don't usually like to put big lumps of init code at the module level, because it pollutes the module namespace with local variables. So I typically end up with def main(): ... ... ... if __name__ == "__main__": main() So I'd be quite happy if I could just define a function called __main__() and be done with. I don't understand why there's so much opposition to that idea. -- Greg

On 4/22/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
+1. Although I may start out at the module level, that's typically the idiom I use eventually for any non-trivial (e.g. more than 1-2 lines) main*. George * Only exception is if the module consists essentially of main(), i.e. a small standalone script without classes, functions, etc.

On 4/22/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I guess I'm just the odd one out here in that I parse my arguments before passing them to module-level functions. So my code normally looks like:: if __name__ == '__main__': ... a few lines of argument parsing code ... some_function_name(args.foo, args.bar, args.baz) That is, I do the argument parsing at the module level, and then call the module functions with more meaningful arguments than sys.argv. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On 4/19/07, Brett Cannon <brett@python.org> wrote:
Part of me says that you are already proposing the right answer, as these alternatives are just a little too hackish. Still, they are good enough that they should be listed in the PEP, even if only as rejected alternatives. (1) You could add a builtin __main__ that is false. The real main module would mask it, but no other code would need to change. Con: Another builtin, and this one wouldn't even make sense as an independent object. (2) You could special-case the import to use __file__ instead of __name__ when __name__ == "__main__" Con: may be more fragile. (3) You could set __name__ to (an instance of) a funky string subclass that overrides __eq__. Con: may be hard to find exactly the *right* behavior. Examples: What should str(name) do? Maybe __main__ should be the primary value, and split should be overridden? -jJ
participants (14)
-
Aahz
-
Adam Atlas
-
Brett Cannon
-
Christian Heimes
-
George Sakkis
-
Greg Ewing
-
Jacob Holm
-
Jim Jewett
-
Josiah Carlson
-
Mathias Panzenböck
-
Neil Toronto
-
Ron Adam
-
Steven Bethard
-
Terry Reedy