[Python-3000] PEP to change how the main module is delineated

Brett Cannon brett at python.org
Mon Apr 23 21:05:24 CEST 2007


On 4/22/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 4/22/07, Brett Cannon <brett at python.org> wrote:
> > Implementation
> > ==============
> >
> > When the ``-m`` option is used, ``sys.main`` will be set to the
> > argument passed in.  ``sys.argv`` will be adjusted as it is currently.
> > Then the equivalent of ``__import__(self.main)`` will occur.  This
> > differs from current semantics as the ``runpy`` module fetches the
> > code object for the file specified by the module name in order to
> > explicitly set ``__name__`` and other attributes.  This is no longer
> > needed as import can perform its normal operation in this situation.
> >
> > If a file name is specified, then ``sys.main`` will be set to
> > ``"__main__"``.  The specified file will then be read and have a code
> > object created and then be executed with ``__name__`` set to
> > ``"__main__"``.  This mirrors current semantics.
>
> So __name__ will still sometimes be "__main__". That's disappointing.
>

Yeah, I know.  But I can't force people to only run Python scripts
that happen to be on sys.path.

> To clarify, assuming that the foo.bar module contains something like
> "from . import baz", this PEP only addresses the following situation::
>
>     python -mfoo.bar
>
> and all of the following will still raise ImportErrors::
>
>     ~> python foo/bar
>     ~/grok> python ../foo/bar
>     ~/foo> python bar
>
> Right?

It could handle the last one no problem.  The foo/bar it could if it
was deemed worth it to infer the module name.  The middle one *might*
be solvable with some checking on sys.path and walking the directory.

In other words they are possible if we want to pay the startup cost of
trying to infer the module's name.

>
> If that's right, I'm -1 on the proposal. It's complicating the
> standard "am I the main module?" idiom to solve a tiny little problem
> with "-m".

It's not a '-m' problem, it's a relative import problem.  This issue
occurs no matter what solution you go with, even with the specially
named main() function idea some people are tossing around.

> The "am I the main module?" idiom has been around much
> longer than the "-m" flag, and I'd prefer to see a more compelling
> reason to change it.
>

When you start using relative imports exclusively and find you have to
tweak your imports to execute certain modules you might change your
mind.  =)

> Two reasons that would be compelling for me:
>
> * Simplifying the "am I the main module?" idiom, e.g. with the
> rejected ``if __main__`` proposal.
>
> * Getting rid of the "__main__" value for __name__ entirely. This
> would require code like the following to determine the name of a
> module given at the command line::
>
>     def get_name(path):
>         sys_path_set = set(sys.path)
>         path, _ = os.path.splitext(path)
>         path = os.path.abspath(path)
>         parts = path.split(os.path.sep)
>         for i in range(1, len(parts)):
>             sub_path = os.path.sep.join(parts[:i])
>             if sub_path in sys_path_set:
>                 return '.'.join(parts[i:])
>         return parts[-1]
>
>   Even if we can't do the name resolution perfectly, if the flaws are
> documented (e.g. when '..' and symbolic links are combined) I think
> that would be enough better than the current situation to merit
> changing the standard "am I the current module?" idiom.
>

As I said in the PEP, I rejected this because of startup cost.  If
people are willing to pay for it then we can go with it and just
assume that if the current directory is not a prefix for the module's
path that we just use the filename to set the module's name and
sys.main, that's fine as well.  It just is not as accurate as it might
be.

So the questions for people to weigh in on are:

* For spam/bacon.py w/ with spam/__init__.py existing:

  - Should we infer it as spam.bacon?
    Startup cost, but accurate name allows for relative imports.
  - Should we infer as bacon?
    Cheap but breaks relative imports.
  - Should we infer the name as __main__?
    Status quo and breaks relative imports.

* For spam/bacon.py with *no* spam/__init__.py:

  - Should we infer the name as bacon?
    No relative imports possible since not in a package.
  - Should we infer the name as __main__?
    Status quo and no relative import issue as not in a package.

* For ../bacon.py

   - With ../__init__.py defined and ../../ on sys.path:

      + Should we infer the name as bacon?
         Simple, but breaks relative imports.
      + Should we walk up the directories until __init__.py is not
found and check for that directory on sys.path?
          Costly startup, but allows for relative imports to work.
       + Should we infer the name as __main__?
          Status quo and relative imports don't work.

   - Without ../__init__.py

       + Infer as bacon?
          No relative imports possible so fine.
       + Infer as __main__?
          Status quo, and no relative imports even possible.


If people want a simple inference rule we can ditch __main__ entirely
and just go with a simple rule that would be more costly for startup.
Otherwise we can stick with __main__ as the PEP has proposed and have
a partial solution.  Or we can ditch the PEP.

-Brett


More information about the Python-3000 mailing list