A c.l.p discussion referenced from Python-URL just brought this topic back to my attention, and with the relatively low traffic on the development lists in the last few days, it seemed like a good time to repost this PEP (it vanished beneath the Unicode identifier discussion last time).
Cheers, Nick.
PEP: 366 Title: Main module explicit relative imports Version: $Revision: 56172 $ Last-Modified: $Date: 2007-07-04 22:47:13 +1000 (Wed, 04 Jul 2007) $ Author: Nick Coghlan ncoghlan@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 1-May-2007 Python-Version: 2.6 Post-History: 1-May-2007
This PEP proposes a backwards compatible mechanism that permits the use of explicit relative imports from executable modules within packages. Such imports currently fail due to an awkward interaction between PEP 328 and PEP 338 - this behaviour is the subject of at least one open SF bug report (#1510172)[1], and has most likely been a factor in at least a few queries on comp.lang.python (such as Alan Isaac's question in [2]).
With the proposed mechanism, relative imports will work automatically
if the module is executed using the -m
switch. A small amount of
boilerplate will be needed in the module itself to allow the relative
imports to work when the file is executed by name.
(This section is taken from the final revision of PEP 338)
The release of 2.5b1 showed a surprising (although obvious in
retrospect) interaction between PEP 338 and PEP 328 - explicit
relative imports don't work from a main module. This is due to
the fact that relative imports rely on __name__
to determine
the current module's position in the package hierarchy. In a main
module, the value of __name__
is always '__main__'
, so
explicit relative imports will always fail (as they only work for
a module inside a package).
Investigation into why implicit relative imports appear to work when
a main module is executed directly but fail when executed using -m
showed that such imports are actually always treated as absolute
imports. Because of the way direct execution works, the package
containing the executed module is added to sys.path, so its sibling
modules are actually imported as top level modules. This can easily
lead to multiple copies of the sibling modules in the application if
implicit relative imports are used in modules that may be directly
executed (e.g. test modules or utility scripts).
For the 2.5 release, the recommendation is to always use absolute
imports in any module that is intended to be used as a main module.
The -m
switch already provides a benefit here, as it inserts the
current directory into sys.path
, instead of the directory containing
the main module. This means that it is possible to run a module from
inside a package using -m
so long as the current directory contains
the top level directory for the package. Absolute imports will work
correctly even if the package isn't installed anywhere else on
sys.path. If the module is executed directly and uses absolute imports
to retrieve its sibling modules, then the top level package directory
needs to be installed somewhere on sys.path (since the current directory
won't be added automatically).
Here's an example file layout::
devel/
pkg/
__init__.py
moduleA.py
moduleB.py
test/
__init__.py
test_A.py
test_B.py
So long as the current directory is devel
, or devel
is
already
on sys.path
and the test modules use absolute imports (such as
import pkg.moduleA
to retrieve the module under test, PEP 338
allows the tests to be run as::
python -m pkg.test.test_A
python -m pkg.test.test_B
In rejecting PEP 3122 (which proposed a higher impact solution to this
problem), Guido has indicated that he still isn't particularly keen on
the idea of executing modules inside packages as scripts [2]. Despite
these misgivings he has previously approved the addition of the -m
switch in Python 2.4, and the runpy
module based enhancements
described in PEP 338 for Python 2.5.
The philosophy that motivated those previous additions (i.e. access to utility or testing scripts without needing to worry about name clashes in either the OS executable namespace or the top level Python namespace) is also the motivation behind fixing what I see as a bug in the current implementation.
This PEP is intended to provide a solution which permits explicit relative imports from main modules, without incurring any significant costs during interpreter startup or normal module import.
The heart of the proposed solution is a new module attribute
__package_name__
. This attribute will be defined only in
the main module (i.e. modules where __name__ == "__main__"
).
For a directly executed main module, this attribute will be set
to the empty string. For a module executed using
runpy.run_module()
with the run_name
parameter set to
"__main__"
, the attribute will be set to
mod_name.rpartition('.')[0]
(i.e., everything up to
but not including the last period).
In the import machinery there is an error handling path which
deals with the case where an explicit relative reference attempts
to go higher than the top level in the package hierarchy. This
error path would be changed to fall back on the __package_name__
attribute for explicit relative imports when the importing module
is called "__main__"
.
With this change, explicit relative imports will work automatically
from a script executed with the -m
switch. To allow direct
execution of the module, the following boilerplate would be needed at
the top of the script::
if __name__ == "__main__" and not __package_name__: __package_name__ = "<expected_pkg_name>"
Note that this boilerplate is sufficient only if the top level package is already accessible via sys.path. Additional code that manipulates sys.path would be needed in order for direct execution to work without the top level package already being on sys.path.
This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually.
With this feature in place, the test scripts in the package above
would be able to change their import lines to something along the
lines of import ..moduleA
. The scripts could then be
executed unmodified even if the name of the package was changed.
(Rev 47142 in SVN implemented an early variant of this proposal which stored the main module's real module name in the '__module_name__' attribute. It was reverted due to the fact that 2.5 was already in beta by that time.)
PEP 3122 proposed addressing this problem by changing the way the main module is identified. That's a huge compatibility cost to incur to fix something that is a pretty minor bug in the overall scheme of things.
The advantage of the proposal in this PEP is that its only impact on normal code is the tiny amount of time needed at startup to set the extra attribute in the main module. The changes to the import machinery are all in an existing error handling path, so normal imports don't incur any performance penalty at all.
.. [1] Absolute/relative import not working? (http://www.python.org/sf/1510172)
.. [2] Guido's rejection of PEP 3122 (http://mail.python.org/pipermail/python-3000/2007-April/006793.html)
.. [3] c.l.p. question about modules and relative imports
(http://groups.google.com/group/comp.lang.python/browse_thread/thread/c44c769...)
This document has been placed in the public domain.
--
http://www.boredomandlaziness.org
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
FWIW, I find the PEP is rather wordy for such a simple proposal (it took me more time to find the proposal than to understand it :-).
--Guido
On 7/4/07, Nick Coghlan ncoghlan@gmail.com wrote:
A c.l.p discussion referenced from Python-URL just brought this topic back to my attention, and with the relatively low traffic on the development lists in the last few days, it seemed like a good time to repost this PEP (it vanished beneath the Unicode identifier discussion last time).
Cheers, Nick.
PEP: 366 Title: Main module explicit relative imports Version: $Revision: 56172 $ Last-Modified: $Date: 2007-07-04 22:47:13 +1000 (Wed, 04 Jul 2007) $ Author: Nick Coghlan ncoghlan@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 1-May-2007 Python-Version: 2.6 Post-History: 1-May-2007
This PEP proposes a backwards compatible mechanism that permits the use of explicit relative imports from executable modules within packages. Such imports currently fail due to an awkward interaction between PEP 328 and PEP 338 - this behaviour is the subject of at least one open SF bug report (#1510172)[1], and has most likely been a factor in at least a few queries on comp.lang.python (such as Alan Isaac's question in [2]).
With the proposed mechanism, relative imports will work automatically
if the module is executed using the -m
switch. A small amount of
boilerplate will be needed in the module itself to allow the relative
imports to work when the file is executed by name.
(This section is taken from the final revision of PEP 338)
The release of 2.5b1 showed a surprising (although obvious in
retrospect) interaction between PEP 338 and PEP 328 - explicit
relative imports don't work from a main module. This is due to
the fact that relative imports rely on __name__
to determine
the current module's position in the package hierarchy. In a main
module, the value of __name__
is always '__main__'
, so
explicit relative imports will always fail (as they only work for
a module inside a package).
Investigation into why implicit relative imports appear to work when
a main module is executed directly but fail when executed using -m
showed that such imports are actually always treated as absolute
imports. Because of the way direct execution works, the package
containing the executed module is added to sys.path, so its sibling
modules are actually imported as top level modules. This can easily
lead to multiple copies of the sibling modules in the application if
implicit relative imports are used in modules that may be directly
executed (e.g. test modules or utility scripts).
For the 2.5 release, the recommendation is to always use absolute
imports in any module that is intended to be used as a main module.
The -m
switch already provides a benefit here, as it inserts the
current directory into sys.path
, instead of the directory containing
the main module. This means that it is possible to run a module from
inside a package using -m
so long as the current directory contains
the top level directory for the package. Absolute imports will work
correctly even if the package isn't installed anywhere else on
sys.path. If the module is executed directly and uses absolute imports
to retrieve its sibling modules, then the top level package directory
needs to be installed somewhere on sys.path (since the current directory
won't be added automatically).
Here's an example file layout::
devel/
pkg/
__init__.py
moduleA.py
moduleB.py
test/
__init__.py
test_A.py
test_B.py
So long as the current directory is devel
, or devel
is
already
on sys.path
and the test modules use absolute imports (such as
import pkg.moduleA
to retrieve the module under test, PEP 338
allows the tests to be run as::
python -m pkg.test.test_A
python -m pkg.test.test_B
In rejecting PEP 3122 (which proposed a higher impact solution to this
problem), Guido has indicated that he still isn't particularly keen on
the idea of executing modules inside packages as scripts [2]. Despite
these misgivings he has previously approved the addition of the -m
switch in Python 2.4, and the runpy
module based enhancements
described in PEP 338 for Python 2.5.
The philosophy that motivated those previous additions (i.e. access to utility or testing scripts without needing to worry about name clashes in either the OS executable namespace or the top level Python namespace) is also the motivation behind fixing what I see as a bug in the current implementation.
This PEP is intended to provide a solution which permits explicit relative imports from main modules, without incurring any significant costs during interpreter startup or normal module import.
The heart of the proposed solution is a new module attribute
__package_name__
. This attribute will be defined only in
the main module (i.e. modules where __name__ == "__main__"
).
For a directly executed main module, this attribute will be set
to the empty string. For a module executed using
runpy.run_module()
with the run_name
parameter set to
"__main__"
, the attribute will be set to
mod_name.rpartition('.')[0]
(i.e., everything up to
but not including the last period).
In the import machinery there is an error handling path which
deals with the case where an explicit relative reference attempts
to go higher than the top level in the package hierarchy. This
error path would be changed to fall back on the __package_name__
attribute for explicit relative imports when the importing module
is called "__main__"
.
With this change, explicit relative imports will work automatically
from a script executed with the -m
switch. To allow direct
execution of the module, the following boilerplate would be needed at
the top of the script::
if __name__ == "__main__" and not __package_name__: __package_name__ = "<expected_pkg_name>"
Note that this boilerplate is sufficient only if the top level package is already accessible via sys.path. Additional code that manipulates sys.path would be needed in order for direct execution to work without the top level package already being on sys.path.
This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually.
With this feature in place, the test scripts in the package above
would be able to change their import lines to something along the
lines of import ..moduleA
. The scripts could then be
executed unmodified even if the name of the package was changed.
(Rev 47142 in SVN implemented an early variant of this proposal which stored the main module's real module name in the '__module_name__' attribute. It was reverted due to the fact that 2.5 was already in beta by that time.)
PEP 3122 proposed addressing this problem by changing the way the main module is identified. That's a huge compatibility cost to incur to fix something that is a pretty minor bug in the overall scheme of things.
The advantage of the proposal in this PEP is that its only impact on normal code is the tiny amount of time needed at startup to set the extra attribute in the main module. The changes to the import machinery are all in an existing error handling path, so normal imports don't incur any performance penalty at all.
.. [1] Absolute/relative import not working? (http://www.python.org/sf/1510172)
.. [2] Guido's rejection of PEP 3122 (http://mail.python.org/pipermail/python-3000/2007-April/006793.html)
.. [3] c.l.p. question about modules and relative imports
(http://groups.google.com/group/comp.lang.python/browse_thread/thread/c44c769...)
This document has been placed in the public domain.
--
http://www.boredomandlaziness.org
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
Oh, one more thing. Perhaps we should rename it, like the other PEPs still active slated for inclusion in Py3k (and backporting to 2.6)?
--Guido
On 7/5/07, Guido van Rossum guido@python.org wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
FWIW, I find the PEP is rather wordy for such a simple proposal (it took me more time to find the proposal than to understand it :-).
--Guido
On 7/4/07, Nick Coghlan ncoghlan@gmail.com wrote:
A c.l.p discussion referenced from Python-URL just brought this topic back to my attention, and with the relatively low traffic on the development lists in the last few days, it seemed like a good time to repost this PEP (it vanished beneath the Unicode identifier discussion last time).
Cheers, Nick.
PEP: 366 Title: Main module explicit relative imports Version: $Revision: 56172 $ Last-Modified: $Date: 2007-07-04 22:47:13 +1000 (Wed, 04 Jul 2007) $ Author: Nick Coghlan ncoghlan@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 1-May-2007 Python-Version: 2.6 Post-History: 1-May-2007
This PEP proposes a backwards compatible mechanism that permits the use of explicit relative imports from executable modules within packages. Such imports currently fail due to an awkward interaction between PEP 328 and PEP 338 - this behaviour is the subject of at least one open SF bug report (#1510172)[1], and has most likely been a factor in at least a few queries on comp.lang.python (such as Alan Isaac's question in [2]).
With the proposed mechanism, relative imports will work automatically
if the module is executed using the -m
switch. A small amount of
boilerplate will be needed in the module itself to allow the relative
imports to work when the file is executed by name.
(This section is taken from the final revision of PEP 338)
The release of 2.5b1 showed a surprising (although obvious in
retrospect) interaction between PEP 338 and PEP 328 - explicit
relative imports don't work from a main module. This is due to
the fact that relative imports rely on __name__
to determine
the current module's position in the package hierarchy. In a main
module, the value of __name__
is always '__main__'
, so
explicit relative imports will always fail (as they only work for
a module inside a package).
Investigation into why implicit relative imports appear to work when
a main module is executed directly but fail when executed using -m
showed that such imports are actually always treated as absolute
imports. Because of the way direct execution works, the package
containing the executed module is added to sys.path, so its sibling
modules are actually imported as top level modules. This can easily
lead to multiple copies of the sibling modules in the application if
implicit relative imports are used in modules that may be directly
executed (e.g. test modules or utility scripts).
For the 2.5 release, the recommendation is to always use absolute
imports in any module that is intended to be used as a main module.
The -m
switch already provides a benefit here, as it inserts the
current directory into sys.path
, instead of the directory containing
the main module. This means that it is possible to run a module from
inside a package using -m
so long as the current directory contains
the top level directory for the package. Absolute imports will work
correctly even if the package isn't installed anywhere else on
sys.path. If the module is executed directly and uses absolute imports
to retrieve its sibling modules, then the top level package directory
needs to be installed somewhere on sys.path (since the current directory
won't be added automatically).
Here's an example file layout::
devel/
pkg/
__init__.py
moduleA.py
moduleB.py
test/
__init__.py
test_A.py
test_B.py
So long as the current directory is devel
, or devel
is
already
on sys.path
and the test modules use absolute imports (such as
import pkg.moduleA
to retrieve the module under test, PEP 338
allows the tests to be run as::
python -m pkg.test.test_A
python -m pkg.test.test_B
In rejecting PEP 3122 (which proposed a higher impact solution to this
problem), Guido has indicated that he still isn't particularly keen on
the idea of executing modules inside packages as scripts [2]. Despite
these misgivings he has previously approved the addition of the -m
switch in Python 2.4, and the runpy
module based enhancements
described in PEP 338 for Python 2.5.
The philosophy that motivated those previous additions (i.e. access to utility or testing scripts without needing to worry about name clashes in either the OS executable namespace or the top level Python namespace) is also the motivation behind fixing what I see as a bug in the current implementation.
This PEP is intended to provide a solution which permits explicit relative imports from main modules, without incurring any significant costs during interpreter startup or normal module import.
The heart of the proposed solution is a new module attribute
__package_name__
. This attribute will be defined only in
the main module (i.e. modules where __name__ == "__main__"
).
For a directly executed main module, this attribute will be set
to the empty string. For a module executed using
runpy.run_module()
with the run_name
parameter set to
"__main__"
, the attribute will be set to
mod_name.rpartition('.')[0]
(i.e., everything up to
but not including the last period).
In the import machinery there is an error handling path which
deals with the case where an explicit relative reference attempts
to go higher than the top level in the package hierarchy. This
error path would be changed to fall back on the __package_name__
attribute for explicit relative imports when the importing module
is called "__main__"
.
With this change, explicit relative imports will work automatically
from a script executed with the -m
switch. To allow direct
execution of the module, the following boilerplate would be needed at
the top of the script::
if __name__ == "__main__" and not __package_name__: __package_name__ = "<expected_pkg_name>"
Note that this boilerplate is sufficient only if the top level package is already accessible via sys.path. Additional code that manipulates sys.path would be needed in order for direct execution to work without the top level package already being on sys.path.
This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually.
With this feature in place, the test scripts in the package above
would be able to change their import lines to something along the
lines of import ..moduleA
. The scripts could then be
executed unmodified even if the name of the package was changed.
(Rev 47142 in SVN implemented an early variant of this proposal which stored the main module's real module name in the '__module_name__' attribute. It was reverted due to the fact that 2.5 was already in beta by that time.)
PEP 3122 proposed addressing this problem by changing the way the main module is identified. That's a huge compatibility cost to incur to fix something that is a pretty minor bug in the overall scheme of things.
The advantage of the proposal in this PEP is that its only impact on normal code is the tiny amount of time needed at startup to set the extra attribute in the main module. The changes to the import machinery are all in an existing error handling path, so normal imports don't incur any performance penalty at all.
.. [1] Absolute/relative import not working? (http://www.python.org/sf/1510172)
.. [2] Guido's rejection of PEP 3122 (http://mail.python.org/pipermail/python-3000/2007-April/006793.html)
.. [3] c.l.p. question about modules and relative imports
(http://groups.google.com/group/comp.lang.python/browse_thread/thread/c44c769...)
This document has been placed in the public domain.
--
http://www.boredomandlaziness.org
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Oh, one more thing. Perhaps we should rename it, like the other PEPs still active slated for inclusion in Py3k (and backporting to 2.6)?
Might as well be consistent - I'll take care of that when I update the PEP based on your suggestions.
On 7/5/07, Guido van Rossum guido@python.org wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
I'll have a look at what would be involved in always defining __package__ and using it for relative imports. I suspect it will be a slightly bigger change than the current PEP (i.e. more lines/files touched), but not significantly so.
FWIW, I find the PEP is rather wordy for such a simple proposal (it took me more time to find the proposal than to understand it :-).
Yeah, I still haven't come up with a particularly concise way of explaining why relative imports don't currently work in main modules.
I'll rearrange the PEP to put the proposed fix before the detailed explanation of the problem (in fact, given that the latter is mainly of historical interest now, I may just include a pointer to the relevant section of PEP 338).
Cheers, Nick.
--
http://www.boredomandlaziness.org