At 11:53 AM 7/5/2007 +0200, Guido van Rossum wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
+1 for __package__, and putting it everywhere. Relative import should use it first if present, falling back to use of __name__.
On 7/5/07, Phillip J. Eby pje@telecommunity.com wrote:
At 11:53 AM 7/5/2007 +0200, Guido van Rossum wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
+1 for __package__, and putting it everywhere. Relative import should use it first if present, falling back to use of __name__.
+1 from me as well.
-Brett
"Brett Cannon" brett@python.org wrote:
On 7/5/07, Phillip J. Eby pje@telecommunity.com wrote:
At 11:53 AM 7/5/2007 +0200, Guido van Rossum wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler.
+1 for __package__, and putting it everywhere. Relative import should use it first if present, falling back to use of __name__.
+1 from me as well.
This would solve some issues I'm currently having with relative imports. +1
Josiah Carlson wrote:
"Brett Cannon" brett@python.org wrote:
On 7/5/07, Phillip J. Eby pje@telecommunity.com wrote:
At 11:53 AM 7/5/2007 +0200, Guido van Rossum wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler. +1 for __package__, and putting it everywhere. Relative import should use it first if present, falling back to use of __name__. +1 from me as well.
This would solve some issues I'm currently having with relative imports. +1
I've updated the PEP to incorporate the feedback from this thread. The new version is below, and should show up on the website shortly.
Cheers, Nick.
PEP: 366 Title: Main module explicit relative imports Version: $Revision: 56190 $ Last-Modified: $Date: 2007-07-08 17:45:46 +1000 (Sun, 08 Jul 2007) $ Author: Nick Coghlan ncoghlan@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 1-May-2007 Python-Version: 2.6, 3.0 Post-History: 1-May-2007, 4-Jul-2007, 7-Jul-2007
This PEP proposes a backwards compatible mechanism that permits the use of explicit relative imports from executable modules within packages. Such imports currently fail due to an awkward interaction between PEP 328 and PEP 338.
By adding a new module level attribute, this PEP allows relative imports
to work automatically if the module is executed using the -m
switch.
A small amount of boilerplate in the module itself will allow the relative
imports to work when the file is executed by name.
The major proposed change is the introduction of a new module level
attribute, __package__
. When it is present, relative imports will
be based on this attribute rather than the module __name__
attribute.
As with the current __name__
attribute, setting __package__
will
be the responsibility of the PEP 302 loader used to import a module.
Loaders which use imp.new_module()
to create the module object will
have the new attribute set automatically to
__name__.rpartition('.')[0]
.
runpy.run_module
will also set the new attribute, basing it off the
mod_name
argument, rather than the run_name
argument. This will
allow relative imports to work correctly from main modules executed with
the -m
switch.
When the main module is specified by its filename, then the
__package__
attribute will be set to the empty string. To allow
relative imports when the module is executed directly, boilerplate
similar to the following would be needed before the first relative
import statement:
if __name__ == "__main__" and not __package_name__: __package_name__ = "<expected_pacakage_name>"
Note that this boilerplate is sufficient only if the top level package
is already accessible via sys.path
. Additional code that manipulates
sys.path
would be needed in order for direct execution to work
without the top level package already being importable.
This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually.
The current inability to use explicit relative imports from the main module is the subject of at least one open SF bug report (#1510172)[1], and has most likely been a factor in at least a few queries on comp.lang.python (such as Alan Isaac's question in [2]).
This PEP is intended to provide a solution which permits explicit relative imports from main modules, without incurring any significant costs during interpreter startup or normal module import.
The section in PEP 338 on relative imports and the main module provides further details and background on this problem.
Rev 47142 in SVN implemented an early variant of this proposal which stored the main module's real module name in the '__module_name__' attribute. It was reverted due to the fact that 2.5 was already in beta by that time.
A new patch will be developed for 2.6, and forward ported to Py3k via svnmerge.
PEP 3122 proposed addressing this problem by changing the way the main module is identified. That's a significant compatibility cost to incur to fix something that is a pretty minor bug in the overall scheme of things, and the PEP was rejected [3].
The advantage of the proposal in this PEP is that its only impact on normal code is the small amount of time needed to set the extra attribute when importing a module. Relative imports themselves should be sped up fractionally, as the package name is stored in the module globals, rather than having to be worked out again for each relative import.
.. [1] Absolute/relative import not working? (http://www.python.org/sf/1510172)
.. [2] c.l.p. question about modules and relative imports
(http://groups.google.com/group/comp.lang.python/browse_thread/thread/c44c769...)
.. [3] Guido's rejection of PEP 3122 (http://mail.python.org/pipermail/python-3000/2007-April/006793.html)
This document has been placed in the public domain.
--
http://www.boredomandlaziness.org
On 7/8/07, Nick Coghlan ncoghlan@gmail.com wrote:
Josiah Carlson wrote:
"Brett Cannon" brett@python.org wrote:
On 7/5/07, Phillip J. Eby pje@telecommunity.com wrote:
At 11:53 AM 7/5/2007 +0200, Guido van Rossum wrote:
I see no big problems with this, except I wonder if in the end it wouldn't be better to always define __package_name__ instead of only when it's in main? And then perhaps rename it to __package__? Done properly it could always be used for relative imports, rather than parsing __module__ to find the package. Then you won't even need the error handler. +1 for __package__, and putting it everywhere. Relative import should use it first if present, falling back to use of __name__. +1 from me as well.
This would solve some issues I'm currently having with relative imports. +1
I've updated the PEP to incorporate the feedback from this thread. The new version is below, and should show up on the website shortly.
Cheers, Nick.
PEP: 366 Title: Main module explicit relative imports Version: $Revision: 56190 $ Last-Modified: $Date: 2007-07-08 17:45:46 +1000 (Sun, 08 Jul 2007) $ Author: Nick Coghlan ncoghlan@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 1-May-2007 Python-Version: 2.6, 3.0 Post-History: 1-May-2007, 4-Jul-2007, 7-Jul-2007
This PEP proposes a backwards compatible mechanism that permits the use of explicit relative imports from executable modules within packages. Such imports currently fail due to an awkward interaction between PEP 328 and PEP 338.
By adding a new module level attribute, this PEP allows relative imports
to work automatically if the module is executed using the -m
switch.
A small amount of boilerplate in the module itself will allow the relative
imports to work when the file is executed by name.
The major proposed change is the introduction of a new module level
attribute, __package__
. When it is present, relative imports will
be based on this attribute rather than the module __name__
attribute.
As with the current __name__
attribute, setting __package__
will
be the responsibility of the PEP 302 loader used to import a module.
Loaders which use imp.new_module()
to create the module object will
have the new attribute set automatically to
__name__.rpartition('.')[0]
.
Is this really the best semantics for this? Let's say I have A/B/__init__.py and A/B/C.py. With these semantics I would have A.B having __package__ be 'A' and A.B.C having 'A.B'.
While I agree that the A.B.C setting is correct, is the A.B value what is truly desired? Is an __init__ module really to be considered part of the above package? I always viewed the __init__ module as part of its own package. Thus I expected A.B to have __package__ set to 'A.B'.
Beyond just what I expected, the reason I bring this up is that if
__package__ had the semantics I am suggesting, it is trivial to
discover what modules are the package __init__ modules (as
__package__ == __name__
) compared to being a submodule
(__package__ and __package__ != __name__
). As of right now you
can only do that if you examine __file__ for __init__.py(c), but that
is highly dependent on how the module was loaded. It might be nice if
what kind of module (top-level, package, or submodule) something is
based on its metadata.
-Brett
Brett Cannon wrote:
On 7/8/07, Nick Coghlan ncoghlan@gmail.com wrote:
As with the current __name__
attribute, setting __package__
will
be the responsibility of the PEP 302 loader used to import a module.
Loaders which use imp.new_module()
to create the module object will
have the new attribute set automatically to
__name__.rpartition('.')[0]
.
Is this really the best semantics for this? Let's say I have A/B/__init__.py and A/B/C.py. With these semantics I would have A.B having __package__ be 'A' and A.B.C having 'A.B'.
While I agree that the A.B.C setting is correct, is the A.B value what is truly desired? Is an __init__ module really to be considered part of the above package? I always viewed the __init__ module as part of its own package. Thus I expected A.B to have __package__ set to 'A.B'.
Good point - PEP 328 makes it explicit that __init__.py is treated like any other module in the package for purposes of relative imports, so the semantics you suggest are the ones required. I hadn't actually thought about this case, as it wasn't relevant when the new attribute applied only to the main module.
However, those semantics mean that we won't be able to automatically add the new attribute inside imp.new_module(), as that function doesn't know whether or not the new module is a package.
Beyond just what I expected, the reason I bring this
up is that if
__package__ had the semantics I am suggesting, it is trivial to
discover what modules are the package __init__ modules (as
__package__ == __name__
) compared to being a submodule
(__package__ and __package__ != __name__
). As of right now you
can only do that if you examine __file__ for __init__.py(c), but that
is highly dependent on how the module was loaded. It might be nice if
what kind of module (top-level, package, or submodule) something is
based on its metadata.
This part of the argument isn't relevant though, as it's already trivial to determine whether or not a module is a package by checking for a __path__ attribute. That's what PEP 302 specifies, and it is how the relative import machinery currently determines whether to use __name__ or __name__.rpartition('.')[0] as the base for relative imports.
Given the above limitations, I propose that we document the new attribute as follows:
"If the module global __package__ exists when executing an import statement, it is used to determine the base for relative imports, instead of the __name__ and __path__ attributes. This attribute may be set by the interpreter before a module is executed - whether or not it is set automatically in a given module is implementation dependent."
And for the CPython implementation, I propose that we set the new attribute:
This will allow any module which uses relative imports to benefit from the micro-optimisation of caching the package name in normal modules (regardless of how the module gets loaded), as well as allowing relative imports from the main module (which is the main goal of the PEP).
With the way PEP 302 hands off creation of the module and execution of its code to the loader objects, I don't see any way to guarantee that __package__ will always be set - this seems like a reasonable compromise.
Cheers, Nick.
--
http://www.boredomandlaziness.org
On 7/9/07, Nick Coghlan ncoghlan@gmail.com wrote:
Brett Cannon wrote:
On 7/8/07, Nick Coghlan ncoghlan@gmail.com wrote:
As with the current __name__
attribute, setting __package__
will
be the responsibility of the PEP 302 loader used to import a module.
Loaders which use imp.new_module()
to create the module object will
have the new attribute set automatically to
__name__.rpartition('.')[0]
.
Is this really the best semantics for this? Let's say I have A/B/__init__.py and A/B/C.py. With these semantics I would have A.B having __package__ be 'A' and A.B.C having 'A.B'.
While I agree that the A.B.C setting is correct, is the A.B value what is truly desired? Is an __init__ module really to be considered part of the above package? I always viewed the __init__ module as part of its own package. Thus I expected A.B to have __package__ set to 'A.B'.
Good point - PEP 328 makes it explicit that __init__.py is treated like any other module in the package for purposes of relative imports, so the semantics you suggest are the ones required. I hadn't actually thought about this case, as it wasn't relevant when the new attribute applied only to the main module.
However, those semantics mean that we won't be able to automatically add the new attribute inside imp.new_module(), as that function doesn't know whether or not the new module is a package.
Good point.
Beyond just
what I expected, the reason I bring this up is that if
__package__ had the semantics I am suggesting, it is trivial to
discover what modules are the package __init__ modules (as
__package__ == __name__
) compared to being a submodule
(__package__ and __package__ != __name__
). As of right now you
can only do that if you examine __file__ for __init__.py(c), but that
is highly dependent on how the module was loaded. It might be nice if
what kind of module (top-level, package, or submodule) something is
based on its metadata.
This part of the argument isn't relevant though, as it's already trivial to determine whether or not a module is a package by checking for a __path__ attribute. That's what PEP 302 specifies, and it is how the relative import machinery currently determines whether to use __name__ or __name__.rpartition('.')[0] as the base for relative imports.
=) The lesson here is be careful when emailing on vacation as you might not think everything through. =)
Given the above limitations, I propose that we document the new attribute as follows:
"If the module global __package__ exists when executing an import statement, it is used to determine the base for relative imports, instead of the __name__ and __path__ attributes.
That's fine. __path__ actually isn't used to resolve relative imports into absolute ones anyway; it's used only as a substitute to sys.path when importing within a package.
This attribute may be set by the interpreter before a module is executed - whether or not it is set automatically in a given module is implementation dependent."
And for the CPython implementation, I propose that we set the new attribute:
This will allow any module which uses relative imports to benefit from the micro-optimisation of caching the package name in normal modules (regardless of how the module gets loaded), as well as allowing relative imports from the main module (which is the main goal of the PEP).
With the way PEP 302 hands off creation of the module and execution of its code to the loader objects, I don't see any way to guarantee that __package__ will always be set - this seems like a reasonable compromise.
It could be set after the import, but you are right that the loaders will not necessarily set it before executing code. You might be able to use the reload requirement of using an existing dict if the module is in sys.modules somehow, but that just sounds like asking for trouble.
-Brett
Brett Cannon wrote:
On 7/9/07, Nick Coghlan ncoghlan@gmail.com wrote:
Given the above limitations, I propose that we document the new attribute as follows:
"If the module global __package__ exists when executing an import statement, it is used to determine the base for relative imports, instead of the __name__ and __path__ attributes.
That's fine. __path__ actually isn't used to resolve relative imports into absolute ones anyway; it's used only as a substitute to sys.path when importing within a package.
I was referring to the fact that if __path__ is present (indicating a package), then the relative import is based directly on __name__, otherwise it is based on __name__.rpartition('.')[0].
Cheers, Nick.
--
http://www.boredomandlaziness.org
I'm in general in favor of this. I will accept it once there is a working implementation that is satisfactory.
Are we planning on supporting this in 2.6? It might break some 2.5 code that messes with modules and packages?
--Guido
On 7/10/07, Nick Coghlan ncoghlan@gmail.com wrote:
Brett Cannon wrote:
On 7/9/07, Nick Coghlan ncoghlan@gmail.com wrote:
Given the above limitations, I propose that we document the new attribute as follows:
"If the module global __package__ exists when executing an import statement, it is used to determine the base for relative imports, instead of the __name__ and __path__ attributes.
That's fine. __path__ actually isn't used to resolve relative imports into absolute ones anyway; it's used only as a substitute to sys.path when importing within a package.
I was referring to the fact that if __path__ is present (indicating a package), then the relative import is based directly on __name__, otherwise it is based on __name__.rpartition('.')[0].
Cheers, Nick.
--
http://www.boredomandlaziness.org
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)