Re: [Distutils] Setuptools: omit namespace package directories?
At 05:05 PM 2/8/2007 +0100, Thomas Lotze wrote:
Hi,
when using namespace packages, the corresponding package directories and __init__.py files must physically exist in the source tree, even though they can't, by definition of a namespace package, contain anything other than subordinate directories and a fixed stanza of Python code, resp.
I'm not sure that's entirely true, although they *do* have to exist when using "setup.py develop", or they won't be importable. Have you tried using the distutils package_dir mapping to remap your source tree?
This gets annoying, especially when using nested namespaces. For example, when writing a buildout recipe, the cleanest thing is to use several namespaces, like zc.recipe.egg does.
As Jim points out, nested namespace packages are usually a bad idea and shouldn't be created for new systems, as entry points are a better way of supporting third-party extensions to a package. Thus, having a single top-level namespace to denote a publisher or product line is usually sufficient.
However, you end up jumping through essentially empty directories all the time while working on the source. There's evidence that this problem is a real one: buildout recipes registered with PyPI tend to omit the "recipe" part at the expense of namespace hygiene and PEP 8 compliance, for example gocept.download or buildout_script.
I don't understand.
Since there's basically not information in those boilerplate directories and __init__.py files that couldn't be inferred from a keyword parameter to setup(), would it be a sensible feature request that setuptools do without the physical namespace directories in the future?
As I said, I'm not sure it needs them *now*, except to support setup.py develop. I'm not really fond of the package_dirs feature, preferring to use an importable layout when I do development. But the feature does exist and please feel free to let me know whether it solves your problem, or report *specifically* what it does or does not do correctly and what output is produced. Thanks.
Phillip J. Eby wrote:
At 05:05 PM 2/8/2007 +0100, Thomas Lotze wrote: I'm not sure that's entirely true, although they *do* have to exist when using "setup.py develop", or they won't be importable. Have you tried using the distutils package_dir mapping to remap your source tree?
To achieve what? One does want to support setup.py develop, after all.
As Jim points out, nested namespace packages are usually a bad idea and shouldn't be created for new systems, as entry points are a better way of supporting third-party extensions to a package. Thus, having a single top-level namespace to denote a publisher or product line is usually sufficient.
But if you want to distribute a package as separate eggs, you have to split it into subpackages, which you have to name somehow. If I'm not mistaken about the concept of entry points, they aren't relevant for this.
However, you end up jumping through essentially empty directories all the time while working on the source. There's evidence that this problem is a real one: buildout recipes registered with PyPI tend to omit the "recipe" part at the expense of namespace hygiene and PEP 8 compliance, for example gocept.download or buildout_script.
I don't understand.
What don't you understand?
As I said, I'm not sure it needs them *now*, except to support setup.py develop. I'm not really fond of the package_dirs feature, preferring to use an importable layout when I do development. But the feature does exist and please feel free to let me know whether it solves your problem, or report *specifically* what it does or does not do correctly and what output is produced. Thanks.
For me, package_dirs does correctly what it's supposed to do, which does not include setup.py develop. Not having that is a show stopper for development. -- Thomas
On Feb 8, 2007, at 5:04 PM, Phillip J. Eby wrote:
At 05:05 PM 2/8/2007 +0100, Thomas Lotze wrote:
Hi,
when using namespace packages, the corresponding package directories and __init__.py files must physically exist in the source tree, even though they can't, by definition of a namespace package, contain anything other than subordinate directories and a fixed stanza of Python code, resp.
I'm not sure that's entirely true, although they *do* have to exist when using "setup.py develop",
Yes. I think that's the main issue, as this is a very important use case. We rely on using develop eggs.
As Jim points out, nested namespace packages are usually a bad idea and shouldn't be created for new systems, as entry points are a better way of supporting third-party extensions to a package.
I don't think that the nested namespace was motivated specifically by this. It was simply a result of a natural human tendency to organize. :)
Since there's basically not information in those boilerplate directories and __init__.py files that couldn't be inferred from a keyword parameter to setup(), would it be a sensible feature request that setuptools do without the physical namespace directories in the future?
As I said, I'm not sure it needs them *now*, except to support setup.py develop.
Yes, as we need this.
I'm not really fond of the package_dirs feature, preferring to use an importable layout when I do development.
The question is whether a less annoying layout can be made importable.
But the feature does exist and please feel free to let me know whether it solves your problem,
It won't because of the need to support the develop use case. I haven't really had a chance to think about this, so I don't know what a solution, if there is one, would look like. For me, this isn't critical, as it is merely annoying to have to create namespace package directories. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
At 09:15 AM 2/9/2007 -0500, Jim Fulton wrote:
The question is whether a less annoying layout can be made importable.
Well, what's "annoying" is relative. I personally find the flat layout "annoying" because it's not WYSIWYG. :) But the answer to your question is "yes". Anything can be made importable, with sufficient effort. When a project that includes a namespace package is installed using --single-version-externally-managed, a special .pth file for that project is created that sets up empty modules in sys.modules with appropriate __path__ values. A similar technique could be applied in setup.py develop, if a project uses package_dirs. I'm just a bit reluctant to try to shoehorn something like that into 0.6. But, if you'd like to experiment with creating a patch (or a subclass of "develop") that would support creating and uninstalling this .pth file, see the 'install_namespaces()' method of the install_egg_info command in setuptools. The code you'd need for this would actually be *simpler* in some ways, because that code is trying to work relative to whatever directory it's installed in, but for what "develop" needs you could just bake the absolute paths right in. The uninstall mode of "develop" could just remove the .pth file. One possible complication, however, is that if someone didn't uninstall but left the .pth file around, it would produce strange results (if e.g., they installed an egg for the project without uninstalling the develop mode). So, the code in the .pth should probably check that the original project is still on sys.path before creating dummy modules.
On Feb 9, 2007, at 12:08 PM, Phillip J. Eby wrote:
At 09:15 AM 2/9/2007 -0500, Jim Fulton wrote:
The question is whether a less annoying layout can be made importable.
Well, what's "annoying" is relative. I personally find the flat layout "annoying" because it's not WYSIWYG. :)
But the answer to your question is "yes". Anything can be made importable, with sufficient effort.
Yup.
When a project that includes a namespace package is installed using --single-version-externally-managed, a special .pth file for that project is created that sets up empty modules in sys.modules with appropriate __path__ values.
I really don't fathom --single-version-externally-managed. :) Where is this .pth file created?
A similar technique could be applied in setup.py develop, if a project uses package_dirs. I'm just a bit reluctant to try to shoehorn something like that into 0.6.
Of course.
But, if you'd like to experiment with creating a patch (or a subclass of "develop") that would support creating and uninstalling this .pth file, see the 'install_namespaces()' method of the install_egg_info command in setuptools. The code you'd need for this would actually be *simpler* in some ways, because that code is trying to work relative to whatever directory it's installed in, but for what "develop" needs you could just bake the absolute paths right in.
What would read this .pth file?
The uninstall mode of "develop" could just remove the .pth file. One possible complication, however, is that if someone didn't uninstall but left the .pth file around, it would produce strange results (if e.g., they installed an egg for the project without uninstalling the develop mode). So, the code in the .pth should probably check that the original project is still on sys.path before creating dummy modules.
I kinda doubt I understand this enough to pursue it. In any case, I won't have time until after PyCon. I may ask you more about this there (assuming that you'll be there.) Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
At 01:21 PM 2/9/2007 -0500, Jim Fulton wrote:
I really don't fathom --single-version-externally-managed. :)
It's the same as good old distutils "install" -- with a couple of additions. The additions are that an .egg-info directory is installed alongside the package(s), and if there are namespace packages involved, a .pth file is also added.
Where is this .pth file created?
In site-packages (or whatever the effective --install-lib target directory is).
But, if you'd like to experiment with creating a patch (or a subclass of "develop") that would support creating and uninstalling this .pth file, see the 'install_namespaces()' method of the install_egg_info command in setuptools. The code you'd need for this would actually be *simpler* in some ways, because that code is trying to work relative to whatever directory it's installed in, but for what "develop" needs you could just bake the absolute paths right in.
What would read this .pth file?
Python, at startup, causing the empty namespace packages to be created in sys.modules with usable __path__ settings.
I kinda doubt I understand this enough to pursue it. In any case, I won't have time until after PyCon. I may ask you more about this there (assuming that you'll be there.)
No, I'm not going this year.
On Feb 9, 2007, at 3:33 PM, Phillip J. Eby wrote:
At 01:21 PM 2/9/2007 -0500, Jim Fulton wrote:
I really don't fathom --single-version-externally-managed. :)
It's the same as good old distutils "install" -- with a couple of additions.
Yeah, but I don't fathom distutils either. ;)
The additions are that an .egg-info directory is installed alongside the package(s), and if there are namespace packages involved, a .pth file is also added.
Where is this .pth file created?
In site-packages (or whatever the effective --install-lib target directory is).
This makes this approach uninteresting for buildout, which doesn't write to site-packages or have any site-packages equivalent. It sound like it also violates the egg promise that you just have to put the egg in sys.path for it to be useable. buildout relies on this promise. ...
I kinda doubt I understand this enough to pursue it. In any case, I won't have time until after PyCon. I may ask you more about this there (assuming that you'll be there.)
No, I'm not going this year.
Darn. We'll miss you. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
At 03:42 PM 2/9/2007 -0500, Jim Fulton wrote:
This makes this approach uninteresting for buildout, which doesn't write to site-packages or have any site-packages equivalent.
It sound like it also violates the egg promise that you just have to put the egg in sys.path for it to be useable. buildout relies on this promise.
Well, the backward-compatibility mode is for making system packages like .rpm files, and the namespace package support for that is a hack to deal with the fact that such packaging tools don't like to have multiple .rpm's or whatever containing the same file (i.e. the namespace package's __init__.py). This would be an even more extreme hack, if we tried to support 'develop' mode for procrustean package_dirs setups. You do, however, make a good point regarding the egg promise. As far as I can see, then, there is no way to support crazy package_dirs in combination with namespace packages, without automatically creating a bunch of directories and __init__.py files, along with some other crazy hacks. So the idea is probably a dead duck.
Am Fri, 09 Feb 2007 16:16:45 -0500 schrieb Phillip J. Eby:
As far as I can see, then, there is no way to support crazy package_dirs in combination with namespace packages, without automatically creating a bunch of directories and __init__.py files, along with some other crazy hacks. So the idea is probably a dead duck.
Not necessarily. If I get all of this right, then namespace packages work by creating a bunch of module objects in sys.modules which then take care of finding the right place to import any given subpackage from. So what is needed is to get some code executed that does the Right Thing when first trying to import a package or module that resides inside a namespace. If the egg promise is that putting an egg on sys.path suffices for making it available to Python, this precludes relying on .pth files in any way for doing something other than modifying sys.path itself, right? After all, the egg could be added to sys.path after .pth files have been evaluated - which I understand is the approach buildout takes. The next chance to get code executed is when the top-level namespace's __init__.py is imported. Thus there absolutely must exist a directory for the top-level namespace inside the source tree. But - what prevents the __init__.py file in this directory from setting up any intermediate namespaces in sys.modules and causing the real package to be looked for in some convenient place that avoids directory jumping in the source tree? -- Viele Grüße, Thomas
Am Fri, 09 Feb 2007 23:01:20 +0100 schrieb Thomas Lotze:
But - what prevents the __init__.py file in this directory from setting up any intermediate namespaces in sys.modules and causing the real package to be looked for in some convenient place that avoids directory jumping in the source tree?
Nothing does. A proof-of-concept example can be found at <http://www.thomas-lotze.de/Misc/foo.bar.baz-dev.tar.gz>. The source tree contains foo/__init__.py and baz/__init__.py, the former setting up some stuff in sys.modules using pkg_resources._handle_ns, the latter just printing out __file__ for proof that it has been imported. You can use this project both in develop mode and as an egg; I installed it using easy_install. Right now, a lot of os.path manipulation on __file__ is being done by foo/__init__.py; if this was replaced by a call to some appropriate pkg_resources function, the package_dir information could be used instead in a clean way. Also, the modules stuff gets set up all at once instead of recursively as declare_namespace does. Thus the intermediate namespace package foo.bar doesn't need an __init__.py, and it fact it doesn't even need a directory in the source tree. I can't see anything bad about that right now, but maybe a missing __init__.py does cause trouble somewhere. -- Viele Grüße, Thomas
At 12:15 AM 2/12/2007 +0100, Thomas Lotze wrote:
Am Fri, 09 Feb 2007 23:01:20 +0100 schrieb Thomas Lotze:
But - what prevents the __init__.py file in this directory from setting up any intermediate namespaces in sys.modules and causing the real package to be looked for in some convenient place that avoids directory jumping in the source tree?
Nothing does. A proof-of-concept example can be found at <http://www.thomas-lotze.de/Misc/foo.bar.baz-dev.tar.gz>. The source tree contains foo/__init__.py and baz/__init__.py, the former setting up some stuff in sys.modules using pkg_resources._handle_ns, the latter just printing out __file__ for proof that it has been imported. You can use this project both in develop mode and as an egg; I installed it using easy_install.
Right now, a lot of os.path manipulation on __file__ is being done by foo/__init__.py; if this was replaced by a call to some appropriate pkg_resources function, the package_dir information could be used instead in a clean way.
Also, the modules stuff gets set up all at once instead of recursively as declare_namespace does. Thus the intermediate namespace package foo.bar doesn't need an __init__.py, and it fact it doesn't even need a directory in the source tree. I can't see anything bad about that right now, but maybe a missing __init__.py does cause trouble somewhere.
It appears your goals are somewhat... confused. Namespace packages, as I've already said, do not always even have __init__.py files existing, so there is no place to put your example __init__.py that guarantees it will be executed. Remember that by definition a namespace package has no single "owner" project. It is potentially shared across multiple projects. When installed as an egg, each __init__.py is executed, true. But when a project is NOT installed via egg, but rather by a system packager, the __init__.py doesn't exist, and so cannot be executed. Thus, the normal case for a namespace package is to have a number of __init__.py files that is either 0 or >1. Having exactly one __init__.py present for some namespace package is an abnormality. Therefore, your prototype code is horribly broken, as it will corrupt pkg_resources' internal data structures when it is run more than once -- as it would have to be in order to support *each* project containing code under that namespace package. (And I won't even get into the pointlessness of wrapping module-level code in an import lock.)
Phillip J. Eby wrote:
It appears your goals are somewhat... confused.
I don't think so. I'm sure I still have to learn about setuptools but I think I know quite well what I want to achieve.
Remember that by definition a namespace package has no single "owner" project. It is potentially shared across multiple projects. When installed as an egg, each __init__.py is executed, true. But when a project is NOT installed via egg, but rather by a system packager, the __init__.py doesn't exist, and so cannot be executed.
If we have to take into account distributions of a project as source code bereft of its top-level namespace's __init__.py being expected to just work, I give up. I assumed to always work with either intact source or built eggs, the latter with or without their namespaces' __init__.py files being used and with or without being installed by our egg tools. The built egg has all the subdirectories needed and doesn't look different from other eggs in that respect. It doesn't rely on any namespace cleverness being performed by __init__.py, other than what setuptools already do for namespace packages. If the namespace mangling performed in that file was integrated with the setuptools' declare_namespace, there wouldn't need to be anything special about the __init__.py of a project with remapped package directories, and it wouldn't hurt if it wasn't executed anyway. In fact, the namespace package and all namespace subpackages could be given a standard __init__.py by setup bdist_egg. The package remapping is only needed when running from source, and I don't see a reason why the __init__.py shouldn't exist and be executed in that case.
Therefore, your prototype code is horribly broken, as it will corrupt pkg_resources' internal data structures when it is run more than once --
I don't see why it couldn't be made to behave properly. As I said, I'm still learning about setuptools, and I don't claim to be the most apt person to integrate any new functionality at this point.
(And I won't even get into the pointlessness of wrapping module-level code in an import lock.)
I'd appreciate it if you did, as I haven't yet found a difference between using the lock in modules and using it in pkg_resources.declare_namespace which is being called by modules. -- Thomas
participants (3)
-
Jim Fulton
-
Phillip J. Eby
-
Thomas Lotze