
At 05:05 PM 4/7/2008 -0400, Alexander Michael wrote:
a. I believe that having side-car files that sit alongside packages because they have the same base name makes the database more transparent to the uninitiated.
I'm not aware that this was ever a stated design goal, nor why it should have any priority. OTOH, files named by distribution would be at least as, if not even *more* transparent than package names, so I don't see any particular benefit to this.
Just browsing a directory of python packages will allow you to see what's going on. Moving like-names files around manually maintains the integrity and availability of the data.
Moving anything manually, other than the *entire* directory, will be unlikely to retain any form of integrity, so it's best not to give the false impression that it would.
I think that having magic entries in an essentially "hidden" directory somewhere will cause all sorts of trouble that could be avoiding at the cost of a small bit of duplication. b. I assume, perhaps incorrectly, that most distributions contain only a single package.
Very incorrectly, unless you mean a single top-level package. Odds are fairly good that if there's a package, there's probably at least a subpackage, too, like perhaps a tests subpackage.
That said, I do agree that if you are primarily interested in a database of *distributions* (as opposed to *packages*) then something like is proposed in PEP 262 makes more sense (but it would have to be per directory and not site-wide due to the dynamic nature of the python path).
That's exactly what I want. The only reason I didn't just implement easy_install using a per-directory form of PEP 262 is that I wanted something done rather more immediately. That was years ago, so I can afford to be more patient now. :)
This is a trade-off between putting the metadata up front in an obvious and easy to understand way so that it will hopefully have a better chance of being noticed and maintained, versus tucking it away hidden someplace so that even though it is broken, it doesn't bother anyone until they care enough to fix it. *It is this trade-off that I am exploring with this strawman "counter" proposal to PEP 262.*
Someone would have to be crazy to maintain this information by hand. So I'd actually consider it an advantage if the file format made this fact plain, by using something that's difficult for a human being to maintain, like say a pickle. ;-) OTOH, it's possible that some system packagers will not wish to use Python to generate the files, so using something a bit less complex would be a good idea. The format proposed by PEP 262 isn't really that bad of a trade-off in those terms.
- The strawman proposal did not explicitly address how optional
add-on tools (like setuptools) might manage namespace packages.
I think there's some mistunderstanding here about the proposal's goals. If the proposal doesn't work for setuptools, it doesn't work, period.
The entire point is to allow setuptools to do its work without annoying the people who don't want to use it.
I agree with Floris that the best way to avoid magic is to actually have the sub-packages in a namespace share the same parent directory on disk.
I agree with this also. The issue is that an __init__.py must exist for this to happen, but most system packaging tools (e.g. RPM) require that a given file be owned by at most one system package (i.e., distribution), whereas the contents of a namespace package are assembled from multiple distributions.
That's the problem that needs solving, not runtime support for the namespace itself.
- Concerns were raised about the performance penalty for using the
side-car style files without version numbers possibly not all of which were located at the top-most level of the directory listed in the python path.
Any add-on tool that actually used the data would very likely need to build a cache of the data using a more efficient representation,
This is a misunderstanding of the point I raised. Floris merely asked why there were version numbers in .egg-info files, and I answered him. That doesn't actually have much, if anything, to do with the package database proposal. It's merely how installed distributions' versions can be recognized quickly at runtime, not anything to do with how potential installation conflicts are handled at installation time.
easy_install uses eggs for installation simply because it need never worry about file ownership conflicts. There is a direct mapping from a distribution to its files: the contents of a zipfile or subdirectory. This also allows for (relatively) straightforward uninstallation.
The goal of the proposal, then, is to have a way for easy_install to have another way to map from a distribution to its owned files (and vice versa), so that eggs are not necessary for normal, single-version installations.