Phillip J. Eby wrote:
At 07:14 PM 10/1/2008 -0700, Toshio Kuratomi wrote:
In terms of implementation I'd much rather see something less centered on the egg being the right way and the filesystem being a secondary concern.
Eggs don't have anything to do with it; in Python, it's simply common sense to put static resources next to the code that uses them, if you want to "write once, run anywhere". And given Python's strength as an interactive development language with no "build" step, having to *install* your data files somewhere else on the system to use them isn't a *feature* -- not for a developer, anyway.
You're arguing about the developers point of view on something that's hidden behind an API. You've already made it so that the developer cannot just reference the file on the filesystem because the egg may be zipped. So for the developer there's no change here.
I'm saying that there's no need to have a hardcoded path to lookup the information at and then make the install tool place "forwarding information" there to send the package somewhere else. We have metadata. We should use it.
And our hypothetical de-jure standard won't replace the de-facto standard unless it's adopted by developers... and it won't be adopted if it makes their lives harder without a compensating benefit. For the developer, FHS support is a cost, not a benefit, and only relevant to a subset of platforms, so the spec should make it as transparent for them as possible, if they don't have an interest in explicit support for it. By the STASCTAP principle (Simple Things Are Simple, Complex Things Are Possible), it should be possible for distros to relocate, and simple for developers not to care about it.
It's both a cost and a benefit. The cost is having to use an API which they have to use anyway due to eggs possibly being zip files. The benefit is getting their code packaged by Linux distributors quicker and getting more contributors as a result of the exposure.
We should have metadata that tells us where the types of resources come from. When a package is installed on Linux the metadata could point locales at file:///usr/share/locale. When on Windows egg:locale (Perhaps the uninstalled case would use this too... that depends on how the egg structure and metadata evolves.)
A question we'd have to decide is whether this particular metadata is something that should be defined globally or per package. Or globally with a chance for packages to override it.
I think install tools should handle it and keep it out of developers' hair. We should of course distinguish configuration and other writable data from static data, not to mention documentation. Any other file-related info is going to have to be optional, if that. I don't really think it's a good idea to ask developers to fill in information they don't understand. A developer who works entirely on Windows, for example, is not going to have a clue what to specify for FHS stuff, and they absolutely shouldn't have to if all they're doing is including some static data.
Needing to have some information about the files you ship is inevitable. Documentation is a good example. man pages, License.txt, gnome help files, windows help files, API docs, sphinx docs, etc each have to be installed in different places, some with requirements to register the files so the system knows they exist. All the knowledge about what to do with these files should be placed in the tool. But the knowledge of what type to mark a given file with will have to lay with the developer.
Even today, there exist Python developers who don't use the distutils to distribute their packages, so anything that makes it even more difficult than it is today, isn't going to be a viable standard. The closer we can get in ease of use to just tarring up a directory, the more viable it'll be. (That's one reason, btw, why setuptools offers revision control support and find_packages() for automating discovery of what to include.)
Actually, as a person who distributes upstream packages which don't use distutils and is exposed to others, I'd say that the shortcomings in terms of where to install files and how to reference the files after install is one of the reasons that distutils is not used. Are there other reasons? Sure. But this is definitely one of the reasons.
I'd have preferred to avoid that complexity, but if the two of us can't agree then there's no way on earth to get a community consensus.
Btw, pkg_resources' concept of "metadata" would also need to be relocatable, since e.g. the "EggTranslations" package uses that
to store localizations of image resources and message catalogs. (Other uses of the metadata files also inlcude scripts, dependencies, version info, etc.)
Actually, we should decide whether we want to support that kind of thing within the egg metadata at all. The other things we've been talking about belonging in the metadata are simple key value pairs. EggTranslations uses the metadata area as a data store. (Or in your definition, a resource store). This breaks with the definition of what metadata is. Translations don't store information about a package, they store alternate views of data within the package.
I was actually somewhat incorrect in my statement about the distinction between pkg_resources "metadata" and "resources"; "metadata" is really "data that goes with the distribution, not with a specific package within the distribution". Only some of this data is "about" the distribution; the rest is data "with" or "of" the distribution. (This is a slight API wart, but the use case exists nonetheless.)
Meanwhile, regarding the proposed key-value pairs system, I don't see how that works; "extras" dependency information and entry points are a bit more structured than just key-value pairs; both are currently represented as .ini-like files with arbitrary section names. I suppose you could squash those entire files into values in some sort of key-value system, but that seems a bit hairy to me. In particular, setuptools design choice for separate metadata files is that many of these things don't need to be loaded at the same time. Also, PKG-INFO-style metadata can contain rather large blobs of text that aren't needed or useful at runtime. Entry points and extras are mostly runtime metadata, with the occasional bit of build or install usage.
Structured, yes. Structure and optimizations to how you lookup the data is good. But there is a difference between using metadata to save and lookup configuration and using metadata to save and lookup data (like locale files). You wouldn't save data into gconf or the Windows Registry for instance (at least, not if you don't expect people to make fun of you *cough*evolution*cough*).
OTOH if it's not really a metadata store vs a resource store but instead a package store vs a distribution store we need to decide if we really want to have both. Someone pointed out earlier that
Side note: the fact that someone wrote EggTranslations speaks of a need for people to be able to access the per-package data store across packages. Let's fix that and work with EggTranslations to rewrite its backend to use a proper storage. (Looking at the EggTranslations documentation, it might even be a proper place for getting ideas and help with designing the API for a public data store.)