Phillip J. Eby wrote:
At 09:40 PM 10/1/2008 +0200, Josselin Mouette wrote:
Le mercredi 01 octobre 2008 Ã 14:39 -0400, Phillip J. Eby a Ã©crit :
We need to be able to mark locale, config, and data files in the metadata.
Sure... and having a standard for specifying that kind of application/system-level install stuff is great; it's just entirely outside the scope of what eggs are for.
I donâ€™t follow you. If the library needs these files to work, you definitely want to ship them, whether it is as their FHS locations in a package, or in the egg.
Egg files aren't an all-purpose distribution format; they were designed for application plugins, and for libraries needed to support application plugins. As such, they're self-contained and weren't designed for application-level installation support, such as documentation, configuration or data files, icons, etc.
As has been pointed out, these are deficiencies of .egg files wrt the full spectrum of library and application installation needs, which is why I'm pushing for us working on an installation metadata standard that can accommodate these other needs that the .egg layout isn't really suited for.
We need to get the list of problems up somewhere on the wiki so that people can check that the evolving standard doesn't fall into the same pitfalls. After all, people are using the egg and pkg_resources API for just this purpose today with some happy about it and others not so much.
My main point about the resources is simply that it's a needless complication to physically separate static data needed by a library at runtime, based solely on its file extension, in cases where only that library will be reading that file, and the file's contents are constant for that version of the library.
To put it another way, if some interpretation of the FHS makes a distinction between two files encoding the same data, one named foo.bar and foo.py, where the only difference between the two is the internal encoding of the data, then that interpretation of the FHS is not based on any real requirement, AFAICT.
Actually, file encoding is one major criteria in the FHS. However, it's probably not in the manner you're thinking of :-) Files which are architecture dependent generally need to be separated from files which are architecture independent. Since text files and binary data which has a standard byte-oriented format are generally what's used as data these days it's the major reason that data files usually go in /usr/share while libraries/binaries go in /usr/lib and /usr/bin. This is dues to the range of computers that architecture dependent vs architecture independent data can be shared with. Of course, part of python's site-packages on Linux systems violates this rule as python can split architecture dependent and architecture independent packages from one another. I know that some distributions have debated moving the architecture independent portion of site-packages to /usr/share although I don't know if any have (Josselin, has Debian done this?) The idea of moving is not straight forward because of 1) compatibility with unpackaged software and 2) /usr/share is seen in two lights: the place for architecture independent files and the place for data; /usr/lib is seen in two lights: the place for architecture dependent non-executables and the place for code whose instructions are run by executables.
Of course, for documentation, application icons, and suchlike, the data *will* be read by things other than the library itself, and so a standardized location is appropriate. The .egg format was designed primarily to support resources read only by the package in question, and secondarily to support metadata needed by applications or libraries that the package "plugs in" to. It was not originally intended to be an general-purpose system package installation format.
<nod>. Despite this design, it's presently being used for that. So we need to figure out what to do about it.
To be clear, I mean here that a "file" (as opposed to a resource) is something that the user is expected to be able to read or copy, or modify. (Whereas a resource is something that is entirely internal to a library, and metadata is information *about* the library itself.)
Itâ€™s not as simple as that. Python is not the only thing out there, and there are many times where your resources need to be shipped in existing formats, in files that land at specific places. For example icons go in /usr/share/icons, locale files in .mo format in /usr/share/locale, etc.
And docs need to go in /usr/share/doc, I presume.
docs are special in the packaging world on several accounts. Generally the packager has to collect at least some of the docs themselves (as things like LICENSE.txt aren't normally included in a doc install but are important for distributions to package.) rpm, at least provides a macro to make it easy for the packager to mark files and directories from the source tree as documentation which rpm will put in the appropriate directory itself. So packagers often use an upstream's build scripts to build the docs, but usually install the docs using the package tool's facilities.
Additionally, there's a difference between docs which the program uses (for instance for online help) and docs which the end user would have to navigate the filesystem and invoke a viewer themselves to read. The former is data, the latter is docs.
But these aren't necessarily "resources" in the way I'm defining the term. Some of them *could* be, perhaps. Others aren't.
To be clear, what I'm trying to say is that it is a perfectly valid use case for a Python package author to have static data contained within their Python package directory layout for purposes of accessing that data as if it were code, but without having to go to the trouble of converting it to a .py file (and possibly having to extract it back out at runtime). This usage of "data" files isn't in conflict with the FHS, as I understand it.
But I also understand that there are other kinds of "data" files which *don't* fall under that use case, and which it is desirable to install to shared locations. We need to support both.
Possibly. We could definitely throw out the first case (resources) and just have a data category and the FHS would be fine. Whether there's a case for resources depends on their definition. the test of "Could be put in a python file and extracted" doesn't fly. I could convert all my images to .xpm and put them in python files. But that's a lot of work. And the moment I take them back out to separate .xpm files, they would definitely belong in /usr/share.