
Hi -- [cc'd to the distutils-sig, since this turns out to deserve a public airing rather than being the private email it started out as] after *waaay* too long, I've finally gotten around to looking at the "pkginfo" patch to the Distutils you posted back in January. A few comments: * I'm not convinced a separate PackageInfo class is needed -- the Distribution stuff is the home for package meta-data, and if it gets a bit more complex (eg. dependencies list), I think that's OK. I definitely don't like having two classes (Distribution and PackageInfo) with largely the same info, though. * I'm leery of doing the fancy stuff, namely required packages and compatible versions. While your data model might well be the Right Thing, it might not, and I don't think this stuff has been sufficiently discussed on the SIG. And I'm also not sure that adding slots for the data without having code to back them up is right, either. On the one hand, it's good to get people in the habit of listing requirements/dependencies, but I don't want to raise false expectations that the Distutils will actually *do* anything with that information. (It will someday, but post-Distutils 1.0/Python 1.6.) * I find your type-checking machinery in pkginfo.py intriguing, but again I'm not sure if it's appropriate. It's a neat approach to a common problem, but strikes me as over-engineered for this one module. If I'm going to do really thorough type-checking on the attributes of one class, I'd rather do it everywhere. Anyways, my inclination right now is to take the important stuff from pkginfo.py -- writing the "package info" file -- and graft it into install_info.py. I suppose reading the "package info" file will also be necessary to support an uninstall script. (*Not* an uninstall command, because you don't want to require having the source distribution around in order to uninstall something!) What say you? [And what say other members of the SIG?] Greg -- Greg Ward - geek gward@ase.com http://starship.python.net/~gward/ All of life is a blur of Republicans and meat!

Hello again, one moment whilst I swap in some pages... Greg Ward wrote:
* I'm not convinced a separate PackageInfo class is needed -- the Distribution stuff is the home for package meta-data, and if it gets a bit more complex (eg. dependencies list), I think that's OK. I definitely don't like having two classes (Distribution and PackageInfo) with largely the same info, though.
I disagree. Distribution contains package meta-info, but it also contains a lot of information that is relevant to a source distribution: packages, modules, source files. PackageInfo includes a subset of that information (package name, version, author...) and it also includes the final set of installed files, which appears to me to be the product of the install commands, not of the Distribution. [correct me if I'm missing something here, obviously you know your code better than I do, particularly since I haven't looked at it in several months]. Furthermore, package information deserves to be seperated out for purposes of modularity: if people want to create alternate forms of the module (based, perhaps, on RPM or DBM files), they should be able to plug their replacement right into the system as long as they conform to a very simple, specific interface. Likewise, programmers using alternate build/distribution technologies should be able to define package information without having to use distutils.
* I'm leery of doing the fancy stuff, namely required packages and compatible versions. While your data model might well be the Right Thing, it might not, and I don't think this stuff has been sufficiently discussed on the SIG. And I'm also not sure that adding slots for the data without having code to back them up is right, either. On the one hand, it's good to get people in the habit of listing requirements/dependencies, but I don't want to raise false expectations that the Distutils will actually *do* anything with that information. (It will someday, but post-Distutils 1.0/Python 1.6.)
Yeah, I thought about that when I wrote it: I added it anyway in hopes that the issue of what "The Right Thing" is would be thrashed out on the SIG. Under the circumstances, I agree that it should be removed (for now, at least).
* I find your type-checking machinery in pkginfo.py intriguing, but again I'm not sure if it's appropriate. It's a neat approach to a common problem, but strikes me as over-engineered for this one module. If I'm going to do really thorough type-checking on the attributes of one class, I'd rather do it everywhere.
It's funny: as I looked at the code again now I was having one of those "what the &^$@ was I thinking???" monents; but then it all came flooding back to me like some twisted repressed memory... The problem is not so much the type checking: it's the persistence. In order to read and write the package information, we need to either have code to write and read each field individually, or have some sort of generalized way of writing and reading different kinds of content. The former approach is difficult to extend and maintain, particularly when you start dealing with complex nested structures. The persistence problem is complicated further by the fact that ConfigParser files favor readability over unambiguous expression. For example, a string with no trailing or leading whitespace and no embedded newlines can (and should) be easily be expressed as "header: value of the string". Strings with special needs require special escapes so that these characters are preserved correctly. In order to simplify things and try to keep the package info file syntax as clear as possible, I decided to do the following: 1) Create classes that know how to read and write certain kinds of data. 2) Map the attributes of the PackageInfo object to instances of these classes so that reading and writing is just a matter of iterating over the attributes and calling the associated 'write()' method. 3) Verify that the attribute types are correct in the PackageInfo constructor to keep an error from occuring at the point where the information is written. This approach also allows us to do context sensitive parsing, which does a great deal to clean up the syntax of the package info file. In the absence of dependency and compatibility information, all of this isn't as important: however, at some point I'm sure it will be desirable to add more complicated information to this object. If it wasn't for the fact that I'd like it to be readable, I'd say we should just pickle the object. I liked the original pprint/execfile approach because it seemed to be the best of both worlds. ============================================================================= michaelMuller = mmuller@enduden.com | http://www.cloud9.net/~proteus ----------------------------------------------------------------------------- Those who do not understand Unix are condemned to reinvent it, poorly. -- Henry Spencer =============================================================================

On 21 April 2000, Michael Muller said:
I disagree. Distribution contains package meta-info, but it also contains a lot of information that is relevant to a source distribution: packages, modules, source files. PackageInfo includes a subset of that information (package name, version, author...) and it also includes the final set of installed files, which appears to me to be the product of the install commands, not of the Distribution.
Well, if you were watching python-checkins@python.org late last night, you'll notice you (partly) won that argument without even trying: I've separated the meta-data out into a DistributionMetadata class, because that was the best way to make Bastian's "meta-data display options" patch work. However, DistributionMetadata is *just* a place for things like the package name, version, author, etc. to live, and for methods to dole those out. Someday, that will include fancy meta-data like dependencies/requirements/compatible versions, but for now it's simple and basic; the only logic is stuff like `return self.name or "UNKNOWN"' in the 'get_name()' method. But I still don't think this is the place for lists-of-files-built or lists-of-file-installed. As it happens, I have made some renovations to the code to accomodate this sort of thing; now, you get those list by calling the 'get_outputs()' method on the build or install command objects. I think the list-of-files-installed really is the property of the class that does the installation, and I don't see a big need to keep a copy of that list with the "package meta-data" -- yes, this is information that should be installed with the meta-data, but it isn't really meta-data per se (IMHO).
Furthermore, package information deserves to be seperated out for purposes of modularity: if people want to create alternate forms of the module (based, perhaps, on RPM or DBM files), they should be able to plug their replacement right into the system as long as they conform to a very simple, specific interface. Likewise, programmers using alternate build/distribution technologies should be able to define package information without having to use distutils.
If people want to make RPMs, they will use the "bdist_rpm" command -- when it exists. ;-) A prototype is "bdist_dumb", which generates a zip or tarball built distribution; I'm blithely optimistic that extending that to generate an RPM won't be too hard. Anyways, the "bdist_*" commands will all depend heavily on the 'get_outputs()' method of the "install" command, as well as the meta-data info furnished by the Distribution object (on behalf of its DistributionMetadata object).
The problem is not so much the type checking: it's the persistence. In order to read and write the package information, we need to either have code to write and read each field individually, or have some sort of generalized way of writing and reading different kinds of content. The former approach is difficult to extend and maintain, particularly when you start dealing with complex nested structures.
Ahh, I thought it was something like that. I'm also starting think your original pprint-and-execfile approach was better. Arghh! Maybe a syntax with fragments of Python code to define the complicated structures would be a good compromise. Hmmm... in any case, I think it's moving away from something simple enough for ConfigParser to be appropriate. Greg -- Greg Ward - geek gward@ase.com http://starship.python.net/~gward/ I just forgot my whole philosophy of life!!!
participants (2)
-
Greg Ward
-
Michael Muller