
On 28 December 1999, Michael Muller said:
I've enclosed a patch which addresses a particular concern of mine: package meta-information. At the end of the install command, it creates a package information file in <install_py.install_dir>/_pkginfo named after the package (it also creates the "_pkginfo" directory, if necessary). The file contains python variable definitions for the package name, version number, list of files installed, dependencies, and compatible versions (although the latter two are always empty at this time).
Hmmm... interesting idea. I mean, we've known all along that some sort of "package metainfo database" (metadatabase? ugh) is going to be needed for exactly the reasons you listed (uninstall, dependeny analysis, and system cataloging). I have not spent a lot of time thinking about it, but I think I was stuck in a "database must be one big file" rut, with all the attendant problems of performance, concurrent access, etc. About as far as I got was thinking "text files suck for size and performance, and DB or dbm files might not be portable enough". But I think I like your approach -- at least part of it. Specifically, I think I like the notion of spreading the "metainfo database" across many files in many directories. To find information about all module distributions installed, you troll sys.path, looking for a "_pkginfo" subdirectory in each entry, and then look at the files installed there. At least, that's the understanding I get from reading your message and a cursory scan of the patch -- am I right? This pretty much solves the practical side of "what to do about concurrent access" -- in practice, it's not going to happen much, so don't get too worried about it. It doesn't sound very good for performance, unless all you want is a list of packages installed -- that should be pretty fast (you can get everything you need from a succession of os.listdir() calls). What I'm a little leery about is using Python code as a data format. It's attractive because we all know the syntax and don't have to write a parser. But using a general-purpose language for *such* a specific, tightly-targeted task seems ... I dunno ... overkill-ish. And I wonder if there are security holes lurking in the concept of using code for system catalog data. Does anyone else share my reservations (which are vague, ill-defined, and more superstitious than anything else)? Conversely, does anyone think that Python code is absolutely the right way to store module distribution metadata? Thanks again for the patch -- I think it should find its way into Distutils 0.2 after the SIG has thrashed through some of the issues it raises. Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913