Where to install non-code files

Another one for the combined distutils/python-dev braintrust; apologies to those of you on both lists, but this is yet another distutils issue that treads on python-dev territory. The problem is this: some module distributions need to install files other than code (modules, extensions, and scripts). One example close to home is the Distutils; it has a "system config file" and will soon have a stub executable for creating Windows installers. On Windows and Mac OS, clearly these should go somewhere under sys.prefix: this is the directory for all things Python, including third-party module distributions. If Brian Hooper distributes a module "foo" that requires a data file containing character encoding data (yes, this is based on a true story), then the module belongs in (eg.) C:\Python and the data file in (?) C:\Python\Data. (Maybe C:\Python\Data\foo, but that's a minor wrinkle.) Any disagreement so far? Anyways, what's bugging me is where to put these files on Unix. <prefix>/lib/python1.x is *almost* the home for all things Python, but not quite. (Let's ignore platform-specific files for now: they don't count as "miscellaneous data files", which is what I'm mainly concerned with.) Currently, misc. data files are put in <prefix>/share, and the Distutil's config file is searched for in the directory of the distutils package -- ie. site-packages/distutils under 1.5.2 (or ~/lib/python/distutils if that's where you installed it, or ./distutils if you're running from the source directory, etc.). I'm not thrilled with either of these. My inclination is to nominate a directory under <prefix>lib/python1.x for these sort of files: not sure if I want to call it "etc" or "share" or "data" or what, but it would be treading in Python-space. It would break the ability to have a standard library package called "etc" or "share" or "data" or whatever, but dammit it's convenient. Better ideas? Greg -- Greg Ward - "always the quiet one" gward@python.net http://starship.python.net/~gward/ I have many CHARTS and DIAGRAMS..

On Windows and Mac OS, clearly these should go somewhere under sys.prefix: this is the directory for all things Python, including third-party module distributions. If Brian Hooper distributes a module "foo" that requires a data file containing character encoding data (yes, this is based on a true story), then the module belongs in (eg.) C:\Python and the data file in (?) C:\Python\Data. (Maybe C:\Python\Data\foo, but that's a minor wrinkle.)
Any disagreement so far?
A little. I dont think we need a new dump for arbitary files that no one can associate with their application. Why not put the data with the code? It is quite trivial for a Python package or module to find its own location, and this way we are not dependent on anything. Why assume packages are installed _under_ Python? Why not just assume the package is _reachable_ by Python. Once our package/module is being executed by Python, we know exactly where we are. On my machine, there is no "data" equivilent; the closest would be "python-cvs\pcbuild\data", and that certainly doesnt make sense. Why can't I just place it where I put all my other Python extensions, ensure it is on the PythonPath, and have it "just work"? It sounds a little complicated - do we provide an API for this magic location, or does everybody cut-and-paste a reference implementation for locating it? Either way sounds pretty bad - the API shouldnt be distutils dependent (I may not have installed this package via distutils), and really Python itself shouldnt care about this... So all in all, I dont think it is a problem we need to push up to this level - let each package author do whatever makes sense, and point out how trivial it would be if you assumed code and data in the same place/tree. [If the data is considered read/write, then you need a better answer anyway, as you can't assume "c:\python\data" is writable (when actually running the code) anymore than "c:\python\my_package" is] Mark.

Greg Ward wrote: [installing data files]
On Windows and Mac OS, clearly these should go somewhere under sys.prefix: this is the directory for all things Python, including third-party module distributions. If Brian Hooper distributes a module "foo" that requires a data file containing character encoding data (yes, this is based on a true story), then the module belongs in (eg.) C:\Python and the data file in (?) C:\Python\Data. (Maybe C:\Python\Data\foo, but that's a minor wrinkle.)
Any disagreement so far?
Yeah. I tend to install stuff outside the sys.prefix tree and then use .pth files. I realize I'm, um, unique in this regard but I lost everything in some upgrade gone bad. (When a Windows de- install goes wrong, your only option is to do some manual directory and registry pruning.) I often do much the same on my Linux box, but I don't worry about it as much - upgrading is not "click and pray" there. (Hmm, I guess it is if you use rpms.) So for Windows, I agree with Mark - put the data with the module. On a real OS, I guess I'd be inclined to put global data with the module, but user data in ~/.<something>.
Greg Ward - "always the quiet one" <snort>
- Gordon

So for Windows, I agree with Mark - put the data with the module. On a real OS, I guess I'd be inclined to put global data with the module, but user data in ~/.<something>.
Aha! Good distinction. Modifyable data needs to go in a per-user directory, even on Windows, outside the Python tree. But static data needs to go in the same directory as the module that uses it. (We use this in the standard test package, for example.) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido writes]
Modifyable data needs to go in a per-user directory, even on Windows, outside the Python tree.
This seems to be the value of key "AppData" stored under in HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Filders Right? Thomas

On 26 May 2000, Guido van Rossum said:
Modifyable data needs to go in a per-user directory, even on Windows, outside the Python tree.
But static data needs to go in the same directory as the module that uses it. (We use this in the standard test package, for example.)
What about the Distutils system config file (pydistutils.cfg)? This is something that should only be modified by the sysadmin, and sets the site-wide policy for building and installing Python modules. Does this belong in the code directory? (I hope so, because that's where it goes now...) (Under Unix, users can have a personal Distutils config file that overrides the system config (~/.pydistutils.cfg), and every module distribution can have a setup.cfg that overrides both of them. On Windows and Mac OS, there are only two config files: system and per-distribution.) Greg -- Greg Ward - software developer gward@mems-exchange.org MEMS Exchange / CNRI voice: +1-703-262-5376 Reston, Virginia, USA fax: +1-703-262-5367

On 26 May 2000, Gordon McMillan said:
Yeah. I tend to install stuff outside the sys.prefix tree and then use .pth files. I realize I'm, um, unique in this regard but I lost everything in some upgrade gone bad. (When a Windows de- install goes wrong, your only option is to do some manual directory and registry pruning.)
I think that's appropriate for Python "applications" -- in fact, now that Distutils can install scripts and miscellaneous data, about the only thing needed to properly support "applications" is an easy way for developers to say, "Please give me my own directory and create a .pth file". (Actually, the .pth file should only be one way to install an application: you might not want your app's Python library to muck up everybody else's Python path. An idea AMK and I cooked up yesterday would be an addition to the Distutils "build_scripts" command: along with frobbing the #! line to point to the right Python interpreter, add a second line: import sys ; sys.append(path-to-this-app's-python-lib) Or maybe "sys.insert(0, ...)". Anyways, that's neither here nor there. Except that applications that get their own directory should be free to put their (static) data files wherever they please, rather than having to put them in the app's Python library. I'm more concerned with the what the Distutils works best with now, though: module distributions. I think you guys have convinced me; static data should normally sit with the code. I think I'll make that the default (instead of prefix + "share"), but give developers a way to override it. So eg.: data_files = ["this.dat", "that.cfg"] will put the files in the same place as the code (which could be a bit tricky to figure out, what with the vagaries of package-ization and "extra" install dirs); data_files = [("share", ["this.dat"]), ("etc", ["that.cfg"])] would put the data file in (eg.) /usr/local/share and the config file in /usr/local/etc. This obviously makes the module writer's job harder: he has to grovel from sys.prefix looking for the files that he expects to have been installed with his modules. But if someone really wants to do this, they should be allowed to. Finally, you could also put absolute directories in 'data_files', although this would not be recommended.
(Hmm, I guess it is if you use rpms.)
All the smart Unix installers (RPM, Debian, FreeBSD, ...?) I know of have some sort of dependency mechanism, which works to varying degrees of "work". I'm only familar with RPM, and my usual response to a dependency warning is "dammit, I know what I'm doing", and then I rerun "rpm --nodeps" to ignore the dependency checking. (This usually arises because I build my own Perl and Python, and don't use Red Hat's -- I just make /usr/bin/{perl,python} symlinks to /usr/local/bin, which RPM tends to whine about.) But it's nice to know that someone is watching. ;-) Greg -- Greg Ward - software developer gward@mems-exchange.org MEMS Exchange / CNRI voice: +1-703-262-5376 Reston, Virginia, USA fax: +1-703-262-5367

Greg Ward wrote:
On 26 May 2000, Gordon McMillan said:
Yeah. I tend to install stuff outside the sys.prefix tree and then use .pth files. I realize I'm, um, unique in this regard but I lost everything in some upgrade gone bad. (When a Windows de- install goes wrong, your only option is to do some manual directory and registry pruning.)
I think that's appropriate for Python "applications" -- in fact, now that Distutils can install scripts and miscellaneous data, about the only thing needed to properly support "applications" is an easy way for developers to say, "Please give me my own directory and create a .pth file".
Hmm. I see an application as a module distribution that happens to have a script. (Or maybe I see a module distribution as a scriptless app ;-)). At any rate, I don't see the need to dignify <prefix>/share and friends with an official position.
(Actually, the .pth file should only be one way to install an application: you might not want your app's Python library to muck up everybody else's Python path. An idea AMK and I cooked up yesterday would be an addition to the Distutils "build_scripts" command: along with frobbing the #! line to point to the right Python interpreter, add a second line: import sys ; sys.append(path-to-this-app's-python-lib)
Or maybe "sys.insert(0, ...)".
$PYTHONSTARTUP ?? Never really had to deal with this. On my RH box, /usr/bin/python is my build. At a client site which had 1.4 installed, I built 1.5 into $HOME/bin with a hacked getpath.c.
I'm more concerned with the what the Distutils works best with now, though: module distributions. I think you guys have convinced me; static data should normally sit with the code. I think I'll make that the default (instead of prefix + "share"), but give developers a way to override it. So eg.:
data_files = ["this.dat", "that.cfg"]
will put the files in the same place as the code (which could be a bit tricky to figure out, what with the vagaries of package-ization and "extra" install dirs);
That's an artifact of your code ;-). If you figured it out once, you stand at least a 50% chance of getting the same answer a second time <.5 wink>. - Gordon

On 26 May 2000, Gordon McMillan said:
Hmm. I see an application as a module distribution that happens to have a script. (Or maybe I see a module distribution as a scriptless app ;-)).
But end-users who just want to run a Python application see it as an application; the fact that it's (largely) written in Python and includes a Python library of its own is immaterial. However, all files from that application should most likely be installed in the same place -- /usr/local/myapp or "C:\Program Files\MyApp", pick your poison. And the module developer should have the option of just dumping his stuff in /usr/local/lib/python1.x/site-packages and /usr/local/bin, or C:\Python, or whatever is the locally appropriate place to dump Python modules. Currently, of course, that's the *only* option that the Distutils easily supports. (Although I suspect that with the current code, one could craft a setup.cfg that forces the "install" command to put files wherever you damn well please. That's icky, though -- it won't deal with setting the right sys.path for your application.) [me]
An idea AMK and I cooked up yesterday would be an addition to the Distutils "build_scripts" command: along with frobbing the #! line to point to the right Python interpreter, add a second line: import sys ; sys.append(path-to-this-app's-python-lib)
Or maybe "sys.insert(0, ...)".
Oops, meant "sys.path..." there of course. [Gordon]
$PYTHONSTARTUP ??
The idea is to set sys.path for *just this application* rather than for all Python code installed on the system. This is kind of important if you have an application that defines a module "DateTime" and another, independent module called "DateTime". /usr/local/myapp/lib/python should come first in sys.path when you run myapp, and it should never be seen when you run anything else. The possibilities are: * create shell script wrappers for every Python script: eg. script1 might be PYTHONPATH=/usr/local/myapp/lib/python \ exec /usr/local/bin/python1.6 /usr/local/myapp/_script1.py * adjust the scripts at build time so set the right sys.path Nasty as it is, I incline towards the latter: if nothing else, it's more portable! But this is all speculative. The immediate aim of the Distutils is *not* full-blown Python applications, but I am starting to see how we could support such applications. A little idle speculation never hurt anyone.
That's an artifact of your code ;-). If you figured it out once, you stand at least a 50% chance of getting the same answer a second time <.5 wink>.
What!?! Are you accusing me of writing complex code? Well... ok... maybe a little bit... but only in install.py, really... the rest of it's quite straightforward! Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ I have many CHARTS and DIAGRAMS..

On Fri, 26 May 2000, Greg Ward wrote:
[Gordon]
$PYTHONSTARTUP ??
The idea is to set sys.path for *just this application* rather than for all Python code installed on the system. This is kind of important if
It's useful to remember that $PYTHONSTARTUP only affects interactive interpreters, not scripts/applications. Using this for anything related to the installed base is pretty bogus. It would also be unreliable since users won't cooperate. ;) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

On 26 May 2000, Fred L. Drake said:
It's useful to remember that $PYTHONSTARTUP only affects interactive interpreters, not scripts/applications. Using this for anything related to the installed base is pretty bogus. It would also be unreliable since users won't cooperate. ;)
Thanks for the reminder. The latter also applies to PYTHONPATH: if you expect an environment variable to be set, you had bloody well better supply a shell script that sets it appropriately. (And forget about portability to Mac OS or Windows -- or at least, forget about being used on Windows by anybody except Unix refugees.) I have yet to hear howls of my revulsion at my loopy idea of sticking this: import sys; sys.path.insert(0, application-python-path) into scripts that are installed as part of a "Python application" (ie. a module distribution whose main interface is a script or scripts, and that gets its own directory rather than dangling off Python's library directory). Could it be that people actually think this is a *good* idea? ;-) Greg -- Greg Ward - maladjusted nerd gward@python.net http://starship.python.net/~gward/ Hold the MAYO & pass the COSMIC AWARENESS ...
participants (7)
-
Fred L. Drake
-
Gordon McMillan
-
Greg Ward
-
Greg Ward
-
Guido van Rossum
-
Mark Hammond
-
Thomas Heller