installing data files and headers

Another problem we've been struggling with for Zope projects is that distutils really only installs Python modules and extensions. It's support for data files and C header files is pretty limited. We've got problems with each that could probably be solved in distutils. We often store data files in a package directory. Zope components sometimes have configuration files, presentation files like html and images, and other data. One common case is unit test packages, which often have test data in them. In all these cases, we find it useful to access the data files by loading them from the package directory. You get the package's __path__ attribute and look for data files in that directory. The problem is that distutils won't install these files for us. It ends up being a lot of work to get the files installed. You need to provide a custom distclass to copy the files at build and install time. It would be a lot more convenient if distutils would just install the files by default. I think there are some potential problems with installing non .py files. You need to have some control over what exactly gets built and installed, so that you don't install .py~ files. One possibility is to explicitly list the file extensions that constitute installable data. We did that for Zope3, but the list of extensions ended up being fairly long. The other problem we have is with header files. We'd like to have .h files that are installed inside a directory in /usr/local/include. For example, we'd like source code that uses the persistence API to read like this: #include "persistence/persistence.h" #include "persistence/persistenceAPI.h" I can't figure out any way to instruct distutils to create a persistence directory and put the headers in it. I think we'd need to extend the specification for header files to make this possible. Jeremy

On Fri, Feb 28, 2003 at 10:31:00AM -0500, Jeremy Hylton wrote:
Attached to this mail is a Distutils extension I coded to install twisted tml file, but it works fine to install any other file in a python package. Feel free to use, modify and distribute. -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Développement logiciel avancé - Intelligence Artificielle - Formations

On Fri, Feb 28, 2003 at 10:31:00AM -0500, Jeremy Hylton wrote:
Good idea. We have a similar subclass for Quixote that installs *.ptl files, and it's a common need.
Well, what are the options? 1) List extensions. 2) Explicitly list pathnames for additional files. 3) A MANIFEST.in-like mini-language for specifying which files should be installed. 4) Automatically add things in package directories that aren't obviously irrelevant (*~, *.pyc, CVS, .svn). Any other ideas? 4) probably offers too little control, and 3) might be too much, and adds yet another file to write. What if both 1) and 2) were supported, say, like this: setup(... package_files=['zope/app/config.xml', 'zope/app/dtd.xml'], package_patterns=['*.cfg'], ) So this adds all *.cfg files in any package directory, and the two XML files. We could also allow arbitrary filenames in the 'py_modules' list, but then the very name 'py_modules' is misleading, so IMHO that's a bad idea. One nit is that packages are identified as 'zope.app' in the 'packages' and 'package_dir' keywords. build_py will have to mess around with the package_files strings, but hopefully that won't be too difficult to get right.
The other problem we have is with header files. We'd like to have .h files that are installed inside a directory in /usr/local/include. For
Why isn't installing the headers in /usr/local/python2.2/persistence OK? Are these headers completely independent of Python (e.g. for a standalone C library)? --amk (www.amk.ca) I never think twice about anything; takes too much time. -- Sanders, in "Kinda"

A.M. Kuchling wrote:
Dito here for various documentation files, licenses etc. The problem seems to be that all of us have slightly different requirements here, so perhaps there simply isn't a one-fits-all implementation.
Why not merge package_files and package_patterns into one list. Then use glob.glob() to work this into a list of single filenames. The downside with this proposal is that you'll have to add MANIFEST.in rules for these files as well...
FWIW, I usually install the header files right along side the package's .so files. This makes it very easy for others to find the locations of the installed headers and avoids any name clashes. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 01 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 31 days left EuroPython 2003, Charleroi, Belgium: 115 days left

On Sat, Mar 01, 2003 at 11:13:19AM +0100, M.-A. Lemburg wrote:
There's a potential problem here if I want to include the *.cfg files in the package/ directory but not in package/example/. Maybe that doesn't matter.
The downside with this proposal is that you'll have to add MANIFEST.in rules for these files as well...
The sdist.add_defaults() method automatically includes README{.txt}, setup.py, and any referenced *.py files in the manifest; clearly if we add additional patterns, add_defaults() should automatically include matching files in the manifest. --amk (www.amk.ca) ENOBARBUS: Age cannot wither her, nor custom stale her infinite variety. -- _Antony and Cleopatra_, II, ii

A.M. Kuchling wrote:
Hmm, globbing 'package/*.cfg' should only include files in the package dir.
I was referring to any files that a package author adds to the package via package_files (beyond the default ones like README); he would have to add the same globbing patterns to MANIFEST.in in order to have them included in the sdist. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 03 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 29 days left EuroPython 2003, Charleroi, Belgium: 113 days left

"A.M. Kuchling" schrieb:
In my opinion the third option is the best because it can be done without much effort. Look at pyxml or PYOpenGL for an example which uses one of my scripts I wrote some time ago. http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/pyxml/xml/setupext/install_data.py?rev=1.3&content-type=text/vnd.viewcvs-markup http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/pyopengl/PyOpenGL2/setup.py?rev=1.54&content-type=text/vnd.viewcvs-markup And its use in PyOpenGL setup.py: ... # Overridden command classes cmdclass = { ... # the next line is very important # because we use another format for data_files 'install_data': my_install_data}, data_files = [Data_Files( base_dir='install_lib', copy_to = 'OpenGL', strip_dirs = 1, template=[ # take the whole tree 'graft OpenGL', 'global-exclude *.py', 'global-exclude Cvs/*', 'global-exclude CVS/*', ], preserve_path=1 )], ... It replaces the install_data command, still accepts old parameter lists, supports MANIFEST like specification of files and directories, and allows paths relative to 'install_lib', ... So this might be a good start point to replace the distutils install_data command. Kind regards Rene Liebscher

On Fri, 2003-02-28 at 18:14, A.M. Kuchling wrote:
That it is. I've also had need for this (Twisted applications may want to distribute plug-in or other data/template files alongside modules), and it's a source of great frustration to me that while the topic has come up again time and time again on this list, no common solution has taken hold in distutils. Does this lack of consensus indicate that packager's needs are really so unique that they each need to implement their own tailored solution? -- The moon is waning gibbous, 95.1% illuminated, 16.9 days old.

On Fri, Feb 28, 2003 at 10:31:00AM -0500, Jeremy Hylton wrote:
Attached to this mail is a Distutils extension I coded to install twisted tml file, but it works fine to install any other file in a python package. Feel free to use, modify and distribute. -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Développement logiciel avancé - Intelligence Artificielle - Formations

On Fri, Feb 28, 2003 at 10:31:00AM -0500, Jeremy Hylton wrote:
Good idea. We have a similar subclass for Quixote that installs *.ptl files, and it's a common need.
Well, what are the options? 1) List extensions. 2) Explicitly list pathnames for additional files. 3) A MANIFEST.in-like mini-language for specifying which files should be installed. 4) Automatically add things in package directories that aren't obviously irrelevant (*~, *.pyc, CVS, .svn). Any other ideas? 4) probably offers too little control, and 3) might be too much, and adds yet another file to write. What if both 1) and 2) were supported, say, like this: setup(... package_files=['zope/app/config.xml', 'zope/app/dtd.xml'], package_patterns=['*.cfg'], ) So this adds all *.cfg files in any package directory, and the two XML files. We could also allow arbitrary filenames in the 'py_modules' list, but then the very name 'py_modules' is misleading, so IMHO that's a bad idea. One nit is that packages are identified as 'zope.app' in the 'packages' and 'package_dir' keywords. build_py will have to mess around with the package_files strings, but hopefully that won't be too difficult to get right.
The other problem we have is with header files. We'd like to have .h files that are installed inside a directory in /usr/local/include. For
Why isn't installing the headers in /usr/local/python2.2/persistence OK? Are these headers completely independent of Python (e.g. for a standalone C library)? --amk (www.amk.ca) I never think twice about anything; takes too much time. -- Sanders, in "Kinda"

A.M. Kuchling wrote:
Dito here for various documentation files, licenses etc. The problem seems to be that all of us have slightly different requirements here, so perhaps there simply isn't a one-fits-all implementation.
Why not merge package_files and package_patterns into one list. Then use glob.glob() to work this into a list of single filenames. The downside with this proposal is that you'll have to add MANIFEST.in rules for these files as well...
FWIW, I usually install the header files right along side the package's .so files. This makes it very easy for others to find the locations of the installed headers and avoids any name clashes. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 01 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 31 days left EuroPython 2003, Charleroi, Belgium: 115 days left

On Sat, Mar 01, 2003 at 11:13:19AM +0100, M.-A. Lemburg wrote:
There's a potential problem here if I want to include the *.cfg files in the package/ directory but not in package/example/. Maybe that doesn't matter.
The downside with this proposal is that you'll have to add MANIFEST.in rules for these files as well...
The sdist.add_defaults() method automatically includes README{.txt}, setup.py, and any referenced *.py files in the manifest; clearly if we add additional patterns, add_defaults() should automatically include matching files in the manifest. --amk (www.amk.ca) ENOBARBUS: Age cannot wither her, nor custom stale her infinite variety. -- _Antony and Cleopatra_, II, ii

A.M. Kuchling wrote:
Hmm, globbing 'package/*.cfg' should only include files in the package dir.
I was referring to any files that a package author adds to the package via package_files (beyond the default ones like README); he would have to add the same globbing patterns to MANIFEST.in in order to have them included in the sdist. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 03 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 29 days left EuroPython 2003, Charleroi, Belgium: 113 days left

"A.M. Kuchling" schrieb:
In my opinion the third option is the best because it can be done without much effort. Look at pyxml or PYOpenGL for an example which uses one of my scripts I wrote some time ago. http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/pyxml/xml/setupext/install_data.py?rev=1.3&content-type=text/vnd.viewcvs-markup http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/pyopengl/PyOpenGL2/setup.py?rev=1.54&content-type=text/vnd.viewcvs-markup And its use in PyOpenGL setup.py: ... # Overridden command classes cmdclass = { ... # the next line is very important # because we use another format for data_files 'install_data': my_install_data}, data_files = [Data_Files( base_dir='install_lib', copy_to = 'OpenGL', strip_dirs = 1, template=[ # take the whole tree 'graft OpenGL', 'global-exclude *.py', 'global-exclude Cvs/*', 'global-exclude CVS/*', ], preserve_path=1 )], ... It replaces the install_data command, still accepts old parameter lists, supports MANIFEST like specification of files and directories, and allows paths relative to 'install_lib', ... So this might be a good start point to replace the distutils install_data command. Kind regards Rene Liebscher

On Fri, 2003-02-28 at 18:14, A.M. Kuchling wrote:
That it is. I've also had need for this (Twisted applications may want to distribute plug-in or other data/template files alongside modules), and it's a source of great frustration to me that while the topic has come up again time and time again on this list, no common solution has taken hold in distutils. Does this lack of consensus indicate that packager's needs are really so unique that they each need to implement their own tailored solution? -- The moon is waning gibbous, 95.1% illuminated, 16.9 days old.
participants (6)
-
A.M. Kuchling
-
Alexandre
-
Jeremy Hylton
-
Kevin Turner
-
M.-A. Lemburg
-
René Liebscher