Pondering multi-package packages
I apologize of this is documented somewhere. If it is, I haven't found it. I've done lots of python modules, but not any packages, so if I'm just package ignorant feel free to slap me..... It all started in the early Zope days when I loaded Zope only to discover that all my scripts using Marc Lemburg's mxDateTime bombed because Zope's DateTime produced a namespace collision. The solution then was to move Zope, since I didn't 'need' it at the time. Now I do. So, I contacted Marc who, of course, knew about the collision and has already started looking at packaging all his stuff under an umbrella "mx" package. When I asked about Distutils, he hadn't really looked at it so I volunteered to see what it would take. Since Distutils includes a sample mxDateTime setup.py, it seemed fairly easy. I hacked it up to push it down a level (see attempt below). No problem except, of course, that it doesn't import because there's no mx.__init__.py. Ok, so I add an mx.__init__.py with __all__ defined and appropriate imports. Everything's happy and Distutils does fine. Now, I add in mx.TextTools and things get murkier. How does mx.__init__.py determine what __all__ should be and what to import? How do I tell Distutils about the other packages? I tried a single all-purpose Distutils setup.py that searches the mx package directory and adds in the sub-packages that it finds. So far so good. Now, what about mx.__init__.py? Should it: 1) Search it's package directory at runtime to determine what __all__ should be and what subpackages to import? If it does that, then mx.__init__.py will exist in all possible mx packages. Distutils probably won't care and will happily copy the lastest __init__.py into place (since they "should" all be the same, that's not a problem). BUT -- what if I make packages with --bdist-WHATEVER. Package manager WHATEVER is going to not be pleased with multiple packages that all provide mx.__init__.py. Some type of package forcing is going to have to occur for all except the first subpackage installed. - OR - 2) Be created dynamically at "install" time. The all-purpose setup.py can scan when Distutils runs and programatically create an __init__.py at installation time that meets the needs of the currently installed subpacakges. BUT -- that doesn't help when I used --bdist-WHATEVER again, because the resulting package is going to be installed on a system that probably does not have the same set of subpackages installed. I'll have to provide a postinstall script to WHATEVER to create __init__.py on the install host. Do-able, but now I'm manually tacking stuff on to the package that --bdist-WHATEVER can't (unless I missed it) handle. And, mx.__init__.py is not registered to the package somewhat defeating the purpose. It looks like --bdist-WHATEVER needs information about pre- and post-installation processing. This would have to be included in setup.py somewhere. But what should it be? Package managers (that support pre/post install) expect shell scripts. It would be really nice if --bdist took a python function and wrapped it for us in a shell wrapper, e.g. #!/bin/sh python -i <<EOF def postinstall(): ... postinstall EOF So, am I missing something or are these issues real? FWIW, I've included the directory scanning setup.py below. If you try it, it "almost" works for mxDateTime, mxTextTools, and mxNewBuiltins. There's still some issues with mx sub-modules that I'm ignorant of, but they're not related to this discussion. Mark mwa@gate.net ############## mx.DateTime setup.py ################ #!/usr/bin/env python import os """ setup.py for Marc-Andr� Lemburg's mx Extension modules Will scan the 'mx' directory for Marc's packages (just DateTime, TextTools, and NewBuiltins so far....) and provide the Distutils infor for all found packages. .""" # created 1999/09/19, Greg Ward # pushed to mx package by Mark Alexander __revision__ = "$Id: mxdatetime_setup.py,v 1.4 2000/03/02 01:49:46 gward Exp $" from distutils.core import setup DateTimePackages=['mx.DateTime', 'mx.DateTime.Examples', 'mx.DateTime.mxDateTime'] DateTimeExtensions=[('mx.DateTime.mxDateTime.mxDateTime', { 'sources': ['mx/DateTime/mxDateTime/mxDateTime.c'], 'include_dirs': ['mxDateTime'], 'macros': [('HAVE_STRFTIME', None), ('HAVE_STRPTIME', None), ('HAVE_TIMEGM', None)], } )] TextToolsPackages=['mx.TextTools','mx.TextTools','mx.TextTools.Constants','mx.TextTools.Examples'] TextToolsExtensions=[('mx.TextTools.mxTextTools.mxTextTools', { 'sources': ['mx/TextTools/mxTextTools/mxTextTools.c'], 'include_dirs': ['mxTextTools'], 'macros': [('HAVE_STRFTIME', None), ('HAVE_STRPTIME', None), ('HAVE_TIMEGM', None)], } )] NewBuiltinsPackages=['mx.NewBuiltins','mx.NewBuiltins.Examples','mx.NewBuiltins.mxTools'] NewBuiltinsExtensions=[('mx.NewBuiltins.mxTools.mxTools', { 'sources': ['mx/NewBuiltins/mxTools/mxTools.c'], 'include_dirs': ['NewBuiltins'], 'macros': [('HAVE_STRFTIME', None), ('HAVE_STRPTIME', None), ('HAVE_TIMEGM', None)], } ), ('mx.NewBuiltins.mxTools.xmap', { 'sources': ['mx/NewBuiltins/mxTools/xmap.c'], 'include_dirs': ['NewBuiltins'], 'macros': [('HAVE_STRFTIME', None), ('HAVE_STRPTIME', None), ('HAVE_TIMEGM', None)], } )] mxPackages=['mx'] mxExtensions=[] if os.path.isdir('mx/DateTime'): mxPackages=mxPackages+DateTimePackages mxExtensions=mxExtensions+DateTimeExtensions if os.path.isdir('mx/TextTools'): mxPackages=mxPackages+TextToolsPackages mxExtensions=mxExtensions+TextToolsExtensions if os.path.isdir('mx/NewBuiltins'): mxPackages=mxPackages+NewBuiltinsPackages mxExtensions=mxExtensions+NewBuiltinsExtensions setup (name = "mxDateTime", version = "1.3.0", description = "", author = "Marc-Andr� Lemburg", author_email = "mal@lemburg.com", url = "http://starship.python.net/~lemburg/mxDateTime.html", packages = mxPackages, # XXX user might have to edit the macro definitions here: yuck! # Probably do need to support 'Setup' file or something similar. ext_modules = mxExtensions, )
On 24 May 2000, Mark W. Alexander said:
I apologize of this is documented somewhere. If it is, I haven't found it. I've done lots of python modules, but not any packages, so if I'm just package ignorant feel free to slap me.....
It's not, and this problem is a fundamental design flaw in Python's packaging system. I have a few ideas on how we can live with it, and they generally involve "put nothing (important) in __init__.py".
Since Distutils includes a sample mxDateTime setup.py, it seemed fairly easy. I hacked it up to push it down a level (see attempt below). No problem except, of course, that it doesn't import because there's no mx.__init__.py. Ok, so I add an mx.__init__.py with __all__ defined and appropriate imports. Everything's happy and Distutils does fine.
You may be labouring under the illusion that you have to define __all__. You don't. It's just there so that you can "from package import *", where * will presumably resolve to a list of modules. (It doesn't have to, though: __init__.py can define things on its own, which then become exported by the package.) The easiest solution is zero-byte __init__.py files. Then auto-generation is a snap, and there's no need to worry about mxDateTime clobbering the mx/__init__.py from mxTextTools (or vice-versa). It's nice to put a brief comment or docstring in __init__.py, though -- then, it would be up to the developer (Marc-Andre in this case) to make sure the mx/__init__.py included with all of his packages is the same (or at least that all are "content-free", i.e. only comments and docstrings). Dealing with conflicting mx/__init__.py's from multiple RPMs is another issue; I don't have a ready answer off the top o' my head. Hmmm.
It looks like --bdist-WHATEVER needs information about pre- and post-installation processing. This would have to be included in setup.py somewhere.
Gee, I hope we can avoid it. But if we can't, I agree that such code should be provided as Python; it's up to the Distutils to cook up a way to run that code at installation-time. Yuck! Hairy, hairy, hairy... Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ Vote anarchist.
Greg Ward writes:
The easiest solution is zero-byte __init__.py files. Then auto-generation is a snap, and there's no need to worry about mxDateTime clobbering the mx/__init__.py from mxTextTools (or vice-versa).
Note that WinZip, which is commonly used, doesn't archive zero-length files under some circumstances; occasionally people run into this when they unzip a directory and can't import from the package. Hence the convention of just putting a comment in __init__.py, to avoid having a zero-length file. --amk
On Wed, 24 May 2000, Andrew Kuchling wrote:
Note that WinZip, which is commonly used, doesn't archive zero-length files under some circumstances; occasionally people run into this when they unzip a directory and can't import from the package. Hence the convention of just putting a comment in __init__.py, to avoid having a zero-length file.
Docstrings are also nice candidates, but I don't know if that's the right thing for the "mx" package. It's a little different from the simpler packages that were initialy envisioned. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>
Greg Ward wrote:
On 24 May 2000, Mark W. Alexander said:
I apologize of this is documented somewhere. If it is, I haven't found it. I've done lots of python modules, but not any packages, so if I'm just package ignorant feel free to slap me.....
It's not, and this problem is a fundamental design flaw in Python's packaging system. I have a few ideas on how we can live with it, and they generally involve "put nothing (important) in __init__.py".
Since Distutils includes a sample mxDateTime setup.py, it seemed fairly easy. I hacked it up to push it down a level (see attempt below). No problem except, of course, that it doesn't import because there's no mx.__init__.py. Ok, so I add an mx.__init__.py with __all__ defined and appropriate imports. Everything's happy and Distutils does fine.
You may be labouring under the illusion that you have to define __all__. You don't. It's just there so that you can "from package import *", where * will presumably resolve to a list of modules. (It doesn't have to, though: __init__.py can define things on its own, which then become exported by the package.)
The easiest solution is zero-byte __init__.py files. Then auto-generation is a snap, and there's no need to worry about mxDateTime clobbering the mx/__init__.py from mxTextTools (or vice-versa).
It's nice to put a brief comment or docstring in __init__.py, though -- then, it would be up to the developer (Marc-Andre in this case) to make sure the mx/__init__.py included with all of his packages is the same (or at least that all are "content-free", i.e. only comments and docstrings).
Dealing with conflicting mx/__init__.py's from multiple RPMs is another issue; I don't have a ready answer off the top o' my head. Hmmm.
FYI, I'm going to package the mx stuff in 3-4 ZIP archives: 1. base (this one contains the __init__.py file and always has to be installed) 2. crypto (optional add-in with mx.Crypto) 3. mx-ug (subpackages only available to mx User Group members) 4. commercial (things like mx.ODBC and some other DB related subpackages) There will no longer be separate mxTools, mxTextTools, mxDateTime, etc. packages.
It looks like --bdist-WHATEVER needs information about pre- and post-installation processing. This would have to be included in setup.py somewhere.
Gee, I hope we can avoid it. But if we can't, I agree that such code should be provided as Python; it's up to the Distutils to cook up a way to run that code at installation-time. Yuck! Hairy, hairy, hairy...
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another. I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
On Thu, May 25, 2000 at 12:08:35PM +0200, M.-A. Lemburg wrote:
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another.
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
You can do this now by defining your own install command in setup, I am providing my own build command in PyNcurses to perform some pre-build actions. -- Harry Henry Gebel, Senior Developer, Landon House SBS West Dover Hundred, Delaware
Harry Henry Gebel wrote:
On Thu, May 25, 2000 at 12:08:35PM +0200, M.-A. Lemburg wrote:
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another.
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
You can do this now by defining your own install command in setup, I am providing my own build command in PyNcurses to perform some pre-build actions.
I'd rather like to see predefined hooks for this than having to define my own install command. BTW, how can I add some Q&A style setup dialog to build and install commands ? I will need this for mxODBC since there are plenty subpackages which can all be installed separately and each of them will need to know where to find the header files and libs to link against. mxDateTime has a different need, which I'm not really sure how to handle: it needs some sort of "try to compile this and return the exit code from the compiler" + "if this compiles, run the result and return the exit code" to be able to define symbols such as HAVE_TIMEGM (much like autoconf does on Unix). Is this possible with the current distutils ? And a final question: do I have to redistribute distutils together with the mx packages in order to make sure that the build and install process works ? What about 1.5.2 compatibility ? Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
On Thu, 25 May 2000, M.-A. Lemburg wrote:
FYI, I'm going to package the mx stuff in 3-4 ZIP archives:
1. base (this one contains the __init__.py file and always has to be installed) 2. crypto (optional add-in with mx.Crypto) 3. mx-ug (subpackages only available to mx User Group members) 4. commercial (things like mx.ODBC and some other DB related subpackages)
There will no longer be separate mxTools, mxTextTools, mxDateTime, etc. packages.
I think that pretty much resolves the issue with mx.Stuff. It still remains a generic problem for other packages though. Do you have an ETA on your new packages? If not, shoot me an email and I'll be happy to do the setup.py scripts.
should be provided as Python; it's up to the Distutils to cook up a way to run that code at installation-time. Yuck! Hairy, hairy, hairy...
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another.
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
I also think this is a definite long-term need. For "relocatable" python packages on some architectures (e.g. Solaris) it may be necessary to re-link at install time (ld -rpath) to avoid requiring the user to set LD_LIBRARY_PATH. (IANALG -- I Am Not A Linking Guru, but I've had problems when things libraries ar in different places on "install" machines than they are on "build" machines. That said, it's probably only an issue when you're stuck with bunches fo boxes with different file system setups where your forced to put things where you can find the room. That probably means me and some other guy whose still lost writing perl ;-) If no one else is interested, I'll do it when hell freezes^W^W I have time. mwa
"Mark W. Alexander" wrote:
On Thu, 25 May 2000, M.-A. Lemburg wrote:
FYI, I'm going to package the mx stuff in 3-4 ZIP archives:
1. base (this one contains the __init__.py file and always has to be installed) 2. crypto (optional add-in with mx.Crypto) 3. mx-ug (subpackages only available to mx User Group members) 4. commercial (things like mx.ODBC and some other DB related subpackages)
There will no longer be separate mxTools, mxTextTools, mxDateTime, etc. packages.
I think that pretty much resolves the issue with mx.Stuff. It still remains a generic problem for other packages though. Do you have an ETA on your new packages? If not, shoot me an email and I'll be happy to do the setup.py scripts.
This depends on whether I'll find time within the next month to get the docs right and setup the web site for the new tools.
should be provided as Python; it's up to the Distutils to cook up a way to run that code at installation-time. Yuck! Hairy, hairy, hairy...
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another.
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
I also think this is a definite long-term need. For "relocatable" python packages on some architectures (e.g. Solaris) it may be necessary to re-link at install time (ld -rpath) to avoid requiring the user to set LD_LIBRARY_PATH. (IANALG -- I Am Not A Linking Guru, but I've had problems when things libraries ar in different places on "install" machines than they are on "build" machines.
Right, but there are other more serious needs as well, e.g. a package might want to prebuild some database or scan the system to figure out configuration information, initialize third party tools, register APIs, COM or CORBA objects, check package dependencies etc. (just look at what CPAN does when you let it do its thing in automatic mode). There are many possibilities which can't possibly all be covered by special install or build functions. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
[Marc-Andre wants pre- and post-install hooks]
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
[Harry Henry Gebel points out Distutils' extensibility mechanism]
You can do this now by defining your own install command in setup, I am providing my own build command in PyNcurses to perform some pre-build actions.
[Marc-Andre thinks otherwise]
I'd rather like to see predefined hooks for this than having to define my own install command.
There are two meanings to "install", and I think each of you is talking about a different one. I believe MAL wants to supply pre-install and post-install hooks that would run when someone installs from (eg.) an RPM or executable installer -- ie. from a smart built distribution that has support for {pre,post}-install hooks. Harry was talking about overriding the Distutils "install" command, which is fine as long as people only build and install from source. They don't. If you define {pre,post}-install by defining your own "install" command, then they will run on the packager's system, might take effect in the packager's build/bdist.<plat>/<format> directory, and might be propagated to the end-user's machine. But if you tweak the Windows registry, or add a user to /etc/passwd, or what-have-you, then it only affects the packager's machine. Not good. We need a way for module developers to supply a snippet (or a pile) of Python code that is carried along for the ride when a packager creates a smart built distribution (RPM, wininst, Wise, .deb, whatever) and executed when the installer installs from the smart built distribution. It's OK to write these hooks in Python, because of course there's a Python interpreter on the other end. It should be possibly to write completely portable hooks, but I imagine most such real-world hooks would look like this: if os.name == "posix": # poke around /lib, frob something in /etc, ... elif os.name == "nt": # twiddle the registry, check for some DLL, ... else: raise RuntimeError, \ "don't know how to install on platform '%s'" % os.name which in many cases is the Distutils approach to portability. Some of those if/elif/.../else constructs have grown "mac" branches, but not all yet. These install hooks should be run by the "install" command if this is a "real" installation, but not if it's a "fake" installation done on behalf of one of the "bdist" commands to a temp directory. Anyone got ideas for a good interface? Function, class, module, chunk of source code as text, or what?
BTW, how can I add some Q&A style setup dialog to build and install commands ?
I will need this for mxODBC since there are plenty subpackages which can all be installed separately and each of them will need to know where to find the header files and libs to link against.
Oops, this reminds me of the part of config files that I forgot to implement: custom configuration information for the current distribution. The idea is that users would edit setup.cfg to provide things like library search paths here. IMHO that's preferable to an interactive build process, but only slightly.
mxDateTime has a different need, which I'm not really sure how to handle: it needs some sort of "try to compile this and return the exit code from the compiler" + "if this compiles, run the result and return the exit code" to be able to define symbols such as HAVE_TIMEGM (much like autoconf does on Unix).
Is this possible with the current distutils ?
No. I've been putting that off as long as possible, because I fear it's a huge job. It basically means rewriting Autoconf in Python. The good news is, no M4 or shell scripts; the bad news is, it has to be insanely portable. (The CCompiler framework should help enormously here, but even so...)
And a final question: do I have to redistribute distutils together with the mx packages in order to make sure that the build and install process works ? What about 1.5.2 compatibility ?
All versions of the Distutils will work with Python 1.5.2, at least until Python 1.5.2 is as dead as Python 1.4 is today. The basic idea is this: if someone wants to build from source, they either have to be running Python 1.6, or they have to download and install Distutils to their Python 1.5.2 installation. I will write up an exemplary blurb and put it in the "Distributing Python Modules" documentation -- most programmers seem to have a hard time expressing themselves clearly in README files, and this particular concept must be made loud and clear in every README for every Python module distribution. Incidentally, I have not had a single complaint of Python 1.5.1 incompatibility since March, when I released Distutils 0.1.4 and 0.1.5 expressly for Python 1.5.1 compatibility. I have not ported those changes forward to the current Distutils, have very little desire to do so, and have seen no reason to do so -- ie. no complaints from users. So does anyone care if I drop my goal of Python 1.5.1 compatibility? (Hey, there's always Distutils 0.1.5 for the Python 1.5.1 crowd...) Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ Hold the MAYO & pass the COSMIC AWARENESS ...
On 25 May 2000, M.-A. Lemburg said:
FYI, I'm going to package the mx stuff in 3-4 ZIP archives:
1. base (this one contains the __init__.py file and always has to be installed) 2. crypto (optional add-in with mx.Crypto) 3. mx-ug (subpackages only available to mx User Group members) 4. commercial (things like mx.ODBC and some other DB related subpackages)
There will no longer be separate mxTools, mxTextTools, mxDateTime, etc. packages.
Do you mean "package" as in a directory with an __init__.py file, or as in something others download, build, and install?
Umm, what's hairy about pre- and post-install code ? Pretty much all RPM-like archives provide this feature in some way or another.
See my previous post: it's not the code that's hairy, it's getting it from the developer to the installer correctly. Greg -- Greg Ward - "always the quiet one" gward@python.net http://starship.python.net/~gward/ I have a VISION! It's a RANCID double-FISHWICH on an ENRICHED BUN!!
Greg Ward wrote:
[Marc-Andre wants pre- and post-install hooks]
I'd suggest to have setup.py include a reference to two functions somewhere (probably in the setup constructor): one to run for pre-install and one for post-install.
[Harry Henry Gebel points out Distutils' extensibility mechanism]
You can do this now by defining your own install command in setup, I am providing my own build command in PyNcurses to perform some pre-build actions.
[Marc-Andre thinks otherwise]
I'd rather like to see predefined hooks for this than having to define my own install command.
There are two meanings to "install", and I think each of you is talking about a different one. I believe MAL wants to supply pre-install and post-install hooks that would run when someone installs from (eg.) an RPM or executable installer -- ie. from a smart built distribution that has support for {pre,post}-install hooks.
Hmm, the pre/post-install hooks are definitely an install thing. We'd also need a pre-build, though, for the things I mentioned below, e.g. finding header files and libs, checking the compiler, etc.
Harry was talking about overriding the Distutils "install" command, which is fine as long as people only build and install from source. They don't. If you define {pre,post}-install by defining your own "install" command, then they will run on the packager's system, might take effect in the packager's build/bdist.<plat>/<format> directory, and might be propagated to the end-user's machine. But if you tweak the Windows registry, or add a user to /etc/passwd, or what-have-you, then it only affects the packager's machine. Not good.
We need a way for module developers to supply a snippet (or a pile) of Python code that is carried along for the ride when a packager creates a smart built distribution (RPM, wininst, Wise, .deb, whatever) and executed when the installer installs from the smart built distribution. It's OK to write these hooks in Python, because of course there's a Python interpreter on the other end. It should be possibly to write completely portable hooks, but I imagine most such real-world hooks would look like this:
if os.name == "posix": # poke around /lib, frob something in /etc, ... elif os.name == "nt": # twiddle the registry, check for some DLL, ... else: raise RuntimeError, \ "don't know how to install on platform '%s'" % os.name
which in many cases is the Distutils approach to portability. Some of those if/elif/.../else constructs have grown "mac" branches, but not all yet.
These install hooks should be run by the "install" command if this is a "real" installation, but not if it's a "fake" installation done on behalf of one of the "bdist" commands to a temp directory.
Anyone got ideas for a good interface? Function, class, module, chunk of source code as text, or what?
Why not add some keywords to the constructor ?! import mx.ODBC.Misc.DistSupport setup( preinstall = mx.ODBC.Misc.DistSupport.preinstall, postinstall = ...postinstall, prebuild = ...prebuild, postbuild = ...postbuild )
BTW, how can I add some Q&A style setup dialog to build and install commands ?
I will need this for mxODBC since there are plenty subpackages which can all be installed separately and each of them will need to know where to find the header files and libs to link against.
Oops, this reminds me of the part of config files that I forgot to implement: custom configuration information for the current distribution. The idea is that users would edit setup.cfg to provide things like library search paths here. IMHO that's preferable to an interactive build process, but only slightly.
A config file is fine too, but given that someone may want to write an installer, I think we'd also need an API hook for this.
mxDateTime has a different need, which I'm not really sure how to handle: it needs some sort of "try to compile this and return the exit code from the compiler" + "if this compiles, run the result and return the exit code" to be able to define symbols such as HAVE_TIMEGM (much like autoconf does on Unix).
Is this possible with the current distutils ?
No. I've been putting that off as long as possible, because I fear it's a huge job. It basically means rewriting Autoconf in Python. The good news is, no M4 or shell scripts; the bad news is, it has to be insanely portable. (The CCompiler framework should help enormously here, but even so...)
Naa... no need to rewrite Autoconf in Python: the simple tests can easily be done using a few lines of Python provided that the compiler classes allow these trial-and-error approaches.
And a final question: do I have to redistribute distutils together with the mx packages in order to make sure that the build and install process works ? What about 1.5.2 compatibility ?
All versions of the Distutils will work with Python 1.5.2, at least until Python 1.5.2 is as dead as Python 1.4 is today. The basic idea is this: if someone wants to build from source, they either have to be running Python 1.6, or they have to download and install Distutils to their Python 1.5.2 installation.
I will write up an exemplary blurb and put it in the "Distributing Python Modules" documentation -- most programmers seem to have a hard time expressing themselves clearly in README files, and this particular concept must be made loud and clear in every README for every Python module distribution.
Incidentally, I have not had a single complaint of Python 1.5.1 incompatibility since March, when I released Distutils 0.1.4 and 0.1.5 expressly for Python 1.5.1 compatibility. I have not ported those changes forward to the current Distutils, have very little desire to do so, and have seen no reason to do so -- ie. no complaints from users. So does anyone care if I drop my goal of Python 1.5.1 compatibility? (Hey, there's always Distutils 0.1.5 for the Python 1.5.1 crowd...)
1.5.2 is fine with me. By the time I'll push out my new stuff, 1.6 will be out anyway... so 1.5.1 is not much of a problem anymore (I will keep the current Makefile.pre.in +Setup approach for a while too). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
Greg Ward wrote:
On 25 May 2000, M.-A. Lemburg said:
FYI, I'm going to package the mx stuff in 3-4 ZIP archives:
1. base (this one contains the __init__.py file and always has to be installed) 2. crypto (optional add-in with mx.Crypto) 3. mx-ug (subpackages only available to mx User Group members) 4. commercial (things like mx.ODBC and some other DB related subpackages)
There will no longer be separate mxTools, mxTextTools, mxDateTime, etc. packages.
Do you mean "package" as in a directory with an __init__.py file, or as in something others download, build, and install?
Well, e.g. mxDateTime was a package in the sense that you can unzip it directly in a directory on you PYTHONPATH. Only the "base" archive will have that property. The others are add-in archives which get unzipped on top of the "base" installation. Hmm, thinking of it: would distutils allow recursive builds/installs ? Or must I add logic to figure out which parts of the cake are available to build/install and then setup a single setup.py file ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
On Fri, 26 May 2000, M.-A. Lemburg wrote:
Well, e.g. mxDateTime was a package in the sense that you can unzip it directly in a directory on you PYTHONPATH. Only the "base" archive will have that property. The others are add-in archives which get unzipped on top of the "base" installation.
Hmm, thinking of it: would distutils allow recursive builds/installs ? Or must I add logic to figure out which parts of the cake are available to build/install and then setup a single setup.py file ?
It allows for nested packages, if that's what you mean. I didn't have much trouble backing a setup.py that did mxDateTime, mxTextTools, and mxTools. I don't see any problem with each having their own setup.py, IF you have (as you mentioned) a "core" package that provides the foundation for all the others. Earlier, someon else said:
BTW, how can I add some Q&A style setup dialog to build and install commands ?
This is another good question. The SysV 'pkgtool' utilities allow not only preinstall and postinstall, but preremove, postremove and request scripts. The request script is invoked before anything is done, and allows Q&A with the installer to support customized installation. Having to support multiple packages on multiple machines of multiple architectures, I'm a strong believer in the idea that the installer should have complete controll over package installation, IF they want it (appropriate defaults assumed). In order to provide that level of flexibility, the requirements for the bdist-* modules are pushed to the point of having to provide the maximum requirements of the most complex package manager supported. This is likely to get very ugly. Pardon me while I talk circles around myself...... I'm starting to go back to what someone (Henry?) said early in the bdist-rpm progress, provide the package specification file and let it go at that. Where possible, take it to the max like bdist-rpm is, but as a minimal bdist requirement the only function required (and possibly default?) could be that of --spec-only. I really like where distutils is headed (actually, where it's at is pretty darn good!), and I hate to see much effort go into the 20% of functions that hardly anyone would use. "Most" packages don't need any scripts. Those that have major scripting needs are probably best addressed by a packager who understands packaging fairly thoroughly. Getting a spec/prototype/pkginfo/psf file, can be tedious and is the one thing distutils can easily provide. The packager can plug scripts in as required. Mark Alexander mwa@gate.net
On 26 May 2000, M.-A. Lemburg said:
Hmm, the pre/post-install hooks are definitely an install thing.
Again, there are two meanings to install: install from source and install from a built distribution. Hooks at install-from-source time should be doable with the Distutils' existing extension mechanism (if a bit cumbersome -- you have to know how to write a Distutils command class, which is a tad idiosyncratic).
We'd also need a pre-build, though, for the things I mentioned below, e.g. finding header files and libs, checking the compiler, etc.
In principle, that should all be doable with the Distutils' existing facilities. I'd be willing to grease the wheels by adding a standard "prebuild" or "configure" command that runs before "build", but I'd leave it empty and open to subclassing -- there's just too many things that you might want to do there! So eg. you might do this in your setup script: class configure (Command): user_options = [('foo-inc', None, ("where to search for foo headers"), ('foo-lib', None, "where to search for foo library"), ] def initialize_options (self): self.foo_inc = None self.foo_lib = None def finalize_options (self): # if user doesn't define foo_inc and foo_lib, leave them # alone (we will search for foo.h and libfoo) pass def run (self): if self.foo_inc is None: for dir in (...): # list would vary by platform # try to compile "#include <foo.h>" with -Idir # break if success else: # die, couldn't find foo.h # similar loop, trying to link with -lfoo and -Ldir and then a little later: setup (..., cmdclass = {'configure': configure}, ..., ) Now, what the hell do we do with 'foo_inc' and 'foo_lib' -- as written, the 'configure' command finds the foo header and library paths, and then quits without doing anything with that information. If you happen to know the guts of the Distutils, you could be evil and sneaky and do something like this: build_ext_opts = self.distribution.get_option_dict('build_ext') include_dirs = build_ext_opts.get('include_dirs') if include_dirs is None: include_dirs = build_ext_opts['include_dirs'] = [] include_dirs.append(self.foo_inc) ...and then similar code to modify build_ext's 'library_dirs' option from 'self.foo_lib'. This is nasty, though. Come to think of it, it's not entirely reliable: if the "build_ext" command object has already been created by the time we run "configure", then it's too late to go frobbing the option dict owned by the Distribution object -- you'd want to frob the "build_ext" object directly. Well, it's a common idiom to *fetch* options from another command object. And, oh yeah, I decided many months ago to stick with this "pull" model -- if command X needs option Y from command Z, then it's X's responsibility to dig up a Z object and get attribute Y from it. Just search the code for 'find_peer' to see how often this happens. Eg. in bdist.py: build_base = self.find_peer('build').build_base to find the build base directory. But there's no way the general-purpose "build" command can know what's defined in your particular "configure" command -- so this is one place where we seem to need to support "pushing" options. The problem with pushing options from one command to another is that option initialization is *tricky*, because we need to be able to derive default values in an intelligent way. See build_ext.py for a rich, meaty, but comprehensible example; or install.py for an insanely complex example. I think the difficulty of pushing options boils down to the fact that 'finalize_options()' only expects to be called once, and most commands are written in such a way that they die horribly if it is called more than once. (I have accidentally ventured into option-pushing territory once or twice in the past, and quickly retreated, licking my wounds.) This is a design/implementation flaw that I have lived with up to now, but I might not be able to any longer. Now do you see why I have avoided a "configure" command? ;-) Other little things...
Why not add some keywords to the constructor ?!
import mx.ODBC.Misc.DistSupport setup( preinstall = mx.ODBC.Misc.DistSupport.preinstall, postinstall = ...postinstall, prebuild = ...prebuild, postbuild = ...postbuild )
I realize that the OO write-your-own-class alternative is a little more clunky, but I don't think it's clunky enough to mandate a function-passing interface. Can you buy that?
Naa... no need to rewrite Autoconf in Python: the simple tests can easily be done using a few lines of Python provided that the compiler classes allow these trial-and-error approaches.
You may be right, based on the above hypothetical configure command. Abstracting some of these common functions away shouldn't be too hard. Greg -- Greg Ward - geek-on-the-loose gward@python.net http://starship.python.net/~gward/ Hold the MAYO & pass the COSMIC AWARENESS ...
On 26 May 2000, M.-A. Lemburg said:
Hmm, thinking of it: would distutils allow recursive builds/installs ? Or must I add logic to figure out which parts of the cake are available to build/install and then setup a single setup.py file ?
This is one of those things I've pondered about in the past. Haven't ever tried it though. At a minimum, there would have to be something in the setup script that says, "these directories are sub-distributions, please cd into them in turn and run the setup script there". Presumably it would run those setup scripts with the same arguments as the current script. All speculation, of course... Greg -- Greg Ward - maladjusted nerd gward@python.net http://starship.python.net/~gward/ I'm a lumberjack and I'm OK / I sleep all night and I work all day
Greg Ward wrote:
On 26 May 2000, M.-A. Lemburg said:
Hmm, the pre/post-install hooks are definitely an install thing.
Again, there are two meanings to install: install from source and install from a built distribution. Hooks at install-from-source time should be doable with the Distutils' existing extension mechanism (if a bit cumbersome -- you have to know how to write a Distutils command class, which is a tad idiosyncratic).
I was referring to installing a (pre)built binary -- just before copying the compiled files to their final install location and right after that step is done. "install-from-source" would execute these hooks too: right after having built the binaries.
We'd also need a pre-build, though, for the things I mentioned below, e.g. finding header files and libs, checking the compiler, etc.
In principle, that should all be doable with the Distutils' existing facilities. I'd be willing to grease the wheels by adding a standard "prebuild" or "configure" command that runs before "build", but I'd leave it empty and open to subclassing -- there's just too many things that you might want to do there!
Ok... if you beat me to it, I'll do some subclassing then ;-)
So eg. you might do this in your setup script:
class configure (Command): user_options = [('foo-inc', None, ("where to search for foo headers"), ('foo-lib', None, "where to search for foo library"), ]
def initialize_options (self): self.foo_inc = None self.foo_lib = None
def finalize_options (self): # if user doesn't define foo_inc and foo_lib, leave them # alone (we will search for foo.h and libfoo) pass
def run (self): if self.foo_inc is None: for dir in (...): # list would vary by platform # try to compile "#include <foo.h>" with -Idir # break if success else: # die, couldn't find foo.h
# similar loop, trying to link with -lfoo and -Ldir
and then a little later: setup (..., cmdclass = {'configure': configure}, ..., )
Looks feasable :-)
Now, what the hell do we do with 'foo_inc' and 'foo_lib' -- as written, the 'configure' command finds the foo header and library paths, and then quits without doing anything with that information.
If you happen to know the guts of the Distutils, you could be evil and sneaky and do something like this:
build_ext_opts = self.distribution.get_option_dict('build_ext') include_dirs = build_ext_opts.get('include_dirs') if include_dirs is None: include_dirs = build_ext_opts['include_dirs'] = [] include_dirs.append(self.foo_inc)
...and then similar code to modify build_ext's 'library_dirs' option from 'self.foo_lib'.
This is nasty, though. Come to think of it, it's not entirely reliable: if the "build_ext" command object has already been created by the time we run "configure", then it's too late to go frobbing the option dict owned by the Distribution object -- you'd want to frob the "build_ext" object directly.
Well, it's a common idiom to *fetch* options from another command object. And, oh yeah, I decided many months ago to stick with this "pull" model -- if command X needs option Y from command Z, then it's X's responsibility to dig up a Z object and get attribute Y from it. Just search the code for 'find_peer' to see how often this happens. Eg. in bdist.py: build_base = self.find_peer('build').build_base to find the build base directory.
Wouldn't a method interface be more reliable and provide better means of extension using subclassing ? I usually wrap these attributes in .get_foobar(), .set_foobar() methods -- this also makes it clear which attributes are read-only, read-write or "better don't touch" :-)
But there's no way the general-purpose "build" command can know what's defined in your particular "configure" command -- so this is one place where we seem to need to support "pushing" options. The problem with pushing options from one command to another is that option initialization is *tricky*, because we need to be able to derive default values in an intelligent way. See build_ext.py for a rich, meaty, but comprehensible example; or install.py for an insanely complex example.
I think the difficulty of pushing options boils down to the fact that 'finalize_options()' only expects to be called once, and most commands are written in such a way that they die horribly if it is called more than once. (I have accidentally ventured into option-pushing territory once or twice in the past, and quickly retreated, licking my wounds.) This is a design/implementation flaw that I have lived with up to now, but I might not be able to any longer.
Now do you see why I have avoided a "configure" command? ;-)
Ehm, yes... but I can't really follow here: ok, I don't know much about the internals of distutils, but wouldn't passing a (more-or-less) intelligent context object around solve the problem ? The context object would know which parts are readable, changeable or write-once, etc. (I've been doing this in an 55k LOC application server and it works great.)
Other little things...
Why not add some keywords to the constructor ?!
import mx.ODBC.Misc.DistSupport setup( preinstall = mx.ODBC.Misc.DistSupport.preinstall, postinstall = ...postinstall, prebuild = ...prebuild, postbuild = ...postbuild )
I realize that the OO write-your-own-class alternative is a little more clunky, but I don't think it's clunky enough to mandate a function-passing interface. Can you buy that?
Ok.
Naa... no need to rewrite Autoconf in Python: the simple tests can easily be done using a few lines of Python provided that the compiler classes allow these trial-and-error approaches.
You may be right, based on the above hypothetical configure command. Abstracting some of these common functions away shouldn't be too hard.
Basically, I need: rc = compile("""#include "foobar.h"\nmain(){}""", output="delete.me") if rc != 0: HAVE_FOOBAR_H = 0 else: HAVE_FOOBAR_H = 1 os.unlink("delete.me") and sometimes: rc = compile("""#include "foobar.h" main() { int x = frobnicate(); exit(x); }""", output="run-and-then-delete.me") if rc != 0: HAVE_FROBNICATE = 0 else: HAVE_FROBNICATE = 1 # Run and get rc = os.system("run-and-then-delete.me") FROBINATE_VALUE = rc os.unlink("run-and-then-delete.me") -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
"Mark W. Alexander" wrote:
On Fri, 26 May 2000, M.-A. Lemburg wrote:
Well, e.g. mxDateTime was a package in the sense that you can unzip it directly in a directory on you PYTHONPATH. Only the "base" archive will have that property. The others are add-in archives which get unzipped on top of the "base" installation.
Hmm, thinking of it: would distutils allow recursive builds/installs ? Or must I add logic to figure out which parts of the cake are available to build/install and then setup a single setup.py file ?
It allows for nested packages, if that's what you mean. I didn't have much trouble backing a setup.py that did mxDateTime, mxTextTools, and mxTools. I don't see any problem with each having their own setup.py, IF you have (as you mentioned) a "core" package that provides the foundation for all the others.
It would probably be wise to have one setup.py file per ZIP archive. The ZIP archives have predefined content and the setup.py files could have this information hard-coded somewhere. I would have to move away from my current installation logic though (simply unzipping and then compiling in place). With distutils help this shouldn't be much of a problem though (I hope ;-). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
On 27 May 2000, M.-A. Lemburg said:
I was referring to installing a (pre)built binary -- just before copying the compiled files to their final install location and right after that step is done.
Yes: this is the one place where the Distutils' extension mechanism *won't* work, because the Distutils aren't present (or at least, not in control) when installing a pre-built binary. Here, some other mechanism will be needed: pass a function, or a module, or a chunk of code to be eval'd, or something. Still not sure what's best; we have to balance the needs of the developer writing the setup script with the facilities available at the time the hook is run, and how the hook will be run ("python hookscript.py"?).
"install-from-source" would execute these hooks too: right after having built the binaries.
Yes, *except* in the case where the installation is being done solely for the purpose of creating a built distribution. I'm pretty sure this can be handled by adding a "fake install" flag to the "install" command: if true, don't run the {pre,post}-install hooks.
Wouldn't a method interface be more reliable and provide better means of extension using subclassing ?
I usually wrap these attributes in .get_foobar(), .set_foobar() methods -- this also makes it clear which attributes are read-only, read-write or "better don't touch" :-)
"Yes, but..." I have spent the weekend thinking hard about this problem, and I think I can explain the situation a little better now. Distutils commands are rather odd beasts, and the usual rules and conventions of OO programming don't work very well with them. Not only are they singletons (enforced by the Distribution method 'get_command_obj()'), but they have a prescribed life-cycle which is also enforced by the Distribution class. Until today, this life-cycle was strictly linear: non-existent ---> preinitialized ---> initialized ---> finalized ---> running ---> run "Preinitialized" and "initialized" are on the same line because, to outsiders, they are indistinguishable: the transition happens entirely inside the Command constructor. It works like this: * before we create any command objects, we find and parse all config files, and parse the command line; the results are stored in a dictionary 'command_options' belonging to the Distribution instance * somebody somewhere calls Distribution.get_command_obj("foo"), which notices that it hasn't yet instantiated the "foo" command (typically implemented by the class 'foo' in the module distutils.command.foo) * 'get_command_obj()' instantiates a 'foo' object; command classes do not define constructors, so we go straight into Command.__init__ * Command.__init__ calls self.initialize_options(), which must be provided by each individual command class * 'initialize_options()' is typically a series of self.this = None self.that = None assignments: ie. it "declares" the available "options" for this command. (The 'user_options' class attribute also "declares" the command's options. The two are redundant; every "foo-bar" option in 'user_options' must be matched by a "self.foo_bar = None" in 'initialize_options()', or it will all end in tears.) * some time later (usually immediately), the command's 'finalize_options()' method is called. The job of 'finalize_options()' is to make up the command's mind about everything that will happen when the command runs. Typical code in 'finalize_options()' is: if self.foo is None: self.foo = default value if self.bar is None: self.bar = f(self.foo) Thus, we respect the user's value for 'foo', and have a sensible default if the user didn't provide one. And we respect the user's value for 'bar', and have a sensible -- possibly complicated -- default to fallback on. The idea is to reduce the responsibilities of the 'run()' method, and to ensure that "full disclosure" about the command's intentions can be made before it is ever run. To play along with this complicated dance, Distutils command classes have to provide 1) the 'user_options' class attribute, 2) the 'initialize_options()' method, and 3) the 'finalize_options()' method. (They also have to provide a 'run()' method, of course, but that has nothing to do with setting/getting option values.) The payoff is that new command classes get all the Distutils user interface -- command-line parsing and config files, for now -- for free. The example "configure" command that I showed in a previous post, simply by virtue of having "foo-inc" and "foo-lib" in 'user_options' (and corresponding "self.xxx = None" statements in 'initialize_options()', will automatically use the Distutils' config file and command-line parsing mechanism to set values for those options. Only if the user doesn't supply the information do we have to poke around the target system to figure out where "foo" is installed. Anyways, the point of this long-winded discussion is this: certain attributes of command objects are public and fair game for anyone to set or modify. However, there are well-defined points in the object's life-cycle *before* which it is meaningless to *get* option values, and *after* which it is pointless to *set* option values. In particular, there's no point in getting an option value *before* finalization, because -- duh -- the options aren't finalized yet. More subtly, attempting to set some option *after* finalization time might have no effect at all (if eg. that option is only used to derive other options from, like the 'build_base' option in the "build" command); or it might have complicated, undesirable effects. I can see this happening in particular with the "install" command, which (necessarily) has a frighteningly complex finalization routine. If we go by the simple, linear state-transition diagram above, it turns out that setting option values for a particular command object is a dicey proposition: you simply don't know what state the command object is in, so you don't know what effect setting values on that command will have. If you try to force them to have the right effect, by calling 'finalize_options()', it won't work: the way that method is typically written ("if self.foo is None: self.foo = default value", for as many values of "foo" as are needed), calling it a second time just won't work. So today, I added a couple of new transitions to that state-transition diagram. Now, you can go from any state to the "initialized" state using the 'reinitialize_command()' method provided by Distribution. So it's now safe to do something like this, eg. in a "configure" command build = self.reinitialize_command("build") build.include_dirs.append(foo_inc) build.library_dirs.append(foo_lib) build.ensure_finalized() ...and you know that any user-specified options to the "build" command will be preserved, and that all dependent-but-unspecified options will be recomputed. (You don't need to call 'ensure_finalized()' here unless you will subsequently by getting some option values from the "build" object.) Thus, it should now be possible to write a "configure" command that respects the bureaucracy of the Distutils *and* forces the "build" command to do The Right Thing. This is a small change to the code, but a major change to the philosophy of option-passing in the Distutils, which until now was (theoretically) "pull only": it was not considered proper or safe to assign another command's option attributes; now it is, as long as you play by the above rules. Cool! BTW, I'm not opposed to the idea of 'get_foo()' and 'set_foo()' methods: they could add some value, but only if they are provided by the Command class, rather than each command having to implement a long list of near-identical accessor and modifier methods. Probably 'get_foo()' should die if the object hasn't been finalized, and 'set_foo()' should die if it has been finalized (or hasn't been initialized). Hope this makes some sense... Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ I haven't lost my mind; I know exactly where I left it.
On 27 May 2000, M.-A. Lemburg said:
It would probably be wise to have one setup.py file per ZIP archive. The ZIP archives have predefined content and the setup.py files could have this information hard-coded somewhere.
Good point. Does this mean I don't have to worry about recursive setup scripts after all? ;-)
I would have to move away from my current installation logic though (simply unzipping and then compiling in place). With distutils help this shouldn't be much of a problem though (I hope ;-).
Yes, please do! The idea of having to unpack a distribution archive in a particular place has always deeply offended me; the fact that more than one Python module distributor thought this was a good idea is one of the things that motivated me to do the Distutils in the first place. IOW, forcing you to build in one place and install to another place is a feature. Greg -- Greg Ward - just another /P(erl|ython)/ hacker gward@python.net http://starship.python.net/~gward/ Always look on the bright side of life.
Greg Ward wrote:
On 27 May 2000, M.-A. Lemburg said:
I was referring to installing a (pre)built binary -- just before copying the compiled files to their final install location and right after that step is done.
Yes: this is the one place where the Distutils' extension mechanism *won't* work, because the Distutils aren't present (or at least, not in control) when installing a pre-built binary.
Why not ? The RPMs could use the existing Python installation which comes with a version of distutils (at least for 1.6) or use a copy which gets installed together with the package. The post-install script could then pass control to distutils and let it apply its magic.
Here, some other mechanism will be needed: pass a function, or a module, or a chunk of code to be eval'd, or something. Still not sure what's best; we have to balance the needs of the developer writing the setup script with the facilities available at the time the hook is run, and how the hook will be run ("python hookscript.py"?).
python .../distutils/setup.py --post-install ?!
"install-from-source" would execute these hooks too: right after having built the binaries.
Yes, *except* in the case where the installation is being done solely for the purpose of creating a built distribution. I'm pretty sure this can be handled by adding a "fake install" flag to the "install" command: if true, don't run the {pre,post}-install hooks.
Ok.
Wouldn't a method interface be more reliable and provide better means of extension using subclassing ?
I usually wrap these attributes in .get_foobar(), .set_foobar() methods -- this also makes it clear which attributes are read-only, read-write or "better don't touch" :-)
"Yes, but..."
I have spent the weekend thinking hard about this problem, and I think I can explain the situation a little better now. Distutils commands are rather odd beasts, and the usual rules and conventions of OO programming don't work very well with them. Not only are they singletons (enforced by the Distribution method 'get_command_obj()'), but they have a prescribed life-cycle which is also enforced by the Distribution class. Until today, this life-cycle was strictly linear:
non-existent ---> preinitialized ---> initialized ---> finalized ---> running ---> run
"Preinitialized" and "initialized" are on the same line because, to outsiders, they are indistinguishable: the transition happens entirely inside the Command constructor. It works like this:
* before we create any command objects, we find and parse all config files, and parse the command line; the results are stored in a dictionary 'command_options' belonging to the Distribution instance * somebody somewhere calls Distribution.get_command_obj("foo"), which notices that it hasn't yet instantiated the "foo" command (typically implemented by the class 'foo' in the module distutils.command.foo) * 'get_command_obj()' instantiates a 'foo' object; command classes do not define constructors, so we go straight into Command.__init__ * Command.__init__ calls self.initialize_options(), which must be provided by each individual command class * 'initialize_options()' is typically a series of self.this = None self.that = None assignments: ie. it "declares" the available "options" for this command. (The 'user_options' class attribute also "declares" the command's options. The two are redundant; every "foo-bar" option in 'user_options' must be matched by a "self.foo_bar = None" in 'initialize_options()', or it will all end in tears.) * some time later (usually immediately), the command's 'finalize_options()' method is called. The job of 'finalize_options()' is to make up the command's mind about everything that will happen when the command runs. Typical code in 'finalize_options()' is: if self.foo is None: self.foo = default value if self.bar is None: self.bar = f(self.foo)
Thus, we respect the user's value for 'foo', and have a sensible default if the user didn't provide one. And we respect the user's value for 'bar', and have a sensible -- possibly complicated -- default to fallback on.
The idea is to reduce the responsibilities of the 'run()' method, and to ensure that "full disclosure" about the command's intentions can be made before it is ever run.
To play along with this complicated dance, Distutils command classes have to provide 1) the 'user_options' class attribute, 2) the 'initialize_options()' method, and 3) the 'finalize_options()' method. (They also have to provide a 'run()' method, of course, but that has nothing to do with setting/getting option values.)
The payoff is that new command classes get all the Distutils user interface -- command-line parsing and config files, for now -- for free. The example "configure" command that I showed in a previous post, simply by virtue of having "foo-inc" and "foo-lib" in 'user_options' (and corresponding "self.xxx = None" statements in 'initialize_options()', will automatically use the Distutils' config file and command-line parsing mechanism to set values for those options. Only if the user doesn't supply the information do we have to poke around the target system to figure out where "foo" is installed.
Nice :-)
Anyways, the point of this long-winded discussion is this: certain attributes of command objects are public and fair game for anyone to set or modify. However, there are well-defined points in the object's life-cycle *before* which it is meaningless to *get* option values, and *after* which it is pointless to *set* option values. In particular, there's no point in getting an option value *before* finalization, because -- duh -- the options aren't finalized yet. More subtly, attempting to set some option *after* finalization time might have no effect at all (if eg. that option is only used to derive other options from, like the 'build_base' option in the "build" command); or it might have complicated, undesirable effects. I can see this happening in particular with the "install" command, which (necessarily) has a frighteningly complex finalization routine.
Hmm, I still don't see why you can't add attribute access methods which check and possibly control the forementioned problems. A few .set_this() and .get_that() methods would make the interface more transparent, add documentation (by virtue of __doc__ strings ;-) and could add check assertions.
If we go by the simple, linear state-transition diagram above, it turns out that setting option values for a particular command object is a dicey proposition: you simply don't know what state the command object is in, so you don't know what effect setting values on that command will have. If you try to force them to have the right effect, by calling 'finalize_options()', it won't work: the way that method is typically written ("if self.foo is None: self.foo = default value", for as many values of "foo" as are needed), calling it a second time just won't work.
Why not let the .set_this() method take care of getting the state right ? (or raise an exception if that's impossible)
So today, I added a couple of new transitions to that state-transition diagram. Now, you can go from any state to the "initialized" state using the 'reinitialize_command()' method provided by Distribution. So it's now safe to do something like this, eg. in a "configure" command
build = self.reinitialize_command("build") build.include_dirs.append(foo_inc) build.library_dirs.append(foo_lib) build.ensure_finalized()
...and you know that any user-specified options to the "build" command will be preserved, and that all dependent-but-unspecified options will be recomputed. (You don't need to call 'ensure_finalized()' here unless you will subsequently by getting some option values from the "build" object.)
Thus, it should now be possible to write a "configure" command that respects the bureaucracy of the Distutils *and* forces the "build" command to do The Right Thing. This is a small change to the code, but a major change to the philosophy of option-passing in the Distutils, which until now was (theoretically) "pull only": it was not considered proper or safe to assign another command's option attributes; now it is, as long as you play by the above rules. Cool!
Indeed :-)
BTW, I'm not opposed to the idea of 'get_foo()' and 'set_foo()' methods: they could add some value, but only if they are provided by the Command class, rather than each command having to implement a long list of near-identical accessor and modifier methods. Probably 'get_foo()' should die if the object hasn't been finalized, and 'set_foo()' should die if it has been finalized (or hasn't been initialized).
Right. I'd say: go for it ;-) In my experience, it's always better to define object access via methods rather than attributes. This is especially true when the projects evolves with time: you simply forget about the details, side-effects, assertions you made months ago (and possibly forgot to document) about the specific attributes. Performance is an argument here, but in the end you pay the few percent in performance gain with a much larger percentage in support costs... -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
Greg Ward wrote:
On 27 May 2000, M.-A. Lemburg said:
It would probably be wise to have one setup.py file per ZIP archive. The ZIP archives have predefined content and the setup.py files could have this information hard-coded somewhere.
Good point. Does this mean I don't have to worry about recursive setup scripts after all? ;-)
Not for me anymore ;-)
I would have to move away from my current installation logic though (simply unzipping and then compiling in place). With distutils help this shouldn't be much of a problem though (I hope ;-).
Yes, please do! The idea of having to unpack a distribution archive in a particular place has always deeply offended me; the fact that more than one Python module distributor thought this was a good idea is one of the things that motivated me to do the Distutils in the first place. IOW, forcing you to build in one place and install to another place is a feature.
I've always found that kind of setup convenient (and I've only gotten about 2-3 complaints about this in all the years the mx stuff has been around). With distutils I can finally get those 3 guys happy too ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
On 31 May 2000, M.-A. Lemburg said:
Why not ? The RPMs could use the existing Python installation which comes with a version of distutils (at least for 1.6) or use a copy which gets installed together with the package. The post-install script could then pass control to distutils and let it apply its magic.
True when installing and RPM into a Python 1.6 installation, but it'll be a good while before we can assume that. I want Distutils-generated RPMs to work with any version of Python that the modules-being-installed work with. (Granted, you may have to build extensions multiple times for eg. 1.5 and 1.6. But you shouldn't need to have anything more than Python installed to install Python modules from an RPM; the same goes for other built distribution formats.) Also, even when we *can* assume that, we won't be including the setup script in the (binary) RPM (or whatever). So while we could still use the many useful utility functions provided by distutils.*_util, we can't make use of whatever goodies the developer has put in his setup script, like the name of the distribution or the modules it installs.
Hmm, I still don't see why you can't add attribute access methods which check and possibly control the forementioned problems. A few .set_this() and .get_that() methods would make the interface more transparent, add documentation (by virtue of __doc__ strings ;-) and could add check assertions. [...] Why not let the .set_this() method take care of getting the state right ? (or raise an exception if that's impossible) [...] In my experience, it's always better to define object access via methods rather than attributes. This is especially true when the projects evolves with time: you simply forget about the details, side-effects, assertions you made months ago (and possibly forgot to document) about the specific attributes.
I'm moving in the direction of more bureaucracy for command options. We're a long way from "out-of-control", but the system has grown considerably in the last couple of months. There are dependencies and interactions between commands that are only documented in code, default values, type/syntax expectations -- all sorts of things that might be better off under a more bondage 'n discipline regime. One aspect of that regime would be automatic accessor and modifier methods provided by the Command class. However, this is not a high priority for Distutils 1.0. For now, I'm more interested in solving the problem of building, installing, and distributing Python module distributions; I'm inclined to worry about imposing more bureaucracy on the Distutils inner workings in the future. Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ Hand me a pair of leather pants and a CASIO keyboard -- I'm living for today!
participants (6)
-
Andrew Kuchling
-
Fred L. Drake
-
Greg Ward
-
Harry Henry Gebel
-
M.-A. Lemburg
-
Mark W. Alexander