Splitting large packages into multiple eggs
I'm currently refactoring PyObjC and py2app to be setuptools- friendly, but one issue I'm coming across is that PyObjC is a large package, and normally people wouldn't want to include all of it. The way to do that would be to split it up into lots of eggs, so py2app can include the subset of eggs that are necessary. The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do? -bob
At 01:37 PM 12/11/2005 -0800, Bob Ippolito wrote:
I'm currently refactoring PyObjC and py2app to be setuptools- friendly, but one issue I'm coming across is that PyObjC is a large package, and normally people wouldn't want to include all of it. The way to do that would be to split it up into lots of eggs, so py2app can include the subset of eggs that are necessary.
The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do?
Um, create a script to generate the setup.py files for you? ;) If you don't mind building everything together, you could just create a data structure and then call setup() in a "for" loop that loops through all the projects... no, scratch that, it won't work unless you clean out the build directory every single time. Seriously, distutils isn't made for this. I originally intended to make setuptools do something like it for PEAK, and then gave up because it really just doesn't work. Too many of the "install" commands depend on just copying everything that's in the build directories, which means you'll get inter-project crosstalk. The One Obvious Way (30+ setup.py files) is in fact the only practical way without some pretty major work on the distutils or scaffolding to work around them. I'd suggest, however, that maybe one egg per package is too fine-grained, and you break it into just a handful of eggs instead.
On Dec 11, 2005, at 2:11 PM, Phillip J. Eby wrote:
At 01:37 PM 12/11/2005 -0800, Bob Ippolito wrote:
I'm currently refactoring PyObjC and py2app to be setuptools- friendly, but one issue I'm coming across is that PyObjC is a large package, and normally people wouldn't want to include all of it. The way to do that would be to split it up into lots of eggs, so py2app can include the subset of eggs that are necessary.
The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do?
Um, create a script to generate the setup.py files for you? ;)
If you don't mind building everything together, you could just create a data structure and then call setup() in a "for" loop that loops through all the projects... no, scratch that, it won't work unless you clean out the build directory every single time.
Seriously, distutils isn't made for this. I originally intended to make setuptools do something like it for PEAK, and then gave up because it really just doesn't work. Too many of the "install" commands depend on just copying everything that's in the build directories, which means you'll get inter-project crosstalk.
The One Obvious Way (30+ setup.py files) is in fact the only practical way without some pretty major work on the distutils or scaffolding to work around them. I'd suggest, however, that maybe one egg per package is too fine-grained, and you break it into just a handful of eggs instead.
Ok, so I'll start off with a handful of setup.py files to see how it works out.. For install_requires and setup_requires, how I can I let ez_setup know that the subprojects are in the same tarball relative to the main setup.py? -bob
On Dec 11, 2005, at 2:49 PM, Bob Ippolito wrote:
On Dec 11, 2005, at 2:11 PM, Phillip J. Eby wrote:
At 01:37 PM 12/11/2005 -0800, Bob Ippolito wrote:
I'm currently refactoring PyObjC and py2app to be setuptools- friendly, but one issue I'm coming across is that PyObjC is a large package, and normally people wouldn't want to include all of it. The way to do that would be to split it up into lots of eggs, so py2app can include the subset of eggs that are necessary.
The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do?
Um, create a script to generate the setup.py files for you? ;)
If you don't mind building everything together, you could just create a data structure and then call setup() in a "for" loop that loops through all the projects... no, scratch that, it won't work unless you clean out the build directory every single time.
Seriously, distutils isn't made for this. I originally intended to make setuptools do something like it for PEAK, and then gave up because it really just doesn't work. Too many of the "install" commands depend on just copying everything that's in the build directories, which means you'll get inter-project crosstalk.
The One Obvious Way (30+ setup.py files) is in fact the only practical way without some pretty major work on the distutils or scaffolding to work around them. I'd suggest, however, that maybe one egg per package is too fine-grained, and you break it into just a handful of eggs instead.
Ok, so I'll start off with a handful of setup.py files to see how it works out..
For install_requires and setup_requires, how I can I let ez_setup know that the subprojects are in the same tarball relative to the main setup.py?
It looks like easy_install can't do this yet. Here's the layout of what I have right now: http://svn.red-bean.com/pyobjc/branches/pyobjc-setuptools/ The main setup.py is at: http://svn.red-bean.com/pyobjc/branches/pyobjc-setuptools/setup.py and each subproject lives in here: http://svn.red-bean.com/pyobjc/branches/pyobjc-setuptools/subprojects/ Each subproject contains only a setup.py... The setup.py performs its job by changing back to the main source directory and then building some stuff as if it were just the main setup.py. It's done this way because there are a lot of include files that these things share, and it makes the refactoring less painful. It seems that the PackageIndex only wants to find eggs and source packages, but not source dirs... I can't make source packages out of them because they refer to source that exists elsewhere (in the parent tree). I'm not very familiar with the sources yet, so it would take a while for me to write such a patch. Also, I'm going to want a way to have "setup.py develop" ensure that all of the subprojects are up to date. -bob
At 07:02 PM 12/11/2005 -0800, Bob Ippolito wrote:
It looks like easy_install can't do this yet.
And it's not going to be able to. The closest thing I can envision is that if you have a master project with a setup.py that runs each child project's setup.py (or just a setup() call), such that each gets passed a "bdist_egg" command by easy_install. EasyInstall already detects when multiple eggs are built by a single setup.py, and processes all of them. That way, you could have one PyObjC project that contains all the others, and builds multiple eggs, including one that just specifies dependencies on the others. The only possible issue that might arise is if there are inter-egg dependencies and the eggs are built out-of-order. In that case, easy_install might incorrectly conclude that it needs one of the built eggs, before it has processed it. I could probably add some code to make this more robust. Anyway, the only limitation of this approach is that you won't be able to build any individual packages from source, only the overall package as a whole. However, it will be possible to build and even upload all the eggs using "setup.py bdist_egg upload", so easy_install will be able to find the binary packages. Or you can just have your PyPI download URL point to a directory where you dump the latest eggs; if you're running your bdist commands on the web server (or rsync it), you can use the 'rotate' command to delete outdated snapshots. But anybody who wants to build from source will have to use the master package, rather than any individual subpackages.
Also, I'm going to want a way to have "setup.py develop" ensure that all of the subprojects are up to date.
This should work normally if your setup.py just calls setup() for each subproject and the master; again this should be in dependencies-first order.
Bob Ippolito wrote:
The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do?
You may want to look at how the new scipy.distutils scheme works. We have a utility class Configuration which encapsulates everything about the setup. Eventually, it creates the **kwds for setup() from that information. The important bits of our main setup.py looks something like this: from scipy.distutils.misc_util import Configuration config = Configuration( maintainer = "SciPy Developers", # ... ) # Force scipy to be a package (its __init__.py file comes from scipy_core) config.packages.append('scipy') config.package_dir['scipy'] = os.path.join(config.local_path,'Lib') config.add_subpackage('Lib') setup(**config.todict()) In this case, the Lib/ subdirectory in the source tree will become the main scipy package. It has its own setup.py: def configuration(parent_package='',top_path=None): from scipy.distutils.misc_util import Configuration config = Configuration('scipy',parent_package,top_path) config.add_subpackage('sandbox') config.add_subpackage('utils') config.add_subpackage('io') config.add_subpackage('fftpack') config.add_subpackage('signal') config.add_subpackage('integrate') config.add_subpackage('linalg') config.add_subpackage('special') config.add_subpackage('optimize') config.add_subpackage('stats') config.add_subpackage('interpolate') config.add_subpackage('sparse') config.add_subpackage('cluster') config.add_subpackage('lib') config.make_svn_version_py() # installs __svn_version__.py config.make_config_py('__scipy_config__') return config Each of these subdirectories has their own setup.py files, too. E.g.: def configuration(parent_package='',top_path=None): from scipy.distutils.misc_util import Configuration config = Configuration('io', parent_package, top_path) config.add_extension('numpyio', sources = ['numpyiomodule.c']) config.add_data_dir('tests') config.add_data_dir('examples') config.add_data_dir('docs') return config All of these file names and directories are local. I could move subpackages from scipy.sandbox.*, say, into scipy.*, and all I'd have to change are the config.add_subpackage() calls. The Configuration class does all of the bookkeeping. One could selectively build certain subpackages. One of these days I am going to write a script that will read data from a configuration file to determine which subpackages to build. The Configuration class seems to be fairly decoupled from the rest of scipy.distutils. I think you only have to change a few from scipy.distutils.core import Extension to from setuptools import Extension http://svn.scipy.org/svn/scipy_core/trunk/scipy/distutils/misc_util.py -- Robert Kern robert.kern@gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
On Dec 11, 2005, at 7:55 PM, Robert Kern wrote:
Bob Ippolito wrote:
The issue at hand is how to structure the setup.py to support creation of multiple eggs, with an egg for installation purposes that depends on everything. PyObjC can be broken up into about 30 eggs, one for each package, one for the Xcode support (which depends on py2app and altgraph), one for all of the tests (or maybe separate eggs for each test suite). Obviously I'm not looking to create 30+ setup.py files, so what do I do?
You may want to look at how the new scipy.distutils scheme works. We have a utility class Configuration which encapsulates everything about the setup. Eventually, it creates the **kwds for setup() from that information. The important bits of our main setup.py looks something like this:
That's interesting, but it's not relevant... The point is that there needs to be separate setup.py files because we're making separate eggs. I'm making separate eggs for packaging reasons. Basically: - there's 3.4M of tests for the core - a handful of the packages aren't compatible with some supported platforms - the Xcode support depends on a whole slew of other packages you don't otherwise want. This was traditionally not a problem because there was no dependency management, plus py2app simply included whatever it detected was used on a per-module level. For the next release of py2app I'm going to make it wholly include any egg that is used, which would be very bad for PyObjC if it were a single egg. -bob
Bob Ippolito wrote:
That's interesting, but it's not relevant... The point is that there needs to be separate setup.py files because we're making separate eggs. I'm making separate eggs for packaging reasons.
Ah, that's right. PyObjC isn't organized into a namespace package. Never mind. -- Robert Kern robert.kern@gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
participants (4)
-
Bob Ippolito
-
Bob Ippolito
-
Phillip J. Eby
-
Robert Kern