Mailman 3 How can I create prebuilt distributions? - Distutils-SIG

How can I create prebuilt distributions?

older
first cut at a distutils 'fetch'...

Jack Jansen

13 Feb 2003 13 Feb '03

9:54 a.m.

I remember that this was discussed some time ago, but at that time I wasn't interested, and now I can't find it back in the archives. Can someone help me out? I want to create binary packages (i.e. packages that are usable if the end-user doesn't have a development environment). For now I'm happy with something that works on Unix (MacOSX to be specific). I had always thought that bdist did this, but it turns out that it does something completely different: it creates a tar file with all pathnames already hard-coded. This isn't good enough, as it doesn't allow the end user to select install location, etc. What I want as output is a tarfile with inside it basically a tree that has seen "setup.py build", but with some extra glue so it doesn't try to build again. If I simply tar the tree after the build this doesn't work: on the destination machine it still tries to build things, probably due to differences in modification times or something. This won't work if the end user doesn't have a c compiler, of course. Also, I wouldn't mind losing all the source code to trim down the archive size. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -

Show replies by date

Thomas Heller

13 Feb 13 Feb

7:08 p.m.

Jack Jansen writes:

...

I remember that this was discussed some time ago, but at that time I wasn't interested, and now I can't find it back in the archives. Can someone help me out?

I want to create binary packages (i.e. packages that are usable if the end-user doesn't have a development environment). For now I'm happy with something that works on Unix (MacOSX to be specific).

I had always thought that bdist did this, but it turns out that it does something completely different: it creates a tar file with all pathnames already hard-coded. This isn't good enough, as it doesn't allow the end user to select install location, etc.

What I want as output is a tarfile with inside it basically a tree that has seen "setup.py build", but with some extra glue so it doesn't try to build again. If I simply tar the tree after the build this doesn't work: on the destination machine it still tries to build things, probably due to differences in modification times or something. This won't work if the end user doesn't have a c compiler, of course.

IIRC, I've first seen Pete Shinners doing this, so maybe you want to ask him also. I've also thought about this some time, and experimented a little bit. Here is a technique which seems to work (on Windows, with Python2.2.2). Overrider the sdist command in your setup script by a class which doesn't remove the build tree from the file list (I think this may be a buglet: If the build tree is specified in the MANIFEST.in file, it should NOT be removed later): from distutils.command import sdist class my_sdist(sdist.sdist): def prune_file_list (self): """Prune off branches that might slip into the file list as created by 'read_template()', but really don't belong there: * the build tree (typically "build") * the release tree itself (only an issue if we ran "sdist" previously with --keep-temp, or it aborted) * any RCS or CVS directories """ build = self.get_finalized_command('build') base_dir = self.distribution.get_fullname() ## Commented out, because we want the build tree to be included ## self.filelist.exclude_pattern(None, prefix=build.build_base) self.filelist.exclude_pattern(None, prefix=base_dir) self.filelist.exclude_pattern(r'/(RCS|CVS)/.*', is_regex=1) and later: setup(... cmdclass = {'sdist': my_sdist}, ...) Then, in your MANIFEST.in file, insert something like this: recursive-include build *.obj *.pyd Finally, first run 'python setup.py build', and then 'python setup.py sdist'. This creates a zip-file (default on Windows), which can be unpacked somewhere, and, since the timestamps are ok, be 'built' without a C compiler (as long as you don't change the sources). I haven't tried it, but it should also be possible to include prebuild binaries for several Pythin versions.

...

Also, I wouldn't mind losing all the source code to trim down the archive size.

Also a task for either the MANIFEST.in file, or your custom sdist class. Thomas

M.-A. Lemburg

7:41 p.m.

Thomas Heller wrote:

...

Jack Jansen writes:

...
I want to create binary packages (i.e. packages that are usable if the end-user doesn't have a development environment). For now I'm happy with something that works on Unix (MacOSX to be specific).

This is basically what ActiveState does with their .ppm format: they ship a tarred build directory and then run something like "python setup.py install" on the target machine. At some point way back in time they wanted to make this code available to Python for general usage, but it seems they have lost interest (in so many things :-(). So at least you now know that it does work :-) distutils is your friend and can be tweaked to do many new things. Thomas already hinted at a solution. I'd just create my own bdist_prebuilt and include the needed distutils extensions right along with it. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Feb 14 2003)

...

...
...
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Python UK 2003, Oxford: 46 days left EuroPython 2003, Charleroi, Belgium: 130 days left

David Ascher

14 Feb 14 Feb

4:04 a.m.

M.-A. Lemburg wrote:

...

This is basically what ActiveState does with their .ppm format: they ship a tarred build directory and then run something like "python setup.py install" on the target machine.

Right. There were quite a few problems with that -- many of the setup.py's for example could only work with the full source distribution, etc. etc.

...

At some point way back in time they wanted to make this code available to Python for general usage, but it seems they have lost interest (in so many things :-().

Aw, don't be so mean. The PyPPM stuff has two parts -- the server, which is written in Perl (we use the PPM server that was developed for Perl a long time ago), and the client. The client isn't pretty code, it's very brittle, etc. I don't think it's a favor to anyone to use it. I am very interested in doing this still, and I'm hoping that we can find some resources to do it "right" at some point. It's true that we have had to focus on some things beyond others. MAL -- if there are other things that you feel we've dropped the ball on, please let me know (preferably off-list -- unless it has to do with distutils). We're still going to keep ActivePython up to date. We're kept adding Python features to Komodo, we're working on the next generation of Visual Python, we still maintain the ASPN Cookbook, the mailing list archives, etc. PyPPM is what I consider our biggest 'dropped ball', and while I can explain how that ball got dropped, that doesn't really get us here or there. Cheers, --da

M.-A. Lemburg

6:51 a.m.

David Ascher wrote:

...

M.-A. Lemburg wrote:

...
This is basically what ActiveState does with their .ppm format: they ship a tarred build directory and then run something like "python setup.py install" on the target machine.

Right. There were quite a few problems with that -- many of the setup.py's for example could only work with the full source distribution, etc. etc.

...
At some point way back in time they wanted to make this code available to Python for general usage, but it seems they have lost interest (in so many things :-().

Aw, don't be so mean.

Ah, not being mean... it's just that I have asked so many times to make the code public and nothing ever happened.

...

The PyPPM stuff has two parts -- the server, which is written in Perl (we use the PPM server that was developed for Perl a long time ago), and the client. The client isn't pretty code, it's very brittle, etc. I don't think it's a favor to anyone to use it.

Even if it's ugly code, it could still provide a) information of how this can be done b) opens up the world to ActivePython There has been some work put into a package directory recently (see http://www.python.org/peps/pep-0301.html) and registering .ppm like files in such a directory would sure make the Python experience an even better one :-) Here's a demo: http://www.amk.ca/cgi-bin/pypi.cgi

...

I am very interested in doing this still, and I'm hoping that we can find some resources to do it "right" at some point. It's true that we have had to focus on some things beyond others.

That would be great :-) The system is missing the "get the stuff and install it" part (which is good, since its better to do this in small steps rather than coming up with big ideas and then losing interest). -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Feb 14 2003)

...

...
...
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Python UK 2003, Oxford: 46 days left EuroPython 2003, Charleroi, Belgium: 130 days left

Thomas Heller

7:06 a.m.

"M.-A. Lemburg" writes:

...

There has been some work put into a package directory recently (see http://www.python.org/peps/pep-0301.html) and registering .ppm like files in such a directory would sure make the Python experience an even better one :-)

Here's a demo: http://www.amk.ca/cgi-bin/pypi.cgi

...
I am very interested in doing this still, and I'm hoping that we can find some resources to do it "right" at some point. It's true that we have had to focus on some things beyond others.

That would be great :-)

The system is missing the "get the stuff and install it" part (which is good, since its better to do this in small steps rather than coming up with big ideas and then losing interest).

The system is missing more, IMO, and that's the critical part: There should be a way to programatically find out which file I have to download. I have suggested this when the PEP was posted, but it wasn't included in the PEP afaik. Now they have added a download-url field, but there must be convention how to interpret it's value IMO (depending on platform, version, and so on). Again: determining from a Python script which dist-file I have to download is the most critical part, all the other stuff has been demonstrated in several implementations (ciphon, pypan, maybe more). Thomas

M.-A. Lemburg

7:25 a.m.

Thomas Heller wrote:

...

"M.-A. Lemburg" writes:

...
There has been some work put into a package directory recently (see http://www.python.org/peps/pep-0301.html) and registering .ppm like files in such a directory would sure make the Python experience an even better one :-)

Here's a demo: http://www.amk.ca/cgi-bin/pypi.cgi

...
I am very interested in doing this still, and I'm hoping that we can find some resources to do it "right" at some point. It's true that we have had to focus on some things beyond others.

That would be great :-)

The system is missing the "get the stuff and install it" part (which is good, since its better to do this in small steps rather than coming up with big ideas and then losing interest).

The system is missing more, IMO, and that's the critical part: There should be a way to programatically find out which file I have to download. I have suggested this when the PEP was posted, but it wasn't included in the PEP afaik.

Now they have added a download-url field, but there must be convention how to interpret it's value IMO (depending on platform, version, and so on).

Again: determining from a Python script which dist-file I have to download is the most critical part, all the other stuff has been demonstrated in several implementations (ciphon, pypan, maybe more).

Agreed. There should be a list of entries (distutils platform string, pyversion, disttype, download URL). There should be enough information to let distutils decide which version to download. I suppose that the register command could be made to generate most of this information from the data used to build the package. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Feb 14 2003)

...

...
...
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Python UK 2003, Oxford: 46 days left EuroPython 2003, Charleroi, Belgium: 130 days left

David Ascher

7:25 a.m.

M.-A. Lemburg wrote:

...

Ah, not being mean... it's just that I have asked so many times to make the code public and nothing ever happened.

The client code ships with ActivePython. Feel free to look at it, and then tell me if you really want it =). On the PPD-generation front, we basically define a new distutils target, and then do "python setup.py bdist_ppm" and then package the setup.py and the distributions subdirectories. The details of bdist_ppm change periodically, and I know I've given that code to some people, I don't remember if I sent it to you. Regardless, it's not a solid foundation for the future. The fundamental problem is that people write setup.py's which aren't tested in the case where the "install" phase is done on a different machine than the build. So the build phase may do lots of machine-specific checking and then the install phase breaks if e.g. Python is installed in a different directory, or it's built on win2k and installed on win9x, etc. The way people manage 'supplementary' files varies also a great deal between packages. It's also very hard to do packages which install different things depending on what is available on the target machine. I think that the "right" answer involves a _lot_ of work specifying and validating the definitions for the targets, _if that approach is to be used_. I'm not 100% sure that that approach is the right one. The bdist_wininst approach seems to work much better in practice, although I'm not sure that it covers all the cases that I wanted to cover (things getting installed in the Tools directory with proper shebang line tweaking, etc.).

...

There has been some work put into a package directory recently (see http://www.python.org/peps/pep-0301.html) and registering .ppm like files in such a directory would sure make the Python experience an even better one :-)

Yes, I know about PyPI, and I'm definitely hoping that we'll be able to help provide PPM-style functionality for everything registered in PyPI.

...

The system is missing the "get the stuff and install it" part (which is good, since its better to do this in small steps rather than coming up with big ideas and then losing interest).

One problem with how PyPPM has "happened" in the past has been that the people doing it at ActiveState either weren't knowledgeable enough about Python or distutils, or didn't have the bandwidth needed to effect the required changes in distutils or in people's setup.py's. (I still think that shipping setup.pys as opposed to using a declarative syntax was a mistake, but that's water under the bridge, although maybe we could build a dam). Distutils is a pretty scary beast, and build and installation is a nasty domain in general because there are so many different variations. If I estimated how much time it would take to do PyPPM right from scratch, it would probably amount to six man-months of work of my best build engineer. That's something that I need to justify with a business case, something that's harder than some people expect. =) In the Perl world, PPM is a no-brainer for us to fund, because PPM is only practically available through ActivePerl, and we get a lot of business benefits by providing ActivePerl for free. In the Python world, everytime we talk about doing PPM-style things with ActivePython, key people say "I don't care about ActivePython, make it open source so that we can add it to the core" (note that ActivePython users don't say that =). While I completely understand the reaction, I'm sure you can understand that it makes the business case quite tricky. The good news is that we're reviewing our PPM strategy, and that we may be able to approach the problem in more effective ways than we have in the past. I think that PyPI will help as well because it provides a centralized point by which we can e.g. contact all of the maintainers. We could also presumably add functionality to the PyPI server that validates whether one can make a PPD out of the package, etc. I'm willing to spend some time talking about this, just not able to commit to spending significant development time or promise deliverables. --david

Jack Jansen

9:07 a.m.

On vrijdag, feb 14, 2003, at 21:19 Europe/Amsterdam, David Ascher wrote:

...

The good news is that we're reviewing our PPM strategy, and that we may be able to approach the problem in more effective ways than we have in the past. I think that PyPI will help as well because it provides a centralized point by which we can e.g. contact all of the maintainers. We could also presumably add functionality to the PyPI server that validates whether one can make a PPD out of the package, etc.

I'm willing to spend some time talking about this, just not able to commit to spending significant development time or promise deliverables.

I'm interested. I'm working on a thing called the Package Install Manager for Python (pimp) right now. I was planning not to go into design discussions until 2.3 is out and only at that time write up a PEP or something, but as pimp is 90% done I guess I can spare the time. I wanted to wait because pimp is serving a very real need I have with MacPython: on OS9 the MacPython distributions have always contained a lot of extra goodies (Numeric, PIL), and people have come to expect that. This was eating into my time for every distribution, though, and I want to get rid of it. But Mac users often don't have a development environment, and even if they do having them type in distutils commands isn't exactly considered good style. (pet peeve: to me "open source" means that you may download the source, you can build it yourself and you have to option to modify it. Unfortunately the actual state of affair is that in reality it means you must download the source, you will build it yourself and you shall modify it before it works). So the idea behind pimp is that there is an online database of packages that is specific to a (OS-version, python-version) combination and that has a maintainer who is responsible for not adding only packages that have been tested and tried. Preferably the database should have source and binary version of all packages, so the end user can state a preference for either. The database also has dependency information, and bits of code so pimp can actively test whether a specific package is installed into your python, etc etc. The idea of having the database be specific to os/python version happened to make the problem manageable, but the more I think about it the more I like it and I'm now considering it (together with having a scapeg^h^h^h^h^hmaintainer a central part of the design:-) I hope that people will appear who want to maintain the database, and even if I have to do it myself it's still less of a bottleneck if I don't have to do it at the same time of making sure everything in a new Python release works. If people want to have a look at pimp: it's in Lib/plat-mac/pimp.py. It's probably MacOSX-specific right now, but that shouldn't be too difficult to fix. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -

David Ascher

9:31 a.m.

Jack Jansen wrote:

...

I'm interested. I'm working on a thing called the Package Install Manager for Python (pimp)

I'm going to say this just once. Please change the name. It's funny, but it's not 'serious'. I told the Piddle folks the same thing, and they regretted not changing it after it was too late. There, it's off my conscience =).

...

right now. I was planning not to go into design discussions until 2.3 is out and only at that time write up a PEP or something, but as pimp is 90% done I guess I can spare the time.

Funny -- not going into design discussions until after you're done =).

...

But Mac users often don't have a development environment, and even if they do having them type in distutils commands isn't exactly considered good style.

Absolutely. Same problem w/ Windows

...

So the idea behind pimp is that there is an online database of packages that is specific to a (OS-version, python-version) combination and that has a maintainer who is responsible for not adding only packages that have been tested and tried. Preferably the database should have source and binary version of all packages, so the end user can state a preference for either. The database also has dependency information, and bits of code so pimp can actively test whether a specific package is installed into your python, etc etc.

This is exactly what PPM was designed for, and it works (95% of the time) for Perl. See e.g. http://aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/Packages The way it works as a user is that if you have ActivePerl installed you can type 'ppm install Yada-Yada-Yada' and it will work (on Solaris only, according to the table above =). We update that from CPAN regularly (now), using an automated build system. We have two people dedicated to that at this time. This is a non-trivial resource commitment, both in hardware and in personel. The idea behind PyPPM is that we would do the same for Python. The problems came about because, as everyone here known, - there is no CPAN (with the actual sources, not just a catalog) - for a long time, there was no distutils - even with distutils, the process is much harder than in Perl. I'm not 100% sure I understand why. One hunch I have is that 95% of CPAN is "strictly" modules. No front-end tools, no Start Menu configurations, very little dependencies on third-party packages, etc. (also, 90% of CPAN is junk IMO, but that's another problem ;-). Regardless of what else you do, I suggest you consider the The Open Software Description Format: http://www.w3.org/TR/NOTE-OSD PPD (the format used in PPM) is a slight derivative of that.

...

I hope that people will appear who want to maintain the database, and even if I have to do it myself it's still less of a bottleneck if I don't have to do it at the same time of making sure everything in a new Python release works.

That job is what ActiveState does for Perl for three platforms right now. Note that there's more to do than maintaining the database - there's building the packages and testing them (and in our case, serving them).

...

If people want to have a look at pimp: it's in Lib/plat-mac/pimp.py. It's probably MacOSX-specific right now, but that shouldn't be too difficult to fix.

Famous last words =). Seriously, though -- cross-platform setup.py's are hard to come by as soon as you deviate from pure Python modules. They tend to fail if e.g. the versions of shared libraries installed are of different version. Some setup.py's create scripts -- are those installed in Tools, in the base Python directory, where? Are their shebang lines tweaked on install? Do they even get mentioned in the manifest and put in the distribution directories? Etc. I'm not trying to criticize the authors of setup.py's, btw! I just know from painful experience that building cross-platform build systems takes a _huge_ amount of time. Unfortunately, the highest value Python packages tend to rely on extension modules (one exception to that general statement is platform.py, one of my favorite pure-python modules), and so it's the edge cases that cause the greatest grief on the user side. For example, PIL is notoriously hard to build -- it's not because Fredrik is incompetent or mean -- it's because the number of combintions that have to be dealt with is large, and it's very hard to know when you're "done" except through slow, distributed, painful testing in the field. Don't get me wrong -- I think it's great that you're doing this, but I want to suggest that you're not 90% done =). --david

Jack Jansen

8:16 a.m.

On vrijdag, feb 14, 2003, at 18:00 Europe/Amsterdam, David Ascher wrote:

...

M.-A. Lemburg wrote:

...
This is basically what ActiveState does with their .ppm format: they ship a tarred build directory and then run something like "python setup.py install" on the target machine.

Right. There were quite a few problems with that -- many of the setup.py's for example could only work with the full source distribution, etc. etc.

I've already given up on the idea, mainly for this reason. Plus, I want a solution that doesn't need modification of the setup.py script. For the immediate future I'm sticking with bdist-dumb distributions, and on the receiving system I can do one of three things: 1. Just unpack it and hope for the best:-) 2. Check the common prefix of the files in the distro (unfortunately filenames seem to have both ./usr/local/... and usr/local form) and check that that matches sys.prefix. 3. In the distribution database record the prefix used to create the archive, and when unpacking the files substitute sys.prefix for recorded prefix. At the moment I do (1), but I plan to switch to 2 or 3 shortly (hopefully before 2.3a2). It would be nice if at some point in the future the two phases of distutils (build and install) could be cleanly separated. The files are mostly separated (although I think that things like scripts and datafiles don't go via the build subtree, but are installed straight from the source, right?), but there should be a way in which distutils could skip the build steps. Although some stuff from the build may need to be executed, I've seen setup scripts where install actually looks inside the build object to obtain data... -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -

David Ascher

9:09 a.m.

Jack Jansen wrote:

...

It would be nice if at some point in the future the two phases of distutils (build and install) could be cleanly separated.

+1e123 But as per my comments, you _can't_ do that without modifying a lot of pre-existing setup.py's. -da

7739

Age (days ago)

7740

Last active (days ago)

List overview

Download

11 comments

4 participants

participants (4)

David Ascher
Jack Jansen
M.-A. Lemburg
Thomas Heller

How can I create prebuilt distributions?

Jack Jansen

Thomas Heller

M.-A. Lemburg

David Ascher

M.-A. Lemburg

Thomas Heller

M.-A. Lemburg

David Ascher

Jack Jansen

David Ascher

Jack Jansen

David Ascher

tags

participants (4)