Packaging and binary distributions
I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) included a lot of good information and discussion, but ultimately didn't reach any firm conclusions. First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here. Second question - is there a problem at all? For the majority of Windows users, I suspect not. The existing bdist_wininst and bdist_msi formats have worked fine for a long time, offer Windows integration and a GUI installer, and in the case of MSI offer options for integrating with corporate distribution policies that some users consider significant, if not essential. (Binary eggs are a third, and somewhat odd, case - a number of projects have started distributing binary eggs, but I don't know what benefits they have over bdist_wininst in particular, as easy_install will read bdist_wininst installers. Perhaps a setuptools/distribute user could comment. For now I'll assume that binary eggs will slowly go away as packaging gets more widely adopted). So that leaves a minority who (1) prefer integration with packaging, (2) need to work with virtual environments or custom local builds, (3) need binary extensions in some or all of their environments and (4) don't want to have to build all the binaries they need from scratch. Given the scale of the issue, it seems likely that putting significant effort into addressing it is unwise. In particular, it seems unlikely that developers are going to move en masse to a new distribution format just to cater for this minority. On the other hand, for people who care, the fact that packaging (currently) offers no direct support for consuming binary distributions is a fairly obvious hole. And having to build from source just to install into a virtual environment could be a showstopper. The bdist_wininst format is relatively amenable to manipulation - it's little more than a zip file, after all. So writing 3rd party code to install the contents via packaging shouldn't be hard (I've done some proof of concept work, and it isn't :-)) Vinay's proposal to use the resource mechanism and some custom hooks would work, but I'd like to see a small amount of extra direct support added to packaging to make things cleaner. Also, if packaging supported plugins to recognise new distribution formats, this would make it possible to integrate the extra code seamlessly. The MSI format is a little more tricky, mainly because it is a more complex format and (as far as I can tell from a brief check) files are stored in the opaque CAB format, so the only way of getting data out is to do a temporary install somewhere. But I see no reason why that isn't achievable. So, my proposal is as follows: 1. I will write a 3rd party module to take bsist_wininst and bdist_msi modules and install them using packaging 2. Where packaging changes are useful to make installing binaries easier, I'll request them (by supplying patches) 3. I'll look at creating a format-handling plugin mechanism for packaging. If it's viable, I'll post patches 4. If it seems useful, my module could be integrated into the core packaging module I don't intend to do anything about a GUI, or modify the existing formats at all. These don't interest me, particularly, so I'll leave them to someone who has a clear picture of what they want in those areas, and the time to develop it. For 3.3 at least, I'd expect developers to continue distributing bdist_wininst or bdist_msi format files. We'll see what happens with binary eggs. Unix/MacOS users who care will need to propose something themselves. Does anyone have any comments? Paul.
In article
I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) included a lot of good information and discussion, but ultimately didn't reach any firm conclusions.
First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here.
I haven't been following this discussion that closely but I'm rather surprised that the need for binary distributions for Python packages on non-Windows platforms would be in question. Just as on Windows, it's not a given that all Unix or Mac OS X end-user systems will have the necessary development tools installed (C compiler, etc) to build C extension modules. Today, the most platform-independent way of distributing these are with binary eggs: the individual binary eggs are, of course, not platform-independent but the distribution and installation mechanism is or should be. Sure, there are other ways, like pushing the problem back to the OS distributor (e.g. Debian, Red Hat, et al) or, as in the case of Mac OS X where there isn't a system package manager in the same sense, to a third-party package distributor (like MacPorts, Homebrew, or Fink). Or you can produce platform-specific installers for each platform which also seems heavy-weight. Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format? -- Ned Deily, nad@acm.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 02:04 PM, Ned Deily wrote:
In article
,
Paul Moore
I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html)
included a lot of good information and discussion, but ultimately
didn't reach any firm conclusions.
First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here.
I haven't been following this discussion that closely but I'm rather surprised that the need for binary distributions for Python packages on non-Windows platforms would be in question. Just as on Windows, it's not a given that all Unix or Mac OS X end-user systems will have the necessary development tools installed (C compiler, etc) to build C extension modules. Today, the most platform-independent way of distributing these are with binary eggs: the individual binary eggs are, of course, not platform-independent but the distribution and installation mechanism is or should be. Sure, there are other ways, like pushing the problem back to the OS distributor (e.g. Debian, Red Hat, et al) or, as in the case of Mac OS X where there isn't a system package manager in the same sense, to a third-party package distributor (like MacPorts, Homebrew, or Fink). Or you can produce platform-specific installers for each platform which also seems heavy-weight.
Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
Practically speaking, nobody but Windows consumers *needs* binary packages on PyPI: even if the target ("production") box is crippled^Wstripped of its compiler, such environments always have "staging" hosts which can be used to build binary packages for internal distribution. Windows users are the only ones who routinely don't have access to a compiler at all. Even trying to push binary distributeions to PyPI for Linux is a nightmare (e.g., due to UCS2 / UCS4 incompatibility). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6tvk4ACgkQ+gerLs4ltQ7zLwCfa0tvsRUtkwC3OkhYwGD7eGvL pbwAoLAm416vdyS3qbGDf/2R9iEtw2rH =tcS+ -----END PGP SIGNATURE-----
On 30 October 2011 18:04, Ned Deily
Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
A very quick and dirty check: dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570 That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this). This is 2913 packages with some form of binary distribution out of 16615 that have a single release on PyPI. I skipped 398 with multiple releases as I wasn't sure how to capture the data for those... I suspect they include some important cases, though (I know lxml is in there, for example). So: 17% of packages have any binary release. Of those, 88% have eggs, 12% have wininst and the rest are under 5%. Put another way, 2% of all packages have wininst installers. And 15% have eggs. That's not a lot. Paul.
Paul Moore
The MSI format is a little more tricky, mainly because it is a more complex format and (as far as I can tell from a brief check) files are stored in the opaque CAB format, so the only way of getting data out is to do a temporary install somewhere. But I see no reason why that isn't achievable.
It's not just about getting the data out of the CAB, though - it's also about integration with Add/Remove Programs and the rest of the Windows Installer ecosystem.
1. I will write a 3rd party module to take bsist_wininst and bdist_msi modules and install them using packaging
It would be important to retain the flexibility currently offered by setup.cfg hooks, as I don't believe any out-of-the-box approach will work for the wide range of use cases on Windows (think Powershell scripts, Visio templates and other Microsoft Office integration components). I'm also not sure if these formats provide all the flexibility required - e.g. they may be fine for extension modules, but how do they handle packaging include files?
For 3.3 at least, I'd expect developers to continue distributing bdist_wininst or bdist_msi format files. We'll see what happens with binary eggs.
Unix/MacOS users who care will need to propose something themselves.
I'm not sure there's anything especially Windows-specific about the bdist_wininst format, apart from the prepended GUI executable. One drawback of any current scheme is that if you're packaging an extension module that runs on say Windows, Linux and Mac OS X, there's no easy way to build or distribute a single archive (for a given version of Python, say) which has all the binary variants you want to include, such that at installation time, only the bits relevant to the target platform are installed. The current packaging functionality does sort of support this, but it entails potentially tedious manual editing of the setup.cfg file to add information about what resources apply to which platform - the kind of tedious editing which would be obviated by the right kind of additional support code. Regards, Vinay Sajip
I like binary distribution even under Linux. I access some Linux machines using same Linux distribution and some of them doesn't have "python-dev" package or even "build-essensials". (because they are netbooting so have restricted rootfs size) So I want build binary package by myself and distribute it to virtualenv on such machines. In this case, absolute path of virtualenv is not fixed. So "bdist_dumb --relative" or egg is good for me. On Sun, Oct 30, 2011 at 11:09 PM, Paul Moore
I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) included a lot of good information and discussion, but ultimately didn't reach any firm conclusions.
First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here.
Second question - is there a problem at all? For the majority of Windows users, I suspect not. The existing bdist_wininst and bdist_msi formats have worked fine for a long time, offer Windows integration and a GUI installer, and in the case of MSI offer options for integrating with corporate distribution policies that some users consider significant, if not essential. (Binary eggs are a third, and somewhat odd, case - a number of projects have started distributing binary eggs, but I don't know what benefits they have over bdist_wininst in particular, as easy_install will read bdist_wininst installers. Perhaps a setuptools/distribute user could comment. For now I'll assume that binary eggs will slowly go away as packaging gets more widely adopted).
So that leaves a minority who (1) prefer integration with packaging, (2) need to work with virtual environments or custom local builds, (3) need binary extensions in some or all of their environments and (4) don't want to have to build all the binaries they need from scratch.
Given the scale of the issue, it seems likely that putting significant effort into addressing it is unwise. In particular, it seems unlikely that developers are going to move en masse to a new distribution format just to cater for this minority. On the other hand, for people who care, the fact that packaging (currently) offers no direct support for consuming binary distributions is a fairly obvious hole. And having to build from source just to install into a virtual environment could be a showstopper.
The bdist_wininst format is relatively amenable to manipulation - it's little more than a zip file, after all. So writing 3rd party code to install the contents via packaging shouldn't be hard (I've done some proof of concept work, and it isn't :-)) Vinay's proposal to use the resource mechanism and some custom hooks would work, but I'd like to see a small amount of extra direct support added to packaging to make things cleaner. Also, if packaging supported plugins to recognise new distribution formats, this would make it possible to integrate the extra code seamlessly.
The MSI format is a little more tricky, mainly because it is a more complex format and (as far as I can tell from a brief check) files are stored in the opaque CAB format, so the only way of getting data out is to do a temporary install somewhere. But I see no reason why that isn't achievable.
So, my proposal is as follows:
1. I will write a 3rd party module to take bsist_wininst and bdist_msi modules and install them using packaging 2. Where packaging changes are useful to make installing binaries easier, I'll request them (by supplying patches) 3. I'll look at creating a format-handling plugin mechanism for packaging. If it's viable, I'll post patches 4. If it seems useful, my module could be integrated into the core packaging module
I don't intend to do anything about a GUI, or modify the existing formats at all. These don't interest me, particularly, so I'll leave them to someone who has a clear picture of what they want in those areas, and the time to develop it.
For 3.3 at least, I'd expect developers to continue distributing bdist_wininst or bdist_msi format files. We'll see what happens with binary eggs.
Unix/MacOS users who care will need to propose something themselves.
Does anyone have any comments?
Paul. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
--
INADA Naoki
On 30 October 2011 23:17, Vinay Sajip
Paul Moore
writes: The MSI format is a little more tricky, mainly because it is a more complex format and (as far as I can tell from a brief check) files are stored in the opaque CAB format, so the only way of getting data out is to do a temporary install somewhere. But I see no reason why that isn't achievable.
It's not just about getting the data out of the CAB, though - it's also about integration with Add/Remove Programs and the rest of the Windows Installer ecosystem.
Hang on. I'm talking here about repackaging the binary files in the MSI file for use in a pysetup install invocation. As pysetup has no GUI, and doesn't integrate with Add/Remove, there's no issue here. If you want a GUI and Add/Remove integration, just run the MSI. Or am I missing something? We seem to be at cross purposes here, I suspect I'm missing your point.
1. I will write a 3rd party module to take bsist_wininst and bdist_msi modules and install them using packaging
It would be important to retain the flexibility currently offered by setup.cfg hooks, as I don't believe any out-of-the-box approach will work for the wide range of use cases on Windows (think Powershell scripts, Visio templates and other Microsoft Office integration components).
Why? Again, if this is purely as a means to consume bdist_xxx files, then the only flexibility needed is enough to cater for any variations in data stored in the bdist_xxx format. The wininst format is easy here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS (corresponding to the installation --install-xxx parameters) and that's all. As long as the module is flexible enough to deal with that, it can read anything bdist_wininst can produce.
I'm also not sure if these formats provide all the flexibility required - e.g. they may be fine for extension modules, but how do they handle packaging include files?
Ah, I think I see what you are getting at. If someone uses the new features and flexibility of packaging to create a fancy custom install scheme, how do they bundle up a binary distribution from that? My (current) answer is that I don't know. The packaging module as it stands only offers the legacy bdist_xxx formats, so the answer is "run pysetup run bdist_wininst on it". If that breaks (as it is likely to - wininst format isn't very flexible) then tough, you're out of luck. I 100% agree that having a "native" packaging means of building binary distributions from source ones, which captures all of the necessary information to cover any flexibility available to setup.cfg, would be good. But that's potentially a much bigger project than I can manage. My bdist_simple format was based off bdist_dumb/bdist_wininst and had the same limitations as that. You might be able to get somewhere by running build, then zipping up the whole directory, source, build subdirectory and all. Then on the target machine, unzip and do a --skip-build install. That's a bit of a hack, but should in theory work. Whether it's the basis of a sensible distribution format I don't know.
For 3.3 at least, I'd expect developers to continue distributing bdist_wininst or bdist_msi format files. We'll see what happens with binary eggs.
Unix/MacOS users who care will need to propose something themselves.
I'm not sure there's anything especially Windows-specific about the bdist_wininst format, apart from the prepended GUI executable. One drawback of any current scheme is that if you're packaging an extension module that runs on say Windows, Linux and Mac OS X, there's no easy way to build or distribute a single archive (for a given version of Python, say) which has all the binary variants you want to include, such that at installation time, only the bits relevant to the target platform are installed. The current packaging functionality does sort of support this, but it entails potentially tedious manual editing of the setup.cfg file to add information about what resources apply to which platform - the kind of tedious editing which would be obviated by the right kind of additional support code.
Again, I agree that this would be useful. Not something I have the time to look at though (although if someone else picks it up, I'd be interested in doing some testing and maybe contributing to the work). I think I now see why we're not understanding each other. I'm coming from the position that the projects I care about (as an end user) use bdist_wininst or bdist_msi at the moment, so all I want is a way of using, as a consumer, those existing distributions (or something equivalent in power) to install the packages via pysetup (which gets me the ability to install in development builds and venvs). I see why a more powerful binary format would be nice for developers, but as an end user I have no direct need for it. Thanks for your patience. Paul.
Paul Moore
Hang on. I'm talking here about repackaging the binary files in the MSI file for use in a pysetup install invocation. As pysetup has no GUI, and doesn't integrate with Add/Remove, there's no issue here. If you want a GUI and Add/Remove integration, just run the MSI. Or am I missing something? We seem to be at cross purposes here, I suspect I'm missing your point.
As you say later in your post, we're probably just coming at this from two different perspectives. I think you mentioned the possible need to install to a temporary location just to extract files from the CAB; then you would presumably need to uninstall again to remove the Add/Remove Programs entry created when you installed to the temporary location (or else I misunderstood your meaning here).
It would be important to retain the flexibility offered by setup.cfg hooks, as I don't believe any out-of-the-box approach will work for the range of use cases on Windows (think Powershell scripts, Visio templates and other Microsoft Office integration components).
Why? Again, if this is purely as a means to consume bdist_xxx files, then the only flexibility needed is enough to cater for any variations in data stored in the bdist_xxx format. The wininst format is easy here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS (corresponding to the installation --install-xxx parameters) and that's all. As long as the module is flexible enough to deal with that, it can read anything bdist_wininst can produce.
My point is really that a one-size-fits-all DATA location is unlikely to cater to all use cases. The flexibility offered by setup.cfg together with hooks gets around the limitation of a single location for data.
Ah, I think I see what you are getting at. If someone uses the new features and flexibility of packaging to create a fancy custom install scheme, how do they bundle up a binary distribution from that? My (current) answer is that I don't know. The packaging module as it stands only offers the legacy bdist_xxx formats, so the answer is "run pysetup run bdist_wininst on it". If that breaks (as it is likely to - wininst format isn't very flexible) then tough, you're out of luck.
Yes, that's what I was getting at. Regards, Vinay Sajip
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 5:14 PM, Tres Seaver wrote:
On 10/30/2011 02:04 PM, Ned Deily wrote:
In article
, Paul Moore
wrote: I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html)
included a lot of good information and discussion, but ultimately
didn't reach any firm conclusions.
First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here.
I haven't been following this discussion that closely but I'm rather surprised that the need for binary distributions for Python packages on non-Windows platforms would be in question. Just as on Windows, it's not a given that all Unix or Mac OS X end-user systems will have the necessary development tools installed (C compiler, etc) to build C extension modules. Today, the most platform-independent way of distributing these are with binary eggs: the individual binary eggs are, of course, not platform-independent but the distribution and installation mechanism is or should be. Sure, there are other ways, like pushing the problem back to the OS distributor (e.g. Debian, Red Hat, et al) or, as in the case of Mac OS X where there isn't a system package manager in the same sense, to a third-party package distributor (like MacPorts, Homebrew, or Fink). Or you can produce platform-specific installers for each platform which also seems heavy-weight.
I don't pushing it back to the OS vendor solves the problem. Say I want to install these binary packages with buildout: How would it go about consuming an RPM to install in an isolated buildout directory?
Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
Practically speaking, nobody but Windows consumers *needs* binary packages on PyPI: even if the target ("production") box is crippled^Wstripped of its compiler, such environments always have "staging" hosts which can be used to build binary packages for internal distribution.
It might be true that such systems don't need binary packages on PyPI, but the original question is about binary package support for the packaging module on non-Windows systems. I think the answer is clearly "yes": I have such systems without compilers. If I build packages on a staging server, I would want to put them on an internal PyPI-like server, for consumption by packaging. So packaging would need to consume these binary packages. Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOrnFiAAoJENxauZFcKtNxLG0H/03d0uRXw/MvlCA9q92OlwWk +X2PqpZ/F5aFBuN3lsichr/qLiHm69tNu3K++JyLXypT7hzbiB8QEbVUn5Z8X2ds is/6wKIX5Hmd//UlX+VtlYZQSXd/1k7FbqFY0CPTRFGrE+I9ipfCnO3h1OiBwHpY eejoR4Lr/6MXZ+v7DdlyRC9mWZV/uNKnR0ec5ABbQIEC13/j91gR/57ua/ryhRmT hco4ssRSP9pqO058aVJ1ivw2q+9364f7DgWynafRjkrcTy80gZ90LTz7WtteeFPr QO2yFW8ZI0UsxUxNRsDBj1N91AVHngU6HJa1evgegUPRjl94neSQLLWLla37qfQ= =2b7E -----END PGP SIGNATURE-----
Am 31.10.2011 09:07, schrieb Vinay Sajip:
Paul Moore
writes: Hang on. I'm talking here about repackaging the binary files in the MSI file for use in a pysetup install invocation. As pysetup has no GUI, and doesn't integrate with Add/Remove, there's no issue here. If you want a GUI and Add/Remove integration, just run the MSI. Or am I missing something? We seem to be at cross purposes here, I suspect I'm missing your point.
As you say later in your post, we're probably just coming at this from two different perspectives. I think you mentioned the possible need to install to a temporary location just to extract files from the CAB; then you would presumably need to uninstall again to remove the Add/Remove Programs entry created when you installed to the temporary location (or else I misunderstood your meaning here).
This presumption is false (as is the claim that you need to install the MSI to get at the files). It's quite possible to extract the files from the MSI without performing the installation. There are actually two ways to do that: a) perform an "administrative" installation, which unpacks the files to disk but doesn't actually perform any installation procedure, or b) use the MSI API to extract first the CAB file, and then the files in the CAB file. This would be a bit work to do if you want to find out the full path names of the individual files, but it could work in theory.
Why? Again, if this is purely as a means to consume bdist_xxx files, then the only flexibility needed is enough to cater for any variations in data stored in the bdist_xxx format. The wininst format is easy here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS (corresponding to the installation --install-xxx parameters) and that's all. As long as the module is flexible enough to deal with that, it can read anything bdist_wininst can produce.
My point is really that a one-size-fits-all DATA location is unlikely to cater to all use cases. The flexibility offered by setup.cfg together with hooks gets around the limitation of a single location for data.
I'm sure bdist_wininst can be augmented to support arbitrary "base prefixes" (assuming that is the flexibility you talk about). It would just need a list of what directory names are prefixes, The MSI format is designed to provide exactly that flexibility of arbitrarily mapping source folders to destination folders during installation. bdist_msi would just need to be taught to interpret setup.cfg files.
Ah, I think I see what you are getting at. If someone uses the new features and flexibility of packaging to create a fancy custom install scheme, how do they bundle up a binary distribution from that? My (current) answer is that I don't know. The packaging module as it stands only offers the legacy bdist_xxx formats, so the answer is "run pysetup run bdist_wininst on it". If that breaks (as it is likely to - wininst format isn't very flexible) then tough, you're out of luck.
Yes, that's what I was getting at.
Hmm. You are just describing a bug, not an inherent limitation. Regards, Martin
On 31 October 2011 10:42, "Martin v. Löwis"
Am 31.10.2011 09:07, schrieb Vinay Sajip: This presumption is false (as is the claim that you need to install the MSI to get at the files). It's quite possible to extract the files from the MSI without performing the installation. There are actually two ways to do that: a) perform an "administrative" installation, which unpacks the files to disk but doesn't actually perform any installation procedure, or b) use the MSI API to extract first the CAB file, and then the files in the CAB file. This would be a bit work to do if you want to find out the full path names of the individual files, but it could work in theory.
Yes, I'm currently doing an administrative install via msiexec to get the files out. It's simple enough to do.
My point is really that a one-size-fits-all DATA location is unlikely to cater to all use cases. The flexibility offered by setup.cfg together with hooks gets around the limitation of a single location for data.
I'm sure bdist_wininst can be augmented to support arbitrary "base prefixes" (assuming that is the flexibility you talk about). It would just need a list of what directory names are prefixes,
The MSI format is designed to provide exactly that flexibility of arbitrarily mapping source folders to destination folders during installation. bdist_msi would just need to be taught to interpret setup.cfg files.
Agreed - the "one size fits all" data location is a limitation. I'm not sure that in practical terms it is a big issue, though - it's been like that since the wininst format was designed, and nobody has ever complained. There are certainly cases where packages have needed to implement more or less clumsy workarounds (for example, not including documentation in binary distributions) but it's obviously never been enough of an issue to prompt people to fix it. The egg format has the same limitation, as far as I'm aware, so clearly even the "eggs solve everything" crowd don't feel it's a real issue :-)
Ah, I think I see what you are getting at. If someone uses the new features and flexibility of packaging to create a fancy custom install scheme, how do they bundle up a binary distribution from that? My (current) answer is that I don't know. The packaging module as it stands only offers the legacy bdist_xxx formats, so the answer is "run pysetup run bdist_wininst on it". If that breaks (as it is likely to - wininst format isn't very flexible) then tough, you're out of luck.
Yes, that's what I was getting at.
Hmm. You are just describing a bug, not an inherent limitation.
Precisely. And it's a bug that no-one has felt the need to fix in many years. The flexibility is not new - distutils had at least as much flexibility if not more. I'd love to see a binary format that was as flexible and powerful as building from source, which allowed OS integration where the user wanted it while still supporting venvs and non-system installations, and which was widely adopted by distribution authors. Oh, and can I have a pony? :-) Sadly, I don't have the time or understanding of the various requirements to deliver something like that. Realistically, I'd just like to be able to benefit from the generosity of existing distribution authors who make compiled versions of their code available, however they choose to do so. Hence my current focus on consuming existing formats (and even the bdist_simple proposal/patch was little more than a tidied up bdist_wininst made OS-neutral). Paul.
Paul Moore
Agreed - the "one size fits all" data location is a limitation. I'm not sure that in practical terms it is a big issue, though - it's been like that since the wininst format was designed, and nobody has ever complained. There are certainly cases where packages have needed to implement more or less clumsy workarounds (for example, not including documentation in binary distributions) but it's obviously never been enough of an issue to prompt people to fix it. The egg format has the same limitation, as far as I'm aware, so clearly even the "eggs solve everything" crowd don't feel it's a real issue
Yes, but with setup.py you had the option of running any Python code to move things around using a post-install script, so people could get around those limitations, albeit in a completely ad hoc way. So there was nothing to fix, but no standard way of achieving what you wanted in out-of-the-ordinary scenarios.
I'd love to see a binary format that was as flexible and powerful as building from source, which allowed OS integration where the user wanted it while still supporting venvs and non-system installations, and which was widely adopted by distribution authors. Oh, and can I have a pony? Sadly, I don't have the time or understanding of the various requirements to deliver something like that.
Well, from the point of view of venvs and PEP 404, it's certainly topical and worth trying to get some traction behind this particular pony. If bdist_pony is easy enough to use and doesn't close any existing doors, then there's no obvious reason why distribution authors wouldn't use it for future releases of their distributions. Regards, Vinay Sajip
Martin v. Löwis
This presumption is false (as is the claim that you need to install the MSI to get at the files). It's quite possible to extract the files from the MSI without performing the installation. There are actually two ways to do that: a) perform an "administrative" installation, which unpacks the files to disk but doesn't actually perform any installation procedure, or b) use the MSI API to extract first the CAB file, and then the files in the CAB file. This would be a bit work to do if you want to find out the full path names of the individual files, but it could work in theory.
I'd completely forgotten about the administrative installation - thanks for reminding me.
The MSI format is designed to provide exactly that flexibility of arbitrarily mapping source folders to destination folders during installation. bdist_msi would just need to be taught to interpret setup.cfg files.
I agree in principle, but one thing you get with setup.cfg which seems harder to achieve with MSI is the use of Python to do things at installation time. For example, with setup.cfg hooks, you can use ctypes to make Windows API calls at installation time to decide where to put things. While this same flexibility exists in the MSI format (with custom actions and so forth) it's not as readily accessible to someone who wants to use Python to code this type of installation logic.
Hmm. You are just describing a bug, not an inherent limitation.
You're right that it's not an inherent limitation, but I'm not sure which bug you're referring to. Do you mean just a current limitation? Regards, Vinay Sajip
On Mon, 31 Oct 2011 05:59:09 -0400
"Eric V. Smith"
It might be true that such systems don't need binary packages on PyPI, but the original question is about binary package support for the packaging module on non-Windows systems. I think the answer is clearly "yes": I have such systems without compilers. If I build packages on a staging server, I would want to put them on an internal PyPI-like server, for consumption by packaging. So packaging would need to consume these binary packages.
And it's not only compilers, it's also external libraries (which are generally not installed by default). For example, to compile pyOpenSSL, you first need to fetch the OpenSSL development headers. Regards Antoine.
On 31 October 2011 14:22, Antoine Pitrou
On Mon, 31 Oct 2011 05:59:09 -0400 "Eric V. Smith"
wrote: It might be true that such systems don't need binary packages on PyPI, but the original question is about binary package support for the packaging module on non-Windows systems. I think the answer is clearly "yes": I have such systems without compilers. If I build packages on a staging server, I would want to put them on an internal PyPI-like server, for consumption by packaging. So packaging would need to consume these binary packages.
And it's not only compilers, it's also external libraries (which are generally not installed by default). For example, to compile pyOpenSSL, you first need to fetch the OpenSSL development headers.
It sounds to me like there's a clear interest in some level of binary distribution support from packaging. Could anyone comment on whether the current level of support is sufficient? (My instinct says it isn't, but I don't want to put words in people's mouths). If not, a PEP may be the best way to move this forward, but as things stand I'm not entirely clear what that PEP should be proposing. My inclination (to make packaging and pysetup install capable of reading existing binary formats) doesn't seem to be sufficient for most people. Does anyone want to work with me on coming up with a PEP? Paul. PS Should this discussion move somewhere else? Maybe python-ideas or distutils-sig? I'm not sure it's well-formed enough for python-dev at the moment...
Hi,
I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) included a lot of good information and discussion, but ultimately didn't reach any firm conclusions.
I’m sorry there was no reply from the core group of packaging contributors. I read the messages as they flew by and wanted to reply on a lot of points, but didn’t get the time to do it. I hope the list subscribers won’t mind if I go through the threads in the coming days and make many replies. Cheers
In article
On 30 October 2011 18:04, Ned Deily
wrote: Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
A very quick and dirty check:
dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570
That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this).
Thanks. If you have access to the egg file name, you should be able to tell. AFAIK, eggs with extension modules include the Distutils platform name in the file name preceded by a '-', so '-linux', '-win32', '-macosx' for the main ones. Pure python eggs do not contain a platform name. http://pypi.python.org/pypi/pyinterval/ is a random example of the former. -- Ned Deily, nad@acm.org
On 31 October 2011 18:36, Ned Deily
In article
, Paul Moore wrote: On 30 October 2011 18:04, Ned Deily
wrote: Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
A very quick and dirty check:
dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570
That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this).
Thanks. If you have access to the egg file name, you should be able to tell. AFAIK, eggs with extension modules include the Distutils platform name in the file name preceded by a '-', so '-linux', '-win32', '-macosx' for the main ones. Pure python eggs do not contain a platform name. http://pypi.python.org/pypi/pyinterval/ is a random example of the former.
136 architecture-specific 2502 architecture independent About 5%. The numbers don't quite add up, so there's some funnies in there (possibly bad data that I'm not handling well) but it gives an idea. Counts by architecture: win32 70 linux-i686 43 win-amd64 33 linux-x86_64 26 macosx-10.3-fat 12 macosx-10.5-i386 11 macosx-10.6-universal 9 macosx-10.6-fat 8 macosx-10.3-i386 7 macosx-10.6-i386 6 macosx-10.7-intel 4 macosx-10.6-intel 3 macosx-10.6-x86_64 2 macosx-10.3-ppc 2 macosx-10.4-i386 2 macosx-10.4-ppc 2 py2.3-linux-i686 1 py2.4-linux-i686 1 gnu-0.3-i686-AT386 1 linux-ppc 1 cygwin-1.5.25-i686 1 py2.3 1 py2.4 1 py2.5 1 macosx-10.7-x86_64 1 macosx-10.4-universal 1 py2.5-linux-i686 1 Most of the 1-counts are bad data in some form. I'm not sure what this proves, to be honest, but what I take from it is: - Nearly all binary distributions are for Windows - Architecture-neutral eggs are common (but not relevant here as packaging can install from source with these) - Ignoring architecture-neutral eggs, most popular formats are wininst, egg, dumb(!!!) and msi - Even the most popular binary format (wininst) only accounts for 2% of all packages. Having said all of this, there are two major caveats I'd include: - Not everything is on PyPI. - This analysis ignores relative importance. It's hard to claim that numpy is no more significant than, say, "Products.CMFDynamicViewFTI" (whatever that might be - I picked it at random, so apologies to the author :-)) Paul.
On Sun, Oct 30, 2011 at 6:52 PM, Paul Moore
On 30 October 2011 18:04, Ned Deily
wrote: Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
A very quick and dirty check:
dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570
That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this).
FYI, the egg filename will contain a distutils platform identifier (e.g. 'win32', 'macosx', 'linux', etc.) after the 'py2.x' tag if the egg is platform-specific. Otherwise, it's pure Python.
Urgh. I guess that was already answered. Guess this'll teach me not to
reply to a thread before waiting for ALL the messages to download over a
low-bandwidth connection... (am on the road at the moment and catching up
on stuff in spare cycles - sorry for the noise)
On Fri, Nov 4, 2011 at 10:24 PM, PJ Eby
On Sun, Oct 30, 2011 at 6:52 PM, Paul Moore
wrote: On 30 October 2011 18:04, Ned Deily
wrote: Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format?
A very quick and dirty check:
dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570
That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this).
FYI, the egg filename will contain a distutils platform identifier (e.g. 'win32', 'macosx', 'linux', etc.) after the 'py2.x' tag if the egg is platform-specific. Otherwise, it's pure Python.
I agree in principle, but one thing you get with setup.cfg which seems harder to achieve with MSI is the use of Python to do things at installation time. For example, with setup.cfg hooks, you can use ctypes to make Windows API calls at installation time to decide where to put things. While this same flexibility exists in the MSI format (with custom actions and so forth) it's not as readily accessible to someone who wants to use Python to code this type of installation logic.
Again, that's a bdist_msi implementation issue. It could generate custom actions that run the "proper" setup.cfg hooks (I presume - I have no idea what a setup.cfg hook actually is). Regards, Martin
Martin v. Löwis
Again, that's a bdist_msi implementation issue. It could generate custom actions that run the "proper" setup.cfg hooks (I presume - I have no idea what a setup.cfg hook actually is).
I know that custom hooks are quite powerful, but my comment was about having the functionality in Python. Here's an example of a working hooks.py: import os import sys if os.name == 'nt': def get_personal_path(): from ctypes import (wintypes, windll, create_unicode_buffer, WinError, c_int, HRESULT) from ctypes.wintypes import HWND, HANDLE, DWORD, LPWSTR, MAX_PATH CSIDL_PERSONAL = 5 # We use an older API to remain XP-compatible. SHGetFolderPath = windll.shell32.SHGetFolderPathW SHGetFolderPath.argtypes = [HWND, c_int, HANDLE, DWORD, LPWSTR] SHGetFolderPath.restype = DWORD path = create_unicode_buffer(MAX_PATH) hr = SHGetFolderPath(0, CSIDL_PERSONAL, 0, 0, path) if hr != 0: raise WinError() return path.value path = get_personal_path() del get_personal_path # Assume ~\Documents\WindowsPowerShell\Modules is in $PSModulePath, # which should be true in a default installation of PowerShell 2.0. psroot = os.path.join(path, 'WindowsPowerShell') psmodules = os.path.join(psroot, 'Modules') psscripts = os.path.join(psroot, 'Scripts') def setup(config): files = config['files'] if os.name != 'nt': files_to_add = 'virtualenvwrapper.sh = {scripts}' else: files_to_add = ('winfiles/ *.ps* = ' '{psmodules}/virtualenvwrapper\n' 'winfiles/ vew_profile.ps1 = {psscripts}') if 'resources' not in files: files['resources'] = files_to_add else: files['resources'] += '\n%s' % files_to_add def pre_install_data(cmd): if os.name == 'nt': cmd.categories['psmodules'] = psmodules cmd.categories['psscripts'] = psscripts cmd.categories['psroot'] = psroot which works with the following setup.cfg: [global] setup_hooks = hooks.setup [install_data] pre-hook.win32 = hooks.pre_install_data categories = cat1 = /path/one # comment cat2 = /path/two #[install_dist] #post-hook.win32 = hooks.post_install_dist [metadata] name = nemo version = 0.1 summary = New Environments Made, Obviously description = A tool to manage virtual environments download_url = UNKNOWN home_page = https://bitbucket.org/vinay.sajip/nemo author = Vinay Sajip author_email = vinay_sajip@yahoo.co.uk license = BSD classifier = Development Status :: 3 - Alpha Programming Language :: Python :: 3 Operating System :: OS Independent Intended Audience :: System Administrators Intended Audience :: Developers License :: OSI Approved :: BSD License requires_python = >= 3.3 [files] packages = nemo virtualenvwrapper scripts = nemo = nemo.main extra_files = hooks.py winfiles/* # Additional esources are added in hooks based on platform resources = nemo/scripts/** = {purelib} I'm curious to know how this level of flexibility can be achieved with the MSI format: I know one can code the equivalent logic in C (for example) in a custom action, but don't know how you can keep the logic in Python. Regards, Vinay Sajip
On 7 November 2011 09:26, Vinay Sajip
Martin v. Löwis
writes: Again, that's a bdist_msi implementation issue. It could generate custom actions that run the "proper" setup.cfg hooks (I presume - I have no idea what a setup.cfg hook actually is).
I know that custom hooks are quite powerful, but my comment was about having the functionality in Python. Here's an example of a working hooks.py:
It seems to me that there are two separate things going on in this sample. It's not 100% clear that they are separate, at first glance, as packaging currently doesn't make a strong distinction between things going on at "build" time, and things going on at "install" time. This is essentially because the idea of binary installs is not fundamental to the design. (Thanks for sharing this example, btw, I hadn't really spotted this issue until I saw the code here). Suppose you have two people involved - the "packager" who uses the source code to create a binary distribution (MSI, wininst, zip, doesn't matter - conceptually, it's a set of "final" files that need no further processing and can just be put in the correct locations on the target PC) and the "end user" who takes that binary distribution and installs it on his PC. Some of the hook code is designed to run at "build" time (the stuff that adds the right resource files). This can be run on the packager's machine quite happily, as long as the packager is using the same OS as the end user. However, other parts of the hook code (the stuff that defines the custom categories) must run on the end user's PC, as it detects specific aspects of the target PC configuration. I think Martin is only really interested in the second type of hook here. I know that I am, insofar as they are the only type I would expect to need to support if I were building a new binary distribution format. But without the two types being more clearly separated, it's not obvious that it's possible to "just support one type" in quite that sense... Paul. PS There are subtleties here, of course - byte-compiling .py files is probably an install-time action rather than a build-time one, so my "no further processing required" comment isn't 100% true. But the basic principle certainly applies.
It seems to me that there are two separate things going on in this sample. It's not 100% clear that they are separate, at first glance, as packaging currently doesn't make a strong distinction between things going on at "build" time, and things going on at "install" time. This is essentially because the idea of binary installs is not fundamental to the design. (Thanks for sharing this example, btw, I hadn't really spotted this issue until I saw the code here).
Suppose you have two people involved - the "packager" who uses the source code to create a binary distribution (MSI, wininst, zip, doesn't matter - conceptually, it's a set of "final" files that need no further processing and can just be put in the correct locations on the target PC) and the "end user" who takes that binary distribution and installs it on his PC.
Some of the hook code is designed to run at "build" time (the stuff that adds the right resource files). This can be run on the packager's machine quite happily, as long as the packager is using the same OS as the end user. However, other parts of the hook code (the stuff that defines the custom categories) must run on the end user's PC, as it detects specific aspects of the target PC configuration.
In this case at least, the code *all* runs at installation time: the distributed package contains all files for all platforms, and at installation time the choice is made as to which files to actually install from the installation directory to the target directories. While this might not be ideal for all packagers, the only downside of having all files for all platforms available in a single distribution is disk space - an increasingly cheap commodity. OTOH there is some advantage in having a single package which would be usable on all platforms (supported by the package being distributed), albeit perhaps for a particular version of Python.
I think Martin is only really interested in the second type of hook here. I know that I am, insofar as they are the only type I would expect to need to support if I were building a new binary distribution format. But without the two types being more clearly separated, it's not obvious that it's possible to "just support one type" in quite that sense...
In terms of the flexibility required, the code to determine the "personal path" is being run at installation time, to determine the target folder for the PowerShell scripts. It's this kind of flexibility (by which I mean Python coded logic) that I don't see how to easily provide in the MSI format, short of recoding in e.g. C, in a custom action DLL or EXE. (This latter approach is what I've used in the PEP 397 launcher MSI.) Regards, Vinay Sajip
I'm curious to know how this level of flexibility can be achieved with the MSI format: I know one can code the equivalent logic in C (for example) in a custom action, but don't know how you can keep the logic in Python.
I'd provide a fixed custom action which gets hold of the installer session, and then runs a Python script. IIUC, it should be possible to map categories to entries in the Directory table, so that the Python script would actually configure the installer process before the installer actually starts installing the files. The DLL could be part of packaging, similar to how the bdist_wininst executable is part of distutils. Regards, Martin
I'd provide a fixed custom action which gets hold of the installer session, and then runs a Python script. IIUC, it should be possible to map categories to entries in the Directory table, so that the Python script would actually configure the installer process before the installer actually starts installing the files. The DLL could be part of packaging, similar to how the bdist_wininst executable is part of distutils.
Presumably the code in the DLL would need to be independent of Python, and find the correct Python version to run? Perhaps a variable in the .MSI could serve to indicate the version dependency. It's certainly feasible, but needs specifying in more detail ... Regards, Vinay Sajip
participants (10)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Eric V. Smith
-
INADA Naoki
-
Ned Deily
-
Paul Moore
-
PJ Eby
-
Tres Seaver
-
Vinay Sajip
-
Éric Araujo