Packaging multiple wheels in the same archive
Hey all, As part of working on Cloudify (http://github.com/cloudify-cosmo) we've had to provide a way for customers to install our plugins in an environment where PyPI isn't accessible. These plugins are sets of Python packages which necessarily depend on one another (i.e. a regular python package with dependencies). We decided that we want to package sets of wheels together created or downloaded by `pip wheel`, add relevant metadata, package them together into a single archive (tar.gz or zip) and use the same tool which packs them up to install them later on, on the destination hosts. We came up with a tool (http://github.com/cloudify-cosmo/wagon) to do just that and that's what we currently use to create and install our plugins. While wheel solves the problem of generating wheels, there is no single, standard method for taking an entire set of dependencies packaged in a single location and installing them in a different location. We thought it would be a good idea to propose a PEP for that and wanted to get your feedback before we start writing the proposal. Our proposed implementation is not the issue here of course, but rather if you think there should be a PEP describing the way multiple wheels should be packaged together to create a standalone installable package. We would greatly appreciate your feedback on this. Thanks!
[Some folks are going to get this twice - unfortunately, Google's mailing list mirrors are fundamentally broken, so replies to them don't actually go to the original mailing list properly] (Note for context: I stumbled across Wagon recently, and commented that we don't currently have a good target-environment-independent way of bundling up a set of wheels as a single transferable unit) On 23 November 2016 at 03:44, Nir Cohen <nir36g@gmail.com> wrote:
We came up with a tool (http://github.com/cloudify-cosmo/wagon) to do just that and that's what we currently use to create and install our plugins. While wheel solves the problem of generating wheels, there is no single, standard method for taking an entire set of dependencies packaged in a single location and installing them in a different location.
Where I see this being potentially valuable is in terms of having a common "multiwheel" transfer format that can be used for cases where the goal is essentially wheelhouse caching and transfer. The two main cases I'm aware of where this comes up: - offline installation support (i.e. the Cloudify plugins use case, where the installation environment doesn't have networked access to an index server) - saving and restoring the wheelhouse cache (e.g. this comes up in container build pipelines) The latter problem arises from an issue with the way some container build environments (most notable Docker's) currently work: they always run in a clean environment, which means they can't see the host's wheel cache. One of the solutions to this is to let container builds specify a "cache state" which is archived by the build management service at the end of the build process, and then restored when starting the next incremental image build. This kind of cache transfer is already *possible* today, but having a standardised way of doing it makes it easier for people to write general purpose tooling around the concept, without requiring that the tool used to create the archive be the same tool used to unpack it at install time. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
This then ties into Kenneth's pipfile idea he's working on as it then makes sense to make a wagon/wheelhouse for a lock file. To also tie into the container aspect, if you dev on Windows but deploy to Linux, this can allow for gathering your dependencies locally for Linux on your Windows box and then deploy the set as a unit to your server (something Steve Dower and I have thought about and why we support a lock file concept). And if we use zip files with no nesting then as long as it's only Python code you could use zipimporter on the bundle directly. On Tue, Nov 22, 2016, 22:07 Nick Coghlan, <ncoghlan@gmail.com> wrote:
[Some folks are going to get this twice - unfortunately, Google's mailing list mirrors are fundamentally broken, so replies to them don't actually go to the original mailing list properly]
(Note for context: I stumbled across Wagon recently, and commented that we don't currently have a good target-environment-independent way of bundling up a set of wheels as a single transferable unit)
On 23 November 2016 at 03:44, Nir Cohen <nir36g@gmail.com> wrote:
We came up with a tool (http://github.com/cloudify-cosmo/wagon) to do just that and that's what we currently use to create and install our plugins. While wheel solves the problem of generating wheels, there is no single, standard method for taking an entire set of dependencies packaged in a single location and installing them in a different location.
Where I see this being potentially valuable is in terms of having a common "multiwheel" transfer format that can be used for cases where the goal is essentially wheelhouse caching and transfer. The two main cases I'm aware of where this comes up:
- offline installation support (i.e. the Cloudify plugins use case, where the installation environment doesn't have networked access to an index server) - saving and restoring the wheelhouse cache (e.g. this comes up in container build pipelines)
The latter problem arises from an issue with the way some container build environments (most notable Docker's) currently work: they always run in a clean environment, which means they can't see the host's wheel cache. One of the solutions to this is to let container builds specify a "cache state" which is archived by the build management service at the end of the build process, and then restored when starting the next incremental image build.
This kind of cache transfer is already *possible* today, but having a standardised way of doing it makes it easier for people to write general purpose tooling around the concept, without requiring that the tool used to create the archive be the same tool used to unpack it at install time.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Well, creating on Windows and deploying on Linux will only be possible if the entire set of dependencies either have no C extensions or are manylinux1 wheels.. but yeah, that's pretty much what we're doing right now with our reference implementation. Regarding zipimporter, as far as I understand (correct me if I'm wrong) there's no such a solution for wheels (i.e. you can't use zipimporter on a zip of wheels) so does that means we'll have to package python files for all dependencies directly in the archive? Our current implementation simply runs `pip wheel --wheel-dir /my/wheelhosue/path --find-links /my/wheelhosue/path`, packages the wheelhouse, adds metadata and applies a name to the file and on the destination machine, wagon simply extracts the wheels and runs `pip install --no-index --find-links /extracted/wheelhosue/path`. On Wed, Nov 23, 2016 at 9:30 PM Brett Cannon <brett@python.org> wrote:
This then ties into Kenneth's pipfile idea he's working on as it then makes sense to make a wagon/wheelhouse for a lock file. To also tie into the container aspect, if you dev on Windows but deploy to Linux, this can allow for gathering your dependencies locally for Linux on your Windows box and then deploy the set as a unit to your server (something Steve Dower and I have thought about and why we support a lock file concept).
And if we use zip files with no nesting then as long as it's only Python code you could use zipimporter on the bundle directly.
On Tue, Nov 22, 2016, 22:07 Nick Coghlan, <ncoghlan@gmail.com> wrote:
[Some folks are going to get this twice - unfortunately, Google's mailing list mirrors are fundamentally broken, so replies to them don't actually go to the original mailing list properly]
(Note for context: I stumbled across Wagon recently, and commented that we don't currently have a good target-environment-independent way of bundling up a set of wheels as a single transferable unit)
On 23 November 2016 at 03:44, Nir Cohen <nir36g@gmail.com> wrote:
We came up with a tool (http://github.com/cloudify-cosmo/wagon) to do just that and that's what we currently use to create and install our plugins. While wheel solves the problem of generating wheels, there is no single, standard method for taking an entire set of dependencies packaged in a single location and installing them in a different location.
Where I see this being potentially valuable is in terms of having a common "multiwheel" transfer format that can be used for cases where the goal is essentially wheelhouse caching and transfer. The two main cases I'm aware of where this comes up:
- offline installation support (i.e. the Cloudify plugins use case, where the installation environment doesn't have networked access to an index server) - saving and restoring the wheelhouse cache (e.g. this comes up in container build pipelines)
The latter problem arises from an issue with the way some container build environments (most notable Docker's) currently work: they always run in a clean environment, which means they can't see the host's wheel cache. One of the solutions to this is to let container builds specify a "cache state" which is archived by the build management service at the end of the build process, and then restored when starting the next incremental image build.
This kind of cache transfer is already *possible* today, but having a standardised way of doing it makes it easier for people to write general purpose tooling around the concept, without requiring that the tool used to create the archive be the same tool used to unpack it at install time.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 24 November 2016 at 16:45, Nir Cohen <nir36g@gmail.com> wrote:
Well, creating on Windows and deploying on Linux will only be possible if the entire set of dependencies either have no C extensions or are manylinux1 wheels.. but yeah, that's pretty much what we're doing right now with our reference implementation.
Regarding zipimporter, as far as I understand (correct me if I'm wrong) there's no such a solution for wheels (i.e. you can't use zipimporter on a zip of wheels) so does that means we'll have to package python files for all dependencies directly in the archive?
Right, there would be a couple of significant barriers to doing this in the general case: - firstly, wheels themselves are officially only a transport format, with direct imports being a matter of "we're not going to do anything to deliberately break the cases that work, but you're also on your own if anything goes wrong for any given use case": https://www.python.org/dev/peps/pep-0427/#is-it-possible-to-import-python-co... - secondly, I don't think zipimporter handles archives-within-archives - it handles directories within archives, so it would require that the individual wheels by unpacked and the whole structure archived as one big directory tree Overall, it sounds to me more like the "archive an entire installed virtual environment" use case than it does the "transfer a collection of pre-built artifacts from point A to point B" use case (which, to be fair, is an interesting use case in its own right, its just a slightly different problem). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Should we provide that as an abstraction in the wagon maybe so that it allows for easy importing? On Thu, Nov 24, 2016 at 3:49 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On 24 November 2016 at 16:45, Nir Cohen <nir36g@gmail.com> wrote:
Well, creating on Windows and deploying on Linux will only be possible if the entire set of dependencies either have no C extensions or are manylinux1 wheels.. but yeah, that's pretty much what we're doing right now with our reference implementation.
Regarding zipimporter, as far as I understand (correct me if I'm wrong) there's no such a solution for wheels (i.e. you can't use zipimporter on a zip of wheels) so does that means we'll have to package python files for all dependencies directly in the archive?
Right, there would be a couple of significant barriers to doing this in the general case:
- firstly, wheels themselves are officially only a transport format, with direct imports being a matter of "we're not going to do anything to deliberately break the cases that work, but you're also on your own if anything goes wrong for any given use case":
https://www.python.org/dev/peps/pep-0427/#is-it-possible-to-import-python-co... - secondly, I don't think zipimporter handles archives-within-archives - it handles directories within archives, so it would require that the individual wheels by unpacked and the whole structure archived as one big directory tree
Overall, it sounds to me more like the "archive an entire installed virtual environment" use case than it does the "transfer a collection of pre-built artifacts from point A to point B" use case (which, to be fair, is an interesting use case in its own right, its just a slightly different problem).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Nov 23, 2016 11:33 AM, "Brett Cannon" <brett@python.org> wrote:
This then ties into Kenneth's pipfile idea he's working on as it then
makes sense to make a wagon/wheelhouse for a lock file. To also tie into the container aspect, if you dev on Windows but deploy to Linux, this can allow for gathering your dependencies locally for Linux on your Windows box and then deploy the set as a unit to your server (something Steve Dower and I have thought about and why we support a lock file concept).
And if we use zip files with no nesting then as long as it's only Python
code you could use zipimporter on the bundle directly. The "only Python code" restriction pretty much rules this out as anything like a general solution though... If people are investigating this, though, pex should also be considered as a source of prior art / inspiration: https://pex.readthedocs.io/en/stable/whatispex.html -n
participants (4)
-
Brett Cannon
-
Nathaniel Smith
-
Nick Coghlan
-
Nir Cohen