Beyond wheels 1.0: helping downstream, FHS and more
Hi there,
During pycon, Nick mentioned there was interest in updating the wheel
format to support downstream distributions. Nick mentioned Linux
distributions, but I would like to express interest for other kind of
downstream distributors like Anaconda from Continuum or Canopy from
Enthought (disclaimer: I work for Enthought).
Right now, wheels have the following limitations for us:
1. lack of post/pre install/removing
2. more fine-grained installation scheme
3. lack of clarify on which tags vendors should use for custom wheels: some
packages we provide would not be installable on "normal" python, and it
would be nice to have a scheme to avoid confusion there as well.
At least 1. and 2. are of interest not just for us.
Regarding 2., it looks like anything in the
On Apr 13, 2015, at 10:39 AM, David Cournapeau
wrote: Hi there,
During pycon, Nick mentioned there was interest in updating the wheel format to support downstream distributions. Nick mentioned Linux distributions, but I would like to express interest for other kind of downstream distributors like Anaconda from Continuum or Canopy from Enthought (disclaimer: I work for Enthought).
Right now, wheels have the following limitations for us:
1. lack of post/pre install/removing 2. more fine-grained installation scheme 3. lack of clarify on which tags vendors should use for custom wheels: some packages we provide would not be installable on "normal" python, and it would be nice to have a scheme to avoid confusion there as well.
At least 1. and 2. are of interest not just for us.
Regarding 2., it looks like anything in the
.data/data directory will be placed as is in sys.prefix by pip. This is how distutils scheme is defined ATM, but I am not sure whether that's by design or accident ? I would suggest to use something close to autotools, with some tweaks to work well on windows. I implemented something like this in my project bento (https://github.com/cournape/Bento/blob/master/bento/core/platforms/sysconfig... https://github.com/cournape/Bento/blob/master/bento/core/platforms/sysconfig...), but we could of course tweak that.
For 1., I believe it was a conscious decision not to include them in wheel 1.0 ? Would it make sense to start a discussion to add it to wheel ?
I will be at the pycon sprints until wednesday evening, so that we can flesh some concrete proposal first, if there is enough interest.
As a background: at Enthought, we have been using eggs to distribute binaries of python packages and other packages (e.g. C libraries, compiled binaries, etc...) for a very long time. We had our own extensions to the egg format to support this, but I want to get out of eggs so as to make our own software more compatible with where the community is going. I would also like to avoid making ad-hoc extensions to wheels for our own purposes.
To my knowledge, (1) was purposely punted until a later revision of Wheel just to make it easier to land the “basic” wheel. I think (2) is a reasonable thing as long as we can map it sanely on all platforms. I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Mon, Apr 13, 2015 at 10:44 AM, Donald Stufft
On Apr 13, 2015, at 10:39 AM, David Cournapeau
wrote: Hi there,
During pycon, Nick mentioned there was interest in updating the wheel format to support downstream distributions. Nick mentioned Linux distributions, but I would like to express interest for other kind of downstream distributors like Anaconda from Continuum or Canopy from Enthought (disclaimer: I work for Enthought).
Right now, wheels have the following limitations for us:
1. lack of post/pre install/removing 2. more fine-grained installation scheme 3. lack of clarify on which tags vendors should use for custom wheels: some packages we provide would not be installable on "normal" python, and it would be nice to have a scheme to avoid confusion there as well.
At least 1. and 2. are of interest not just for us.
Regarding 2., it looks like anything in the
.data/data directory will be placed as is in sys.prefix by pip. This is how distutils scheme is defined ATM, but I am not sure whether that's by design or accident ? I would suggest to use something close to autotools, with some tweaks to work well on windows. I implemented something like this in my project bento ( https://github.com/cournape/Bento/blob/master/bento/core/platforms/sysconfig...), but we could of course tweak that.
For 1., I believe it was a conscious decision not to include them in wheel 1.0 ? Would it make sense to start a discussion to add it to wheel ?
I will be at the pycon sprints until wednesday evening, so that we can flesh some concrete proposal first, if there is enough interest.
As a background: at Enthought, we have been using eggs to distribute binaries of python packages and other packages (e.g. C libraries, compiled binaries, etc...) for a very long time. We had our own extensions to the egg format to support this, but I want to get out of eggs so as to make our own software more compatible with where the community is going. I would also like to avoid making ad-hoc extensions to wheels for our own purposes.
To my knowledge, (1) was purposely punted until a later revision of Wheel just to make it easier to land the “basic” wheel.
Great. Was there any proposal made to support it at all ? Or should I just work from scratch there ?
I think (2) is a reasonable thing as long as we can map it sanely on all platforms.
Yes. We support all platforms at Enthought, and Windows is important for us !
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python More generally, we could be modifying python in a way that is not forward compatible with upstream python: a binary that works on our python may not work on the python from python.org (though the opposite is true). It would be nice if one could make sure pip will not try to install those eggs when installed on top of a python that does not advertise itself as "compatible". David
On Mon, Apr 13, 2015 at 10:54 AM, David Cournapeau
On Mon, Apr 13, 2015 at 10:44 AM, Donald Stufft
wrote: On Apr 13, 2015, at 10:39 AM, David Cournapeau
wrote: Hi there,
During pycon, Nick mentioned there was interest in updating the wheel format to support downstream distributions. Nick mentioned Linux distributions, but I would like to express interest for other kind of downstream distributors like Anaconda from Continuum or Canopy from Enthought (disclaimer: I work for Enthought).
Right now, wheels have the following limitations for us:
1. lack of post/pre install/removing 2. more fine-grained installation scheme 3. lack of clarify on which tags vendors should use for custom wheels: some packages we provide would not be installable on "normal" python, and it would be nice to have a scheme to avoid confusion there as well.
At least 1. and 2. are of interest not just for us.
Regarding 2., it looks like anything in the
.data/data directory will be placed as is in sys.prefix by pip. This is how distutils scheme is defined ATM, but I am not sure whether that's by design or accident ? I would suggest to use something close to autotools, with some tweaks to work well on windows. I implemented something like this in my project bento (https://github.com/cournape/Bento/blob/master/bento/core/platforms/sysconfig...), but we could of course tweak that.
For 1., I believe it was a conscious decision not to include them in wheel 1.0 ? Would it make sense to start a discussion to add it to wheel ?
I will be at the pycon sprints until wednesday evening, so that we can flesh some concrete proposal first, if there is enough interest.
As a background: at Enthought, we have been using eggs to distribute binaries of python packages and other packages (e.g. C libraries, compiled binaries, etc...) for a very long time. We had our own extensions to the egg format to support this, but I want to get out of eggs so as to make our own software more compatible with where the community is going. I would also like to avoid making ad-hoc extensions to wheels for our own purposes.
To my knowledge, (1) was purposely punted until a later revision of Wheel just to make it easier to land the “basic” wheel.
Great. Was there any proposal made to support it at all ? Or should I just work from scratch there ?
I think (2) is a reasonable thing as long as we can map it sanely on all platforms.
Yes. We support all platforms at Enthought, and Windows is important for us !
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
More generally, we could be modifying python in a way that is not forward compatible with upstream python: a binary that works on our python may not work on the python from python.org (though the opposite is true). It would be nice if one could make sure pip will not try to install those eggs when installed on top of a python that does not advertise itself as "compatible"
We need a hook to alter pip's list of compatible tags (pip.pep425tags), and to alter the default tags used by bdist_wheel when creating wheels. One sensible proposal for "special" wheels is to just use a truncated hash of the platform description (a random hex string) in place of the wheel platform tag.
NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make things easier for OS-X users. I’m not sure what (3) means exactly. What is a “normal” Python, do you
modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries likely have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something like libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package For the most part, the accepted solution for OS-X has been to statically link, but: - it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky. - now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous. On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues. I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so this should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies that are platform dependent and even worse, build-dependent. Meanwhile, conda was chugging along and getting a lot of momentum in the Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...) So I have been focusing on conda lately. Which brings me back to the question: should the python tools (i.e. wheel) be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy). I'm on the fence here. I mostly care about Python, and I think we're pretty darn close with allowing wheel to support the non-python dependencies, which would allow us all to "simply pip install" pretty much anything -- that would be cool. But maybe it's a bit of a slippery slope, and if we go there, we'll end up re-writing conda. BTW, while you can't generally install a conda package in/for another python, you can generally install a wheel in a conda python....There are a few issues with pip/setuptools trying to resolve dependencies while not knowing about conda packages, but it does mostly work. Not sure that helped the discussion -- but I've been wrestling with this for a while, so thought I'd get my thoughts out there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, Apr 13, 2015 at 12:56 PM, Chris Barker
NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make things easier for OS-X users.
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries likely have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something like libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package
For the most part, the accepted solution for OS-X has been to statically link, but:
- it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky.
- now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous.
On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues.
I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so this should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies that are platform dependent and even worse, build-dependent.
Meanwhile, conda was chugging along and getting a lot of momentum in the Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...)
So I have been focusing on conda lately.
Which brings me back to the question: should the python tools (i.e. wheel) be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
I'm on the fence here. I mostly care about Python, and I think we're pretty darn close with allowing wheel to support the non-python dependencies, which would allow us all to "simply pip install" pretty much anything -- that would be cool. But maybe it's a bit of a slippery slope, and if we go there, we'll end up re-writing conda.
BTW, while you can't generally install a conda package in/for another python, you can generally install a wheel in a conda python....There are a few issues with pip/setuptools trying to resolve dependencies while not knowing about conda packages, but it does mostly work.
Not sure that helped the discussion -- but I've been wrestling with this for a while, so thought I'd get my thoughts out there.
I've always thought of wheel as solving only the Python-specific problem. Providing relocatable Python-specific packaging without trying to solve the intractable problem of non-Python dependencies. The strategy works best the more you are targeting "Python" as your platform and not a specific OS or distribution - sometimes it works well, other times not at all. Obviously if you need a specific build of PostgreSQL wheel isn't going to help you. With enough hacks you could make it work but are we ready to "pip install kde"? I don't think so. Personally I'm happy to let other tools solve the problem of C-level virtualenv. It's been suggested you could have a whole section in your Python package that said "by the way, RedHat package x, or Debian package y, or Gentoo package z", or use a separate package equivalency mapping as a level of indirection. I don't think this would be very good either. Instead, if you are doing system-level stuff, you should just use a system-level or user-level packaging tool that can easily re-package Python packages such as conda, rpm, deb, etc.
On Mon, Apr 13, 2015 at 12:56 PM, Chris Barker
NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make things easier for OS-X users.
I’m not sure what (3) means exactly. What is a “normal” Python, do you
modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries likely have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something like libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package
For the most part, the accepted solution for OS-X has been to statically link, but:
- it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky.
- now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous.
On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues.
I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so this should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies that are platform dependent and even worse, build-dependent.
Meanwhile, conda was chugging along and getting a lot of momentum in the Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...)
So I have been focusing on conda lately.
The whole reason I started this discussion is to make sure wheel has a standard way to do what is needed for those usecases. conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on the exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed. We've used eggs for that purpose for almost 10 years at Enthought, and we did not need that many extensions on top of the egg format after all.
Which brings me back to the question: should the python tools (i.e. wheel) be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
IMO, given that wheels do most of what's needed, it is worth supporting most simple usecases (compiled libraries required by well known extensions). Right now, such packages (pyzmq, numpy, cryptography, lxml) resort to quite horrible custom hacks to support those cases. Hope that clarifies the intent, David
I'm on the fence here. I mostly care about Python, and I think we're pretty darn close with allowing wheel to support the non-python dependencies, which would allow us all to "simply pip install" pretty much anything -- that would be cool. But maybe it's a bit of a slippery slope, and if we go there, we'll end up re-writing conda.
BTW, while you can't generally install a conda package in/for another python, you can generally install a wheel in a conda python....There are a few issues with pip/setuptools trying to resolve dependencies while not knowing about conda packages, but it does mostly work.
Not sure that helped the discussion -- but I've been wrestling with this for a while, so thought I'd get my thoughts out there.
-Chris
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
On Mon, Apr 13, 2015 at 3:46 PM, David Cournapeau
On Mon, Apr 13, 2015 at 12:56 PM, Chris Barker
wrote: NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make things easier for OS-X users.
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries likely have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something like libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package
For the most part, the accepted solution for OS-X has been to statically link, but:
- it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky.
- now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous.
On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues.
I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so this should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies that are platform dependent and even worse, build-dependent.
Meanwhile, conda was chugging along and getting a lot of momentum in the Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...)
So I have been focusing on conda lately.
The whole reason I started this discussion is to make sure wheel has a standard way to do what is needed for those usecases.
conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on the exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed. We've used eggs for that purpose for almost 10 years at Enthought, and we did not need that many extensions on top of the egg format after all.
Which brings me back to the question: should the python tools (i.e. wheel) be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
IMO, given that wheels do most of what's needed, it is worth supporting most simple usecases (compiled libraries required by well known extensions). Right now, such packages (pyzmq, numpy, cryptography, lxml) resort to quite horrible custom hacks to support those cases.
Hope that clarifies the intent,
David
Then it sounds like I should read about the Enthought egg extensions. It's something else than just defining a separate pypi name for "just the libxml.so without the python bits"?
I would advise against using or even reading about our egg extensions, as
the implementation is full of legacy (we've been doing this many years :)
): http://enstaller.readthedocs.org/en/master/reference/egg_format.html
This is what we use on top of setuptools egg:
- ability to add dependencies which are not python packages (I think most
of it is already handled in metadata 2.0/PEP 426, but I would have to
re-read the PEP carefully).
- ability to run post/pre install/remove scripts
- support for all the of the autotools directories, with "sensible"
mapping on windows
- a few extensions to the actual binary format (adding support for
symlinks is the only one I can think of ATM).
Everything else is legacy you really don't want to know (see here if you
still want to
http://enstaller.readthedocs.org/en/master/reference/egg_format.html)
David
On Mon, Apr 13, 2015 at 3:55 PM, Daniel Holth
On Mon, Apr 13, 2015 at 3:46 PM, David Cournapeau
wrote: On Mon, Apr 13, 2015 at 12:56 PM, Chris Barker
wrote: NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make
easier for OS-X users.
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries
have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something
libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package
For the most part, the accepted solution for OS-X has been to statically link, but:
- it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky.
- now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous.
On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues.
I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so this should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies that are platform dependent and even worse, build-dependent.
Meanwhile, conda was chugging along and getting a lot of momentum in the Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...)
So I have been focusing on conda lately.
The whole reason I started this discussion is to make sure wheel has a standard way to do what is needed for those usecases.
conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on
exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed. We've used eggs for that purpose for almost 10 years at Enthought, and we did not need that many extensions on top of
things likely like the the
egg format after all.
Which brings me back to the question: should the python tools (i.e.
wheel)
be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
IMO, given that wheels do most of what's needed, it is worth supporting most simple usecases (compiled libraries required by well known extensions). Right now, such packages (pyzmq, numpy, cryptography, lxml) resort to quite horrible custom hacks to support those cases.
Hope that clarifies the intent,
David
Then it sounds like I should read about the Enthought egg extensions. It's something else than just defining a separate pypi name for "just the libxml.so without the python bits"?
Seems like you could extend wheel to do that easily.
On Apr 13, 2015 4:19 PM, "David Cournapeau"
I would advise against using or even reading about our egg extensions, as the implementation is full of legacy (we've been doing this many years :) ): http://enstaller.readthedocs.org/en/master/reference/egg_format.html
This is what we use on top of setuptools egg:
- ability to add dependencies which are not python packages (I think most of it is already handled in metadata 2.0/PEP 426, but I would have to re-read the PEP carefully). - ability to run post/pre install/remove scripts - support for all the of the autotools directories, with "sensible" mapping on windows - a few extensions to the actual binary format (adding support for symlinks is the only one I can think of ATM).
Everything else is legacy you really don't want to know (see here if you still want to http://enstaller.readthedocs.org/en/master/reference/egg_format.html)
David
On Mon, Apr 13, 2015 at 3:55 PM, Daniel Holth
wrote: On Mon, Apr 13, 2015 at 3:46 PM, David Cournapeau
wrote: On Mon, Apr 13, 2015 at 12:56 PM, Chris Barker
wrote: NOTE: I don't work for any of the companies involved -- just a somewhat frustrated user... And someone that has been trying for years to make
easier for OS-X users.
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
It could be multiple things. The most obvious one is that generally. cross-platforms python distributions will try to be "relocatable" (i.e. the whole installation can be moved and still work). This means they require python itself to be built a special way. Strictly speaking, it is not an ABI issue, but the result is the same though: you can't use libraries from anaconda or canopy on top of a normal python
But why not? -- at least for Anaconda, it's because those libraries
have non-python dependencies, which are expected to be installed in a particular way. And really, this is not particular to Anaconda/Canopy at all. Python itself has no answer for this issue, and eggs and wheels don't help. Well, maybe kinda sorta they do, but in a clunky/ugly way: in order to build a binary wheel with non-python dependencies (let's say something
libjpeg, for instance), you need to either: - assume that libjpeg is installed in a "standard" place -- really no solution at all (at least outside of linux) - statically link it - ship the dynamic lib with the package
For the most part, the accepted solution for OS-X has been to statically link, but:
- it's a pain to do. The gnu toolchain really likes to use dynamic linking, and building a static lib that will run on a maybe-older-than-the-build-system machine is pretty tricky.
- now we end up with multiple copies of the same lib in the python install. There are a handful of libs that are used a LOT. Maybe there is no real downside -- disk space and memory are cheap these days, but it sure feels ugly. And I have yet to feel comfortable with having multiple versions of the same lib linked into one python instance -- I can't say I've seen a problem, but it makes me nervous.
On Windows, the choices are the same, except that: It is so much harder to build many of the "standard" open source libs that package authors are more likely to do it for folks, and you do get the occasional "dll hell" issues.
I had a plan to make some binary wheels for OS-X that were not really python packages, but actually just bundled up libs, so that other wheels could depend on them. OS-X does allow linking to relative paths, so
should have been doable, but I never got anyone else to agree this was a good idea, and I never found the roundtoits anyway. And it doesn't really fit into the PyPi, pip, wheel, etc. philosphy to have dependencies
platform dependent and even worse, build-dependent.
Meanwhile, conda was chugging along and getting a lot of momentum in
Scientific community. And the core thing here is that conda was designed from the ground up to support essentially anything, This means is supports python packages that depend on non-python packages, but also supports packages that have nothing to do with python (Perl, command line tools, what have you...)
So I have been focusing on conda lately.
The whole reason I started this discussion is to make sure wheel has a standard way to do what is needed for those usecases.
conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on
exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed. We've used eggs for that purpose for almost 10 years at Enthought, and we did not need that many extensions on top of
things likely like this that are the the the
egg format after all.
Which brings me back to the question: should the python tools (i.e.
wheel)
be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
IMO, given that wheels do most of what's needed, it is worth supporting most simple usecases (compiled libraries required by well known extensions). Right now, such packages (pyzmq, numpy, cryptography, lxml) resort to quite horrible custom hacks to support those cases.
Hope that clarifies the intent,
David
Then it sounds like I should read about the Enthought egg extensions. It's something else than just defining a separate pypi name for "just the libxml.so without the python bits"?
On Mon, Apr 13, 2015 at 1:19 PM, David Cournapeau
This is what we use on top of setuptools egg:
- ability to add dependencies which are not python packages (I think most of it is already handled in metadata 2.0/PEP 426, but I would have to re-read the PEP carefully). - ability to run post/pre install/remove scripts - support for all the of the autotools directories, with "sensible" mapping on windows
Are these inside or outside the python installation? I'm more than a bit wary of a wheel that would install stuff outside of the "sandbox" of the python install. The whole reason I started this discussion is to make sure wheel has a
standard way to do what is needed for those usecases.
conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on the exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed.
hmm -- true. I guess where it seems to get more complicated is beyond the wheel (or conda, or...) package itself, to the dependency management, installation tools, etc. But perhaps you are suggesting that we can extend wheel to support a bt more stuff, and leave the rest of the system as separate problem? i.e. Canopy can have it's own find, install, manage-dependency tool, but that it can use the wheel format for the packages themselves? I don't see why not.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, Apr 13, 2015 at 5:25 PM, Chris Barker
On Mon, Apr 13, 2015 at 1:19 PM, David Cournapeau
wrote: This is what we use on top of setuptools egg:
- ability to add dependencies which are not python packages (I think most of it is already handled in metadata 2.0/PEP 426, but I would have to re-read the PEP carefully). - ability to run post/pre install/remove scripts - support for all the of the autotools directories, with "sensible" mapping on windows
Are these inside or outside the python installation? I'm more than a bit wary of a wheel that would install stuff outside of the "sandbox" of the python install.
I would always install things relative to sys.prefix, for exactly the reasons you mention.
The whole reason I started this discussion is to make sure wheel has a
standard way to do what is needed for those usecases.
conda, rpm, deb, or eggs as used in enthought are all essentially the same: an archive with a bunch of metadata. The real issue is standardising on the exact formats. As you noticed, there is not much missing in the wheel *spec* to get most of what's needed.
hmm -- true. I guess where it seems to get more complicated is beyond the wheel (or conda, or...) package itself, to the dependency management, installation tools, etc.
But perhaps you are suggesting that we can extend wheel to support a bt more stuff, and leave the rest of the system as separate problem? i.e. Canopy can have it's own find, install, manage-dependency tool, but that it can use the wheel format for the packages themselves?
Exactly ! David
On 14 April 2015 at 09:35, David Cournapeau
On Apr 13, 2015, at 10:17 PM, Robert Collins
wrote: On 14 April 2015 at 09:35, David Cournapeau
wrote: ... One of the earlier things mentioned here - {pre,post}{install,remove} scripts - raises a red flag for me.
In Debian at least, the underlying system has the ability to run such turing complete scripts, and they are a rich source of bugs - both correctness and performance related.
Nowadays nearly all such scripts are machine generated from higher level representations such as 'this should be the default command' or 'configure X if Y is installed', but because the plumbing is turing complete, they all need to be executed, which slows down install/upgrade paths, and any improvement to the tooling requires a version bump on *all* the packages using it - because effectively the package is itself a compiled artifact.
I'd really prefer it if we keep wheels 100% declarative, and instead focus on defining appropriate metadata for the things you need to accomplish {pre,post}{install,remove} of a package.
A possible way to implement {pre,post}{install,remove} scripts is to instead turn them into extensions. One example is that Twisted uses a setup.py hack to regenerate a cache file of all of the registered plugins. This needs to happen at install time due to permission issues. Currently you can’t get this speedup when installing something that uses twisted plugins and installing via Wheel. So a possible way for this to work is in a PEP 426 world, simply define a twisted.plugins extension that says, in a declarative way, “hey when you install this Wheel, if there’s a plugin that understands this extension installed, let it do something before you actually move the files into place”. This let’s Wheels themselves still be declarative and moves the responsibility of implementing these bits into their own PyPI projects that can be versioned and independently upgraded and such. We’d probably need some method of marking an extension as “critical” (e.g. bail out and don’t install this Wheel if you don’t have something that knows how to handle it) and then non critical extensions just get ignored if we don’t know how to handle it. Popular extensions could possibly be added directly to pip at some point if a lot of people are using them (or even moved from third party extension to officially supported extension). --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 13 April 2015 at 22:29, Donald Stufft
So a possible way for this to work is in a PEP 426 world, simply define a twisted.plugins extension that says, in a declarative way, “hey when you install this Wheel, if there’s a plugin that understands this extension installed, let it do something before you actually move the files into place”. This let’s Wheels themselves still be declarative and moves the responsibility of implementing these bits into their own PyPI projects that can be versioned and independently upgraded and such. We’d probably need some method of marking an extension as “critical” (e.g. bail out and don’t install this Wheel if you don’t have something that knows how to handle it) and then non critical extensions just get ignored if we don’t know how to handle it.
Right, this is the intent of the "Required extension handling" feature: https://www.python.org/dev/peps/pep-0426/#required-extension-handling If a package flags an extension as "installer_must_handle", then attempts to install that package are supposed to fail if the installer doesn't recognise the extension. Otherwise, installers are free to ignore extensions they don't understand. So meta-installers like canopy could add their own extensions to their generated wheel files, flag those extensions as required, and other installers would correctly reject those wheels as unsupported. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
If there’s a plugin that understands this extension
installed, let it do something before you actually move the files into place”. This let’s Wheels themselves still be declarative and moves the responsibility of implementing these bits into their own PyPI projects that can be versioned and independently upgraded and such. We’d probably need some method of marking an extension as “critical” (e.g. bail out and don’t install this Wheel if you don’t have something that knows how to handle it) and then non critical extensions just get ignored if we don’t know how to handle it.
Could an "extension" be -- "run this arbitrary Python script" ? We've got a full featured scripting language (with batteries included!) -- isn't that all the extension you need? Or is this about security? We don't want to let a package do virtually anything on install? -CHB
Right, this is the intent of the "Required extension handling" feature: https://www.python.org/dev/peps/pep-0426/#required-extension-handling
If a package flags an extension as "installer_must_handle", then attempts to install that package are supposed to fail if the installer doesn't recognise the extension. Otherwise, installers are free to ignore extensions they don't understand.
So meta-installers like canopy could add their own extensions to their generated wheel files, flag those extensions as required, and other installers would correctly reject those wheels as unsupported.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 14 April 2015 at 17:10, Chris Barker
If there’s a plugin that understands this extension
installed, let it do something before you actually move the files into place”. This let’s Wheels themselves still be declarative and moves the responsibility of implementing these bits into their own PyPI projects that can be versioned and independently upgraded and such. We’d probably need some method of marking an extension as “critical” (e.g. bail out and don’t install this Wheel if you don’t have something that knows how to handle it) and then non critical extensions just get ignored if we don’t know how to handle it.
Could an "extension" be -- "run this arbitrary Python script" ?
The main point (as I see it) of an "extension" is that it's distributed independently of the packages that use it. So you get to decide to use an extension (and by inference audit it if you want) *before* it gets run as part of an installation. Extensions get peer review by the community, and bad ones get weeded out, in a way that just having a chunk of code in your setup.py or the postinstall section of your wheel doesn't.
We've got a full featured scripting language (with batteries included!) -- isn't that all the extension you need?
Up to a point yes. It's the independent review and quality control aspects that matter to me.
Or is this about security? We don't want to let a package do virtually anything on install?
Security is one aspect, and one that a lot of people will pick up on immediately. But there's also portability. And code quality. And consistency. I'd be much happier installing a project that used a well-known "start menu manager extension" than one that just used custom code. I'd be willing to assume that the author of the extension had thought about Unix/Windows compatibility, how to handle use in a virtualenv, handling user preferences (such as the end user *not wanting* shortcuts), etc etc. And I could look at the extension project's issue tracker to see how happy I was with the state of the project. Of course, if the project I want to install makes using the extension mandatory for the install to work, I still don't have a real choice - I accept the extension or I can't use the code I want - but there's an extra level of transparency involved. And hopefully most extensions will be optional, in practice. Paul
On Tue, Apr 14, 2015 at 9:46 AM, Paul Moore
Could an "extension" be -- "run this arbitrary Python script" ?
The main point (as I see it) of an "extension" is that it's distributed independently of the packages that use it. So you get to decide to use an extension (and by inference audit it if you want) *before* it gets run as part of an installation.
OK, I think this is getting clearer to me now -- an Extension is (I suppose arbitrary) block of python code, but what goes into the wheel is not the code, but rather a declarative configuration for the extension. then at install-time, the actual code that runs is separate from the wheel, which gives the end user greater control, plus these nifty features....
Extensions get peer review by the community, and bad ones get weeded out,
the independent review and quality control
there's also portability. And code quality. And consistency.
And I'll add that this would promote code re-use and DRY. I'd be much happier installing a project that used a well-known "start
menu manager extension"
So where would that code live? and how would it be managed? I'm thinking: - In package on PyPi like anything else - a specification in install_requires - pip auto-installs it (if not already there) when the user goes to install the wheel. Is that the idea? Of course, if the project I want to install makes using the extension
mandatory for the install to work, I still don't have a real choice - I accept the extension or I can't use the code I want -
well, you can't easily auto-install it anyway -- you could still do a source install, presumably. but there's an
extra level of transparency involved. And hopefully most extensions will be optional, in practice.
There's a bit to think about in the API/UI here. If an installation_extension is used by a package, and it's specified in install_requires, then it's going to get auto-magically installed an used with a regular old "pip install". If we are worried about code review and users being in control of what extensions they use, then how to we make it obvious that a given extension is in use, but optional, and how to turn it off if you want? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 14 April 2015 at 22:02, Chris Barker
- pip auto-installs it (if not already there) when the user goes to install the wheel.
Personally, I'm not a fan of auto-installing, so I'd hope for something more like pip would fail to install if a required extension were missing. The user would then install the extension and redo the install. But that may be a minority opinion - it's a bit like setup_requires in principle, and people seem to prefer that to be auto-installed. Paul Paul
On Tue, Apr 14, 2015 at 4:19 PM, Paul Moore
On 14 April 2015 at 22:02, Chris Barker
wrote: Personally, I'm not a fan of auto-installing, so I'd hope for something more like pip would fail to install if a required extension were missing. The user would then install the extension and redo the install. But that may be a minority opinion - it's a bit like setup_requires in principle, and people seem to prefer that to be auto-installed.
(lurker surfaces) I'm with Paul on this one. It seems to me that auto-installing the extension would destroy most of the advantages of distributing the extensions separately. I _might_ not hate it if pip prompted the user and _then_ installed, but then again, I might. (lurker sinks back into the depths) -- Kevin Horn
On Tue, Apr 14, 2015 at 8:57 PM, Kevin Horn
Personally, I'm not a fan of auto-installing,
I'm with Paul on this one. It seems to me that auto-installing the extension would destroy most of the advantages of distributing the extensions separately.
Exactly -- I actually tossed that one out there because I wanted to know what folks were thinking, but also a bit of bait ;-) -- we've got a conflict here: 1) These possible extensions are potentially dangerous, etc, and should be well reviewed and not just tossed in there. 2) People (and I'm one of them) really, really want "pip install" to "just work". (or conda install or enpkg, or...). If it's going to "just work", then it needs to find and install the extensions auto-magically, and then we're really not very far from running arbitrary code... Would that be that different than the fact that installing a given package automatically installs all sorts of other packages -- and most of us don't give that a good review before running install... (I just was showing off Shinx to a class last night -- quite amazing what gets brought in with a pip install of sphinx (including pytz -- I have no idea why). But at the end of the day, I don't care much either. I'm trusting that the Sphinx folks aren't doing something ridiculous or dangerous. Which brings us back to the "review of extensions" thing -- I think it's less about the end user checking it out and making a decision about it, but about the package builder doing that. I have a package I want to be easy to install on Windows -- so I go look for an extension that does the Start Menu, etc. Indeed, that kind of thing "'should" be part of pip and/or wheel, but it would probably be more successful if it were done as third party extensions -- perhaps over the years, the ones that rise to the top of usefulness can become standards. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 15 April 2015 at 21:40, Chris Barker
Which brings us back to the "review of extensions" thing -- I think it's less about the end user checking it out and making a decision about it, but about the package builder doing that. I have a package I want to be easy to install on Windows -- so I go look for an extension that does the Start Menu, etc. Indeed, that kind of thing "'should" be part of pip and/or wheel, but it would probably be more successful if it were done as third party extensions -- perhaps over the years, the ones that rise to the top of usefulness can become standards.
In the PEP, there's a concept of "optional" vs "required" extensions. See https://www.python.org/dev/peps/pep-0426/#required-extension-handling. This is crucial - I've no problem if a particular extension is used by a project, as long as it's optional. I won't install it, so it's fine. It seems to me that pip *has* to ignore missing optional extensions, for this reason. Of course, that introduces the converse problem, which is how would people who might want that extension to be activated, know that a project used it? Critical extensions, on the other hand, are precisely that - the install won't run without them. I'd hope that critical extensions will only be used for things where the installation will be useless without them. But I worry that some people may have a more liberal definition of "required" than I do. To be honest, I can't think of *anything* that I'd consider a "required" extension. Console script wrappers aren't essential, for example (you can use "python -m pip" even if pip.exe isn't present). More generally, none of the extensions in PEP 459 seem essential, in this sense. Start menu entry writers wouldn't be essential, nor would COM registration extensions necessarily be (most of pywin32's functionality works fine if the COM components aren't registered). Beyond that I'm struggling to think of things that might be extensions. So, as long as the "optional" vs "required" distinction is respected, people are conservative about deeming something as "essential", and a missing optional extension doesn't stop an install, then I don't see extensions as being a big issue. Based on the above, it's possibly valid to allow "required" extensions to be auto-installed. It *is* a vector for unexpected code execution, but maybe that's OK. Paul PS The idea of a "Start Menu entries" has come up a lot here. To be clear, I *don't* actually think such a thing is a good idea (far from it - I think it's a pretty lousy idea) but it is a good example of something that people think they ought to do, but in practice users have widely differing views on whet they prefer or will use, and a developer with limited experience could easily create a dreadful user experience without meaning to ("developer" here could either mean the extension developer, or the package developer using the extension - both have opportunities to make a horrible mess...) So it's a good straw man for "an extension that some people will love and others will hate" :-) PPS I'm inclined to think that the PEP should say "Installation tools MUST NOT fail if installer_must_handle is set to false for an extension that the tool cannot process. Installation tools SHOULD NOT attempt to install plugins or similar optional functionality to handle an extension with installer_must_handle set to false, except with explicit approval from the end user."
On the Start Menu suggestion, I think that's a horrible idea. Pip is not the system package manager and it shouldn't be changing the system. Unversioned script launchers are in the same category, but aren't quite as offensive.
I know it's only a hypothetical, but I'd much rather it didn't get repeated so often that it actually happens. There are better tools for making app installers, as opposed to package installers.
Cheers,
Steve
Top-posted from my Windows Phone
________________________________
From: Paul Mooremailto:p.f.moore@gmail.com
Sent: 4/15/2015 17:24
To: Chris Barkermailto:chris.barker@noaa.gov
Cc: distutils-sigmailto:distutils-sig@python.org
Subject: Re: [Distutils] Beyond wheels 1.0: helping downstream, FHS and more
On 15 April 2015 at 21:40, Chris Barker
Which brings us back to the "review of extensions" thing -- I think it's less about the end user checking it out and making a decision about it, but about the package builder doing that. I have a package I want to be easy to install on Windows -- so I go look for an extension that does the Start Menu, etc. Indeed, that kind of thing "'should" be part of pip and/or wheel, but it would probably be more successful if it were done as third party extensions -- perhaps over the years, the ones that rise to the top of usefulness can become standards.
In the PEP, there's a concept of "optional" vs "required" extensions. See https://www.python.org/dev/peps/pep-0426/#required-extension-handling. This is crucial - I've no problem if a particular extension is used by a project, as long as it's optional. I won't install it, so it's fine. It seems to me that pip *has* to ignore missing optional extensions, for this reason. Of course, that introduces the converse problem, which is how would people who might want that extension to be activated, know that a project used it? Critical extensions, on the other hand, are precisely that - the install won't run without them. I'd hope that critical extensions will only be used for things where the installation will be useless without them. But I worry that some people may have a more liberal definition of "required" than I do. To be honest, I can't think of *anything* that I'd consider a "required" extension. Console script wrappers aren't essential, for example (you can use "python -m pip" even if pip.exe isn't present). More generally, none of the extensions in PEP 459 seem essential, in this sense. Start menu entry writers wouldn't be essential, nor would COM registration extensions necessarily be (most of pywin32's functionality works fine if the COM components aren't registered). Beyond that I'm struggling to think of things that might be extensions. So, as long as the "optional" vs "required" distinction is respected, people are conservative about deeming something as "essential", and a missing optional extension doesn't stop an install, then I don't see extensions as being a big issue. Based on the above, it's possibly valid to allow "required" extensions to be auto-installed. It *is* a vector for unexpected code execution, but maybe that's OK. Paul PS The idea of a "Start Menu entries" has come up a lot here. To be clear, I *don't* actually think such a thing is a good idea (far from it - I think it's a pretty lousy idea) but it is a good example of something that people think they ought to do, but in practice users have widely differing views on whet they prefer or will use, and a developer with limited experience could easily create a dreadful user experience without meaning to ("developer" here could either mean the extension developer, or the package developer using the extension - both have opportunities to make a horrible mess...) So it's a good straw man for "an extension that some people will love and others will hate" :-) PPS I'm inclined to think that the PEP should say "Installation tools MUST NOT fail if installer_must_handle is set to false for an extension that the tool cannot process. Installation tools SHOULD NOT attempt to install plugins or similar optional functionality to handle an extension with installer_must_handle set to false, except with explicit approval from the end user." _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 16 April 2015 at 00:48, Steve Dower
On the Start Menu suggestion, I think that's a horrible idea. Pip is not the system package manager and it shouldn't be changing the system. Unversioned script launchers are in the same category, but aren't quite as offensive.
I know it's only a hypothetical, but I'd much rather it didn't get repeated so often that it actually happens. There are better tools for making app installers, as opposed to package installers.
Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme. Just to expand on another point in my mail - I'd like *anyone* to provide an example of a genuine use case for something they think should be a "required" installer extension. I'm not sure such a thing actually exists... Paul
On 16/04/2015 08:08, Paul Moore wrote:
On 16 April 2015 at 00:48, Steve Dower
wrote: On the Start Menu suggestion, I think that's a horrible idea. Pip is not the system package manager and it shouldn't be changing the system. Unversioned script launchers are in the same category, but aren't quite as offensive.
I know it's only a hypothetical, but I'd much rather it didn't get repeated so often that it actually happens. There are better tools for making app installers, as opposed to package installers.
Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32. He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly. I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact). The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed? TJG
On 16 April 2015 at 08:30, Tim Golden
Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32.
He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly.
Thanks for involving Mark in this. While pywin32 isn't the only project with a postinstall script, it's one of the most complex that I know of, and a good example to work from when looking at what projects need.
I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact).
To be clear, I don't have that much of a problem with pywin32. I don't use it myself, these days, but that's because (a) it's a very big, monolithic dependency, and (b) it's not usable directly with pip. The problem I have with it is that a lot of projects use it for simple access to the Win32 API (uses which can easily be handled by ctypes, possibly with slightly more messy code) and that means that they inherit the pywin32 problems. So I advise against pywin32 because of that, *not* because I think it's a problem itself, when used for situations where there isn't a better alternative.
The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed?
To be honest, for the cases I encounter frequently, these requirements don't come up. So my experience here goes back to the days when I used pywin32 to write COM servers and services, which was quite a while ago.
From what I recall, pywin32 has the following steps in its postinstall:
1. Create start menu entries. My view is that this should simply be dropped. Python packages should never be adding start menu entries. Steve Dower has confirmed he agrees with this view earlier on this thread. 2. Move the pywin32 DLLs to the system directory. I don't see any way this is compatible with per-user or virtualenv installs, so I don't know how to address this, other than again dropping the step. I've no idea why this is necessary, or precisely which parts of pywin32 require it (I've a recollection from a long time ago that "services written in Python" was the explanation, but that's all I know). But presumably such use cases already break with a per-user Python install? 3. Registering the ActiveX COM DLLs. I believe this is mostly obsolete technology these days (who still uses ActiveX Scripting in anything other than VBScript or maybe a bit of JScript?) I'd drop this and make it a step that the user has to do manually if they want it. In place of it, pywin32 could provide an entry point to register the DLLs ("python -m pywin32 --register-dlls" or something). Presumably users who need it would understand the implications, and how to avoid registering multiple environments or forgetting to unregister before dropping an environment, etc. That sort of pitfall isn't something Python should try to solve automatically via pre- and post- install scripts. 4. Registering help files. I never understood how that worked or why it was needed. So again, I'd say just drop it. Have I missed anything else? Paul
[cc-ing Mark H as he indicated he was open to be kept in the loop; also changed the title to reflect the shift of conversation] On 16/04/2015 09:21, Paul Moore wrote:
On 16 April 2015 at 08:30, Tim Golden
wrote: Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32.
He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly.
Thanks for involving Mark in this. While pywin32 isn't the only project with a postinstall script, it's one of the most complex that I know of, and a good example to work from when looking at what projects need.
I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact).
To be clear, I don't have that much of a problem with pywin32. I don't use it myself, these days, but that's because (a) it's a very big, monolithic dependency, and (b) it's not usable directly with pip. The problem I have with it is that a lot of projects use it for simple access to the Win32 API (uses which can easily be handled by ctypes, possibly with slightly more messy code) and that means that they inherit the pywin32 problems. So I advise against pywin32 because of that, *not* because I think it's a problem itself, when used for situations where there isn't a better alternative.
The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed?
To be honest, for the cases I encounter frequently, these requirements don't come up. So my experience here goes back to the days when I used pywin32 to write COM servers and services, which was quite a while ago.
From what I recall, pywin32 has the following steps in its postinstall:
1. Create start menu entries. My view is that this should simply be dropped. Python packages should never be adding start menu entries. Steve Dower has confirmed he agrees with this view earlier on this thread. 2. Move the pywin32 DLLs to the system directory. I don't see any way this is compatible with per-user or virtualenv installs, so I don't know how to address this, other than again dropping the step. I've no idea why this is necessary, or precisely which parts of pywin32 require it (I've a recollection from a long time ago that "services written in Python" was the explanation, but that's all I know). But presumably such use cases already break with a per-user Python install? 3. Registering the ActiveX COM DLLs. I believe this is mostly obsolete technology these days (who still uses ActiveX Scripting in anything other than VBScript or maybe a bit of JScript?) I'd drop this and make it a step that the user has to do manually if they want it. In place of it, pywin32 could provide an entry point to register the DLLs ("python -m pywin32 --register-dlls" or something). Presumably users who need it would understand the implications, and how to avoid registering multiple environments or forgetting to unregister before dropping an environment, etc. That sort of pitfall isn't something Python should try to solve automatically via pre- and post- install scripts. 4. Registering help files. I never understood how that worked or why it was needed. So again, I'd say just drop it.
Really, pywin32 is several things: a set of libraries (win32api, win32file, etc.); some system-level support for various things (COM registration, Service support etc.); and a development/editing environment (pythonwin). I see this ending up as (respectively): as venv-friendly wheel; a py -m script of the kind Paul suggests; and an installable app with the usual start menu icons etc. In my copious spare time I'll at least try to visit the pywin32 codebase to see how viable all this is. Feel free to challenge my thoughts on the matter. TJG
Tim,
As a long time user, I think you're right on the money.
My only concern is how to manage the transition in user experience, as
moving to what you've described (which I totally approve of, if it's
feasible) will be a significant change, and may break user expectations.
I think maybe the best thing to do is to change the existing binary
installer package to:
- install the included wheel in the system python environment
- run the various post-install scripts (py -m)
- install pythonwin, along with start menu icons, etc.
Those that want to use pywin32 in a virtualenv (or just without all the
system changes) could simply install the wheel (or even an sdist, perhaps)
from the command line using pip, and then perform whatever other steps they
want manually.
This would allow those who are installing using the installer package
(which I assume is almost everybody, right?) to get a similar experience to
the current one, while those wanting more control (use in virtualenvs, etc)
to have that as well.
I think the changes described have the potential to be a big win.
On Thu, Apr 16, 2015 at 5:11 AM, Tim Golden
[cc-ing Mark H as he indicated he was open to be kept in the loop; also changed the title to reflect the shift of conversation]
On 16/04/2015 09:21, Paul Moore wrote:
On 16 April 2015 at 08:30, Tim Golden
wrote: Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32.
He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly.
Thanks for involving Mark in this. While pywin32 isn't the only project with a postinstall script, it's one of the most complex that I know of, and a good example to work from when looking at what projects need.
I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact).
To be clear, I don't have that much of a problem with pywin32. I don't use it myself, these days, but that's because (a) it's a very big, monolithic dependency, and (b) it's not usable directly with pip. The problem I have with it is that a lot of projects use it for simple access to the Win32 API (uses which can easily be handled by ctypes, possibly with slightly more messy code) and that means that they inherit the pywin32 problems. So I advise against pywin32 because of that, *not* because I think it's a problem itself, when used for situations where there isn't a better alternative.
The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed?
To be honest, for the cases I encounter frequently, these requirements don't come up. So my experience here goes back to the days when I used pywin32 to write COM servers and services, which was quite a while ago.
From what I recall, pywin32 has the following steps in its postinstall:
1. Create start menu entries. My view is that this should simply be dropped. Python packages should never be adding start menu entries. Steve Dower has confirmed he agrees with this view earlier on this thread. 2. Move the pywin32 DLLs to the system directory. I don't see any way this is compatible with per-user or virtualenv installs, so I don't know how to address this, other than again dropping the step. I've no idea why this is necessary, or precisely which parts of pywin32 require it (I've a recollection from a long time ago that "services written in Python" was the explanation, but that's all I know). But presumably such use cases already break with a per-user Python install? 3. Registering the ActiveX COM DLLs. I believe this is mostly obsolete technology these days (who still uses ActiveX Scripting in anything other than VBScript or maybe a bit of JScript?) I'd drop this and make it a step that the user has to do manually if they want it. In place of it, pywin32 could provide an entry point to register the DLLs ("python -m pywin32 --register-dlls" or something). Presumably users who need it would understand the implications, and how to avoid registering multiple environments or forgetting to unregister before dropping an environment, etc. That sort of pitfall isn't something Python should try to solve automatically via pre- and post- install scripts. 4. Registering help files. I never understood how that worked or why it was needed. So again, I'd say just drop it.
Really, pywin32 is several things: a set of libraries (win32api, win32file, etc.); some system-level support for various things (COM registration, Service support etc.); and a development/editing environment (pythonwin).
I see this ending up as (respectively): as venv-friendly wheel; a py -m script of the kind Paul suggests; and an installable app with the usual start menu icons etc.
In my copious spare time I'll at least try to visit the pywin32 codebase to see how viable all this is. Feel free to challenge my thoughts on the matter.
TJG
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- -- Kevin Horn
This seems like a good time to remind everyone that "wheel convert"
can turn bdist_wininst .exe's to wheels. Both formats are designed to
preserve all the distutils file categories. In the future it would be
nice if the bdist_wininst .exe wrapper used wheel instead of its own
format. Then a single file would both be the Windows installer and a
valid wheel (except for the extension).
On Thu, Apr 16, 2015 at 9:43 AM, Kevin Horn
Tim,
As a long time user, I think you're right on the money.
My only concern is how to manage the transition in user experience, as moving to what you've described (which I totally approve of, if it's feasible) will be a significant change, and may break user expectations.
I think maybe the best thing to do is to change the existing binary installer package to: - install the included wheel in the system python environment - run the various post-install scripts (py -m) - install pythonwin, along with start menu icons, etc.
Those that want to use pywin32 in a virtualenv (or just without all the system changes) could simply install the wheel (or even an sdist, perhaps) from the command line using pip, and then perform whatever other steps they want manually.
This would allow those who are installing using the installer package (which I assume is almost everybody, right?) to get a similar experience to the current one, while those wanting more control (use in virtualenvs, etc) to have that as well.
I think the changes described have the potential to be a big win.
On Thu, Apr 16, 2015 at 5:11 AM, Tim Golden
wrote: [cc-ing Mark H as he indicated he was open to be kept in the loop; also changed the title to reflect the shift of conversation]
On 16/04/2015 09:21, Paul Moore wrote:
On 16 April 2015 at 08:30, Tim Golden
wrote: Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32.
He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly.
Thanks for involving Mark in this. While pywin32 isn't the only project with a postinstall script, it's one of the most complex that I know of, and a good example to work from when looking at what projects need.
I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact).
To be clear, I don't have that much of a problem with pywin32. I don't use it myself, these days, but that's because (a) it's a very big, monolithic dependency, and (b) it's not usable directly with pip. The problem I have with it is that a lot of projects use it for simple access to the Win32 API (uses which can easily be handled by ctypes, possibly with slightly more messy code) and that means that they inherit the pywin32 problems. So I advise against pywin32 because of that, *not* because I think it's a problem itself, when used for situations where there isn't a better alternative.
The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed?
To be honest, for the cases I encounter frequently, these requirements don't come up. So my experience here goes back to the days when I used pywin32 to write COM servers and services, which was quite a while ago.
From what I recall, pywin32 has the following steps in its postinstall:
1. Create start menu entries. My view is that this should simply be dropped. Python packages should never be adding start menu entries. Steve Dower has confirmed he agrees with this view earlier on this thread. 2. Move the pywin32 DLLs to the system directory. I don't see any way this is compatible with per-user or virtualenv installs, so I don't know how to address this, other than again dropping the step. I've no idea why this is necessary, or precisely which parts of pywin32 require it (I've a recollection from a long time ago that "services written in Python" was the explanation, but that's all I know). But presumably such use cases already break with a per-user Python install? 3. Registering the ActiveX COM DLLs. I believe this is mostly obsolete technology these days (who still uses ActiveX Scripting in anything other than VBScript or maybe a bit of JScript?) I'd drop this and make it a step that the user has to do manually if they want it. In place of it, pywin32 could provide an entry point to register the DLLs ("python -m pywin32 --register-dlls" or something). Presumably users who need it would understand the implications, and how to avoid registering multiple environments or forgetting to unregister before dropping an environment, etc. That sort of pitfall isn't something Python should try to solve automatically via pre- and post- install scripts. 4. Registering help files. I never understood how that worked or why it was needed. So again, I'd say just drop it.
Really, pywin32 is several things: a set of libraries (win32api, win32file, etc.); some system-level support for various things (COM registration, Service support etc.); and a development/editing environment (pythonwin).
I see this ending up as (respectively): as venv-friendly wheel; a py -m script of the kind Paul suggests; and an installable app with the usual start menu icons etc.
In my copious spare time I'll at least try to visit the pywin32 codebase to see how viable all this is. Feel free to challenge my thoughts on the matter.
TJG
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- -- Kevin Horn
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 16 April 2015 at 14:43, Kevin Horn
Those that want to use pywin32 in a virtualenv (or just without all the system changes) could simply install the wheel (or even an sdist, perhaps) from the command line using pip, and then perform whatever other steps they want manually.
Just as a data point, converting the existing wininst installer for pywin32 to a wheel (using wheel convert), and installing that via pip, is entirely workable (for the win32api/win32file type basic functionality). The pypiwin32 project (started by Glyph as a way of providing pywin32 wheels from PyPI) includes wheels for Python 3.x which I built that way, so it's certainly seen some level of use. The wheels are probably unnecessarily big, as they'll include all of pythonwin, and the ActiveX Scripting and service creation support, which I guess won't work in that configuration, but they are a good starting point for checking precisely what will work unmodified from a wheel. Paul
On Thu, Apr 16, 2015 at 9:58 AM, Paul Moore
Those that want to use pywin32 in a virtualenv (or just without all the system changes) could simply install the wheel (or even an sdist,
from the command line using pip, and then perform whatever other steps
On 16 April 2015 at 14:43, Kevin Horn
wrote: perhaps) they want manually.
Just as a data point, converting the existing wininst installer for pywin32 to a wheel (using wheel convert), and installing that via pip, is entirely workable (for the win32api/win32file type basic functionality). The pypiwin32 project (started by Glyph as a way of providing pywin32 wheels from PyPI) includes wheels for Python 3.x which I built that way, so it's certainly seen some level of use.
The wheels are probably unnecessarily big, as they'll include all of pythonwin, and the ActiveX Scripting and service creation support, which I guess won't work in that configuration, but they are a good starting point for checking precisely what will work unmodified from a wheel.
For people interested in a lightweight alternative to pywin32, we have the pywin32ctypes project, which started as a way to get access to win32 credentials without depending on DLL (to avoid file locking issues with inplace updates). The project is on github (https://github.com/enthought/pywin32-ctypes), and is already used by a few projects. We support both cffi and ctypes backends (the former to work out of the box on cpython, the latter to work on pypy). David
On 16 April 2015 at 11:11, Tim Golden
Really, pywin32 is several things: a set of libraries (win32api, win32file, etc.); some system-level support for various things (COM registration, Service support etc.); and a development/editing environment (pythonwin).
That sounds about right.
I see this ending up as (respectively): as venv-friendly wheel; a py -m script of the kind Paul suggests; and an installable app with the usual start menu icons etc.
Again, yes, that seems reasonable. Personally, for the uses I see of pywin32, it would make sense to split it into a number of separate wheels (win32api, win32file, ...) to reduce the dependency footprint for projects that only use one or two functions out of the whole thing, but honestly ctypes is probably still a better approach for that scenario, so the benefit of such a split is likely minimal.
In my copious spare time I'll at least try to visit the pywin32 codebase to see how viable all this is. Feel free to challenge my thoughts on the matter.
I think you're going in the right direction. The hardest parts are likely to be where the Windows architecture interferes (COM registration and services). Paul.
As already mentioned in this thread, most of the postinstall stuff is needed only for a subset of users - mainly those who want to write COM objects or Windows Services (and also people who want the shortcuts etc). pywin32 itself should be close to "portable" - eg, "setup.py install" doesn't run the postinstall script but leaves a largely functioning pywin32 install. So I think it should be relatively easy to achieve for pywin32 to work in a virtual env without running any of the post-install scripts, and I'd support any consolidation of the setup process to support this effort. Cheers, Mark On 16/04/2015 8:11 PM, Tim Golden wrote:
[cc-ing Mark H as he indicated he was open to be kept in the loop; also changed the title to reflect the shift of conversation]
On 16/04/2015 09:21, Paul Moore wrote:
On 16 April 2015 at 08:30, Tim Golden
wrote: Sorry - I agree it's an awful idea. Older wininst installers such as the pywin32 (and I think the PyQT one) one do this, I wanted to use it as an example of abuse of postinstall scripts that should *not* be perpetuated in any new scheme.
FWIW I've just had a to-and-fro by email with Mark Hammond. I gather that he's now given Glyph access to the PyPI & hg setup for pywin32.
He's also happy to consider changes to the setup process to support wheel/virtualenv/postinstall improvements. There's been a side discussion on the pywin32 list about which versions of Python pywin32 should continue to support going forward, which obviously interacts with the idea of making it wheel/virtualenv-friendly.
Thanks for involving Mark in this. While pywin32 isn't the only project with a postinstall script, it's one of the most complex that I know of, and a good example to work from when looking at what projects need.
I'm not sure what Glyph's plan is at this point -- doubtless he can speak for himself. I gather from Paul's comments earlier that he's not a particular fan of pywin32. If the thing seems to have legs, I'm happy to coordinate changes to the setup. (I am, technically, a pywin32 committer although I've never made use of that fact).
To be clear, I don't have that much of a problem with pywin32. I don't use it myself, these days, but that's because (a) it's a very big, monolithic dependency, and (b) it's not usable directly with pip. The problem I have with it is that a lot of projects use it for simple access to the Win32 API (uses which can easily be handled by ctypes, possibly with slightly more messy code) and that means that they inherit the pywin32 problems. So I advise against pywin32 because of that, *not* because I think it's a problem itself, when used for situations where there isn't a better alternative.
The particular issue I'm not sure about is: how does Paul see pywin32's postinstall steps working when they *are* needed, ie when someone wants to install pywin32 as a wheel and wants the COM registration to happen? Or is that a question of: run these steps manually once pip's completed?
To be honest, for the cases I encounter frequently, these requirements don't come up. So my experience here goes back to the days when I used pywin32 to write COM servers and services, which was quite a while ago.
From what I recall, pywin32 has the following steps in its postinstall:
1. Create start menu entries. My view is that this should simply be dropped. Python packages should never be adding start menu entries. Steve Dower has confirmed he agrees with this view earlier on this thread. 2. Move the pywin32 DLLs to the system directory. I don't see any way this is compatible with per-user or virtualenv installs, so I don't know how to address this, other than again dropping the step. I've no idea why this is necessary, or precisely which parts of pywin32 require it (I've a recollection from a long time ago that "services written in Python" was the explanation, but that's all I know). But presumably such use cases already break with a per-user Python install? 3. Registering the ActiveX COM DLLs. I believe this is mostly obsolete technology these days (who still uses ActiveX Scripting in anything other than VBScript or maybe a bit of JScript?) I'd drop this and make it a step that the user has to do manually if they want it. In place of it, pywin32 could provide an entry point to register the DLLs ("python -m pywin32 --register-dlls" or something). Presumably users who need it would understand the implications, and how to avoid registering multiple environments or forgetting to unregister before dropping an environment, etc. That sort of pitfall isn't something Python should try to solve automatically via pre- and post- install scripts. 4. Registering help files. I never understood how that worked or why it was needed. So again, I'd say just drop it.
Really, pywin32 is several things: a set of libraries (win32api, win32file, etc.); some system-level support for various things (COM registration, Service support etc.); and a development/editing environment (pythonwin).
I see this ending up as (respectively): as venv-friendly wheel; a py -m script of the kind Paul suggests; and an installable app with the usual start menu icons etc.
In my copious spare time I'll at least try to visit the pywin32 codebase to see how viable all this is. Feel free to challenge my thoughts on the matter.
TJG
On 16 Apr 2015 03:08, "Paul Moore"
Just to expand on another point in my mail - I'd like *anyone* to provide an example of a genuine use case for something they think should be a "required" installer extension. I'm not sure such a thing actually exists...
The constraints extension in PEP 459 recommends flagging extension processing as required, otherwise it's possible for unaware installers to silently skip the compatibility checks: https://www.python.org/dev/peps/pep-0459/#the-python-constraints-extension Installers offering the ability to opt in to ignoring environmental constraints is one thing, ignoring them through lack of understanding the extension is something else entirely. Cheers, Nick.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On Wed, Apr 15, 2015 at 2:23 PM, Paul Moore
In the PEP, there's a concept of "optional" vs "required" extensions. See https://www.python.org/dev/peps/pep-0426/#required-extension-handling. This is crucial - I've no problem if a particular extension is used by a project, as long as it's optional. I won't install it, so it's fine. It seems to me that pip *has* to ignore missing optional extensions, for this reason. Of course, that introduces the converse problem, which is how would people who might want that extension to be activated, know that a project used it?
Exactly -- we do want "pip install" to just work...
But I worry that some people may have a more liberal definition of "required" than I do.
They probably do -- if they want things to "just work" We have the same problem with optional dependencies. For instance, for iPython to work, you don't need much. but if you want the ipython notebook to work, you need tornado, zeromq, who knows what else. But people want it to just work -- and just work be default, so you want all that optional stuff to go in by default. I expect this is the same with wheel installer extensions. To use your example, for instance. People want to do: pip install sphinx and then have the sphinx-quickstart utility ready to go. by default. So scripts need to be installed by default. The trade-off between convenience and control/security is tough.
Based on the above, it's possibly valid to allow "required" extensions to be auto-installed. It *is* a vector for unexpected code execution, but maybe that's OK.
If even required extensions aren't auto installed, then we can just toss out the whole idea of automatic dependency management. (which I personally wouldn't mind, actually, but I'm weird that way) But maybe we need some "real" use cases to talk about -- I agree with others in this thread that the Start menu isn't a good example. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 16 April 2015 at 17:58, Chris Barker
We have the same problem with optional dependencies.
For instance, for iPython to work, you don't need much. but if you want the ipython notebook to work, you need tornado, zeromq, who knows what else. But people want it to just work -- and just work be default, so you want all that optional stuff to go in by default.
But none of those are installed by default with ipython - they are covered by extras. If you want them, you ask for them. Thanks to extras, ipython offers a nice shortcut for you - pip install ipython[all] - but you still have to ask for them.
I expect this is the same with wheel installer extensions. To use your example, for instance. People want to do:
pip install sphinx
and then have the sphinx-quickstart utility ready to go. by default. So scripts need to be installed by default.
Yes, the script wrapper extension is a much better example of a "nobody would ever want this off" extension. But is it really "required"? The distribution would work fine if scripts weren't installed. My understanding of the required/optional distinction in the PEP is that an extension is required if the installation would be *broken* if that extension wasn't supported. And having to type "python -m something" rather than just "something" isn't broken, it's just an inconvenience. In practice, I'd assume script wrappers would be an extension built into pip, so it would always be available. But that's different from "required". Installing via "wheel install" (which doesn't support generating script wrappers) still works. Actually, I just checked PEP 459, which says of the "python.commands" extension: "Build tools SHOULD mark this extension as requiring handling by installers". So I stand corrected - script wrappers should be considered mandatory. In practice, though, what that means is that pip will be fine (as it'll have support built in) and wheel will be a non-compliant installer (as it neither supports generating wrappers, nor does it give an error when asked to create them - maybe an error will get added, but I doubt it as wheel install isn't intended to be a full installer). I've no idea at this point what distil or easy_install will do, much less any other installers out there.
The trade-off between convenience and control/security is tough.
Certainly. But that's about what is available by default, and how the installer (pip) handles the user interface for "this package says that it gives you some extra functionality if you have extensions X, Y, and Z". There's no convenience or UI implication with required extensions - if they aren't available, the installer refuses to install the distribution. Simple as that. Maybe pip could try to locate and download mandatory extensions before deciding whether to fail the install, but the package metadata doesn't say how to find such installer plugins (and it *can't* - because the plugin would be different for pip, easy-install, distil or wheel, so all it can say is "I need plugin foo" and the installer has to know the rest). That's an installation program quality of implementation issue though. Given that a random project could add metadata "extensions: { foocorp.randomstuff: { "installer_must_handle": true, "foocorp.localdata": "Some random stuff" } } there is no way that pip has any means of discovering where to get code to handle the "foocorp.randomstuff" extension from. So in the general case, auto-installing required extension support just isn't possible. At best, pip could have a registry of plugins that support standard extensions (i.e. those covered by a PEP) but I'd expect that we'd just build them into pip (as we don't have a plugin interface at the moment).
Based on the above, it's possibly valid to allow "required" extensions to be auto-installed. It *is* a vector for unexpected code execution, but maybe that's OK.
If even required extensions aren't auto installed, then we can just toss out the whole idea of automatic dependency management. (which I personally wouldn't mind, actually, but I'm weird that way)
I disagree. It's no different conceptually than the fact that if you don't have a C compiler, you can't install a package that contains C code and only comes in sdist format today. The UI in pip for noticing and saying "you need a C compiler" is terrible (you get a build error which might mention that you don't have the compiler, if you're lucky :-)). And yet people survive. So a clear error saying "package X needs a handler for extension Y to install" is a major improvement over the current situation. (I know C compilers are build-step and extensions are install-step, but right now the user experience doesn't clearly distinguish these, so the analogy holds). Whether users want pip to go one step further and auto-install the plugin is unknown at this point. So far, it seems that the only people who have expressed an opinion (you and I) aren't particularly pushing for auto-installing (unless I misinterpreted your "which I personally wouldn't mind" comment). For a proper view, we'd need a concrete example of a package with a specific required extension, that pip was unlikely to include out of the box. Or we could just not worry for now, and wait to see what feedback we got from a non-automatic initial implementation in real use.
But maybe we need some "real" use cases to talk about -- I agree with others in this thread that the Start menu isn't a good example.
+10000 To focus discussion, I think we need - A credible "required" extension (python.constraints or python.commands from PEP 459) - A credible "required" extension that pip wouldn't provide builtin support for - A credible "optional" extension (most of the other extensions in PEP 459, for example exports) - A credible "optional" extension that pip wouldn't provide builtin support for I've separated out things that pip wouldn't provide builtin support for, because those are the only ones where there's a real question about "what do we do if support isn't available", at least from a pip point of view. In practice, that probably means "not defined in an accepted PEP" (I'd expect pip to build in support for standardised extensions). By the way. I just did a check through PEPs 426 and 459. Neither one currently defines a "postinstall script" metadata item or extension, which is interesting given that this discussion started from the question of how postinstall actions would be supported. There *have* been discussions in the past, and I could have sworn they ended up in a PEP somewhere, but maybe I was wrong. Paul
On 16 Apr 2015 14:34, "Paul Moore"
By the way. I just did a check through PEPs 426 and 459. Neither one currently defines a "postinstall script" metadata item or extension, which is interesting given that this discussion started from the question of how postinstall actions would be supported. There *have* been discussions in the past, and I could have sworn they ended up in a PEP somewhere, but maybe I was wrong.
Arbitrary postinstall operations haven't ended up in a PEP yet because *I* don't like them. Anyone that has experienced "Windows rot" where uninstallers fail to clean up properly after themselves has seen first hand the consequences of delegating trust to development teams without the ability to set any minimum expectations for their quality assurance processes. (One way of looking at Linux distro packaging policies is to view them as a code review process applied to Turing complete software build and installation programs, while container tech like Docker is a way of isolating apps from the host system) Trigger based programming is hard at the best of times, and it doesn't get easier when integrating arbitrary pieces of software written by different people at different times in different contexts. On the other hand, I *am* prepared to build in an escape hatch that lets folks disagree with me, and I'll just not install their software. As far as *pip* goes, whether or not to add a plugin system to handle additional metadata extensions would be up to the pip devs. As a user, my main request if the pip devs decided to add such a plugin system would be that extension handlers couldn't be implicitly installed as a dependency of another package. If folks want their installs to "just work", they shouldn't be marking non-standard metadata extensions as mandatory :) Cheers, Nick.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
For the most part, I think it's all been said. What should and shouldn't be installed by default is really specific extension dependent, not much point in speculating. But a comment or two: having to type
"python -m something" rather than just "something" isn't broken, it's just an inconvenience.
Tell that to a newbie. This is EXACTLY the kind of thing that should "just work". Maybe a three-tier system: 1) mandatory -- can't install without it 2) default -- try to install it be default if possible 3) optional: only install if specifically asked for And this isn't just about extensions -- for instance, the "all" stuff in iPython would be well served by level 2
It's no different conceptually than the fact that if you don't have a C compiler, you can't install a package that contains C
Sure it is -- a C complier is a system tool, and the whole point of binary wheels is that the end user doesn't need one. -CHB
On 18 April 2015 at 18:19, Chris Barker - NOAA Federal
"python -m something" rather than just "something" isn't broken, it's just an inconvenience.
Tell that to a newbie. This is EXACTLY the kind of thing that should "just work".
It's a huge "quality of implementation" issue, certainly - any installer that doesn't include script generation built in is going to be as annoying as hell to a user. But they do exist (wheel install, for instance) and the resulting installation "works", even if a newcomer would hate it. So it's not "mandatory" in the sense that no functionality is lost. But this is a moot point, as PEP 459 says the python.commands extension SHOULD be marked as required. And wheel install would technically be in violation of PEP 426, as it doesn't handle script wrappers and it doesn't fail when a package needs them (only "technically", because PEP 426 isn't finalised yet, and "wheel install" could be updated to support it). But I'd already said most of that - you just pulled that one point out of context. Paul
The wheel installer does call setuptools to generate console script wrappers. On Apr 18, 2015 1:36 PM, "Paul Moore"
On 18 April 2015 at 18:19, Chris Barker - NOAA Federal
wrote: (your quote trimming's a bit over-enthusiastic, you lost the attribution here)
"python -m something" rather than just "something" isn't broken, it's just an inconvenience.
Tell that to a newbie. This is EXACTLY the kind of thing that should "just work".
It's a huge "quality of implementation" issue, certainly - any installer that doesn't include script generation built in is going to be as annoying as hell to a user. But they do exist (wheel install, for instance) and the resulting installation "works", even if a newcomer would hate it. So it's not "mandatory" in the sense that no functionality is lost. But this is a moot point, as PEP 459 says the python.commands extension SHOULD be marked as required. And wheel install would technically be in violation of PEP 426, as it doesn't handle script wrappers and it doesn't fail when a package needs them (only "technically", because PEP 426 isn't finalised yet, and "wheel install" could be updated to support it).
But I'd already said most of that - you just pulled that one point out of context.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 18 April 2015 at 13:36, Paul Moore
On 18 April 2015 at 18:19, Chris Barker - NOAA Federal
wrote: (your quote trimming's a bit over-enthusiastic, you lost the attribution here)
"python -m something" rather than just "something" isn't broken, it's just an inconvenience.
Tell that to a newbie. This is EXACTLY the kind of thing that should "just work".
It's a huge "quality of implementation" issue, certainly - any installer that doesn't include script generation built in is going to be as annoying as hell to a user. But they do exist (wheel install, for instance) and the resulting installation "works", even if a newcomer would hate it. So it's not "mandatory" in the sense that no functionality is lost. But this is a moot point, as PEP 459 says the python.commands extension SHOULD be marked as required. And wheel install would technically be in violation of PEP 426, as it doesn't handle script wrappers and it doesn't fail when a package needs them (only "technically", because PEP 426 isn't finalised yet, and "wheel install" could be updated to support it).
It's not in violation, that's the whole point of saying SHOULD, rather than MUST. Please don't lose that distinction - if users start demanding that developers always implement SHOULDs, they're misreading the spec, and are going to make life miserable for a lot of people by making unreasonable demands on their time. As a specification author, "SHOULD" is a way for us to say "most users are likely to want this, so you should probably do it if you don't have a strong preference, but not all users will want it, so certain tools may choose not to do it for reasons that are too context specific for us to go into in a general purpose specification". The MUSTs are the "things you may not personally care about, but other people do care about, will break if you get this wrong" part of the specs, the SHOULDs are "this is probably a good thing to do, but you may disagree" :) Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 18 April 2015 at 20:02, Nick Coghlan
It's a huge "quality of implementation" issue, certainly - any installer that doesn't include script generation built in is going to be as annoying as hell to a user. But they do exist (wheel install, for instance) and the resulting installation "works", even if a newcomer would hate it. So it's not "mandatory" in the sense that no functionality is lost. But this is a moot point, as PEP 459 says the python.commands extension SHOULD be marked as required. And wheel install would technically be in violation of PEP 426, as it doesn't handle script wrappers and it doesn't fail when a package needs them (only "technically", because PEP 426 isn't finalised yet, and "wheel install" could be updated to support it).
It's not in violation, that's the whole point of saying SHOULD, rather than MUST. Please don't lose that distinction - if users start demanding that developers always implement SHOULDs, they're misreading the spec, and are going to make life miserable for a lot of people by making unreasonable demands on their time.
As a specification author, "SHOULD" is a way for us to say "most users are likely to want this, so you should probably do it if you don't have a strong preference, but not all users will want it, so certain tools may choose not to do it for reasons that are too context specific for us to go into in a general purpose specification".
The MUSTs are the "things you may not personally care about, but other people do care about, will break if you get this wrong" part of the specs, the SHOULDs are "this is probably a good thing to do, but you may disagree" :)
Sorry - I missed quoting one relevant bit of PEP 426. What I was trying to say was that wheel install violates the "MUST" in PEP 426 (that installers MUST support a required extension or report an error when installing). But Daniel pointed out that I'm wrong anyway - wheel install *does* support generating script wrappers, using setuptools to do so. My apologies for not checking my facts more carefully, and for my confusing wording. Paul
On Mon, Apr 13, 2015 at 10:17 PM, Robert Collins
On 14 April 2015 at 09:35, David Cournapeau
wrote: ... One of the earlier things mentioned here - {pre,post}{install,remove} scripts - raises a red flag for me.
That's indeed a good a priori. I myself removed a lot of those scripts because of the fragility. Anything that needs to run on a end-user machine can fail, and writing idempotent scripts is hard. Unfortunately, pure declarative does not really cut it if you want cross platform support. Sure, you may be able to deal with menu entries, environment variables, etc... in a cross-platform manner with a significant effort, but what about COM registration ? pywin32 is one of the most used package in the python ecosystem, and its post install script is not trivial. Another difficult case if when a package needs some specific configuration to run at all, and that configuration requires values known at install time only (e.g. sys.prefix, as in the iris package).
I'd really prefer it if we keep wheels 100% declarative, and instead focus on defining appropriate metadata for the things you need to accomplish {pre,post}{install,remove} of a package.
What about a way for wheels to specify whether their post,pre/install,remove scripts are declarative or not, with support for the most common tasks, with an escape, but opt-in mechanism ? This way it could be a matter of policy to refuse packages that require non-declarative scripts. David
On 14 April 2015 at 06:37, David Cournapeau
pywin32 is one of the most used package in the python ecosystem, and its post install script is not trivial.
And yet pywin32's postinstall script is completely virtualenv-hostile. It registers start menu entries (not applicable when installing in a virtualenv), registers itself as a COM server (once again, one per machine), adds registry entries (again, virtualenv-hostile), moves installed files into the Windows system directory (ditto) etc. And yet for many actual uses of pywin32, installing as a wheel without running the postinstall is sufficient. With the exception of writing COM servers in Python (and maybe writing services, but I thing cx_Freeze lets you do that without pywin32), pretty much every use *I* have seen of pywin32 can be replaced with ctypes or cffi with no loss of functionality. I'd argue that pywin32 is a perfect example of a project where *not* supporting postinstall scripts would be a good idea, as it would encourage the project to find a way to implement the same functionality in a way that's compatible with current practices (virtualenv, tox, etc). Or it would encourage other projects to stop depending on pywin32 (which is actually what is happening, many projects now use ctypes and similar in place of pywin32-using code, to avoid the problems pywin32 causes for them). Paul
On 13 April 2015 at 12:56, Chris Barker
Which brings me back to the question: should the python tools (i.e. wheel) be extended to support more use-cases, specifically non-python dependencies? Or do we just figure that that's a problem better solved by projects with a larger scope (i.e. rpm, deb, conda, canopy).
I'm on the fence here. I mostly care about Python, and I think we're pretty darn close with allowing wheel to support the non-python dependencies, which would allow us all to "simply pip install" pretty much anything -- that would be cool. But maybe it's a bit of a slippery slope, and if we go there, we'll end up re-writing conda.
From a Python upstream perspective, Nix falls a long way behind conda due to the fact that Nix currently relies on Cygwin for Windows support - it's interesting to me for Fedora because Nix ticks a lot of boxes from a system administrator perspective that conda doesn't (in
The main two language independent solutions I've identified for this general "user level package management" problem in the Fedora Environments & Stacks context (https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag...) are conda (http://conda.pydata.org/) and Nix (https://nixos.org/nix/about.html), backed up by Pulp for plugin-based format independent repository management (http://www.pulpproject.org/). particular, system administrators can more easily track what users have installed, and ensure that packages are updated appropriately in the face of security updates in dependencies). I definitely see value in Python upstream formats being able to bundle additional files like config files, desktop integration files, service definition files, statically linked extensions modules, etc, in a way that not only supports direct installation onto end user machines, but also conversion into platform specific formats (whether that platform is an operating system, or a cross-platform platform like nix, canopy or conda). The point where I draw the line is supporting *dynamic* linking between modules - that's the capability I view as defining the boundary between "enabling an addon ecosystem for a programming language runtime" and "providing a comprehensive software development platform" :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, Apr 14, 2015 at 8:41 AM, Nick Coghlan
The main two language independent solutions I've identified for this general "user level package management" problem in the Fedora Environments & Stacks context ( https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag... ) are conda (http://conda.pydata.org/) and Nix (https://nixos.org/nix/about.html),
cool -- I hadn't seem nix before.
From a Python upstream perspective, Nix falls a long way behind conda due to the fact that Nix currently relies on Cygwin for Windows support -
The other thing that's nice about conda is that while it was designed for the general case, it has a lot of python-specific features. Being a Python guy -- I llke that ;-) -- it may not work nearly as well for Ruby or what have you -- I wouldn't know.
The point where I draw the line is supporting *dynamic* linking between modules -
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
that's the capability I view as defining the boundary between "enabling an add-on ecosystem for a programming language runtime" and "providing a comprehensive software development platform" :)
Well, with it's target audience being scientific programmers, conda IS trying to give you a "comprehensive software development platform" We're talking about Python here -- it's a development tool. It turns out that for scientific development, pure python is simply not enough -- hence the need for conda and friends. I guess this is what it comes down to -- I'm all for adding a few features to wheel -- it would be nice to be abel to pip install most of what I, and people like me, need. But maybe it's not possible -- you can solve the shared lib problem, and the scripts problem, and maybe the menu entires problem, but eventually, you end up with "I want to use numba" -- and then you need LLVM, etc. -- and pretty soon you are building a tool that provides a "comprehensive software development platform". ;-) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
Chris Barker
On Tue, Apr 14, 2015 at 8:41 AM, Nick Coghlan
wrote: The point where I draw the line is supporting *dynamic* linking between modules -
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
Agreed. A key pain point for Python distributions is the lack of support for installing *one* instrance of a Python library, and other Python modules able to discover such installed libraries which meet their declared dependency. For example: * Python distribution ‘foo’, for Python implementation “D” on architecture “X“, declares dependency on “bar >= 1.7”. * Installing Python distribution ‘bar’ version 1.8, on a host running Python “D” for architecture “X”, goes to a single instance for the ‘bar’ library for that architecture and Python implementation. * Invoking the ‘foo’ code on the same host will go looking (dynamically?) for the dependency ‘bar’ and find version 1.8 already installed in the one instance on that host. It uses that and all is well. I'm in agreement with Chris that, while the above example may not currently play out as described, that is a fault to be fixed by improving Python's packaging and distribution tools so that it *is* a first-class use case. Nick, you seem to be arguing against that. Can you clarify? -- \ “Natural catastrophes are rare, but they come often enough. We | `\ need not force the hand of nature.” —Carl Sagan, _Cosmos_, 1980 | _o__) | Ben Finney
On 13 May 2015 at 16:19, Ben Finney
Chris Barker
writes: On Tue, Apr 14, 2015 at 8:41 AM, Nick Coghlan
wrote: The point where I draw the line is supporting *dynamic* linking between modules -
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
Agreed. A key pain point for Python distributions is the lack of support for installing *one* instrance of a Python library, and other Python modules able to discover such installed libraries which meet their declared dependency.
Are we talking about Python libraries accessed via Python APIs, or linking to external dependencies not written in Python (including linking directly to C libraries shipped with a Python library)? It's the latter I consider to be out of scope for a language specific packaging system - Python packaging dependencies are designed to describe inter-component dependencies based on the Python import system, not dependencies based on the operating system provided C/C++ dynamic linking system. If folks are after the latter, than they want a language independent package system, like conda, nix, or the system package manager in a Linux distribution.
For example:
* Python distribution ‘foo’, for Python implementation “D” on architecture “X“, declares dependency on “bar >= 1.7”.
* Installing Python distribution ‘bar’ version 1.8, on a host running Python “D” for architecture “X”, goes to a single instance for the ‘bar’ library for that architecture and Python implementation.
* Invoking the ‘foo’ code on the same host will go looking (dynamically?) for the dependency ‘bar’ and find version 1.8 already installed in the one instance on that host. It uses that and all is well.
I'm in agreement with Chris that, while the above example may not currently play out as described, that is a fault to be fixed by improving Python's packaging and distribution tools so that it *is* a first-class use case.
Nick, you seem to be arguing against that. Can you clarify?
I'm arguing against supporting direct C level dependencies between packages that rely on dynamic linking to find each other rather than going through the Python import system, as I consider that the point where you cross the line into defining a new platform of your own, rather than providing components that can plug into a range of platforms. (Another way of looking at this: if a tool can manage the Python runtime in addition to Python modules, it's a full-blown arbitrary software distribution platform, not just a Python package manager). Defining cross-platform ABIs (cf. http://bugs.python.org/issue23966) is an unholy mess that will be quite willing to consume vast amounts of time without a great deal to show for it beyond can already be achieved more easily by telling people to just use one of the many existing systems designed specifically to solve that problem (with conda being my default recommendation if you care about Windows, and nix being my recommendation if you only care about *nix systems). Integrator oriented packaging tools and developer oriented packaging tools solve different problems for different groups of people, so I'm firmly of the opinion that trying to solve both sets of problems with a single tool will produce a result that doesn't work as well for *either* use case as separate tools can. Cheers, Nick. P.S. The ABI definition problem is at least somewhat manageable for Windows and Mac OS X desktop/laptop environments (since you can mostly pretend that architectures other than x86_64 don't exist, with perhaps some grudging concessions to the existence of 32-bit mode), but beyond those two, things get very messy, very fast - identifying CPU architectures, CPU operating modes and kernel syscall interfaces correctly is still a hard problem in the Linux distribution space, and they've been working at it a lot longer than we have (and that's *without* getting into things like determining which vectorisation instructions are available). Folks often try to "deal" with this complexity by wishing it away, but the rise of aarch64 and IBM's creation of the OpenPOWER Foundation is making the data centre space interesting again, while in the mobile and embedded spaces it's ARM that is the default, with x86_64 attempting to make inroads. As a result of all that, distributing software that uses dynamically linked dependencies is genuinely difficult, to the point where even operating system vendors struggle to get it right. This is why "statically link all the things" keeps coming back in various guises (whether that's Go's lack of dynamic linking support, or the surge in adoption of Linux containers as a deployment technique), despite the fact that these techniques inevitably bring back the *other* problems that led to the invention of dynamic linking in the first place. The only solution that is known to work reliably for dynamic linking is to have a curated set of packages all built by the same build system, so you know they're using consistent build settings. Linux distributions provide this, as do multi-OS platforms like nix and conda. We *might* be able to provide it for Python someday if PyPI ever gets an integrated build farm, but that's still a big "if" at this point. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
Are we talking about Python libraries accessed via Python APIs, or linking to external dependencies not written in Python (including linking directly to C libraries shipped with a Python library)?
I, at least, am talking about the latter. for a concrete example: libpng, for instance, might be needed by PIL, wxPython, Matplotlib, and who knows what else. At this point, if you want to build a package of any of these, you need to statically link it into each of them, or distribute shared libs with each package -- if you ware using them all together (which I do, anyway) you now have three copies of the same lib (but maybe different versions) all linked into your executable. Maybe there is no downside to that (I haven't had a problem yet), but it seems like a bad way to do it! It's the latter I consider to be out of scope for a language specific
packaging system
Maybe, but it's a problem to be solved, and the Linux distros more or less solve it for us, but OS-X and Windows have no such system built in (OS-X does have Brew and macports....)
- Python packaging dependencies are designed to describe inter-component dependencies based on the Python import system, not dependencies based on the operating system provided C/C++ dynamic linking system.
I think there is a bit of fuzz here -- cPython, at least, uses the "the operating system provided C/C++ dynamic linking system" -- it's not a totally independent thing. If folks are after the latter, than they want
a language independent package system, like conda, nix, or the system package manager in a Linux distribution.
And I am, indeed, focusing on conda lately for this reason -- but not all my users want to use a whole new system, they just want to "pip install" and have it work. And if you are using something like conda you don't need pip or wheels anyway!
I'm arguing against supporting direct C level dependencies between packages that rely on dynamic linking to find each other rather than going through the Python import system,
Maybe there is a mid ground. For instance, I have a complex wrapper system around a bunch of C++ code. There are maybe 6 or 7 modules that all need to link against that C++ code. On OS-X (and I think Linux, I haven't been doing those builds), we can statically link all the C++ into one python module -- then, as long as that python module is imported before the others, they will all work, and all use that same already loaded version of that library. (this doesn't work so nicely on Windows, unfortunately, so there, we build a dll, and have all the extensions link to it, then put the dll somewhere it gets found -- a little fuzzy on those details) So option (1) for something like libpng is to have a compiled python module that is little but a something that can be linked to ibpng, so that it can be found and loaded by cPython on import, and any other modules can then expect it to be there. This is a big old kludge, but I think could be done with little change to anything in Python or wheel, or...but it would require changes to how each package that use that lib sets itself up and checks for and install dependencies -- maybe not really possible. and it would be better if dependencies could be platform independent, which I'm not sure is supported now. option (2) would be to extend python's import mechanism a bit to allow it to do a raw "link in this arbitrary lib" action, so the lib would not have to be wrapped in a python module -- I don't know how possible that is, or if it would be worth it.
(Another way of looking at this: if a tool can manage the Python runtime in addition to Python modules, it's a full-blown arbitrary software distribution platform, not just a Python package manager).
sure, but if it's ALSO a Python package manger, then why not? i.e. conda -- if we all used conda, we wouldn't need pip+wheel.
Defining cross-platform ABIs (cf. http://bugs.python.org/issue23966)
This is a mess that you need to deal with for ANY binary package -- that's why we don't distribute binary wheels on pypi for Linux, yes?
I'm firmly of the opinion that trying to solve both sets of problems with a single tool will produce a result that doesn't work as well for *either* use case as separate tools can.
I'm going to point to conda again -- it solves both problems, and it's better to use it for all your packages than mingling it with pip (though you CAN mingle it with pip...). So if we say "pip and friends are not going to do that", then we are saying: we don't support a substantial class of packages, and then I wonder what the point is to supporting binary packages at all? P.S. The ABI definition problem is at least somewhat manageable for
Windows and Mac OS X desktop/laptop environments
Ah -- here is a key point -- because of that, we DO support binary packages on PyPi -- but only for Windows and OS-X.. I'm just suggesting we find a way to extend that to pacakges that require a non-system non-python dependency. but beyond
those two, things get very messy, very fast - identifying CPU architectures, CPU operating modes and kernel syscall interfaces correctly is still a hard problem in the Linux distribution space
right -- but where I am confused is where the line is drawn -- it seem sto be the line is REALLY drawn at "yuo need to compile some C (or Fortran, or???) code, rather than at "you depend on another lib" -- the C code, whether it is a third party lib, or part of your extension, still needs to be compiled to match the host platform. and that's
*without* getting into things like determining which vectorisation instructions are available).
yup -- that's why we don't have binary wheels for numpy up on PyPi at this point..... but the rise of aarch64 and IBM's
creation of the OpenPOWER Foundation is making the data centre space interesting again, while in the mobile and embedded spaces it's ARM that is the default, with x86_64 attempting to make inroads.
Are those the targets for binary wheels? I don't think so.
This is why
"statically link all the things" keeps coming back in various guises
but if you statically link, you need to build the static package right anyway -- so it doesn't actually solve the problem at hand anyway. The only solution that is known to work reliably for dynamic linking
is to have a curated set of packages all built by the same build system, so you know they're using consistent build settings. Linux distributions provide this, as do multi-OS platforms like nix and conda. We *might* be able to provide it for Python someday if PyPI ever gets an integrated build farm, but that's still a big "if" at this point.
Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by "curated", but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link. OK -- I just remembered the missing link for doing what I proposed above for third party dynamic libs: at this point dependencies are tied to a particular package -- whereas my plan above would require a dependency ties to particular wheel, not the package as a whole. i.e: my mythical matplotlib wheel on OS-X would depend on a py_libpng module -- which could be provided as separate binary wheel. but matplotlib in general would not have that dependency -- for instance, on Linux, folks would want it to build against the system lib, and not have another dependency. Even on OS-X, homebrew users would want it to build against the homebrew lib, etc... So what would be good is a way to specify a "this build" dependency. That can be hacked in, of course, but nicer not to have to. NOTE; we ran into this with readline and iPython wheels -- I can't remember how that was resolved. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 15 May 2015 at 06:01, Chris Barker
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
Are we talking about Python libraries accessed via Python APIs, or linking to external dependencies not written in Python (including linking directly to C libraries shipped with a Python library)?
I, at least, am talking about the latter. for a concrete example: libpng, for instance, might be needed by PIL, wxPython, Matplotlib, and who knows what else. At this point, if you want to build a package of any of these, you need to statically link it into each of them, or distribute shared libs with each package -- if you ware using them all together (which I do, anyway) you now have three copies of the same lib (but maybe different versions) all linked into your executable. Maybe there is no downside to that (I haven't had a problem yet), but it seems like a bad way to do it!
If they are exchanging data structures, it will break at some point.
Consider libpng; say that you have a handle to the native C struct for
it in PIL, and you pass the wrapping Python object for it to
Matplotlib but the struct changed between the version embedded in PIL
and that in Matplotlib. Boom.
If you communicate purely via Python objects that get remarshalled
within each lib its safe (perhaps heavy on the footprint, but safe).
-Rob
--
Robert Collins
On Thu, May 14, 2015 at 4:41 PM, Robert Collins
anyway) you now have three copies of the same lib (but maybe different versions) all linked into your executable. Maybe there is no downside to that (I haven't had a problem yet), but it seems like a bad way to do it!
If they are exchanging data structures, it will break at some point. Consider libpng; say that you have a handle to the native C struct for it in PIL, and you pass the wrapping Python object for it to Matplotlib but the struct changed between the version embedded in PIL and that in Matplotlib. Boom.
If you communicate purely via Python objects that get remarshalled within each lib its safe (perhaps heavy on the footprint, but safe).
As far as I know -- no one tries to pass, say, libpng structure pointers around between different python packages. You are right, that would be pretty insane! the best you can do is pass python buffer objects around so you are not copying data where you don't need to. But maybe there is a use-case for passing a native lib data structure around, in which case, yes, you'd really want the lib versions to match -- yes! I suppose if I were to do this, I'd do a run-time check on the lib version number... not sure how else you could be safe in Python-land. So maybe the only real downside is some wasted disk space an memory, which are pretty cheap thee days -- but I stil l don't like it ;-) But the linker/run time/whatever can keep track of which version of a given function is called where? -CHB
-Rob
-- Robert Collins
Distinguished Technologist HP Converged Cloud
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 14 May 2015 at 19:01, Chris Barker
Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by "curated", but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link.
I think the issue is that, if we have 5 different packages that depend on (say) libpng, and we're using dynamic builds, then how do those packages declare that they need access to libpng.dll? And on Windows, where does the user put libpng.dll so that it gets picked up? And how does a non-expert user do this ("put it in $DIRECTORY, update your PATH, blah blah blah" doesn't work for the average user)? In particular, on Windows, note that the shared DLL must either be in the directory where the executable is located (which is fun when you have virtualenvs, embedded interpreters, etc), or on PATH (which has other implications - suppose I have an incompatible version of libpng.dll, from mingw, say, somewhere earlier on PATH). The problem isn't so much defining a standard ABI that shared DLLs need - as you say, that's a more or less solved problem on Windows - it's managing how those shared DLLs are made available to Python extensions. And *that* is what Unix package managers do for you, and Windows doesn't have a good solution for (other than "bundle all the dependent DLLs with the app, or suffer DLL hell"). Paul PS For a fun exercise, it might be interesting to try breaking conda - find a Python extension which uses a shared DLL, and check that it works. Then grab an incompatible copy of that DLL (say a 32-bit version on a 64-bit system) and try hacking around with PATH, putting the incompatible DLL in a directory earlier on PATH than the correct one, in the Windows directory, use an embedded interpreter like mod_wsgi, tricks like that. If conda survives that, then the solution that they use might be something worth documenting and might offer an approach to solving the issue I described above. If it *doesn't* survive, then that probably implies that the general environment pip has to work in is less forgiving than the curated environment conda manages (which is, of course, the whole point of using conda - to get that curated environment :-))
On Fri, May 15, 2015 at 1:49 AM, Paul Moore
On 14 May 2015 at 19:01, Chris Barker
wrote: Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by "curated", but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link.
I think the issue is that, if we have 5 different packages that depend on (say) libpng, and we're using dynamic builds, then how do those packages declare that they need access to libpng.dll?
this is the missing link -- it is a binary build dependency, not a package dependency -- so not such much that matplotlib-1.4.3 depends on libpng.x.y, but that: matplotlib-1.4.3-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl depends on: libpng-x.y (all those binary parts will come from the platform) That's what's missing now. And on Windows,
where does the user put libpng.dll so that it gets picked up?
Well, here is the rub -- Windows dll hell really is hell -- but I think it goes into the python dll searchpath (sorry, not on a Windows box where I can really check this out right now), it can work -- I know have an in-house product that has multiple python modules sharing a single dll somehow....
And how does a non-expert user do this ("put it in $DIRECTORY, update your PATH, blah blah blah" doesn't work for the average user)?
That's why we may need to update the tooling to handle this -- I"m not totally sure if the current wheel format can support this on Windows -- though it can on OS-X. In particular, on Windows, note that the shared DLL must either be in
the directory where the executable is located (which is fun when you have virtualenvs, embedded interpreters, etc), or on PATH (which has other implications - suppose I have an incompatible version of libpng.dll, from mingw, say, somewhere earlier on PATH).
that would be dll hell, yes.....
The problem isn't so much defining a standard ABI that shared DLLs need - as you say, that's a more or less solved problem on Windows - it's managing how those shared DLLs are made available to Python extensions. And *that* is what Unix package managers do for you, and Windows doesn't have a good solution for (other than "bundle all the dependent DLLs with the app, or suffer DLL hell").
exactly -- but if we consider the python install to be the "app", rather than an individual python bundle, then we _may_ be OK. PS For a fun exercise, it might be interesting to try breaking conda -
Windows really is simply broken [1] in this regard -- so I'm quite sure you could break conda -- but it does seem to do a pretty good job of not being broken easily by common uses -- I can't say I know enough about Windows dll finding or conda to know how... Oh, and conda is actually broken in this regard on OS-X at this point -- if you compile your own extension in an anaconda environment, it will find a shared lib at compile time that it won't find at run time. -- the conda install process fixes these, but that's a pain when under development -- i.e. you don't want to have to actually install the package with conda to run a test each time you re-build the dll.. (or even change a bit of python code...) But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains: Is there any point? or is the current approach of statically linking all third party libs the way to go? If so, then is there any chance of getting folks to conform to this standard for PyPi hosted binary packages anyway? i.e. the curation problem. Personally, I'm on the fence here -- I really want newbies to be able to simply "pip install" as many packages as possible and get a good result when they do it. On the other hand, I've found that conda better supports this right now, so it's easier for me to simply use that for my tools. -Chris [1] My take on dll hell: a) it's inherently difficult -- which is why Linux provides a system package manager. b) however, Windows really does make it MORE difficult than it has to be: i) it looks first next the executable ii) it also looks on the PATH (rather than a separate DLL_PATH) Combine these two, and you have some folks dropping dlls next to their executable, which means they have inadvertently dropped it on the DLL search path for other apps to find it. Add to this the (very odd to me) long standing tradition of not putting extensive version numbers in dll file names, and presto: dll hell! -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 15 May 2015 at 20:56, Chris Barker
But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains:
Is there any point? or is the current approach of statically linking all third party libs the way to go?
If someone can make it work, that would be good. But (a) nobody is actually offering to develop and maintain such a solution, and (b) it's not particularly clear how *much* of a benefit there would be (space savings aren't that important, ease of upgrade is fine as long as everything can be upgraded at once, etc...)
If so, then is there any chance of getting folks to conform to this standard for PyPi hosted binary packages anyway? i.e. the curation problem.
If it exists, and if there's a benefit, people will use it.
Personally, I'm on the fence here -- I really want newbies to be able to simply "pip install" as many packages as possible and get a good result when they do it.
Static linking gives that on Windows FWIW. (And maybe also on OSX?) This is a key point, though - the goal shouldn't be "use dynamic linking" but rather "make the user experience as easy as possible". It may even be that the best approach (dynamic or static) differs depending on platform.
On the other hand, I've found that conda better supports this right now, so it's easier for me to simply use that for my tools.
And that's an entirely reasonable position. The only "problem" (if indeed it is a problem) is that by having two different solutions (pip/wheel and conda) splits the developer resource, which means that neither approach moves forward as fast as a combined approach does. But that's OK if the two solutions are addressing different needs (which seems to be the case for the moment). Paul
On Fri, May 15, 2015 at 1:44 PM, Paul Moore
Is there any point? or is the current approach of statically linking all third party libs the way to go?
If someone can make it work, that would be good. But (a) nobody is actually offering to develop and maintain such a solution,
well, it's on my list -- but it has been for a while, so I'm trying to gauge whether it's worth putting at the top of my "things to do for python" list. It's not at the top now ;-)
and (b) it's not particularly clear how *much* of a benefit there would be (space savings aren't that important, ease of upgrade is fine as long as everything can be upgraded at once, etc...)
hmm -- that may be a trick, though not a uncommon one in python package dependencies -- it maybe hard to have more than one version of a given lib installed....
If so, then is there any chance of getting folks to conform to this standard
for PyPi hosted binary packages anyway? i.e. the curation problem.
If it exists, and if there's a benefit, people will use it.
OK -- that's encouraging...
Personally, I'm on the fence here -- I really want newbies to be able to simply "pip install" as many packages as possible and get a good result when they do it.
Static linking gives that on Windows FWIW. (And maybe also on OSX?) This is a key point, though - the goal shouldn't be "use dynamic linking" but rather "make the user experience as easy as possible". It may even be that the best approach (dynamic or static) differs depending on platform.
true -- though we also have another problem -- that static linking solution is actually a big pain for package maintainers -- building and linking the dependencies the right way is a pain -- and now everyone that uses a given lib has to figure out how to do it. Giving folks a dynamic lib they can use would mie it easier for tehm to build their packages -- a nice benifit there. Though it's a lot harder to provide a build environment than just the lib to link too .. I"m going to have to think more about that...
On the other hand, I've found that conda better supports this right now, so it's easier for me to simply use that for my tools.
And that's an entirely reasonable position. The only "problem" (if indeed it is a problem) is that by having two different solutions (pip/wheel and conda) splits the developer resource, which means that neither approach moves forward as fast as a combined approach does.
That's not the only problem -- the current split between the (more than one) scientifc python distributions, and the community of folks using python.org and pypi creates a bit of a mess for newbies. I'm reviving this conversation because i just spent a class lecture in a python class on numpy/scipy -- these students have been using a python install for months, using virtualenv, ip installing whatever they need, et. and now, to use another lib, they have to go through machination, maybe even installing a entire additional python. This is not good. And I've had to help more than one student untangle a mess of Apple Python python.org python, homebrew, and/or Anaconda -- for someone that doesn't really get python pacakging, never mond PATHS, and .bashrc vs .bash_profile, etc, it's an unholy mess. "There should be one-- and preferably only one --obvious way to do it." -- HA! But that's OK if the two solutions are addressing different needs
The needs aren't really that different, however. Oh well. Anyway, it seems like if I can find some time to prototype what I have in mind, there may be some room to make it official if it works out. If anyone else want to help -- let me know! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 16 May 2015 at 06:45, Chris Barker
Personally, I'm on the fence here -- I really want newbies to be able to simply "pip install" as many packages as possible and get a good result when they do it.
Static linking gives that on Windows FWIW. (And maybe also on OSX?) This is a key point, though - the goal shouldn't be "use dynamic linking" but rather "make the user experience as easy as possible". It may even be that the best approach (dynamic or static) differs depending on platform.
true -- though we also have another problem -- that static linking solution is actually a big pain for package maintainers -- building and linking the dependencies the right way is a pain -- and now everyone that uses a given lib has to figure out how to do it. Giving folks a dynamic lib they can use would mie it easier for tehm to build their packages -- a nice benifit there.
Though it's a lot harder to provide a build environment than just the lib to link too .. I"m going to have to think more about that...
It seems to me that the end user doesn't really have a problem here ("pip install matplotlib" works fine for me using the existing wheel). It's the package maintainers (who have to build the binaries) that have the issue because everyone ends up doing the same work over and over, building dependencies. So rather than trying to address the hard problem of dynamic linking, maybe a simpler solution is to set up a PyPI-like hosting solution for static libraries of C dependencies? It could be as simple as a github project that contained a directory for each dependency, with scripts to build Python-compatible static libraries, and probably built .lib files for the supported architectures. With a setuptools build plugin you could even just specify your libraries in setup.py, and have the plugin download the lib files automatically at build time. People add libraries to the archive simply by posting pull requests. Maybe the project maintainer maintains the actual binaries by running the builds separately and publishing them separately, or maybe PRs include binaries - either way would work (although having the maintainer do it ensures a certain level of QA that the build process is reproducible). It could even include libraries that people need for embedding, rather than extensions (I recently needed a version of libxpm compatible with Python 3.5, for building a Python-enabled vim, for example). The msys2 projects provides something very similar to this at https://github.com/Alexpux/MINGW-packages which is a repository of build scripts for various packages. Paul PS The above is described as if it's single-platform, mostly because I only tend to think about these issues from a Windows POV, but it shouldn't be hard to extend it to multi-platform.
On Sat, May 16, 2015 at 4:13 AM, Paul Moore
Though it's a lot harder to provide a build environment than just the lib to link too .. I"m going to have to think more about that...
It seems to me that the end user doesn't really have a problem here ("pip install matplotlib" works fine for me using the existing wheel).
Sure -- but that's because Matthew Brett has done a lot of work to make that happen.
It's the package maintainers (who have to build the binaries) that have the issue because everyone ends up doing the same work over and over, building dependencies.
Exactly -- It would be nice if the ecosystem made that easier.
So rather than trying to address the hard problem of dynamic linking, maybe a simpler solution is to set up a PyPI-like hosting solution for static libraries of C dependencies?
It could be as simple as a github project that contained a directory for each dependency,
I started that here: https://github.com/PythonCHB/mac-builds but haven't kept it up. And Matthew Brett has done most of the work here: https://github.com/MacPython not sure how he's sharing the static libs -- but it could be done. With a setuptools build plugin you could even just
specify your libraries in setup.py, and have the plugin download the
lib files automatically at build time.
actually, that's a pretty cool idea! you'd need place to host them -- gitHbu is no longer hosting "downloads" are they? though you could probably use github-pages.. (or somethign else)
People add libraries to the archive simply by posting pull requests. Maybe the project maintainer maintains the actual binaries by running the builds separately and publishing them separately, or maybe PRs include binaries
or you use a CI system to build them. Something like this is being done by a bunch of folks for conda/binstar: https://github.com/ioos/conda-recipes is just one example. PS The above is described as if it's single-platform, mostly because I
only tend to think about these issues from a Windows POV, but it
shouldn't be hard to extend it to multi-platform.
Indeed -- the MacWheels projects are, of course single platform, but could be extended. though at the end of the day, there isn't much to share between building libs on different platforms (unless you are using a cross-platfrom build tool -- why I was trying out gattai for my stuff) The conda stuff is multi-platform, though, in fact, you have to write a separate build script for each platform -- it doesn't really provide anything to help with that part. But while these efforts are moving towards removing the need for every pacakge maintainer to build the deps -- we are now duplicating the effort of trying to remove duplication of effort :-) -- but maybe just waiting for something to gain momentum and rise to the top is the answer. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Sat, May 16, 2015 at 4:56 AM, Chris Barker
On Fri, May 15, 2015 at 1:49 AM, Paul Moore
wrote: On 14 May 2015 at 19:01, Chris Barker
wrote: Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by "curated", but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link.
I think the issue is that, if we have 5 different packages that depend on (say) libpng, and we're using dynamic builds, then how do those packages declare that they need access to libpng.dll?
this is the missing link -- it is a binary build dependency, not a package dependency -- so not such much that matplotlib-1.4.3 depends on libpng.x.y, but that:
matplotlib-1.4.3-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
depends on:
libpng-x.y
(all those binary parts will come from the platform)
That's what's missing now.
And on Windows,
where does the user put libpng.dll so that it gets picked up?
Well, here is the rub -- Windows dll hell really is hell -- but I think it goes into the python dll searchpath (sorry, not on a Windows box where I can really check this out right now), it can work -- I know have an in-house product that has multiple python modules sharing a single dll somehow....
And how does a non-expert user do this ("put it in $DIRECTORY, update your PATH, blah blah blah" doesn't work for the average user)?
That's why we may need to update the tooling to handle this -- I"m not totally sure if the current wheel format can support this on Windows -- though it can on OS-X.
In particular, on Windows, note that the shared DLL must either be in
the directory where the executable is located (which is fun when you have virtualenvs, embedded interpreters, etc), or on PATH (which has other implications - suppose I have an incompatible version of libpng.dll, from mingw, say, somewhere earlier on PATH).
that would be dll hell, yes.....
The problem isn't so much defining a standard ABI that shared DLLs need - as you say, that's a more or less solved problem on Windows - it's managing how those shared DLLs are made available to Python extensions. And *that* is what Unix package managers do for you, and Windows doesn't have a good solution for (other than "bundle all the dependent DLLs with the app, or suffer DLL hell").
exactly -- but if we consider the python install to be the "app", rather than an individual python bundle, then we _may_ be OK.
PS For a fun exercise, it might be interesting to try breaking conda -
Windows really is simply broken [1] in this regard -- so I'm quite sure you could break conda -- but it does seem to do a pretty good job of not being broken easily by common uses -- I can't say I know enough about Windows dll finding or conda to know how...
Oh, and conda is actually broken in this regard on OS-X at this point -- if you compile your own extension in an anaconda environment, it will find a shared lib at compile time that it won't find at run time. -- the conda install process fixes these, but that's a pain when under development -- i.e. you don't want to have to actually install the package with conda to run a test each time you re-build the dll.. (or even change a bit of python code...)
But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains:
There is actually no way to do that on windows without modifying the interpreter somehow. This was somehow discussed a bit at PyCon when talking about windows packaging: 1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible. 2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API. With 2., you still have the issue of DLL hell, which may be resolved through naming and activation contexts. I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory). FWIW, we are about to deploy 2. @ Enthought (where we control the python interpreter, so it is much easier for us). David
Is there any point? or is the current approach of statically linking all third party libs the way to go?
If so, then is there any chance of getting folks to conform to this standard for PyPi hosted binary packages anyway? i.e. the curation problem.
Personally, I'm on the fence here -- I really want newbies to be able to simply "pip install" as many packages as possible and get a good result when they do it.
On the other hand, I've found that conda better supports this right now, so it's easier for me to simply use that for my tools.
-Chris
[1] My take on dll hell:
a) it's inherently difficult -- which is why Linux provides a system package manager.
b) however, Windows really does make it MORE difficult than it has to be: i) it looks first next the executable ii) it also looks on the PATH (rather than a separate DLL_PATH) Combine these two, and you have some folks dropping dlls next to their executable, which means they have inadvertently dropped it on the DLL search path for other apps to find it.
Add to this the (very odd to me) long standing tradition of not putting extensive version numbers in dll file names, and presto: dll hell!
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 16 May 2015 at 07:35, David Cournapeau
But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains:
There is actually no way to do that on windows without modifying the interpreter somehow. This was somehow discussed a bit at PyCon when talking about windows packaging:
1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible. 2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API.
With 2., you still have the issue of DLL hell, which may be resolved through naming and activation contexts. I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory).
FWIW, we are about to deploy 2. @ Enthought (where we control the python interpreter, so it is much easier for us).
This is indeed precisely the issue. In general, Python code can run with "the executable" being in many different places - there are the standard installs, virtualenvs, and embedding scenarios to consider. So "put DLLs alongside the executable", which is often how Windows applications deal with this issue, is not a valid option (that's an option David missed out above, but that's fine as it doesn't work :-)) Putting DLLs on %PATH% *does* cause problems, and pretty severe ones. People who use ports of Unix tools, such as myself, hit this a lot - at one point I got so frustrated with various incompatible versions of libintl showing up on my PATH, all with the same name, that I went on a spree of rebuilding all of the GNU tools without libintl support, just to avoid the issue (and older versions openssl were just as bad with libeay, etc). So, as David says, you pretty much have to use SetDllDirectory and similar features to get a viable location for shared DLLs. I guess it *may* be possible to call those APIs from a Python extension that you load *before* using any shared DLLs, but that seems like a very fragile solution. It's also possible for Python 3.6+ to add a new "shared DLLs" location for such things, which the core interpreter includes (either via SetDllDirectory or by the same mechanism that adds C:\PythonXY\DLLs to the search path at the moment). But that wouldn't help older versions. So while I encourage Chris' enthusiasm in looking for a solution to this issue, I'm not sure it's as easy as he's hoping. Paul
On Fri, May 15, 2015 at 11:35 PM, David Cournapeau
On Sat, May 16, 2015 at 4:56 AM, Chris Barker
wrote: But in short -- I'm pretty sure there is a way, on all systems, to have a standard way to build extension modules, combined with a standard way to install shared libs, so that a lib can be shared among multiple packages. So the question remains:
There is actually no way to do that on windows without modifying the interpreter somehow.
Darn.
This was somehow discussed a bit at PyCon when talking about windows packaging:
1. the simple way to share DLLs across extensions is to put them in the %PATH%, but that's horrible.
yes -- that has to be off the table, period.
2. there are ways to put DLLs in a shared directory *not* in the %PATH% since at least windows XP SP2 and above, through the SetDllDirectory API.
With 2., you still have the issue of DLL hell,
could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this! which may be resolved through naming and activation contexts.
I guess that's what I mean by the above..
I had a brief chat with Steve where he mentioned that this may be a solution, but he was not 100 % sure IIRC. The main drawback of this solution is that it won't work when inheriting virtual environments (as you can only set a single directory).
no relative paths here? or path that can be set at run time? or maybe I"m missing what "inheriting virtual environments" means...
FWIW, we are about to deploy 2. @ Enthought (where we control the python interpreter, so it is much easier for
us).
It'll be great to see how that works out, then. I take that this means that for Canopy, you've decided that statically linking everything is NOT The way to go. Which is a good data point to have. Thanks for the update. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 16 May 2015 at 19:40, Chris Barker
With 2., you still have the issue of DLL hell,
could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this!
Suppose Python adds C:\PythonXY\SharedDLLs to %PATH%. Suppose there's a libpng.dll in there, for matplotlib. Everything works fine. Then I install another non-Python application that uses libpng.dll, and does so by putting libpng.dll alongside the executable (a common way of making DLLs available with Windows applications). Also assume that the application installer adds the application directory to the *start* of PATH. Now, Python extensions will use this 3rd party application's DLL rather than the correct one. If it's ABI-incompatible, the Python extension will crash. If it's ABI compatible, but behaves differently (it could be a different version) there could be inconsistencies or failures. The problem is that while Python can add a DLL directory to PATH, it cannot control what *else* is on PATH, or what has priority. Paul
On Sat, May 16, 2015 at 11:54 AM, Paul Moore
could you clarify a bit -- I thought that this could, at least, put a dir on the search path that was specific to that python context. So it would require cooperation among all the packages being used at once, but not get tangled up with the rest of the system. but maybe I'm wrong here -- I have no idea what the heck I'm doing with this!
Suppose Python adds C:\PythonXY\SharedDLLs to %PATH%. Suppose there's a libpng.dll in there, for matplotlib.
I think we all agree that %PATH% is NOT the option! Taht is the key source od dll hell on Windows. I was referring to the SetDllDirectory API. I don't think that gets picked up by other processes. from: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686203%28v=vs.85%... It looks like you can add a path, at run time, that gets searched for dlls before the rest of the system locations. And this does to effect any other applications. But you'd need to make sure this got run before any of the effected packages where loaded -- which is proabbly what David meant by needing to "control the python binary". -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 16 May 2015 at 20:04, Chris Barker
I was referring to the SetDllDirectory API. I don't think that gets picked up by other processes.
from:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms686203%28v=vs.85%...
It looks like you can add a path, at run time, that gets searched for dlls before the rest of the system locations. And this does to effect any other applications. But you'd need to make sure this got run before any of the effected packages where loaded -- which is proabbly what David meant by needing to "control the python binary".
Ah, sorry - I misunderstood you. This might work, but as you say, the DLL Path change would need to run before any imports needed it. Which basically means it needs to be part of the Python interpreter startup. It *could* be run as normal user code - you just have to ensure you run it before any imports that need shared libraries. But that seems very fragile to me. I'm not sure it's viable as a generic solution. Paul
On 15 May 2015 at 04:01, Chris Barker
I'm confused -- you don't want a system to be able to install ONE version of a lib that various python packages can all link to? That's really the key use-case for me....
Are we talking about Python libraries accessed via Python APIs, or linking to external dependencies not written in Python (including linking directly to C libraries shipped with a Python library)?
I, at least, am talking about the latter. for a concrete example: libpng, for instance, might be needed by PIL, wxPython, Matplotlib, and who knows what else. At this point, if you want to build a package of any of these, you need to statically link it into each of them, or distribute shared libs with each package -- if you ware using them all together (which I do, anyway) you now have three copies of the same lib (but maybe different versions) all linked into your executable. Maybe there is no downside to that (I haven't had a problem yet), but it seems like a bad way to do it!
It's the latter I consider to be out of scope for a language specific packaging system
Maybe, but it's a problem to be solved, and the Linux distros more or less solve it for us, but OS-X and Windows have no such system built in (OS-X does have Brew and macports....)
Windows 10 has Chocalatey and OneGet: * https://chocolatey.org/ * http://blogs.msdn.com/b/garretts/archive/2015/01/27/oneget-and-the-windows-1... conda and nix then fill the niche for language independent packaging at the user level rather than the system level.
- Python packaging dependencies are designed to describe inter-component dependencies based on the Python import system, not dependencies based on the operating system provided C/C++ dynamic linking system.
I think there is a bit of fuzz here -- cPython, at least, uses the "the operating system provided C/C++ dynamic linking system" -- it's not a totally independent thing.
I'm specifically referring to the *declaration* of dependencies here. While CPython itself will use the dynamic linker to load extension modules found via the import system, the loading of further dynamically linked modules beyond that point is entirely opaque not only to the interpreter runtime at module import time, but also to pip at installation time.
If folks are after the latter, than they want a language independent package system, like conda, nix, or the system package manager in a Linux distribution.
And I am, indeed, focusing on conda lately for this reason -- but not all my users want to use a whole new system, they just want to "pip install" and have it work. And if you are using something like conda you don't need pip or wheels anyway!
Correct, just as if you're relying solely on Linux system packages, you don't need pip or wheels. Aside from the fact that conda is cross-platform, the main difference between the conda community and a Linux distro is in the *kind* of software we're likely to have already done the integration work for. The key to understanding the difference in the respective roles of pip and conda is realising that there are *two* basic distribution scenarios that we want to be able to cover (I go into this in more detail in https://www.python.org/dev/peps/pep-0426/#development-distribution-and-deplo...): * software developer/publisher -> software integrator/service operator (or data analyst) * software developer/publisher -> software integrator -> service operator (or data analyst) Note the second line has 3 groups and 2 distribution arrows, while the first line only has the 2 groups and a single distribution step. pip and the other Python specific tools cover that initial developer/publisher -> integrator link for Python projects. This means that Python developers only need to learn a single publishing toolchain (the PyPA tooling) to get started, and they'll be able to publish their software in a format that any integrator that supports Python can consume (whether that's for direct consumption in a DIY integration scenario, or to put through a redistributor's integration processes). On the consumption side, though, the nature of the PyPA tooling as a platform-independent software publication toolchain means that if you want to consume the PyPA formats directly, you need to be prepared to do your own integration work. Many public web service developers are entirely happy with that deal, but most system administrators and data analysts trying to deal with components written in multiple programming languages aren't. That latter link, where the person or organisation handling the software integration task is distinct from the person or organisation running an operational service, or carrying out some data analysis, are where the language independent redistributor tools like Chocolatey, Nix, deb, rpm, conda, Docker, etc all come in - they let a redistributor handle the integration task (or at least some of it) on behalf of their users, leaving those users free to spend more of their time on problems that are unique to them, rather than having to duplicate the redistributor's integration work on their own time. If you look at those pipelines from the service operator/data analyst end, then the *first* question to ask is "Is there a software integrator that targets the audience I am a member of?". If there is, then you're likely to have a better experience reusing their work, rather than spending time going on a DIY integration adventure. In those cases, the fact that the tooling you're using to consume software differs from that the original developers used to publish it *should* be a hidden implementation detail. When it isn't, it's either a sign that those of us in the "software integrator" role aren't meeting the needs of our audience adequately, or else it's a sign that that particular user made the wrong call in opting out of tackling the "DIY integration" task.
I'm arguing against supporting direct C level dependencies between packages that rely on dynamic linking to find each other rather than going through the Python import system,
Maybe there is a mid ground. For instance, I have a complex wrapper system around a bunch of C++ code. There are maybe 6 or 7 modules that all need to link against that C++ code. On OS-X (and I think Linux, I haven't been doing those builds), we can statically link all the C++ into one python module -- then, as long as that python module is imported before the others, they will all work, and all use that same already loaded version of that library.
(this doesn't work so nicely on Windows, unfortunately, so there, we build a dll, and have all the extensions link to it, then put the dll somewhere it gets found -- a little fuzzy on those details)
So option (1) for something like libpng is to have a compiled python module that is little but a something that can be linked to ibpng, so that it can be found and loaded by cPython on import, and any other modules can then expect it to be there. This is a big old kludge, but I think could be done with little change to anything in Python or wheel, or...but it would require changes to how each package that use that lib sets itself up and checks for and install dependencies -- maybe not really possible. and it would be better if dependencies could be platform independent, which I'm not sure is supported now.
option (2) would be to extend python's import mechanism a bit to allow it to do a raw "link in this arbitrary lib" action, so the lib would not have to be wrapped in a python module -- I don't know how possible that is, or if it would be worth it.
Your option 2 is specifically the kind of thing I don't want to support, as it's incredibly hard to do right (to the tune of "people will pay you millions of dollars a year to reduce-or-eliminate their ABI compatibility concerns"), and has the potential to replace the current you-need-to-be-able-build-this-from-source-yourself issue with "oh, look, now you have a runtime ABI incompatibility, have fun debugging that one, buddy". Your option 1 seems somewhat more plausible, as I believe it should theoretically be possible to use the PyCObject/PyCapsule API (or even just normal Python objects) to pass the relevant shared library details from a "master" module that determines which versions of external libraries to link against, to other modules that always want to load them, in a way that ensures everything is linking against a version that it is ABI compatible with. That would require someone to actually work on the necessary tooling to help with that though, as you wouldn't be able to rely on the implicit dynamic linking provided by C/C++ toolchains any more. Probably the best positioned to tackle that idea would be the Cython community, since they could generate all the required cross-platform boilerplate code automatically.
(Another way of looking at this: if a tool can manage the Python runtime in addition to Python modules, it's a full-blown arbitrary software distribution platform, not just a Python package manager).
sure, but if it's ALSO a Python package manger, then why not? i.e. conda -- if we all used conda, we wouldn't need pip+wheel.
conda's not a Python package manager, it's a language independent package manager that was born out of the Scientific Python community and includes Python as one of its supported languages, just like nix, deb, rpm, etc. That makes it an interesting alternative to pip on the package *consumption* side for data analysts, but it isn't currently a good fit for any of pip's other use cases (e.g. one of the scenarios I'm personally most interested in is that pip is now part of the Fedora/RHEL/CentOS build pipeline for Python based RPM packages - we universally recommend using "pip install" in the %install phase over using "setup.py install" directly)
Defining cross-platform ABIs (cf. http://bugs.python.org/issue23966)
This is a mess that you need to deal with for ANY binary package -- that's why we don't distribute binary wheels on pypi for Linux, yes?
Yes, the reason we don't do *nix packages on any platform other than Mac OS X is because the platform defines the CPython ABI along with everything else. It's a fair bit more manageable when we're just dealing with extension modules on Windows and Mac OS X, as we can anchor the ABI on the CPython interpreter ABI.
I'm firmly of the opinion that trying to solve both sets of problems with a single tool will produce a result that doesn't work as well for *either* use case as separate tools can.
I'm going to point to conda again -- it solves both problems, and it's better to use it for all your packages than mingling it with pip (though you CAN mingle it with pip...). So if we say "pip and friends are not going to do that", then we are saying: we don't support a substantial class of packages, and then I wonder what the point is to supporting binary packages at all?
Binary wheels already work for Python packages that have been developed with cross-platform maintainability and deployability taken into account as key design considerations (including pure Python wheels, where the binary format just serves as an installation accelerator). That category just happens to exclude almost all research and data analysis software, because it excludes the libraries at the bottom of that stack (not worry to much about deployability concerns bought the Scientific Python stack a lot of functionality, but it *did* come at a price). It's also the case that when you *are* doing your own system integration, wheels are a powerful tool for caching builds, since you can deal with ABI compatibility concerns through out of band mechanisms, such as standardising your build platform and your deployment platform on a single OS. If you both build and deploy on CentOS 6, then it doesn't matter that your wheel files may not work on CentOS 7, or Ubuntu, or Debian, or Cygwin, because you're not deploying them there, and if you switched platforms, you'd just redo your builds.
P.S. The ABI definition problem is at least somewhat manageable for Windows and Mac OS X desktop/laptop environments
Ah -- here is a key point -- because of that, we DO support binary packages on PyPi -- but only for Windows and OS-X.. I'm just suggesting we find a way to extend that to pacakges that require a non-system non-python dependency.
At the point you're managing arbitrary external binary dependencies, you've lost all the constraints that let us get away with doing this for extension modules without adequate metadata, and are back to trying to solve the same arbitrary ABI problem that exists on Linux. This is multi-billion-dollar-operating-system-companies-struggle-to-get-this-right levels of difficulty that we're talking about here :)
but beyond those two, things get very messy, very fast - identifying CPU architectures, CPU operating modes and kernel syscall interfaces correctly is still a hard problem in the Linux distribution space
right -- but where I am confused is where the line is drawn -- it seem sto be the line is REALLY drawn at "yuo need to compile some C (or Fortran, or???) code, rather than at "you depend on another lib" -- the C code, whether it is a third party lib, or part of your extension, still needs to be compiled to match the host platform.
The line is drawn at ABI compatibility management. We're able to fuzz that line a little bit in the case of Windows and Mac OS X extension modules because we have the python.org CPython releases to act as an anchor for the ABI definition. We don't have that at all on other *nix platforms, and we don't have it on Windows and Mac OS X either once we move beyond the CPython C ABI (which encompasses the underlying platform ABI) We *might* be able to get to the point of being able to describe platform ABIs well enough to allow public wheels for arbitrary platforms, but we haven't had any plausible sounding designs put forward for that as yet, and it still wouldn't allow depending on arbitrary external binaries (only the versions integrated with a given platform).
but the rise of aarch64 and IBM's creation of the OpenPOWER Foundation is making the data centre space interesting again, while in the mobile and embedded spaces it's ARM that is the default, with x86_64 attempting to make inroads.
Are those the targets for binary wheels? I don't think so.
Yes, they'll likely end up being one of Fedora's targets for prebuilt wheel files: https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag...
This is why
"statically link all the things" keeps coming back in various guises
but if you statically link, you need to build the static package right anyway -- so it doesn't actually solve the problem at hand anyway.
Yes it does - you just need to make sure your build environment suitably matches your target deployment environment. "Publishing on PyPI" is only one of the use cases for wheel files, and it isn't relevant to any of my own personal use cases (which all involve a PyPI independent build system, with PyPI used solely as a source of sdist archives).
The only solution that is known to work reliably for dynamic linking is to have a curated set of packages all built by the same build system, so you know they're using consistent build settings. Linux distributions provide this, as do multi-OS platforms like nix and conda. We *might* be able to provide it for Python someday if PyPI ever gets an integrated build farm, but that's still a big "if" at this point.
Ah -- here is the issue -- but I think we HAVE pretty much got what we need here -- at least for Windows and OS-X. It depends what you mean by "curated", but it seems we have a (defacto?) policy for PyPi: binary wheels should be compatible with the python.org builds. So while each package wheel is supplied by the package maintainer one way or another, rather than by a central entity, it is more or less curated -- or at least standardized. And if you are going to put a binary wheel up, you need to make sure it matches -- and that is less than trivial for packages that require a third party dependency -- but building the lib statically and then linking it in is not inherently easier than doing a dynamic link.
OK -- I just remembered the missing link for doing what I proposed above for third party dynamic libs: at this point dependencies are tied to a particular package -- whereas my plan above would require a dependency ties to particular wheel, not the package as a whole. i.e:
my mythical matplotlib wheel on OS-X would depend on a py_libpng module -- which could be provided as separate binary wheel. but matplotlib in general would not have that dependency -- for instance, on Linux, folks would want it to build against the system lib, and not have another dependency. Even on OS-X, homebrew users would want it to build against the homebrew lib, etc...
So what would be good is a way to specify a "this build" dependency. That can be hacked in, of course, but nicer not to have to.
By the time you've solved all these problems I believe you'll find you have reinvented conda ;) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, May 16, 2015 at 10:12 AM, Nick Coghlan
Maybe, but it's a problem to be solved, and the Linux distros more or less solve it for us, but OS-X and Windows have no such system built in (OS-X does have Brew and macports....)
Windows 10 has Chocalatey and OneGet:
* https://chocolatey.org/ * http://blogs.msdn.com/b/garretts/archive/2015/01/27/oneget-and-the-windows-1...
cool -- though I don't think we want the "official" python to depend on a third party system, and one get won't be available for most users for a LONG time... The fact that OS-X users have to choose between fink, macport, homebrew or roll-your-own is a MAJOR soruce of pain for supporting the OS-X community. "More than one way to do it" is not the goal. conda and nix then fill the niche for language independent packaging
at the user level rather than the system level.
yup -- conda is, indeed, pretty cool.
I think there is a bit of fuzz here -- cPython, at least, uses the "the
operating system provided C/C++ dynamic linking system" -- it's not a totally independent thing.
I'm specifically referring to the *declaration* of dependencies here.
sure -- that's my point about the current "missing link" -- setuptools, pip, etc, can only declare python-package-level dependencies, not binary-level dependencies. My idea is to bundle up a shared lib in a python package -- then, if you declare a dependency on that package, you've handles the dep issue. The trick is that a particular binary wheel depends on that other binary wheel -- rather than the whole package depending on it. (that is, on linux, it would have no dependency, on OS-X it would -- but then only the wheel built for a non-macports build, etc....). I think we could hack around this by monkey-patching the wheel after it is built, so may be worth playing with to see how it works before proposing any changes to the ecosystem.
And if you are using something like conda you don't need pip
or wheels anyway!
Correct, just as if you're relying solely on Linux system packages, you don't need pip or wheels. Aside from the fact that conda is cross-platform, the main difference between the conda community and a Linux distro is in the *kind* of software we're likely to have already done the integration work for.
sure. but the cross-platform thing is BIG -- we NEED pip and wheel because rpm, or deb, or ... are all platform and distro dependent -- we want a way for package maintainers to support a broad audience without having to deal with 12 different package systems. The key to understanding the difference in the respective roles of pip
and conda is realising that there are *two* basic distribution scenarios that we want to be able to cover (I go into this in more detail in https://www.python.org/dev/peps/pep-0426/#development-distribution-and-deplo... ):
hmm -- sure, they are different, but is it impossible to support both with one system?
* software developer/publisher -> software integrator/service operator (or data analyst) * software developer/publisher -> software integrator -> service operator (or data analyst)
...
On the consumption side, though, the nature of the PyPA tooling as a platform-independent software publication toolchain means that if you want to consume the PyPA formats directly, you need to be prepared to do your own integration work.
Exactly! and while Linux system admins can do their own system integration work, everyday users (and many Windows sys admins) can't, and we shouldn't expect them to. And, in fact, the PyPA tooling does support the more casual user much of the time -- for example, I'm in the third quarter of a Python certification class -- Intro, Web development, Advanced topics -- and only half way through the third class have I run into any problems with sticking with the PyPA tools. (except for pychecker -- not being on Pypi :-( ) Many public web service developers are
entirely happy with that deal, but most system administrators and data analysts trying to deal with components written in multiple programming languages aren't.
exactly -- but it's not because the audience is different in their role -- it's because different users need different python packages. The PyPA tools support pure-python great -- and compiled extensions without deps pretty well -- but there is a bit of gap with extensions that require other deps. It's a 90% (95%) solution... It'd be nice to get it to a 99% solution. Where is really gets ugly is where you need stuff that has nothing to do with python -- say a Julia run-time, or ... Anaconda is there to support that: their philosophy is that if you are trying to do full-on data analysis with python, you are likely to need stuff strickly beyond the python ecosystem -- your own Fortran code, numpy (which requires LLVM), etc. Maybe they are right -- but there is still a heck of a lot of stuff that you can do and stay within python, and it would be good if it was easier for web developers to use a bit of numpy, or matplotlib, or pandas in their web apps -- without having to jump to the "scipy stack" ecosystem (which does not support the web dev stuff that well yet... If you look at those pipelines from the service operator/data analyst
end, then the *first* question to ask is "Is there a software integrator that targets the audience I am a member of?".
I think that's part of my point here -- I bridge two communities -- the scientific community says: just use Anaconda or Canopy or ...., but the web developer community says "use python.org, pip, and pypi". If you need to both, there is a gap.
When it isn't, it's either a sign that those of us in the "software integrator" role aren't meeting the needs of our audience adequately,
sure -- but where does PyPA fit in here -- having binary wheels and pypi puts us in teh role of integator -- and we aren't meeting the needs of a broad enough audience as we could -- that's my point there. If we didn't want to be an "integrator", we could have not build pypi, or pip, or wheel.... conda, rpm, macports, etc doesn't need those. I think PyPA tools could meet a braoder need with not much fudging. In some sense,the only question I have at this point is whether there is a compelling reason to better support dynamic libs -- if not, then, as Paul pointed out, all we need is a more coordinated community effort (not easy, but not a tooling question)
option (2) would be to extend python's import mechanism a bit to allow it to
do a raw "link in this arbitrary lib" action, so the lib would not have to be wrapped in a python module -- I don't know how possible that is, or if it would be worth it.
Your option 2 is specifically the kind of thing I don't want to support, as it's incredibly hard to do right (to the tune of "people will pay you millions of dollars a year to reduce-or-eliminate their ABI compatibility concerns"), and has the potential to replace the current you-need-to-be-able-build-this-from-source-yourself issue with "oh, look, now you have a runtime ABI incompatibility, have fun
debugging that one, buddy".
fair enough -- that could be a pretty ugly nightmare. Your option 1 seems somewhat more plausible, as I believe it should
theoretically be possible to use the PyCObject/PyCapsule API (or even just normal Python objects) to pass the relevant shared library details from a "master" module that determines which versions of external libraries to link against, to other modules that always want to load them, in a way that ensures everything is linking against a version that it is ABI compatible with.
That would require someone to actually work on the necessary tooling to help with that though, as you wouldn't be able to rely on the implicit dynamic linking provided by C/C++ toolchains any more. Probably the best positioned to tackle that idea would be the Cython community, since they could generate all the required cross-platform boilerplate code automatically.
good idea -- I'm tied in with those folks -- if I have to do any C stuff I turn to Cython already...
sure, but if it's ALSO a Python package manger, then why not? i.e. conda -- if we all used conda, we wouldn't need pip+wheel.
conda's not a Python package manager, it's a language independent package manager that was born out of the Scientific Python community and includes Python as one of its supported languages, just like nix, deb, rpm, etc.
indeed -- but it does have a bunch of python-specific features....it was built around the need to combine python with other systems. That makes it an interesting alternative to pip on the package
*consumption* side for data analysts, but it isn't currently a good fit for any of pip's other use cases (e.g. one of the scenarios I'm personally most interested in is that pip is now part of the Fedora/RHEL/CentOS build pipeline for Python based RPM packages - we universally recommend using "pip install" in the %install phase over using "setup.py install" directly)
hmm -- conda generally uses "setup.py install" in its build scripts. And it doesn't use pip install because it wants to handle the downloading and dependencies itself (in fact, turning OFF setuptools dependency handling is an annoyance..) So I'm not sure why pip is needed here -- would it be THAT much harder to build rpms of python packages if it didn't exist? (I do see why you wouldn't want to use conda to build rpms..) But while _maybe_ if conda had been around 5 years earlier we could have not bothered with wheel, I'm not proposing that we drop it -- just that we push pip and wheel a bit farther to broaden the supported user-base.
Binary wheels already work for Python packages that have been developed with cross-platform maintainability and deployability taken into account as key design considerations (including pure Python wheels, where the binary format just serves as an installation accelerator). That category just happens to exclude almost all research and data analysis software, because it excludes the libraries at the bottom of that stack
It doesn't quite exclude those -- just makes it harder. And while depending on Fortran, etc, is pretty unique to the data analysis stack, stuff like libpng, libcurl, etc, etc, isn't -- non-system libs are not a rare thing.
It's also the case that when you *are* doing your own system integration, wheels are a powerful tool for caching builds,
conda does this nicely as well :-) I"m not tlrying to argue, at all, that binary wheels are useless, jsu that they could be a bit more useful.
Ah -- here is a key point -- because of that, we DO support binary packages
on PyPi -- but only for Windows and OS-X.. I'm just suggesting we find a way
to extend that to pacakges that require a non-system non-python dependency.
At the point you're managing arbitrary external binary dependencies, you've lost all the constraints that let us get away with doing this for extension modules without adequate metadata, and are back to trying to solve the same arbitrary ABI problem that exists on Linux.
I still don't get that -- any binary extension needs to match the ABI of the python is it used with -- a shared lib is the same problem. The line is drawn at ABI compatibility management. We're able to fuzz
that line a little bit in the case of Windows and Mac OS X extension modules because we have the python.org CPython releases to act as an anchor for the ABI definition.
We don't have that at all on other *nix platforms, and we don't have it on Windows and Mac OS X either once we move beyond the CPython C ABI (which encompasses the underlying platform ABI)
Showing my ignorance here -- what else is there we want to support (fortran ABI maybe?) We *might* be able to get to the point of being able to describe
platform ABIs well enough to allow public wheels for arbitrary platforms,
That would be cool -- but not what I'm talking about here. I'm only talking about the ABIs we already describe.
Are those the targets for binary wheels? I don't think so.
Yes, they'll likely end up being one of Fedora's targets for prebuilt wheel files: https://fedoraproject.org/wiki/Env_and_Stacks/Projects/UserLevelPackageManag...
cool -- but it is Fedora that will be building those wheels -- so a systems integrator.
but if you statically link, you need to build the static package right
anyway -- so it doesn't actually solve the problem at hand anyway.
Yes it does - you just need to make sure your build environment suitably matches your target deployment environment.
"just?" -- that's can actually be a major pain -- at least on OS-X.
So what would be good is a way to specify a "this build" dependency. That
can be hacked in, of course, but nicer not to have to.
By the time you've solved all these problems I believe you'll find you have reinvented conda ;)
I really do have less lofty goals than that, but yes -- no point in going down that route! Anyway -- I've take a lot of my time (and a bunch of others on this list). And where ai think we are at is: * No one else seems to think it's worth trying to extend the PyPa ecosystem a bit more to better support dynamic libs. (except _maybe_ Enthought?) * I still think it can be done with minimal changes, and hacked in to do the proof of concept * But I'm not sure it's something that's going to get to the top of my ToDo list anyway -- I can get my needs met with conda anyway. My real production work is deep in the SciPy stack. * So I may or may not move my ideas forward -- if I do, I'll be back with questions and maybe a more concrete proposal some day.... But I learned a lot from this conversation -- thanks! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 17 May 2015 06:19, "Chris Barker"
indeed -- but it does have a bunch of python-specific features....it was built around the need to combine python with other systems.
That makes it an interesting alternative to pip on the package *consumption* side for data analysts, but it isn't currently a good fit for any of pip's other use cases (e.g. one of the scenarios I'm personally most interested in is that pip is now part of the Fedora/RHEL/CentOS build pipeline for Python based RPM packages - we universally recommend using "pip install" in the %install phase over using "setup.py install" directly)
hmm -- conda generally uses "setup.py install" in its build scripts. And it doesn't use pip install because it wants to handle the downloading and dependencies itself (in fact, turning OFF setuptools dependency handling is an annoyance..)
So I'm not sure why pip is needed here -- would it be THAT much harder to build rpms of python packages if it didn't exist? (I do see why you wouldn't want to use conda to build rpms..)
We switched to recommending pip to ensure that the Fedora (et al) build toolchain can be updated to emit & handle newer Python metadata standards just by upgrading pip. For example, it means that system installed packages on modern Fedora installations should (at least in theory) provide full PEP 376 installation metadata with the installer reported as the system package manager. The conda folks (wastefully, in my view) are still attempting to compete directly with pip upstream, instead of delegating to it from their build scripts as an abstraction layer that helps hide the complexity of the Python packaging ecosystem.
But while _maybe_ if conda had been around 5 years earlier we could have not bothered with wheel,
No, we couldn't, as conda doesn't work as well for system integrators.
I'm not proposing that we drop it -- just that we push pip and wheel a bit farther to broaden the supported user-base.
I can't stop you working on something I consider a deep rabbithole, but why not just recommend the use of conda, and only pubish sdists on PyPI? conda needs more users and contributors seeking better integration with the PyPA tooling, and minimising the non-productive competition. The web development folks targeting Linux will generally be in a position to build from source (caching the resulting wheel file, or perhaps an entire container image). Also, assuming Fedora's experiment with language specific repos goes well ( https://fedoraproject.org/wiki/Env_and_Stacks/Projects/LanguageSpecificRepos...), we may see other distros replicating that model of handling the wheel creation task on behalf of their users. It's also worth noting that one of my key intended use cases for metadata extensions is to publish platform specific external dependencies in the upstream project metadata, which would get us one step closer to fully automated repackaging into policy compliant redistributor packages.
Binary wheels already work for Python packages that have been developed with cross-platform maintainability and deployability taken into account as key design considerations (including pure Python wheels, where the binary format just serves as an installation accelerator). That category just happens to exclude almost all research and data analysis software, because it excludes the libraries at the bottom of that stack
It doesn't quite exclude those -- just makes it harder. And while depending on Fortran, etc, is pretty unique to the data analysis stack, stuff like libpng, libcurl, etc, etc, isn't -- non-system libs are not a rare thing.
The rare thing is having two packages which are tightly coupled to the ABI of a given external dependency. That's a generally bad idea because it causes exactly these kinds of problems with independent distribution of prebuilt components. The existence of tight ABI coupling between components both gives the scientific Python stack a lot of its power, *and* makes it almost as hard to distribute in binary form as native GUI applications.
It's also the case that when you *are* doing your own system integration, wheels are a powerful tool for caching builds,
conda does this nicely as well :-) I"m not tlrying to argue, at all, that binary wheels are useless, jsu that they could be a bit more useful.
Ah -- here is a key point -- because of that, we DO support binary
on PyPi -- but only for Windows and OS-X.. I'm just suggesting we find
a way
to extend that to pacakges that require a non-system non-python dependency.
At the point you're managing arbitrary external binary dependencies, you've lost all the constraints that let us get away with doing this for extension modules without adequate metadata, and are back to trying to solve the same arbitrary ABI problem that exists on Linux.
I still don't get that -- any binary extension needs to match the ABI of
A PEP 426 metadata extension proposal for describing external binary dependencies would certainly be a welcome addition. That's going to be a common need for automated repackaging tools, even if we never find a practical way to take advantage of it upstream. packages the python is it used with -- a shared lib is the same problem. No, it's not, because we don't have python.org defining the binary ABI for anything except CPython itself. That's what constrains the target ABIs for extension modules on Windows & Mac OS X to a feasible number evolving at a feasible rate. A large part of what *defines* a platform is making decisions about the ABI to publish & target. Linux distros, nix, conda do that for everything they redistribute. I assume chocolatey does as well (I'm not sure if the Mac systems do prebuilt binaries)
The line is drawn at ABI compatibility management. We're able to fuzz that line a little bit in the case of Windows and Mac OS X extension modules because we have the python.org CPython releases to act as an anchor for the ABI definition.
We don't have that at all on other *nix platforms, and we don't have it on Windows and Mac OS X either once we move beyond the CPython C ABI (which encompasses the underlying platform ABI)
Showing my ignorance here -- what else is there we want to support (fortran ABI maybe?)
Every single external binary dependency where the memory layout may be exposed to other extension modules when loaded into the same process becomes an ABI compatibility concern, with structs changing sizes so they don't fit in the allocated space any more being one of the most common issues. The stable ABI PEP has some additional background on that: https://www.python.org/dev/peps/pep-0384/
We *might* be able to get to the point of being able to describe platform ABIs well enough to allow public wheels for arbitrary platforms,
That would be cool -- but not what I'm talking about here. I'm only talking about the ABIs we already describe.
Anyway -- I've take a lot of my time (and a bunch of others on this
Except you're not, as we don't currently describe the ABIs for arbitrary external dependencies at all. list). And where ai think we are at is:
* No one else seems to think it's worth trying to extend the PyPa
* I still think it can be done with minimal changes, and hacked in to do
* But I'm not sure it's something that's going to get to the top of my ToDo list anyway -- I can get my needs met with conda anyway. My real
ecosystem a bit more to better support dynamic libs. (except _maybe_ Enthought?) I know Donald is keen to see this, and a lot of ideas become more feasible if(/when?) PyPI gets an integrated wheel build farm. At that point, we can use the "centre of gravity" approach by letting the build farm implicitly determine the "standard" version of particular external dependencies, even if we can't communicate those versions effectively in the metadata. I'm also interested in metadata extensions to describe external binary dependencies, but I'm not sure that can be done sensibly in a platform independent way (in which case, it would be better to let various communities define their own extensions). the proof of concept I'm still not clear on what "it" is. I've been pointing out how hard it is to do this right in the general case, but I get the impression you're actually more interested in the narrower case of defining a "SciPy ABI" that encompasses selected third party binary dependencies. That's a more attainable goal, but NumFocus or the SciPy community would likely be a better audience for that discussion than distutils-sig. production work is deep in the SciPy stack.
* So I may or may not move my ideas forward -- if I do, I'll be back with questions and maybe a more concrete proposal some day....
If I'm correct that your underlying notion is "It would be nice if there was an agreed SciPy ABI we could all build wheels against, and a way of publishing the external dependency manifest in the wheel metadata so it could be checked for consistency at installation time", that's a far more tractable problem than the "arbitrary binary dependencies" one. Attempting to solve the latter is what I believe leads to reinventing something functionally equivalent to conda, while I expect attempting to solve the former is likely to be more of a political battle than a technical one :) Cheers, Nick.
On 17 May 2015 at 04:48, Nick Coghlan
A large part of what *defines* a platform is making decisions about the ABI to publish & target. Linux distros, nix, conda do that for everything they redistribute. I assume chocolatey does as well
I'm picking on this because it seems to be a common misconception about what Chocolatey provides on Windows. As far as I understand, Chocolatey does *not* provide a "platform" in this sense at all. The installers hosted by Chocolatey are typically nothing more than repackaged upstream installers (or maybe just scripting around downloading and running upstream installers directly), with a nice command line means of discovering and installing them.
From the Chocolatey FAQ:
""" What does Chocolatey do? Are you redistributing software? Chocolatey does the same thing that you would do based on the package instructions. This usually means going out and downloading an installer from the official distribution point and then silently installing it on your machine. With most packages this means Chocolatey is not redistributing software because they are going to the same distribution point that you yourself would go get the software if you were performing this process manually. """ So AIUI, for example, if you install Python with Chocolatey, it just downloads and runs the python.org installer behind the scenes. Also, Chocolatey explicitly doesn't handle libraries - it is an application installer only. So there's no dependency management, or sharing of libraries beyond that which application installers do natively. Disclaimer: I haven't used Chocolatey much, except for some experimenting. This is precisely *because* it doesn't add much beyond application installs, which I'm pretty much happy handling myself. But it does mean that I could have missed some aspects of what it provides. Paul
Trying to keep this brief, because the odds of my finding time to do much with this are slim..
I'm not proposing that we drop it -- just that we push pip and wheel a bit farther to broaden the supported user-base.
I can't stop you working on something I consider a deep rabbithole,
no -- but I do appreciate your assessment of how deep that hole is -- you certainly have a while lot more background with all this than I do -- I could well be being very naive here.
but why not just recommend the use of conda, and only pubish sdists on PyPI? conda needs more users and contributors seeking better integration with the PyPA tooling, and minimising the non-productive competition.
I essentially where two hats here: 1) I produce software built on top of the scientific python stack, and I want my users to have an easy experience with installing and running my code. For that -- I am going the conda route. I"m not there yet, but close to being able to say: a) install Anaconda b) add my binstar channel to your conda environment c) conda install my_package The complication here is that we also have a web front end for our computational code, and it makes heavy use of all sorts of web-oriented packages that are not supported by Anaconda or, for the most part, the conda community (binstar). My solution is to make conda packages myself of those and put them in my binstar channel. The other option is to piip install those packages, but then you get pretty tangled up in dependencies ans conda environments, vs viirtual environments, etc... 2) Hat two is an instructor for the University of Washington Continuing Education Program's Python Certification. In that program, we do very little with the scipy stack, but have an entire course on web development. And the instructor of that class, quite rightly, pushes the standard of practice for web developers: heavy use of virtualenv and pip. Oh, and hat (3) is a long time pythonista, who, among other things, has been working for years to make using python easier to use on the Mac for folks that don't know or care what the unix command line is.... I guess the key thing here for me is that I don't see pushing conda to budding web developers -- but what if web developers have the need for a bit of the scipy stack? or??? We really don't have a good solution for those folks.
The web development folks targeting Linux will generally be in a position to build from source (caching the resulting wheel file, or perhaps an entire container image).
again, I'm not concerned about linux -- it's an ABI nightmare, so we really don't want to go there, and its users are generally more "sophisticated" a little building is not a big deal.
It's also worth noting that one of my key intended use cases for metadata extensions is to publish platform specific external dependencies in the upstream project metadata, which would get us one step closer to fully automated repackaging into policy compliant redistributor packages.
Honestly, I don't follow this! -- but I'll keep an eye out for it - sounds useful.
The existence of tight ABI coupling between components both gives the scientific Python stack a lot of its power, *and* makes it almost as hard to distribute in binary form as native GUI applications.
I think harder, actually :-)
* No one else seems to think it's worth trying to extend the PyPa ecosystem a bit more to better support dynamic libs. (except _maybe_ Enthought?)
I know Donald is keen to see this, and a lot of ideas become more feasible if(/when?) PyPI gets an integrated wheel build farm. At that point, we can use the "centre of gravity" approach by letting the build farm implicitly determine the "standard" version of particular external dependencies, even if we can't communicate those versions effectively in the metadata.
* I still think it can be done with minimal changes, and hacked in to do
* But I'm not sure it's something that's going to get to the top of my ToDo list anyway -- I can get my needs met with conda anyway. My real
That's more what I'm thinking, yes. the proof of concept I'm still not clear on what "it" is. I've been pointing out how hard it is to do this right in the general case, but I get the impression you're actually more interested in the narrower case of defining a "SciPy ABI" that encompasses selected third party binary dependencies. I wouldn't say SciPyABI -- that, in a way is already being done -- folks are coordinating the "official" bianries of at least the core "scipy stack" -- it's a pain -- no Windows wheels for numpy, for instance (though I think they are close) My interest is actually taking it beyond that -- honestly in my case there are only a handful of libs that I'm aware of that get common use, for instance libfreetype and libpng in wxPython, PIL, matplotlib, etc. If I were only SciPy focused -- conda would be the way to go. That's part ofteh problem I see -- there are split communities, but they DO overlap, I thin ti's a diservice to punt thes issues of to individual sub-communities to address on their own. production work is deep in the SciPy stack.
* So I may or may not move my ideas forward -- if I do, I'll be back with questions and maybe a more concrete proposal some day....
If I'm correct that your underlying notion is "It would be nice if there was an agreed SciPy ABI we could all build wheels against, and a way of publishing the external dependency manifest in the wheel metadata so it could be checked for consistency at installation time", that's a far more tractable problem than the "arbitrary binary dependencies" one.
What I'm talking about it in-between -- not just SciPy, but not "arbitrary binary dependencies" either. But i think the trick is that the dependency really is binary: i.e: wheelA depends on this particular wheel -- not this version of a lib, but this BUILD of a lib. and what I envision is that "this build of a lib" would be, for instance ,libnnnx.ym "built to be compatible with the python.org 64 bit Windows build of python 3.4" Which is what we need to do now anyway -- if you are going to deliver a build of PIL (pillow), for instance, you need to deliver the libs it needs, built to match the python it's built for, whether you statically link or dump the dll in with the extension. All I'm suggesting is that we have a way of letting others use that same lib build -- this would be more a social thing than anything else. while I expect attempting to solve the former is likely to be more of a political battle than a technical one :) yes -- I think this is a social / political issue as much as anything else. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 17 May 2015 at 23:50, Chris Barker
I guess the key thing here for me is that I don't see pushing conda to budding web developers -- but what if web developers have the need for a bit of the scipy stack? or???
We really don't have a good solution for those folks.
Agreed. My personal use case is as a general programmer (mostly sysadmin and automation type of work) with some strong interest in business data analysis and a side interest in stats. For that sort of scenario, some of the scipy stack (specifically matplotlib and pandas and their dependencies) is really useful. But conda is *not* what I'd use for day to day work, so being able to install via pip is important to me. It should be noted that installing via pip *is* possible - via some of the relevant projects having published wheels, and the rest being available via Christoph Gohlke's site either as wheels or as wininsts that I can convert. But that's not a seamless process, so it's not something I'd be too happy explaining to a colleague should I want to share the workload for that type of thing. Paul
This pertains more to the other thread I started, but I'm sort of becoming convinced--especially by Paul Moore's suggestion there--that the better approach is to grow conda (the tool) rather than shoehorn conda packages into pip. Getting pip to recognize the archive format of conda would be easy enough alone, but that really doesn't cover the fact that 'conda ~= pip+virtualenv', and pip alone simply should not try to grow that latter aspect itself. Plus pip is not going to be fully language agnostic, for various reasons, but including the fact that apt-get and yum and homebrew and ports already exist. So it might make sense to actually allow folks to push conda to budding web developers, if conda allowed installation (and environment management) of sdist packages on PyPI. So perhaps it would be good if *this* worked: % pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI Maybe that flag is mis-named, or could be omitted altogether. But there's no conceptual reason that conda couldn't build an sdist fetched from PyPI into a platform specific binary matching the current user machine (and do all the metadata dependency and environment stuff the conda tool does). On Mon, May 18, 2015 at 3:17 AM, Paul Moore
On 17 May 2015 at 23:50, Chris Barker
wrote: I guess the key thing here for me is that I don't see pushing conda to budding web developers -- but what if web developers have the need for a bit of the scipy stack? or???
We really don't have a good solution for those folks.
Agreed. My personal use case is as a general programmer (mostly sysadmin and automation type of work) with some strong interest in business data analysis and a side interest in stats.
For that sort of scenario, some of the scipy stack (specifically matplotlib and pandas and their dependencies) is really useful. But conda is *not* what I'd use for day to day work, so being able to install via pip is important to me. It should be noted that installing via pip *is* possible - via some of the relevant projects having published wheels, and the rest being available via Christoph Gohlke's site either as wheels or as wininsts that I can convert. But that's not a seamless process, so it's not something I'd be too happy explaining to a colleague should I want to share the workload for that type of thing.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On 18 May 2015 at 18:50, David Mertz
% pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
But that doesn't give (Windows, mainly) users a solution for things that need a C compiler, but aren't provided as conda packages. My honest view is that unless conda is intending to replace pip and wheel totally, you cannot assume that people will be happy to use conda alongside pip (or indeed, use any pair of independent packaging tools together - people typically want one unified solution). And if the scientific community stops working towards providing wheels for people without compilers "because you can use conda", there is going to be a proportion of the Python community that will lose out on some great tools as a result. Paul
I don't really see any reason conda couldn't support bdist wheels also. But yes, basically the idea is that we'd like users to be able to rely entirely on conda as their packaging (and environment configuration) system if they choose to. It may be impolitic to say so, but I think conda can and should replace pip for a large class of users. That is, it should be possible for users to use pip exactly once (as in the line I show above), and use conda forever thereafter. Since conda does a lot more (programming language independence, environments), perhaps it really does make a lot more sense for conda to be "one package manager to rule them all" much more than trying to make a pip that does so. But y'know, the truth is I'm trying to figure out the best path here. I want to get better interoperability between conda packages and the rest of the Python ecosystem, but there are stakeholders involved both in the distutils community and within Continuum (where I now work). On Mon, May 18, 2015 at 11:21 AM, Paul Moore
On 18 May 2015 at 18:50, David Mertz
wrote: % pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
But that doesn't give (Windows, mainly) users a solution for things that need a C compiler, but aren't provided as conda packages.
My honest view is that unless conda is intending to replace pip and wheel totally, you cannot assume that people will be happy to use conda alongside pip (or indeed, use any pair of independent packaging tools together - people typically want one unified solution). And if the scientific community stops working towards providing wheels for people without compilers "because you can use conda", there is going to be a proportion of the Python community that will lose out on some great tools as a result.
Paul
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Mon, May 18, 2015 at 11:21 AM, Paul Moore
On 18 May 2015 at 18:50, David Mertz
wrote: % pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
But that doesn't give (Windows, mainly) users a solution for things that need a C compiler, but aren't provided as conda packages.
Conda provides (or can) a C compiler (some versions of gcc). It was buggy last time I checked, but it's doable.
My honest view is that unless conda is intending to replace pip and wheel totally, you cannot assume that people will be happy to use conda alongside pip (or indeed, use any pair of independent packaging tools together - people typically want one unified solution). And if the scientific community stops working towards providing wheels for people without compilers "because you can use conda", there is going to be a proportion of the Python community that will lose out on some great tools as a result.
Exactly -- this idea that there are two (or more) non-overlapping communities is pretty destructive. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 19 May 2015 at 09:43, Chris Barker
On Mon, May 18, 2015 at 11:21 AM, Paul Moore
wrote: My honest view is that unless conda is intending to replace pip and wheel totally, you cannot assume that people will be happy to use conda alongside pip (or indeed, use any pair of independent packaging tools together - people typically want one unified solution). And if the scientific community stops working towards providing wheels for people without compilers "because you can use conda", there is going to be a proportion of the Python community that will lose out on some great tools as a result.
Exactly -- this idea that there are two (or more) non-overlapping communities is pretty destructive.
There's a cornucopia of *overlapping* communities. We only rarely hear from system administrators upstream, for example, as they tend to be mainly invested in particular operating system or configuration management communities, leaving upstream mostly to developers and data analysts. For these admins, a package management system is only going to be potentially interesting if it is supported by their operating system or configuration management tool of choice (e.g. http://docs.ansible.com/list_of_packaging_modules.html for Ansible, or some of the options linked from Salt's package management abstraction layer: http://docs.saltstack.com/en/latest/ref/states/all/salt.states.pkg.html) This is why I'm such a big fan of richer upstream metadata with automated conversion to downstream formats as my preferred long term solution - this isn't a "pip vs conda" story, it's "pip vs conda vs yum vs apt vs MSI vs nix vs zypper vs zc.buildout vs enstaller vs PyPM vs ....". (in addition to the modules listed for Ansible and Salt, I discovered yet another one today: https://labix.org/smart) The main differences I see with conda relative to the other downstream package management systems is that it happened to be made by folks that are also heavily involved in development of Python based data analysis tools, and that some of its proponents want it to be the "one package management tool to rule them all". I consider the latter proposal to be as outlandish an idea as believing the world only needs one programming language - just as with programming languages, packaging system design involves making trade-offs between different priorities, so you can't optimise for everything at once. conda's an excellent cross-platform end user focused dependency management system. This is a good thing, but it does mean conda isn't a suitable candidate for use as an input format for other tools that compete with it. As far as the "we could use a better dynamic linking story for Windows and Mac OS X" story goes, now that I understand the general *nix case is considered out of scope for the situations Chris is interested in, I think there's a reasonable case to be made for being able to *bundle* redistributable dynamically linked libraries with a wheel file, and for the build process of *other* wheel files to be able to rely on those bundled external libraries. I originally thought the request was about being able to *describe* the external dependencies in sufficient detail that the general case on *nix could be handled, or that an appropriate Windows or Mac OS X binary could be obtained out of band, rather than by being bundled with the relevant wheel file. Getting a bundling based model to work reliably is still going to be difficult (and definitely more complicated than static linking in cases where data sharing isn't needed), but it's not intractable the way the general case is. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, May 20, 2015 at 12:57 AM, Nick Coghlan
This is why I'm such a big fan of richer upstream metadata with automated conversion to downstream formats as my preferred long term solution - this isn't a "pip vs conda" story, it's "pip vs conda vs yum vs apt vs MSI vs nix vs zypper vs zc.buildout vs enstaller vs PyPM vs ....".
hopefully not "versus", but "working with" ;-) -- but very good point. If python can do things to make it easier for all these broader systems, that's a "good thing"
The main differences I see with conda relative to the other downstream package management systems is that it happened to be made by folks that are also heavily involved in development of Python based data analysis tools,
Which is to say Python itself.
and that some of its proponents want it to be the "one package management tool to rule them all".
I don't know about that -- though another key point is that it is cross platform (platform independent) -- it may be the only one that does that part well.
I consider the latter proposal to be as outlandish an idea as believing the world only needs one programming language - just as with programming languages, packaging system design involves making trade-offs between different priorities, so you can't optimise for everything at once. conda's an excellent cross-platform end user focused dependency management system. This is a good thing, but it does mean conda isn't a suitable candidate for use as an input format for other tools that compete with it.
Hmm -- that's true. But it is, as you said " cross-platform end user focused dependency management system" that handles python well, in addition to other things, including libs python may depend on. As such, it _could_ play the role that pip+wheel (secondarily pypi) play in the python ecosystem. You'd still need something like distutils and/or setuptools to actually handle the building, etc. And IF we wanted the "official" package manager for python to fully support dynamic libs, etc, as well as non-python associated software, then it would make sense to use conda, rather than keep growing pip_wheel until it duplicated conda's functionality. But I don't get the impression that that is an end-goal for PyPa, and I'm not sure it should be. As far as the "we could use a better dynamic linking story for Windows
and Mac OS X" story goes, now that I understand the general *nix case is considered out of scope for the situations Chris is interested in,
exactly, -- just like it linux is out of scope for compiled wheels I think there's a reasonable case to be made for being able to
*bundle* redistributable dynamically linked libraries with a wheel file, and for the build process of *other* wheel files to be able to rely on those bundled external libraries.
yup -- that's what I have in mind.
I originally thought the request was about being able to *describe* the external dependencies in sufficient detail that the general case on *nix could be handled, or that an appropriate Windows or Mac OS X binary could be obtained out of band, rather than by being bundled with the relevant wheel file.
Sure would be nice, but no, -- I have no fantasies about that. Getting a bundling based model to work reliably is still going to be
difficult (and definitely more complicated than static linking in cases where data sharing isn't needed), but it's not intractable the way the general case is.
Glad you agree -- so the rabbit hole may not be that deep? There isn't much that should change in pip+wheel+metadata to enable this. So the way to proceed, if someone wants to do it, could be to simply hack together some binary wheels of a common dependency or two, build wheels for a package or two that depend on those, and see how it works. I dont know if/when I'll find the roundtoits to do that -- but I have some more detailed ideas if anyone wants to talk about it. Then it becomes a social issue -- package maintainers would have to actually use these new sharedlib wheels to build against. But that isn't really that different than the current case of deciding whether to include a copy of a dependent python package in your distribution -- and one we made it easy for users to get dependencies, folks have been happy to shift that burden elsewhere. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 21 May 2015 at 03:37, Chris Barker
As such, it _could_ play the role that pip+wheel (secondarily pypi) play in the python ecosystem.
In practice, it can't, as conda is entirely inappropriate as an input format for yum/apt/enstaller/zc.buildout/pypm/MSI/etc. In many ways, the barriers that keep conda from being a viable competitor to pip from an upstream perspective are akin to those that felled the distutils2 project, while the compatible-with-the-existing-ecosystem d2to1 has seen far more success. Rather than being strictly technical, the reasons for this are mostly political (and partially user experience related) so it's not worth the futile effort of attempting to change them. When folks try anyway, it mainly serves to alienate people using (or working on) other integration platforms rather than achieving anything productive (hence my comment about the "one package manager to rule them all" attitude of some conda proponents, although I'll grant they haven't yet gone as far as the NixOS folks by creating an entirely conda based Linux distro). The core requirement for the upstream tooling is to be able to bridge the gap from publishers of software components implemented in Python to integrators of software applications and development environments (regardless of whether those integrators are themselves end users, redistributors or both). That way, Python developers can focus on learning one publication toolchain (anchored by pip & PyPI), while users of integrated platforms can use the appropriate tools for their platform. conda doesn't bridge that gap for Python in the general case, as it is itself an integrator tool managed independently of the PSF and designed to consume components from *multiple* language ecosystems and make them available to end users in a common format. Someone designing a *new* language ecosystem today could quite reasonably decide not to invent their own distribution infrastructure, and instead adopt conda as their *upstream* tooling, and have it be the publication toolchain that new contributors to that ecosystem are taught, and that downstream integrators are expected to interoperate with, but that's not the case for Python - Python's far too far down the distutils->setuptools->pip path to be readily amenable to alternatives (especially alternatives that are currently still fairly tightly coupled to the offerings of one particular commercial redistributor). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 21 May 2015 at 08:46, Nick Coghlan
On 21 May 2015 at 03:37, Chris Barker
wrote: As such, it _could_ play the role that pip+wheel (secondarily pypi) play in the python ecosystem.
In practice, it can't, as conda is entirely inappropriate as an input format for yum/apt/enstaller/zc.buildout/pypm/MSI/etc. In many ways, the barriers that keep conda from being a viable competitor to pip from an upstream perspective are akin to those that felled the distutils2 project, while the compatible-with-the-existing-ecosystem d2to1 has seen far more success.
I think I've finally figured out a short way of describing these "packaging ideas that simply won't work": if an ecosystem-wide packaging proposal doesn't work for entirely unmaintained PyPI packages, it's likely a bad proposal. This was not only the fatal flaw in the previous distribute/distutils2 approach, it's the reason we introduced so much additional complexity into PEP 440 in order to preserve compatibility with the vast majority of existing package versions on PyPI (over 98% of existing version numbers were still accepted), it's one of the key benefits of separating the PyPI-to-end-user TUF PEP from the dev-to-end-user one, and it's the reason why the "Impact assessment" section is one of the most important parts of the proposal in PEP 470 to migrate away from offering the current link spidering functionality (https://www.python.org/dev/peps/pep-0470/#id13). Coping with this problem is also why injecting setuptools when running vanilla distutils projects is one of the secrets of pip's success: by upgrading setuptools, and by tweaking the way pip invokes setup.py with it injected, we can change the way packages are built and installed *without* needing to change the packages themselves. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, May 20, 2015 at 5:20 PM, Nick Coghlan
Coping with this problem is also why injecting setuptools when running vanilla distutils projects is one of the secrets of pip's success:
Ahh! THAT is the role pip plays in building. It's the way that you get setuptools features in a plain distutils-based package. So conda _could_ play the same trick, and inject setuptools into packages that don't use it already, but why bother -- pip does that for us. OK -- I'm going to try to find some time to play with this -- I do think it will solve some of the issues I've had, and if it works well, maybe we can move it toward a new standard of practice for conda-python-packages. Thanks -- clarity at last! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On May 21, 2015, at 11:33 AM, Chris Barker
wrote: On Wed, May 20, 2015 at 5:20 PM, Nick Coghlan
mailto:ncoghlan@gmail.com> wrote: Coping with this problem is also why injecting setuptools when running vanilla distutils projects is one of the secrets of pip's success: Ahh! THAT is the role pip plays in building. It's the way that you get setuptools features in a plain distutils-based package. So conda _could_ play the same trick, and inject setuptools into packages that don't use it already, but why bother -- pip does that for us.
OK -- I'm going to try to find some time to play with this -- I do think it will solve some of the issues I've had, and if it works well, maybe we can move it toward a new standard of practice for conda-python-packages.
Thanks -- clarity at last!
Also, one of the goals a few of us has in the PyPA is that we move to a future where the build systems are pluggable. So one package could be building using setuptools, another building using some SciPy specific build tool, another using a whole other one. They will all ideally have some sort of generic interface that they need to work with, but using pip means you get the details of abstracting out to the different build tools handled for you, for “free”. At least, in theory that’s how it’ll work :) --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 21 May 2015 at 16:37, Donald Stufft
Also, one of the goals a few of us has in the PyPA is that we move to a future where the build systems are pluggable. So one package could be building using setuptools, another building using some SciPy specific build tool, another using a whole other one. They will all ideally have some sort of generic interface that they need to work with, but using pip means you get the details of abstracting out to the different build tools handled for you, for “free”. At least, in theory that’s how it’ll work :)
Note that this is a key to why wheel is important to this discussion. The "build interface" in pip is "pip wheel foo", which will (in the "pluggable build" future) run whatever build tool the project specifies, and produce as output a wheel. That wheel is then the input for any packaging systems that want to build their own formats. So, we have: 1. sdist: The source format for packages. 2. distutils/setuptools/bdist_wheel: The only major "build tool" currently in existence. Our goal is to seamlessly allow others to fit here. 3. pip wheel: The build tool interface designed to convert sdist->wheel using the appropriate build tool. 4. wheel: The built format for Python packages. Acts as a common target for build tools and as a common source for distribution package builders. Also directly installable via pip. 5. pip install <wheelfile>. The canonical Python installer, taking wheels as input. Pip can also combine this whole sequence, and install direct from sdist (via wheel in the next version, currently by direct install from sdist), but that;s not the important point for this discussion. In terms of tool interoperability, therefore, there are a couple of places things can hook in. 1. Anything that can take an sdist and build a wheel can be treated as a "build tool". You could run the appropriate build tool for your package manually, but it's a goal for pip to provide a unified interface to that process. 2. Any installer can use wheels as the source for building its install packages. This frees package build processes from needing to deal with compiling Python extensions, packaging up Python sources, etc. We (the PyPA) haven't really done a particularly good job of articulating this design, not least because a lot of it is still ideas in progress, rather than concrete plans. And as a result, it's hard for tools like conda to clearly understand how they could fit into this stack. And of course, the backward compatibility pressures on any change in Python packaging causes things to go pretty slowly, meaning that projects like conda have an additional pressure to come up with a solution *right now* rather than waiting for a standard solution that frankly is currently vapourware. Ideally, the scientific community's experiences with building complex Python packages can help us to improve the wheel spec to ensure that it can better act as that universal binary format (for repackaging or direct installation). But that does require ongoing effort to make sure we understand where the wheel format falls short, and how we can fix those issues. Doing this without getting sucked into trying to solve problems that the wheel format is *not* intended to cover (packaging and distribution of non-Python code) is hard - particularly where we need to express dependencies on such things. Paul.
On 05/21/2015 12:33 PM, Paul Moore wrote:
1. Anything that can take an sdist and build a wheel can be treated as a "build tool". You could run the appropriate build tool for your package manually, but it's a goal for pip to provide a unified interface to that process. Paul,
Thanks for the wonderfully clear email, it helped solidify a number of questions I had on pip, wheels and build tools. From the section above it sounds like a tool that is capable of converting between conda packages and wheel files (and possibility other binary Python formats) could bridge the two systems and help root out where the wheel format is lacking. Combined with conda's build system such a converter could serve as a pip pluggable "build tool" to take an sdist and produce a wheel using a workflow more familiar to the segment of the scientific python community who use conda. Obviously there are a number of technical challenges that would need to be overcome (shared libraries, Python entry points, etc) and I'm guessing some political issues. None-the-less, I'd be willing to try to prototype such a tool if other think it would be of use. Cheers, - Jonathan Helmus
On Thu, May 21, 2015 at 11:12 AM, Jonathan Helmus
it sounds like a tool that is capable of converting between conda packages and wheel files
converting from a wheel to a conda package should be very doable (and may already be done). But the other way around is not -- conda packages can hold a superset of what wheels can hold -- i.e. stuff outside of python. Now that I think about it, it may seem kludgy, but conda packages can be built from wheel right now, with no special tools. A conda recipe that simply installs a wheel in it's build script would do just that. I'm still a bit confused about the role of wheel here. Why build a wheel, jsut so you can go install it, rather than simply install the package directly? I'm not sure it matters in this context, but I don't get the point. Though now that I think about it, that's exactly what conda does -- if you want to install a package from a conda build recipe -- you build a conda package, and then install that. That may be required to support some stuff conda supports -- for instance, conda install can re-write paths to shared libs, for instance. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 21 May 2015 at 20:26, Chris Barker
Now that I think about it, it may seem kludgy, but conda packages can be built from wheel right now, with no special tools. A conda recipe that simply installs a wheel in it's build script would do just that.
That sounds about right, from what I've seen of conda builds. You could probably do better (for example, by just repacking the wheel rather than going through the whole wheel install process) but you don't have to. Some possible problem areas - when you install a wheel, it will install executable wrappers for the entry points (like pip.exe) which are tied to the install location. You'd need to deal with that. But presumably conda already has to deal with that because setuptools does precisely the same.
I'm still a bit confused about the role of wheel here. Why build a wheel, just so you can go install it, rather than simply install the package directly?
Basically, because you can't "simply install". You may not have a compiler, or you may not have the required libraries, etc etc. Don't forget, you can build a wheel once, then install it anywhere that has a compatible Python installation. I'm surprised this isn't obvious to you, as isn't that precisely what conda does as well? Also, installing from a wheel is a *lot* faster than installing from sdist - people who deploy lots of packages to multiple servers or virtualenvs greatly appreciate the extra speed of a wheel install. That's why pure-python wheels are worth having, even though you could install from source. Also, so you don't need any install-time only requirements (e.g. setuptools) on the target production system. Generally, pretty much all of the reasons people don't compile all their software on their production machines :-) Paul
On Thu, May 21, 2015 at 1:12 PM, Paul Moore
built from wheel right now, with no special tools. A conda recipe that simply installs a wheel in it's build script would do just that.
That sounds about right, from what I've seen of conda builds. You could probably do better (for example, by just repacking the wheel rather than going through the whole wheel install process)
but then conda would need to understand wheel -- now it doesn't have to. ony pip or whatever has to understand wheel.
Some possible problem areas - when you install a wheel, it will install executable wrappers for the entry points (like pip.exe) which are tied to the install location. You'd need to deal with that. But presumably conda already has to deal with that because setuptools does precisely the same.
indeed conda build (or is it install -- not sure!) does do path re-writing, etc.
I'm still a bit confused about the role of wheel here. Why build a wheel, just so you can go install it, rather than simply install the package directly?
Basically, because you can't "simply install". You may not have a compiler, or you may not have the required libraries, etc etc.
I'm thinking of the context of building a conda package -- or rpm, etc. -- not the general, "you are using pip as your package manager" case. And in that case, you want to build it to match your environemnt -- which may not be what a wheel onPyPi matches, for instance. So you have to build the package -- not sure what that wheel step buys you. But it doesn't cost much either, so why not?
Generally, pretty much all of the reasons people don't compile all their software on their production machines :-)
right, but you're not running conda build on a production machine, either. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 21 May 2015 at 23:49, Chris Barker
On Thu, May 21, 2015 at 1:12 PM, Paul Moore
wrote: Some possible problem areas - when you install a wheel, it will install executable wrappers for the entry points (like pip.exe) which are tied to the install location. You'd need to deal with that. But presumably conda already has to deal with that because setuptools does precisely the same.
indeed conda build (or is it install -- not sure!) does do path re-writing, etc.
Note that script wrappers are executables, not scripts, (some are exes with scripts alongside, but not all - distlib and pip embed the script in the exe). So simply editing scripts in .py files isn't enough.
I'm still a bit confused about the role of wheel here. Why build a wheel, just so you can go install it, rather than simply install the package directly?
Basically, because you can't "simply install". You may not have a compiler, or you may not have the required libraries, etc etc.
I'm thinking of the context of building a conda package -- or rpm, etc. -- not the general, "you are using pip as your package manager" case. And in that case, you want to build it to match your environemnt -- which may not be what a wheel onPyPi matches, for instance.
So you have to build the package -- not sure what that wheel step buys you. But it doesn't cost much either, so why not?
It avoids the need to embed the knowledge of how to build the package in conda. Remember that "setup.py install" is not necessarily the only way a package might be built once we get to the pluggable build state.
Generally, pretty much all of the reasons people don't compile all their software on their production machines :-)
right, but you're not running conda build on a production machine, either.
I was answering your question "why install from wheels rather than just installing directly". Now that I understand that you're saying "... in the context of building conda packages" the answer is slightly different: - To prepare for a "pluggable build" situation. - To integrate better with the rest of the Python packaging world. - To let people without compilers, but with access to binary wheels, build their own conda packages should the official ones not exist. (possibly others, that's just off the top of my head). Paul
On Thu, May 21, 2015 at 12:33 PM, Paul Moore
Doing this without getting sucked into trying to solve problems that the wheel format is *not* intended to cover (packaging and distribution of non-Python code) is hard - particularly where we need to express dependencies on such things.
It's not there yet (especially on non-Windows platforms), but the plan is for OneGet to provide a consistent way to install things from different package managers/formats.
On Wed, May 20, 2015 at 3:46 PM, Nick Coghlan
On 21 May 2015 at 03:37, Chris Barker
wrote: As such, it _could_ play the role that pip+wheel (secondarily pypi) play in the python ecosystem.
In practice, it can't, as conda is entirely inappropriate as an input format for yum/apt/enstaller/zc.buildout/pypm/MSI/etc.
well, I'm making a strong distinction between a build system and a dependency management / install system. conda is not any kind of replacement for distutils / setuptools. (kind of how rpm doesn't replace configure and make. at all.) I'm still confused as to why pip plays as big a role in building as it seems to, but I guess I trust that there are reasons. Maybe it's only wheel that conda duplicates. But this is all irrelevent, because: Rather than being strictly technical, the reasons for this are mostly
political (and partially user experience related)
Exactly. setuptools+pip+wheel is, and should be, the "official" python distribution system. When folks try anyway,
it mainly serves to alienate people using (or working on) other integration platforms rather than achieving anything productive (hence my comment about the "one package manager to rule them all" attitude of some conda proponents,
well, sorry if I've contributed to that -- but I guess for my part, there is a core frustration -- I have only so much time (not much), and I want to support multiple communities, -- and I simply can't do that without doing twice as much work. Duplication of effort may be inevitable, but it is still frustrating. That way, Python developers can focus on
learning one publication toolchain (anchored by pip & PyPI), while users of integrated platforms can use the appropriate tools for their platform.
That's all very nice, and it works great for packages that don't have any external dependencies, but if I'm trying to publish my python package, pip+wheel simply doesn't support what I need -- I can't use only one publication toolchain. And indeed, even if it did (with my vaporware better support for shared libs), that would be incompatible with conda, which does, in fact, support everything I need. conda doesn't bridge that gap for Python in the general case, as it is
itself an integrator tool managed independently of the PSF
well that is a political/social issue.
and designed to consume components from *multiple* language ecosystems and make them available to end users in a common format.
not sure why that precludes it being used for python -- Python somehow HWAS to use a system that is only designed for Python? why?
Python's far too far down the distutils->setuptools->pip path to be readily amenable to alternatives
agreed. this is the key point - it's gotten a bit blended in with teh technical issues, but that really is the key point. -- I'll shut up now :-) (at least about this) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, May 18, 2015 at 10:50 AM, David Mertz
This pertains more to the other thread I started, but I'm sort of becoming convinced--especially by Paul Moore's suggestion there--that the better approach is to grow conda (the tool) rather than shoehorn conda packages into pip.
I agree -- in some sense conda is pip+more, you couldn't do it without growing pip (see the other thread...)
So it might make sense to actually allow folks to push conda to budding web developers, if conda allowed installation (and environment management) of sdist packages on PyPI. So perhaps it would be good if *this* worked:
% pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
so a point / question here: you can, right now, run pip from inside a conda environment python, and for the most part, it works -- certainly for sdists. I'm actually doing that a lot, and so are others. But it gets messy when you have two systems trying to handle dependencies -- pip may not realize that conda has already installed something, and vice versa. So it's really nicer to have one package manager. But maybe all you really need to do is teach conda to understand pip meta-data, and/or make sure that conda write pip-compatible meta data. Then a user could do: conda install some_package and conda would look it all its normal places for some_package, and if it din't find it, it would try running "pip install" under the hood. The user wouldn't know, or have to know, where the package came from (though conda might want to add that to the meta-data for use come upgrade time, etc.) In short --make it easy for conda users to use pip / pypi packages. Note: there has been various threads about this on the Anaconda list lately. The current "plan" is to have a community binstar channel that mirrors as much of pypi as possible. Until we have an automated way to grab pypi packages for conda -- this isn't bad stop gap. Also note that conda can often (but not always) build a conda package from pypi automagically -- someone could potentially run a service that does that. On Mon, May 18, 2015 at 3:17 AM, Paul Moore
Agreed. My personal use case is as a general programmer (mostly sysadmin and automation type of work) with some strong interest in business data analysis and a side interest in stats.
For that sort of scenario, some of the scipy stack (specifically matplotlib and pandas and their dependencies) is really useful. But conda is *not* what I'd use for day to day work, so being able to install via pip is important to me.
What if "conda install" did work for virtually all pypi packages? (one way or the other) -- would you use and recommend Anaconda (or miniconda) then?
It should be noted that installing
via pip *is* possible - via some of the relevant projects having published wheels, and the rest being available via Christoph Gohlke's site either as wheels or as wininsts that I can convert. But that's not a seamless process, so it's not something I'd be too happy explaining to a colleague should I want to share the workload for that type of thing.
right -- that could be made better right now -- or soon. Gohlke's packages can't be simply put up on PyPi for licensing reasons (he's using the Intel math libs). But some folks are working really hard on getting a numpy wheel that will work virtually everywhere, and still give good performance for numerics. From there, the core SciPy stack should follow (it's already on PyPi for OS-X). Which is a GREAT move in the right direction, but doesn't get us quite to where PyPi can support the more complex packages. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
But it gets messy when you have two systems trying to handle dependencies -- pip may not realize that conda has already installed something, and vice versa. So it's really nicer to have one package manager.
But maybe all you really need to do is teach conda to understand pip meta-data, and/or make sure that conda write pip-compatible meta data.
Forgive me, I'm trying to follow as someone who is working with PyPI but hasn't really used conda or pip. Does a conda environment contain its own site-packages directory, and does pip correctly install packages to that directory? If so, I expect supporting PEP 376 would help with this. It doesn't help either package manager install dependencies from outside their repos, it just means that pip will work if the user installs dependencies from conda first. To be able to install dependencies, either conda needs to know enough about PyPI to find a package's dependencies itself (and at that point, I wonder how much value pip adds compared to 'wheel'), or pip needs to know that it can delegate to conda when run in this way.
On Mon, May 18, 2015 at 8:24 PM, Vincent Povirk
But maybe all you really need to do is teach conda to understand pip meta-data, and/or make sure that conda write pip-compatible meta data.
Forgive me, I'm trying to follow as someone who is working with PyPI but hasn't really used conda or pip. Does a conda environment contain its own site-packages directory,
If python was installed by conda, yes. I get a bit confused here. For one I have only used conda with Anaconda, and Anaconda starts you off with a python environment (for one thing, conda itself is written in Python, so you need that...). So if one were to start from scratch with conda, I'm not entirely sure what you would get. but I _think_ you could run conda with some_random_python, then use it to set up a conda environment with it's own python... But for the current discussion, yes, a conda environment has it's own site-packages, etc, its own complete python install. and does pip correctly install
packages to that directory?
yes. it does.
If so, I expect supporting PEP 376 would help with this.
yes, I think that would help. though conda is about more-than-python, so it would still need to manage dependency and meta-data its own way. But I would think it could duplicate the effort python packages. But I can't speak for the conda developers. It doesn't help either package manager install dependencies from
outside their repos, it just means that pip will work if the user installs dependencies from conda first.
and if we can get vice-versa to work, also -- things would be easier.
To be able to install dependencies, either conda needs to know enough about PyPI to find a package's dependencies itself (and at that point, I wonder how much value pip adds compared to 'wheel'),
good point -- and conda does know a fair bit about PyPi already -- there is a conda-skeleton pypi command that goes and looks on pypi fo ra package, and builds a conda build script for it automagically -- and it works without modification much of the time, including dependencies. So much of the logic is there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 19 May 2015 at 00:41, Chris Barker
On Mon, May 18, 2015 at 3:17 AM, Paul Moore
wrote: Agreed. My personal use case is as a general programmer (mostly sysadmin and automation type of work) with some strong interest in business data analysis and a side interest in stats.
For that sort of scenario, some of the scipy stack (specifically matplotlib and pandas and their dependencies) is really useful. But conda is *not* what I'd use for day to day work, so being able to install via pip is important to me.
What if "conda install" did work for virtually all pypi packages? (one way or the other) -- would you use and recommend Anaconda (or miniconda) then?
If conda did everything pip did (and that includes consuming wheels from PyPI, not just sdists, and it includes caching of downloads, autobuilding of wheels etc, etc.) then I'd certainly consider how to switch to conda (*not* Anaconda - I'd use a different package manager, but not a different Python distribution) rather than pip. But "considering switching" would include getting PyPI supporting conda packages, getting ensurepip replaced with ensureconda, etc. A total replacement for pip, in other words. As a pip maintainer I'm obviously biased, but if conda is intending to replace pip as the official packaging solution for Python, then it needs to do so completely. If it doesn't do that, then we (PyPA and the Python core developers) need to be able to credibly say that pip is the official solution, and that means that we need to make sure that pip/wheel provides the best user experience possible. That includes persuading parts of the Python community (e.g. Scientific users) not to abandon the standard solution in favour of a custom one. My fear here is a split in the Python community, with some packages only being available via one ecosystem, and some via another. Most people won't mind, but people with cross-discipline interests will end up disadvantaged in such a situation. Paul
On Tue, May 19, 2015 at 5:21 AM, Paul Moore
If conda did everything pip did (and that includes consuming wheels from PyPI, not just sdists, and it includes caching of downloads, autobuilding of wheels etc, etc.)
hmm...what about half-way -- conda does everything pip does, but not necessarily the same way -- i.e. you do a "conda install this_package", and it works for every package ( OK -- almost every ;-) ) that pip install works for. But maybe that's not going to cut it -- in a way, we are headed there now, with a contingent of people porting pypi packages to conda. So far it's various subsets of the scientific community, but if we could get a few web developers to join in... then I'd certainly consider how to
switch to conda (*not* Anaconda - I'd use a different package manager, but not a different Python distribution) rather than pip.
hmm -- that's the interesting technical question -- conda works at a higher level than pip -- it CAN manage python itself -- I'm not sure it is HAS to, but that's how it is usually used, and the idea is to provide a complete environment, which does include python itself.
But "considering switching" would include getting PyPI supporting conda packages,
uhm, why? if there is a community supported repo of packages -- why does it have to be PyPi?
getting ensurepip replaced with ensureconda, etc. A total replacement for pip, in other words.
As a pip maintainer I'm obviously biased, but if conda is intending to replace pip as the official packaging solution for Python, then it needs to do so completely. If it doesn't do that, then we (PyPA and the Python core developers) need to be able to credibly say that pip is the official solution, and that means that we need to make sure that pip/wheel provides the best user experience possible. That includes persuading parts of the Python community (e.g. Scientific users) not to abandon the standard solution in favour of a custom one.
I agree here. Though we do have a problem -- as Nick has pointed out, the "full" scientific development process -- even if python-centered, requires non-python parts, even beyond shared libs -- a fortran compiler is a big one, but also maybe other languages, like R, or Julia, etc. Or LLVM (for numba), or... This is why Continuum build conda -- they wanted to provide a way to manage all that. So is it possible for PyPA to grow the features to mange all the python bits, and then have things like conda use pip inside of Anaconda, maybe? or SOME transition where you can add conda if and only if you need its unique features, and as a add-it-later-to-what-you-have solution, rather than, when you need R or some such, you need to * Toss out your current setup * Install Anaconda (or miniconda) * Switch from virtualenv to conda environments * re-install all your dependencies And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve. My fear here is a split in the Python community, with some packages
only being available via one ecosystem, and some via another.
Exactly. While I'm not at all sure that we could get to a "one way to do it" that would meet every community's needs, I do think that we could push pip+pypi+wheel a little further to better support at least the python-centric stuff -- i.e. third party libs, which would get us a lot farther. And again, it's not just the scipy stack -- there is stuff like image manipulation packages, etc, that could be better handled. And the geospatial packages are a mess, too - is that "scientific"? -- I don't know, but it's the "new hotness" in web development. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 19 May 2015 at 13:28, Chris Barker
[...] So is it possible for PyPA to grow the features to mange all the python bits, and then have things like conda use pip inside of Anaconda, maybe? or SOME transition where you can add conda if and only if you need its unique features, and as a add-it-later-to-what-you-have solution, rather than, when you need R or some such, you need to
* Toss out your current setup * Install Anaconda (or miniconda) * Switch from virtualenv to conda environments * re-install all your dependencies
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
What if instead of focusing on pip being able to install more than just python packages, we made sure that a virtualenv was as strict subset of, say, a conda environment? This way, going from virtualenv to, say, conda would not be a toss-out, but an upgrade. With all that was discussed here, ISTM it should be easy enough to make sure that a virtualenv contains *the place* where you install the DLLs/.so needed to make a certain pip package work, in a way that wouldn't pollute other virtualenvs or the whole system (that'd be <venv>/lib on nix and <venv>\Scritps or <venv>\Lib on Windows), even if pip itself is not responsible for installing the libraries in that place. Then, one could run something like pip install conda-enable Which would add a `conda` script to the virtualenv, which then could be used to install python packages and their non Python dependencies in a way that is pip compatible. But what if we don't want to teach people to use anything other than pip? Then perhaps instead of teaching pip to install non-python stuff, we could just add some hooks to pip. For instance, the `conda-enable` package above could install hooks that would be called by pip whenever pip is called to install some packages. The hook would receive the package name and its known metadata, allowing the `conda-enable` package to go and install whatever pre-requisites it knows about for the presented package, before pip tries to install the package itself. The `conda-enable` package could also add configuration to the virtualenv telling pip which alternative indexes to use to install wheels known to be compatible with the non-python dependencies the hook above installs. This would give us a route that doesn't force the PYPA stack to consider how to solve the non-python dependency issue, or even what metadata would be required to solve it. The installed hook would need to keep a mapping of distribution-name to non-python dependency. In time, we could think of extending the metadata that PYPI carries to contemplate non-python dependencies, but we could have a solution that works even without it. I used `conda` in the examples above, but with the hooks in place, anyone could write their own non-python dependency resolution system. For insance, this could be a good way for corporations to provide for the use of standardized software for their python platforms that is not limited to python packages, without forcing a lot of feature creep onto the PYPA stack itself. Cheers, Leo
On Tue, May 19, 2015 at 10:18 AM, Leonardo Rochael Almeida < leorochael@gmail.com> wrote:
What if instead of focusing on pip being able to install more than just python packages, we made sure that a virtualenv was as strict subset of, say, a conda environment? This way, going from virtualenv to, say, conda would not be a toss-out, but an upgrade.
cool idea -- though it's kind of backwards - i.e. conda installs stuff outside of the python environment. So I'm not sure if you could shoehorn this all together in that way. At least with a significant re-engineering of conda, in which case, you've kind of built a new plug-in or add-on to pip that does more. But it's be great to proven wrong here! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, May 19, 2015 at 2:34 PM, Chris Barker
On Tue, May 19, 2015 at 10:18 AM, Leonardo Rochael Almeida < leorochael@gmail.com> wrote:
What if instead of focusing on pip being able to install more than just python packages, we made sure that a virtualenv was as strict subset of, say, a conda environment? This way, going from virtualenv to, say, conda would not be a toss-out, but an upgrade.
cool idea -- though it's kind of backwards - i.e. conda installs stuff outside of the python environment. So I'm not sure if you could shoehorn this all together in that way. At least with a significant re-engineering of conda, in which case, you've kind of built a new plug-in or add-on to pip that does more.
I've tried to do something like this with my (admittedly opinionated) dotfiles: https://westurner.org/dotfiles/venv # WORKON_HOME ~= CONDA_ENVS_PATH workon <virtualenvname> workon_conda <condaenvname> # wec <tab> lscondaenvs virtualenv: deactivate() condaenv: source deactivate ... https://github.com/westurner/dotfiles/blob/develop/etc/bash/08-bashrc.conda.... It's still pretty verbose and rough around the edges.
On 19 May 2015 at 17:28, Chris Barker
On Tue, May 19, 2015 at 5:21 AM, Paul Moore
wrote: If conda did everything pip did (and that includes consuming wheels from PyPI, not just sdists, and it includes caching of downloads, autobuilding of wheels etc, etc.)
hmm...what about half-way -- conda does everything pip does, but not necessarily the same way -- i.e. you do a "conda install this_package", and it works for every package ( OK -- almost every ;-) ) that pip install works for.
Sure. Doesn't have to be the same way, but the user experience has to be the same.
But maybe that's not going to cut it -- in a way, we are headed there now, with a contingent of people porting pypi packages to conda. So far it's various subsets of the scientific community, but if we could get a few web developers to join in...
Unless project owners switch to providing conda packages, isn't there always going to be a lag? If a new version of lxml comes out, how long must I wait for "the conda folks" to release a package for it?
then I'd certainly consider how to switch to conda (*not* Anaconda - I'd use a different package manager, but not a different Python distribution) rather than pip.
hmm -- that's the interesting technical question -- conda works at a higher level than pip -- it CAN manage python itself -- I'm not sure it is HAS to, but that's how it is usually used, and the idea is to provide a complete environment, which does include python itself.
Yes. But I don't want to use Anaconda Python, Same reason - how long do I wait for the new release of Python to be available in Anaconda? There's currently no Python 3.5 alpha for example...
But "considering switching" would include getting PyPI supporting conda packages,
uhm, why? if there is a community supported repo of packages -- why does it have to be PyPi?
If conda/binstar is good enough to replace pip/PyPI, there's no reason for pip/PyPI to still exist. So in effect binstar *becomes* PyPI. There's an element of evangelisation going on here - you're (effectively) asking what it'd take to persuade me to use conda in place of pip. I'm playing hard to get, a little, because I see no specific benefits to me in using conda, so I don't see why I should accept any loss at all, in the absence of a benefit to justify it. My biggest worry is that at some point, "if you want numpy/scipy, you should use conda" becomes an explicit benefit of conda, and pip/PyPI users get abandoned by the scientific community. If that happens, I'd rather see the community rally behind conda than see a split. But I hope that's not the way things end up going.
getting ensurepip replaced with ensureconda, etc. A total replacement for pip, in other words.
This is the key point. The decision was made to "bless" pip as the official Python package manager. Should we revisit that decision? If not, then how do we ensure that pip (and the surrounding infrastructure) handles the needs of the *whole* Python community? If the authors of scientific extensions for Python abandon pip for conda, then pip isn't supporting that part of the community properly. But conversely, if the scientific community doesn't look to address their issues within the pip/wheel infrastructure, how can we do anything to avoid a rift? (end of doom and gloom section ;-))
As a pip maintainer I'm obviously biased, but if conda is intending to replace pip as the official packaging solution for Python, then it needs to do so completely. If it doesn't do that, then we (PyPA and the Python core developers) need to be able to credibly say that pip is the official solution, and that means that we need to make sure that pip/wheel provides the best user experience possible. That includes persuading parts of the Python community (e.g. Scientific users) not to abandon the standard solution in favour of a custom one.
I agree here. Though we do have a problem -- as Nick has pointed out, the "full" scientific development process -- even if python-centered, requires non-python parts, even beyond shared libs -- a fortran compiler is a big one, but also maybe other languages, like R, or Julia, etc. Or LLVM (for numba), or...
This is why Continuum build conda -- they wanted to provide a way to manage all that.
So is it possible for PyPA to grow the features to mange all the python bits, and then have things like conda use pip inside of Anaconda, maybe?
I'd like to think so. The goal of pip is to be the baseline Python package manager. We're expecting Linux distributions to build their system packages via wheel, why can't Anaconda? Part of the problem here, to my mind, is that it's *very* hard for the outsider to separate out (Ana)conda-as-a-platform versus conda-as-a-tool, versus conda-as-a-distribution-format.
or SOME transition where you can add conda if and only if you need its unique features, and as a add-it-later-to-what-you-have solution, rather than, when you need R or some such, you need to
* Toss out your current setup * Install Anaconda (or miniconda) * Switch from virtualenv to conda environments * re-install all your dependencies
Yeah, that's hopeless. And worse still is the possibility that a pure Python user might have to do that just to gain access to a particular package. I keep saying this, and I ought to ask - is there *any* likelihood that a package would formally abandon any attempt to provide binary distributions for Windows (or OSX, or whatever) except in conda format? So PyPI users will be told "install from source yourself or switch to conda". If there's no intention for that to ever happen, a lot of the "conda vs pip" discussion is less relevant. At the moment, there are projects like numpy that don't distribute Windows wheels on PyPI, but Christoph Gohlke has most of them available, and in general such projects seem to be aiming to move to wheels. So there aren't any practical cases of "conda-only" packages.
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
See above - without some serious resource on the conda side, that "everything" is unlikely.
My fear here is a split in the Python community, with some packages only being available via one ecosystem, and some via another.
Exactly. While I'm not at all sure that we could get to a "one way to do it" that would meet every community's needs, I do think that we could push pip+pypi+wheel a little further to better support at least the python-centric stuff -- i.e. third party libs, which would get us a lot farther.
Agreed. Understanding the *actual* problem here is important though (see my other post about clarifying why dynamic linking is so important).
And again, it's not just the scipy stack -- there is stuff like image manipulation packages, etc, that could be better handled.
Well, (for example) Pillow provides wheels with no issue, so I'm not sure what you're thinking of here?
And the geospatial packages are a mess, too - is that "scientific"? -- I don't know, but it's the "new hotness" in web development.
I can't really comment on that one, as I've never used them. Does that make me untrendy? :-) Paul
On Tue, May 19, 2015 at 10:58 AM, Paul Moore
Sure. Doesn't have to be the same way, but the user experience has to be the same.
absolutely.
But maybe that's not going to cut it -- in a way, we are headed there now,
with a contingent of people porting pypi packages to conda. So far it's various subsets of the scientific community, but if we could get a few web developers to join in...
Unless project owners switch to providing conda packages, isn't there always going to be a lag? If a new version of lxml comes out, how long must I wait for "the conda folks" to release a package for it?
who knows? -- but it is currently a light lift to update a conda package to a new version, once the original is built -- and we've got handy scripts and CI systems that will push an updated version of the binaries as soon as an updated version of the build script is pushed. It's a short step to automate looking for new versions on PyPi and automatically updadating the conda pacakges -- though there would need to be hand-intervention for whenever a update broke the build script... Of course, the ideal is for package maintainers to push conda pacakges themselves -- which is why the more-than-one-system to support is unfortunate. On the other hand, there is one plus side -- if the package maintainer doesn't push to PyPi, it's easier for a third party to take on that role -- see pychecker, or, for that matter numpy and scipy -- on pipy, but not binaries for Windows. But you can get them on binstar (or Anaconda, or...)
hmm -- that's the interesting technical question -- conda works at a higher
level than pip -- it CAN manage python itself -- I'm not sure it is HAS to, but that's how it is usually used, and the idea is to provide a complete environment, which does include python itself.
Yes. But I don't want to use Anaconda Python, Same reason - how long do I wait for the new release of Python to be available in Anaconda? There's currently no Python 3.5 alpha for example...
you can grab and build the latest Python3.5 inside a conda environment just as well. Or are you using python.org builds for alpha versions, too? Oh, and as a conda environment sits at a higher level than python, it's actually easier to set up an environment specifically for a particular version of python. And anyone could put up a conda package of Python3.5 Alpha as well --- once the build script is written, it's pretty easy. But again -- teh more than one way to do it problem. If conda/binstar is good enough to replace pip/PyPI, there's no reason
for pip/PyPI to still exist. So in effect binstar *becomes* PyPI.
yup.
There's an element of evangelisation going on here - you're (effectively) asking what it'd take to persuade me to use conda in place of pip. I'm playing hard to get, a little, because I see no specific benefits to me in using conda, so I don't see why I should accept any loss at all, in the absence of a benefit to justify it.
I take no position here -- I'm playing around with ideas as to how we can move the community toward a better future -- I'm not trying to advocate any particular solution, but trying to figure out what solution we may want to pursue -- quite specifically which solution I'm going to put my personal energy toward. We may want to look back at a thread on this list where Travis Oliphant talks about why he built conda, etc. (I can't find it now -- maybe someone with better google-fu than me can. It think it was a thread on this list, probably about a year ago) or read his Blog Post: http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html One of the key points is that when they started building conda -- pip+wheel where not mature, and the core folks behind them didn't want to support what was needed (dynamic libs, etc) -- and still don't. My biggest worry is that at some point, "if you want numpy/scipy, you
should use conda" becomes an explicit benefit of conda,
That is EXACTLY what the explicit benefit of conda is. I think we'll get binary wheels for numpy and scipy up on PyPi before too long, but the rest of the more complex stuff is not going to be there.
and pip/PyPI users get abandoned by the scientific community.
They kind of already have -- it's been a long time, and a lot of work by only a couple folks to try to get binary wheels up on PyPi for Windows and OS-X
If that happens, I'd rather see the community rally behind conda than see a split. But I hope that's not the way things end up going.
we'll see. But look at Travis' post -- pip+wheel simply does not support the full needs of the full scientific user. If we want a "one ring to rule them all", then it'll have to be conda -- or something a lot like it. On the other hand, I think pip+wheel+PyPi (or maybe just the community around it) can be extended a bit to at least support all the truly python focused stuff -- which I think would be pretty worthwhile it itself. This is the key point. The decision was made to "bless" pip as the
official Python package manager. Should we revisit that decision?
I'm not sure I want to be the one to bring that up ;-)
If not, then how do we ensure that pip (and the surrounding infrastructure) handles the needs of the *whole* Python community? If the authors of scientific extensions for Python abandon pip for conda, then pip isn't supporting that part of the community properly. But conversely, if the scientific community doesn't look to address their issues within the pip/wheel infrastructure, how can we do anything to avoid a rift?
well -- I think the problem is that while SOME of the needs of scientific community can be addressed within the pip-wheel infrastructure, they all can't all be addressed there. And (I wish I could find that thread), I'm pretty sure Travis said that before he started conda, he talked to the PyPa folks (before it was called that), and told that he'd be best off going off and building something new -- pip+wheel were just getting started, and were not going to support what he needed -- certainly not anytime soon.
I'd like to think so. The goal of pip is to be the baseline Python package manager. We're expecting Linux distributions to build their system packages via wheel, why can't Anaconda? Part of the problem here, to my mind, is that it's *very* hard for the outsider to separate out (Ana)conda-as-a-platform versus conda-as-a-tool, versus conda-as-a-distribution-format.
absolutely. When you say "build their system packages via wheel" -- what does that mean? and why wheel, rather than, say, pip + setuptools? you can put whatever you want in a conda build script -- the current standard practice baseline for python packages is: $PYTHON setup.py install it's that simple. And I don't know if this is what it actually does, but essentially the conda package is all the stuff that that script added to the environment. So if changing that invocation to use pip would get us some better meta data or what have you, then by all means, we should change that standard of practice. (but it seems like an odd thing to have to use the package manager to build the package correctly -- shouldn't' that be distutils or setuptools' job?
* Toss out your current setup
* Install Anaconda (or miniconda) * Switch from virtualenv to conda environments * re-install all your dependencies
Yeah, that's hopeless. And worse still is the possibility that a pure Python user might have to do that just to gain access to a particular package.
exactly. I keep saying this, and I ought to ask - is there *any* likelihood
that a package would formally abandon any attempt to provide binary distributions for Windows (or OSX, or whatever) except in conda format?
absolutely -- probably not the core major packages -- but I think that's already the case with a number of more domain specific packages (including my stuff -- OK, I have maybe three users outside my organization now, but still..) And numpy and scipy don't yet have binary wheels for Windows up -- though that is being worked on. So PyPI users will be told "install from source yourself or
switch to conda". If there's no intention for that to ever happen,
Sadly, we are already there for minor packages, at least. Oh wait, not so minor -- the geospatial stack is not well supported on PyPi. I don't think there are pynetcdf or pyhdf binaries, etc... On the other hand, some domain specifc stuff is being support, like scikit-leaern, for instance: http://scikit-learn.org/stable/install.html#install-official-release lot of the "conda vs pip" discussion is less relevant. At the moment,
there are projects like numpy that don't distribute Windows wheels on PyPI, but Christoph Gohlke has most of them available,
yes, but in a form that is not redistributable on PyPi...
and in general such projects seem to be aiming to move to wheels. So there aren't any practical cases of "conda-only" packages.
I'm not sure about "conda-only", but not pip-installable is all too common.
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
See above - without some serious resource on the conda side, that "everything" is unlikely.
conda can provide a full, pretty standard, python environment -- why do you think "everything" is unlikely?
I do think that we could push
pip+pypi+wheel a little further to better support at least the python-centric stuff -- i.e. third party libs, which would get us a lot farther.
Agreed. Understanding the *actual* problem here is important though (see my other post about clarifying why dynamic linking is so important).
yes -- I don't know that that's answered yet -- but the third party dependency problem is real -- whether it is addressed by supporting dynamic linking, or by making it easier to find, build, distribute compatible static libs for package maintainers to use is still up in the air.
And again, it's not just the scipy stack -- there is stuff like image
manipulation packages, etc, that could be better handled.
Well, (for example) Pillow provides wheels with no issue, so I'm not sure what you're thinking of here?
the PIllow folks have figured it out -- and are doing the work -- but it took years, and we had a lot of pain building binaries for OS-X of PIL during that time. I think dynamic libs would be a good thing for packages like PIL, but maybe static is fine (I presume they are doing static now...)
And the geospatial packages are a mess, too - is that "scientific"? -- I
don't know, but it's the "new hotness" in web development.
I can't really comment on that one, as I've never used them. Does that make me untrendy? :-)
Absolutely! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
Sigh. I really am going to try to stop monopolising this thread - but
you keep making good points I feel I "have" to respond to :-) I'll try
to keep to essentials.
On 19 May 2015 at 22:11, Chris Barker
On Tue, May 19, 2015 at 10:58 AM, Paul Moore
wrote:
you can grab and build the latest Python3.5 inside a conda environment just as well. Or are you using python.org builds for alpha versions, too?
Yep, I basically never build Python myself, except when developing it.
Oh, and as a conda environment sits at a higher level than python, it's actually easier to set up an environment specifically for a particular version of python.
And anyone could put up a conda package of Python3.5 Alpha as well --- once the build script is written, it's pretty easy. But again -- teh more than one way to do it problem.
Yeah, I'm trying to never build anything for myself, just consume binaries. Having all binaries built by "the conda people" is a bottleneck. Having pip auto-build wheels once and reuse them (coming in the next version, yay!) is good enough. Having projects upload wheels to PyPI is ideal. Building wheels myself from wininsts provided by others or by doing the awkward work once and hosting them in a personal index is acceptable. I've spent too much of my life trying to build other people's C code. I'd like to stop now :-) My needs don't extend to highly specialised *and* hard to build stuff. The specialised stuff I care about is usually not hard to build, and the hard to build stuff is usually relatively mainstream (basic numpy and scipy as opposed to the specialised scientific stuff)
We may want to look back at a thread on this list where Travis Oliphant talks about why he built conda, etc. (I can't find it now -- maybe someone with better google-fu than me can. It think it was a thread on this list, probably about a year ago)
or read his Blog Post:
http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html
Thanks for the link - I'll definitely read that.
One of the key points is that when they started building conda -- pip+wheel where not mature, and the core folks behind them didn't want to support what was needed (dynamic libs, etc) -- and still don't.
Well, I'm not sure "don't want to" is accurate these days. "Think the problem is harder than you're making it sound" may be accurate, as might "have no resource to spare to do the work, but would like others to".
My biggest worry is that at some point, "if you want numpy/scipy, you should use conda" becomes an explicit benefit of conda,
That is EXACTLY what the explicit benefit of conda is. I think we'll get binary wheels for numpy and scipy up on PyPi before too long, but the rest of the more complex stuff is not going to be there.
I did specifically mean numpy and scipy. People are using those, and pandas and matplotlib, for a lot of non-scientific things (business data analysis as a replacement for Excel, for example). Forcing such people to use Anaconda seems like a mistake - the scientific community and the business community have very different perspectives. Conceded, numpy/scipy are *currently* hard to get - we're in the middle of a transition from wininst (which numpy/scipy did supply) to wheel. But the *intention* is that wheels will be an acceptable replacement for wininst installers, so giving up before we reach that goal would be a shame.
This is the key point. The decision was made to "bless" pip as the official Python package manager. Should we revisit that decision?
I'm not sure I want to be the one to bring that up ;-)
Well, Nick Coghlan is probably better placed to comment on that. I don't really understand his vision for pip and wheel as something that other tools build on - and whether or not the fact that the scientific community feels the need to build a completely independent infrastructure is a problem in that context. But if it is a problem, maybe that decision *should* be reviewed.
When you say "build their system packages via wheel" -- what does that mean? and why wheel, rather than, say, pip + setuptools?
Again, that's something for Nick to comment on - I don't know how wheel (it's wheel more than pip in this context, I believe) fits into RPM/deb building processes. But I do know that's how he's been talking about things working. I can't see why conda would be any different.
you can put whatever you want in a conda build script -- the current standard practice baseline for python packages is:
$PYTHON setup.py install
That sounds like conda doesn't separate build from install - is there a conda equivalent of a "binary distribution" like a wheel? There are a lot of reasons why pip/wheel is working hard to move to a separation of build and install steps, and if conda isn't following that, I'd like to know its response to those issues. (I don't know what all the issues are - it'd take some searching of the list archives to find previous discussions. One obvious one is "we don't want end users to have to have a compiler").
it's that simple. And I don't know if this is what it actually does, but essentially the conda package is all the stuff that that script added to the environment. So if changing that invocation to use pip would get us some better meta data or what have you, then by all means, we should change that standard of practice.
Well, it sounds like using "setup.py bdist_wheel" and then repackaging up the contents of the wheel might be easier than playing "spot what changed" with a setup.py install invocation. But I'm no expert. (If the response to that is "wheel doesn't handle X" then I'd rather see someone offering to help *improve* wheel to handle that situation!)
(but it seems like an odd thing to have to use the package manager to build the package correctly -- shouldn't' that be distutils or setuptools' job?
Wheels can be built using setuptools (bdist_wheel) or just packaging up some files manually. All pip does in this context is orchestrate the running of setup.py bdist_wheel. You lot use "conda" to mean the format, the package manager, and the distribution channel - give me some slack if I occasionally use "pip" when I mean "wheel" or "setuptools" :-) :-)
Sadly, we are already there for minor packages, at least. Oh wait, not so minor -- the geospatial stack is not well supported on PyPi. I don't think there are pynetcdf or pyhdf binaries, etc...
There is a point where "specialised enough to only matter to Acaconda's target audience" is OK. And Christoph Gohlke's distributions fill a huge gap (losing that because "you should use conda" would be a huge issue, IMO).
lot of the "conda vs pip" discussion is less relevant. At the moment, there are projects like numpy that don't distribute Windows wheels on PyPI, but Christoph Gohlke has most of them available,
yes, but in a form that is not redistributable on PyPi...
Well, it's not clear to me how insurmountable that problem is - his page doesn't specifically mention redistribution limitations (I presume it's license issues?) Has anyone discussed the possibility of addressing the issues? If it is licensing (of MKL?) then could the ones without license implications be redistributed? Could the PSF assist with getting a redistribution license for the remainder? I've no idea what the issues are - I'm just asking if there are approaches no-one has considered.
and in general such projects seem to be aiming to move to wheels. So there aren't any practical cases of "conda-only" packages.
I'm not sure about "conda-only", but not pip-installable is all too common.
My benchmark is everything that used to distribute wininst installers or binary eggs distributes wheels. I'd *like* to go beyond that, but we haven't got the resources to help projects that never offered binaries. Asking (and assisting) people to move from the old binary standards to the new one should be a reasonable goal, though. (I appreciate this is a less strict criterion than my previous comments about "losing access to the scientific stack" could be taken as implying. My apologies - some of my earlier comments probably included a bit too much rhetoric :-))
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
See above - without some serious resource on the conda side, that "everything" is unlikely.
conda can provide a full, pretty standard, python environment -- why do you think "everything" is unlikely?
Because the pip/wheel ecosystem relies on projects providing their own wheels. Conda relies on the "conda guys" building packages. It's the same reason that a PyPI build farm is still only a theory :-) Or were you assuming that package authors would start providing conda packages? Paul
lost track of where in the thred this was, but here's a conda recipe I found on gitHub: https://github.com/menpo/conda-recipes/tree/master/libxml2 don't know anything about it..... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 19 May 2015 at 23:32, Chris Barker
lost track of where in the thred this was, but here's a conda recipe I found on gitHub:
https://github.com/menpo/conda-recipes/tree/master/libxml2
don't know anything about it.....
OK, I'm still misunderstanding something, I think. As far as I can see, all that does is copy a published binary and repack it. There's no "build" instructions in there. Paul
On Wed, May 20, 2015 at 1:04 AM, Paul Moore
https://github.com/menpo/conda-recipes/tree/master/libxml2
don't know anything about it.....
OK, I'm still misunderstanding something, I think. As far as I can see, all that does is copy a published binary and repack it. There's no "build" instructions in there.
indeed -- that is one way to buld a conda pacakge, as you well know! maybe no one has done a "proper" build from scratch recipe for that one -- or maybe continuum has, and we'll find out about it from David.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, May 19, 2015 at 3:04 PM, Paul Moore
Yeah, I'm trying to never build anything for myself, just consume binaries. Having all binaries built by "the conda people" is a bottleneck.
it is -- though the group of "conda people" is growing...
Having pip auto-build wheels once and reuse them (coming in the next version, yay!) is good enough. Having projects upload wheels to PyPI is ideal. Building wheels myself from wininsts provided by others or by doing the awkward work once and hosting them in a personal index is acceptable.
you can build conda packages from wininsts as well, actually. Haven't tried it myself yet. (I'm hoping form bdist_mpkgs on the Mac, too, though those are getting rare) I've spent too much of my life trying to build other people's C code.
I'd like to stop now :-)
One of the key points is that when they started building conda --
no kidding -- and this whole thread is about how to help more an more people stop... pip+wheel
where not mature, and the core folks behind them didn't want to support what was needed (dynamic libs, etc) -- and still don't.
Well, I'm not sure "don't want to" is accurate these days. "Think the problem is harder than you're making it sound" may be accurate, as might "have no resource to spare to do the work, but would like others to".
well, yeah, though I'm still not sure how much support there is. And I do think that no one want to extend pip to be able to install Perl, for instance ;-)
My biggest worry is that at some point, "if you want numpy/scipy, you
should use conda" becomes an explicit benefit of conda,
That is EXACTLY what the explicit benefit of conda is. I think we'll get binary wheels for numpy and scipy up on PyPi before too long, but the rest of the more complex stuff is not going to be there.
I did specifically mean numpy and scipy. People are using those, and pandas and matplotlib,
That would be the core "scipy stack", and you can't, as of today, pip install it on Windows (you can on the Mac), and you can get wheel from the Gohlke repo, or wininst from the scipy site. That is going to change, hopefully soon, and to be fair the technical hurdles have to do with building a good LAPACK without licensing issues, and figuring out if we need to support pre-SSE2 hardware, etc ... not a pip or pypi limitation. But the *intention* is that wheels will be an acceptable
replacement for wininst installers, so giving up before we reach that goal would be a shame.
I don't think we will -- some folks are doing some great work on that -- and it looks to be close.
Again, that's something for Nick to comment on - I don't know how wheel (it's wheel more than pip in this context, I believe) fits into RPM/deb building processes. But I do know that's how he's been talking about things working. I can't see why conda would be any different.
yup -- it probably does make sense to do with conda what is done with rpm. Except that conda already has a bunch of python-aware stuff in it...
$PYTHON setup.py install
That sounds like conda doesn't separate build from install - is there a conda equivalent of a "binary distribution" like a wheel?
yes, a conda pacakge is totally binary. but when you build one, it does this: 1) create a new, empty conda environment 2) install the build dependencies of the package at hand 3) download or unpack the source code 4) build and install the package (into this special, temporary, conda environment) 5) package up all the stuff that got installed into a conda package I don't know how it does it, but it essentially finds all the files that were added by the install stage, and puts those in the package. Remarkably simple, and I think, kind of elegant. I think it installs, rather than simply building, so that conda itself doesn't need to know anything about what kind of package it is -- what it wants the final package to be is an archive of everything that you want installed, where you want it installed. so actually installing it is the easiest way to do that. for instance, a C library build script might be: ./configure make make install There are
a lot of reasons why pip/wheel is working hard to move to a separation of build and install steps, and if conda isn't following that,
I think wheel and conda are quite similar in that regard, actually.
it's that simple. And I don't know if this is what it actually does, but essentially the conda package is all the stuff that that script added to the environment. So if changing that invocation to use pip would get us some better meta data or what have you, then by all means, we should change that standard of practice.
Well, it sounds like using "setup.py bdist_wheel" and then repackaging up the contents of the wheel might be easier than playing "spot what changed" with a setup.py install invocation. But I'm no expert. (If the response to that is "wheel doesn't handle X" then I'd rather see someone offering to help *improve* wheel to handle that situation!)
for python packages, maybe but the way they do it now is more generic -- why have a bunch of python package specific code you don't need?
(but it seems like an odd thing to have to use the package manager to build the package correctly -- shouldn't' that be distutils or setuptools' job?
Wheels can be built using setuptools (bdist_wheel) or just packaging up some files manually. All pip does in this context is orchestrate the running of setup.py bdist_wheel. You lot use "conda" to mean the format, the package manager, and the distribution channel - give me some slack if I occasionally use "pip" when I mean "wheel" or "setuptools" :-) :-)
fair enough. but does making a wheel create some meta-data, etc. that doing a raw install not create? i.e. is there something in a wheel that conda maybe should use that it's not getting? There is a point where "specialised enough to only matter to
Acaconda's target audience" is OK. And Christoph Gohlke's distributions fill a huge gap (losing that because "you should use conda" would be a huge issue, IMO).
His repo is fantastic, yes, but it's an awful lot of reliance on one really, really, productive and smart guy! As a note -- he is often the first to find and resolve a lot of Windows build issues on a variety of packages. AND I think some of the wheels on pypi are actually built by him, and re-distributed by the package maintainers.
lot of the "conda vs pip" discussion is less relevant. At the moment,
there are projects like numpy that don't distribute Windows wheels on PyPI, but Christoph Gohlke has most of them available,
yes, but in a form that is not redistributable on PyPi...
Well, it's not clear to me how insurmountable that problem is - his page doesn't specifically mention redistribution limitations (I presume it's license issues?)
yes.
Has anyone discussed the possibility of addressing the issues? If it is licensing (of MKL?)
yes.
then could the ones without license implications be redistributed?
probably -- and, as above, some are (at least one I know of was, anyway :-) )
Could the PSF assist with getting a redistribution license for the remainder? I've no idea what the issues are - I'm just asking if there are approaches no-one has considered.
It's been talked about, and has recently -- the trick, as I understand it, is that while the Intel license allows re-distribution, you are supposed to have soem sort of info for the user about the license restrictions. when you "pip install" something, there is no way for the user to get that license info. Or something like that. Also -- the scipy devs in general really prefer a fully open source solution. But one is coming soon, so we're good.
I'm not sure about "conda-only", but not pip-installable is all too common.
My benchmark is everything that used to distribute wininst installers or binary eggs distributes wheels.
we're probably getting pretty close there. wxPython is an exception, but Robin is distributing the new and improved beta version as wheels. I think he's got enough on his plate that he doesn't want to wrestle with the old and crufty build system. I don't know what else there is of the major packages.
I'd *like* to go beyond that, but we haven't got the resources to help projects that never offered binaries. Asking (and assisting) people to move from the old binary standards to the new one should be a reasonable goal, though.
yup. which makes me think -- maybe not that hard to do a wininst to wheel converter for wxPython -- that would be nice. We also need it for the Mac, and that would be harder -- he's got some trickery in placing the libs in that one...
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
See above - without some serious resource on the conda side, that "everything" is unlikely.
conda can provide a full, pretty standard, python environment -- why do you think "everything" is unlikely?
Because the pip/wheel ecosystem relies on projects providing their own wheels. Conda relies on the "conda guys" building packages. It's the same reason that a PyPI build farm is still only a theory :-) Or were you assuming that package authors would start providing conda packages?
no -- THAT is unlikely. I meant that if we can get the automatic build a conda package from a pypi sdist working, then we're good to go. Or maybe build-a-conda-package-from-a-binary-wheel. After all, if a package can be pip-installed, then it should be able to be pip-installed inside a conda environment, which means it should be able to be done automatically. I've done a bunch of these -- it's mostly boilerplate. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, May 19, 2015 at 6:04 PM, Chris Barker
On Tue, May 19, 2015 at 3:04 PM, Paul Moore
wrote: Yeah, I'm trying to never build anything for myself, just consume binaries. Having all binaries built by "the conda people" is a bottleneck.
it is -- though the group of "conda people" is growing...
Having pip auto-build wheels once and reuse them (coming in the next version, yay!) is good enough. Having projects upload wheels to PyPI is ideal. Building wheels myself from wininsts provided by others or by doing the awkward work once and hosting them in a personal index is acceptable.
you can build conda packages from wininsts as well, actually. Haven't tried it myself yet. (I'm hoping form bdist_mpkgs on the Mac, too, though those are getting rare)
I've spent too much of my life trying to build other people's C code.
I'd like to stop now :-)
no kidding -- and this whole thread is about how to help more an more people stop...
One of the key points is that when they started building conda -- pip+wheel
where not mature, and the core folks behind them didn't want to support what was needed (dynamic libs, etc) -- and still don't.
Well, I'm not sure "don't want to" is accurate these days. "Think the problem is harder than you're making it sound" may be accurate, as might "have no resource to spare to do the work, but would like others to".
well, yeah, though I'm still not sure how much support there is. And I do think that no one want to extend pip to be able to install Perl, for instance ;-)
My biggest worry is that at some point, "if you want numpy/scipy, you
should use conda" becomes an explicit benefit of conda,
That is EXACTLY what the explicit benefit of conda is. I think we'll get binary wheels for numpy and scipy up on PyPi before too long, but the rest of the more complex stuff is not going to be there.
I did specifically mean numpy and scipy. People are using those, and pandas and matplotlib,
That would be the core "scipy stack", and you can't, as of today, pip install it on Windows (you can on the Mac), and you can get wheel from the Gohlke repo, or wininst from the scipy site.
That is going to change, hopefully soon, and to be fair the technical hurdles have to do with building a good LAPACK without licensing issues, and figuring out if we need to support pre-SSE2 hardware, etc ... not a pip or pypi limitation.
But the *intention* is that wheels will be an acceptable
replacement for wininst installers, so giving up before we reach that goal would be a shame.
I don't think we will -- some folks are doing some great work on that -- and it looks to be close.
Again, that's something for Nick to comment on - I don't know how wheel (it's wheel more than pip in this context, I believe) fits into RPM/deb building processes. But I do know that's how he's been talking about things working. I can't see why conda would be any different.
yup -- it probably does make sense to do with conda what is done with rpm. Except that conda already has a bunch of python-aware stuff in it...
$PYTHON setup.py install
That sounds like conda doesn't separate build from install - is there a conda equivalent of a "binary distribution" like a wheel?
yes, a conda pacakge is totally binary. but when you build one, it does this:
1) create a new, empty conda environment 2) install the build dependencies of the package at hand 3) download or unpack the source code 4) build and install the package (into this special, temporary, conda environment) 5) package up all the stuff that got installed into a conda package
I don't know how it does it, but it essentially finds all the files that were added by the install stage, and puts those in the package. Remarkably simple, and I think, kind of elegant.
I think it installs, rather than simply building, so that conda itself doesn't need to know anything about what kind of package it is -- what it wants the final package to be is an archive of everything that you want installed, where you want it installed. so actually installing it is the easiest way to do that.
for instance, a C library build script might be:
./configure make make install
There are
a lot of reasons why pip/wheel is working hard to move to a separation of build and install steps, and if conda isn't following that,
I think wheel and conda are quite similar in that regard, actually.
it's that simple. And I don't know if this is what it actually does, but essentially the conda package is all the stuff that that script added to the environment. So if changing that invocation to use pip would get us some better meta data or what have you, then by all means, we should change that standard of practice.
Well, it sounds like using "setup.py bdist_wheel" and then repackaging up the contents of the wheel might be easier than playing "spot what changed" with a setup.py install invocation. But I'm no expert. (If the response to that is "wheel doesn't handle X" then I'd rather see someone offering to help *improve* wheel to handle that situation!)
for python packages, maybe but the way they do it now is more generic -- why have a bunch of python package specific code you don't need?
(but it seems like an odd thing to have to use the package manager to build the package correctly -- shouldn't' that be distutils or setuptools' job?
Wheels can be built using setuptools (bdist_wheel) or just packaging up some files manually. All pip does in this context is orchestrate the running of setup.py bdist_wheel. You lot use "conda" to mean the format, the package manager, and the distribution channel - give me some slack if I occasionally use "pip" when I mean "wheel" or "setuptools" :-) :-)
fair enough. but does making a wheel create some meta-data, etc. that doing a raw install not create? i.e. is there something in a wheel that conda maybe should use that it's not getting?
There is a point where "specialised enough to only matter to
Acaconda's target audience" is OK. And Christoph Gohlke's distributions fill a huge gap (losing that because "you should use conda" would be a huge issue, IMO).
His repo is fantastic, yes, but it's an awful lot of reliance on one really, really, productive and smart guy! As a note -- he is often the first to find and resolve a lot of Windows build issues on a variety of packages. AND I think some of the wheels on pypi are actually built by him, and re-distributed by the package maintainers.
lot of the "conda vs pip" discussion is less relevant. At the moment,
there are projects like numpy that don't distribute Windows wheels on PyPI, but Christoph Gohlke has most of them available,
yes, but in a form that is not redistributable on PyPi...
Well, it's not clear to me how insurmountable that problem is - his page doesn't specifically mention redistribution limitations (I presume it's license issues?)
yes.
Has anyone discussed the possibility of addressing the issues? If it is licensing (of MKL?)
yes.
then could the ones without license implications be redistributed?
probably -- and, as above, some are (at least one I know of was, anyway :-) )
Could the PSF assist with getting a redistribution license for the remainder? I've no idea what the issues are - I'm just asking if there are approaches no-one has considered.
It's been talked about, and has recently -- the trick, as I understand it, is that while the Intel license allows re-distribution, you are supposed to have soem sort of info for the user about the license restrictions.
when you "pip install" something, there is no way for the user to get that license info. Or something like that.
Also -- the scipy devs in general really prefer a fully open source solution. But one is coming soon, so we're good.
I'm not sure about "conda-only", but not pip-installable is all too common.
My benchmark is everything that used to distribute wininst installers or binary eggs distributes wheels.
we're probably getting pretty close there. wxPython is an exception, but Robin is distributing the new and improved beta version as wheels. I think he's got enough on his plate that he doesn't want to wrestle with the old and crufty build system. I don't know what else there is of the major packages.
I'd *like* to go beyond that, but we haven't got the resources to help projects that never offered binaries. Asking (and assisting) people to move from the old binary standards to the new one should be a reasonable goal, though.
yup. which makes me think -- maybe not that hard to do a wininst to wheel converter for wxPython -- that would be nice. We also need it for the Mac, and that would be harder -- he's got some trickery in placing the libs in that one...
And for even that to work, we need a way for everything installable by pip to be installable within that conda environment -- which we could probably achieve.
See above - without some serious resource on the conda side, that "everything" is unlikely.
conda can provide a full, pretty standard, python environment -- why do you think "everything" is unlikely?
Because the pip/wheel ecosystem relies on projects providing their own wheels. Conda relies on the "conda guys" building packages. It's the same reason that a PyPI build farm is still only a theory :-) Or were you assuming that package authors would start providing conda packages?
no -- THAT is unlikely. I meant that if we can get the automatic build a conda package from a pypi sdist working, then we're good to go. Or maybe build-a-conda-package-from-a-binary-wheel.
conda skeleton pypi pyyaml https://github.com/conda/conda-build/issues/397 (adds a --version-compare to compare conda version w/ pypi version)
After all, if a package can be pip-installed, then it should be able to be pip-installed inside a conda environment, which means it should be able to be done automatically.
I've done a bunch of these -- it's mostly boilerplate.
-Chris
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 20 May 2015 at 00:04, Chris Barker
yup. which makes me think -- maybe not that hard to do a wininst to wheel converter for wxPython -- that would be nice. We also need it for the Mac, and that would be harder -- he's got some trickery in placing the libs in that one...
"wheel convert <wininst file>" already does that. I wrote it, and use it a lot. It doesn't handle postinstall scripts (because wheels don't yet) but otherwise should be complete. Paul
On Mon, May 18, 2015 at 12:50 PM, David Mertz
This pertains more to the other thread I started, but I'm sort of becoming convinced--especially by Paul Moore's suggestion there--that the better approach is to grow conda (the tool) rather than shoehorn conda packages into pip. Getting pip to recognize the archive format of conda would be easy enough alone, but that really doesn't cover the fact that 'conda ~= pip+virtualenv', and pip alone simply should not try to grow that latter aspect itself. Plus pip is not going to be fully language agnostic, for various reasons, but including the fact that apt-get and yum and homebrew and ports already exist.
So it might make sense to actually allow folks to push conda to budding web developers, if conda allowed installation (and environment management) of sdist packages on PyPI. So perhaps it would be good if *this* worked:
% pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
Maybe that flag is mis-named, or could be omitted altogether. But there's no conceptual reason that conda couldn't build an sdist fetched from PyPI into a platform specific binary matching the current user machine (and do all the metadata dependency and environment stuff the conda tool does).
Would this be different than: # miniconda conda install pip conda install scientific_stuff pip install django_widget With gh:conda/conda-env, pip packages are in a pip: section of the environment.yml file For example: conda env export -n root Then, to install pip: packages with pip: conda create -n example -f ./environment.yml
On Mon, May 18, 2015 at 3:17 AM, Paul Moore
wrote: On 17 May 2015 at 23:50, Chris Barker
wrote: I guess the key thing here for me is that I don't see pushing conda to budding web developers -- but what if web developers have the need for a bit of the scipy stack? or???
We really don't have a good solution for those folks.
Agreed. My personal use case is as a general programmer (mostly sysadmin and automation type of work) with some strong interest in business data analysis and a side interest in stats.
For that sort of scenario, some of the scipy stack (specifically matplotlib and pandas and their dependencies) is really useful. But conda is *not* what I'd use for day to day work, so being able to install via pip is important to me. It should be noted that installing via pip *is* possible - via some of the relevant projects having published wheels, and the rest being available via Christoph Gohlke's site either as wheels or as wininsts that I can convert. But that's not a seamless process, so it's not something I'd be too happy explaining to a colleague should I want to share the workload for that type of thing.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 19 May 2015 at 21:19, Wes Turner
Would this be different than:
# miniconda conda install pip conda install scientific_stuff pip install django_widget
Having tried that in the past, I can say that I *very* rapidly got completely lost as to which packages I'd installed with pip, and which with conda. Uninstalling and/or upgrading with the "wrong" package manager caused all sorts of fun. Paul
On Tue, May 19, 2015 at 1:19 PM, Wes Turner
So it might make sense to actually allow folks to push conda to budding
web developers, if conda allowed installation (and environment management) of sdist packages on PyPI. So perhaps it would be good if *this* worked:
% pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
Would this be different than:
# miniconda conda install pip conda install scientific_stuff pip install django_widget
yes -- in the later, you have to START with the conda environment. But yes -- that should be doable. If conda understands pip metadata for dependencies and conda provides pip-understandable metadata, then that could work fine. With gh:conda/conda-env, pip packages are in a pip: section of the
environment.yml file For example:
conda env export -n root
Then, to install pip: packages with pip:
conda create -n example -f ./environment.yml
good point --- you'd also want a way for the conda to easily re-create the environment, with the pip-installed stuff included. But again, do-able. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, May 19, 2015 at 4:17 PM, Chris Barker
On Tue, May 19, 2015 at 1:19 PM, Wes Turner
wrote: So it might make sense to actually allow folks to push conda to budding
web developers, if conda allowed installation (and environment management) of sdist packages on PyPI. So perhaps it would be good if *this* worked:
% pip install conda % conda install scientific_stuff % conda install --sdist django_widget # we know to look on PyPI
Would this be different than:
# miniconda conda install pip conda install scientific_stuff pip install django_widget
yes -- in the later, you have to START with the conda environment. But yes -- that should be doable. If conda understands pip metadata for dependencies and conda provides pip-understandable metadata, then that could work fine.
With gh:conda/conda-env, pip packages are in a pip: section of the
environment.yml file For example:
conda env export -n root
Then, to install pip: packages with pip:
conda create -n example -f ./environment.yml
good point --- you'd also want a way for the conda to easily re-create the environment, with the pip-installed stuff included. But again, do-able.
$ deactivate; conda install conda-env; conda env export -n root | tee environment.yml name: root dependencies: - binstar=0.10.3=py27_0 - clyent=0.3.4=py27_0 - conda=3.12.0=py27_0 - conda-build=1.12.1=py27_0 - conda-env=2.1.4=py27_0 - jinja2=2.7.3=py27_1 - markupsafe=0.23=py27_0 - openssl=1.0.1k=1 - pip=6.1.1=py27_0 - pycosat=0.6.1=py27_0 - python=2.7.9=1 - python-dateutil=2.4.2=py27_0 - pytz=2015.2=py27_0 - pyyaml=3.11=py27_0 - readline=6.2=2 - requests=2.7.0=py27_0 - setuptools=16.0=py27_0 - six=1.9.0=py27_0 - sqlite=3.8.4.1=1 - tk=8.5.18=0 - yaml=0.1.4=1 - zlib=1.2.8=0 - pip: - conda-build (/Users/W/-wrk/-ce27/pypfi/src/conda-build)==1.12.1+46.g2d17c7f - dulwich==0.9.7 - hg-git==0.6.1
-Chris
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
#1 is pretty straightforward. An entry-point format Python
pre/post/etc. script may do.
I have some ideas for the FHS, though I fear it's full of bikesheds:
1. Allow all GNU directory variables as .data/* subdirectories
(https://www.gnu.org/prep/standards/html_node/Directory-Variables.html).
The distutils names will continue to be allowed.
packagename-1.0.data/mandir/...
2. Make data_files useful again. Interpolate path variables into
distutils data_files using $template syntax. (Only allow at
beginning?)
data_files=[('$mandir/xyz', ['manfile', 'other_man_file'])
In addition to $bindir, $mandir, etc. it will be important to allow
the package name and version to be interpolated into the install
directories.
Inside the wheel archive, you will get
packagename-1.0.data/mandir/manfile and
packagename-1.0.data/mandir/other_man_file
3. Write the install paths (the mapping from $bindir, $mandir, $prefix
etc. to the actual paths used) to one or more of a .py, .json, or
.dist-info/* based on new metadata in WHEEL:
install-paths-to: wheel/_paths.py
It is critical that this be allowed to work without requiring the end
user to look for it with pkg_resources or its pals. It's also good to
only write it if the installed package actually needs to locate its
file categories after it has been installed.
This will also be written inside the wheel itself with relative paths
to the .data/ directory.
4. Allow configurable & custom paths. The GNU paths could be
configured relative to the distutils paths as a default. We might let
the user add additional paths with a configuration dict.
paths = {
"foo" : "$bar/${quux}",
"bar" : "${baz}/more/stuff",
"baz" : "${quux}/again",
"quux": "larry"
}
5. On Windows, no one will really care where most of these files go,
but they probably won't mind if they are installed into separate
directories. Come up with sensible locations for the most important
categories.
On Mon, Apr 13, 2015 at 10:44 AM, Donald Stufft
On Apr 13, 2015, at 10:39 AM, David Cournapeau
wrote: Hi there,
During pycon, Nick mentioned there was interest in updating the wheel format to support downstream distributions. Nick mentioned Linux distributions, but I would like to express interest for other kind of downstream distributors like Anaconda from Continuum or Canopy from Enthought (disclaimer: I work for Enthought).
Right now, wheels have the following limitations for us:
1. lack of post/pre install/removing 2. more fine-grained installation scheme 3. lack of clarify on which tags vendors should use for custom wheels: some packages we provide would not be installable on "normal" python, and it would be nice to have a scheme to avoid confusion there as well.
At least 1. and 2. are of interest not just for us.
Regarding 2., it looks like anything in the
.data/data directory will be placed as is in sys.prefix by pip. This is how distutils scheme is defined ATM, but I am not sure whether that's by design or accident ? I would suggest to use something close to autotools, with some tweaks to work well on windows. I implemented something like this in my project bento (https://github.com/cournape/Bento/blob/master/bento/core/platforms/sysconfig...), but we could of course tweak that.
For 1., I believe it was a conscious decision not to include them in wheel 1.0 ? Would it make sense to start a discussion to add it to wheel ?
I will be at the pycon sprints until wednesday evening, so that we can flesh some concrete proposal first, if there is enough interest.
As a background: at Enthought, we have been using eggs to distribute binaries of python packages and other packages (e.g. C libraries, compiled binaries, etc...) for a very long time. We had our own extensions to the egg format to support this, but I want to get out of eggs so as to make our own software more compatible with where the community is going. I would also like to avoid making ad-hoc extensions to wheels for our own purposes.
To my knowledge, (1) was purposely punted until a later revision of Wheel just to make it easier to land the “basic” wheel.
I think (2) is a reasonable thing as long as we can map it sanely on all platforms.
I’m not sure what (3) means exactly. What is a “normal” Python, do you modify Python in a way that breaks the ABI but which isn’t reflected in the standard ABI tag?
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 13 April 2015 at 16:02, Daniel Holth
#1 is pretty straightforward. An entry-point format Python pre/post/etc. script may do.
There's metadata 2.0 information for this. It would be sensible to follow that definition where it applies, but otherwise yes, this shouldn't be hard. Some thoughts, though: 1. Some thought should be put into how we ensure that pre/post install/remove scripts are cross-platform. It would be a shame if a wheel was unusable on Windows for no reason other than that the postinstall script was written as a bash script. Or on Unix because the postinstall script tried to write Windows start menu items. 2. It's worth considering "appropriate use" of such scripts. The Windows start menu example is relevant here - I can easily imagine users requesting something like that for a wheel they want to install into the system Python, but it's completely inappropriate for installing into a virtualenv. To an extent, there's nothing we can (or maybe even should) do about this - projects that include inappropriate install scripts will get issues raised or will lose users, and the problem is self-correcting to an extent, but it's probably worth including in the implementation, some work to add appropriate documentation to the packaging user guide about "best practices" for pre/post-install/remove scripts (hmm, a glossary entry with a good name for these beasts would also be helpful :-)) Paul
participants (18)
-
Ben Finney
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Daniel Holth
-
David Cournapeau
-
David Mertz
-
Donald Stufft
-
Jonathan Helmus
-
Kevin Horn
-
Leonardo Rochael Almeida
-
Mark Hammond
-
Nick Coghlan
-
Paul Moore
-
Robert Collins
-
Steve Dower
-
Tim Golden
-
Vincent Povirk
-
Wes Turner