[Distutils] Current Python packaging status (from my point of view)
chris.barker at noaa.gov
Wed Nov 2 19:08:20 EDT 2016
>  There seems to be some animosity among pip supporters and conda
> > supports, or at least a perception that there is.
> I don't know whether there is animosity, but there is certainly
> tension. Speaking personally, I care a lot about having the option to
> prefer pip.
indeed -- and you have made herculean efforts to make that possible :-)
> There are others in the scientific community who feel we
> should standardize on conda. I think this is the cause of Chris'
> frustration. If we could all use conda instead of pip, then this
> would make it easier for him in terms of packaging, because he would
> not have to support pip (Chris please correct me if I'm wrong).
yup -- nor would you :-)
My personal frustration comes with my history -- I've been in this
community for over 15 years -- and I spent a lot of effort back in the day
to make packages available for the Mac (before pypi, pip, before wheels,
...). And I found I was constantly re-figuring out how to build the
dependent libraries needed, and statically linking everything, etc... It
sucked. A lot of work, and not at all fun (for me, anyway -- maybe some
folks enjoy that sort of thing).
So I started a project to share that effort, and to build a bit of
infrastructure. I also started looking into how to handle dependent libs
with pip+wheel -- I got no support whatsoever for that. I got frustrated
'cause it was too hard, and I also felt like I was fighting the tools. I
did not get far.
Mathew ended up picking up that effort and really making it work, and had
gotten all the core SciPY stuff out there as binary wheels -- really great
But then I discovered conda -- and while I was resistant at first, I found
that it was a much nicer environment to do what I needed to do. It started
not because of anything specific about conda, but because someone had
already built a bunch of the stuff I needed -- nice!
But the breaking point was when I needed to build a package of my own:
py_gd -- it required libgd, which required libpng, libjpeg, libtiff ---
some of those come out of the box on a Mac, it's all available from
homebrew, and all pretty common on Linux -- but I have users that are on
Windows, or on a Mac and can't grok homebrew or macports. And, frankly,
neither do all of my development team.
But guess what? libgd wasn't in conda, but everything else I needed was --
this is all pretty common stuff -- other people have solved the problem and
the system supports installing libs, so I can just use them. My work was SO
MUCH easier. And especially my users have it so much easier, cause I can
just give them a conda package.
And while that particular example would have been solvable with binary
wheels, as things get more complicated, it gets hard or impossible to do.
So if I'm happy with conda -- why the frustration? Some is the history, but
also there are two things:
1) pip is considered "the way" to handle dependencies -- my users want to
use it, and they ask me for help using it, and I don't want to abandon my
users -- so more work for me.
2) I see people over and over again starting out with pip -- cause that's
what you do. Then hitting a wall, then trying Enthought Canopy, then
trying Anaconda, then ending up with a tangled mess of multiple systems
where who knows what python "pip" is associated with. This is why "there is
only one way to do it" would be nice.
And I'm pretty sure that "wall" will always be there -- things have gotten
better with wheels -- between Matthew's efforts and manylinux, most
everybody can get the core SciPy stack with pip -- very nice!
But not pyHDF, netCDF5, gdal, shapely, ... (to name a few that I need to
work with). And these are ugly: which means very hard for end-users to
build, and very hard for people to package up into wheels (is it even
And of course, all is not rosy with conda either -- the conda-forge effort
has made phenomenal progress, but it's really hard to manage that huge
stack of stuff (I'm using the time I'm writing this with to take a break
from conda-forge dependency hell ...). But in a way, I think we'd be better
off if there was more focus on conda-forge rather than the effort to
shoehorn pip into solving more of the dependency problem.
And the final frustration -- I think conda is still pretty misunderstood an
misrepresented as a solution only (or primarily) for "data scientists" or
people doing interactive data exploration, or "scientific programmers",
whereas it's actually a pretty good solution to a lot of people's problems.
> Although there are clear differences in the audience for pip and
> conda, there is also a very large overlap. In practice the majority
> of users could reasonably choose one or the other as their starting
> Of course, one may come to dominate this choice over the
> other. At the point where enough users become frustrated with the
> lack of pip wheels, conda will become the default. If pip wheels are
> widely available, that makes the pressure to use conda less. If we
> reach this tipping point it will become wasteful of developer effort
> to make pip wheels / conda packages, the number and quality of binary
> packages will drop, and one of these package managers will go into
perhaps so -- but it will be a good while! The endorsement of the
"official" community really does keep pip going. And, of course, it works
great for a lot of use-cases.
If it were all up to me (which of course it's not) -- I'd say that keeping
pip / PyPi fully supported for all the stuff it's good at -- pure python
and small/no dependency extension modules -- and folks can go to conda
when/if they need more.
After all, you can use pip from within a conda environment just fine :-)
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Distutils-SIG