[Distutils] Handling the binary dependency management problem

Chris Barker chris.barker at noaa.gov
Tue Dec 3 23:18:15 CET 2013


On Tue, Dec 3, 2013 at 12:48 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Because it already works for the scientific stack, and if we don't provide
> any explicit messaging around where conda fits into the distribution
> picture, users are going to remain confused about it for a long time.
>
Do we have to have explicit messaging for every useful third-party package
out there?

> I'm still confused as to why packages need to share external dependencies
> (though I can see why it's nice...) .
>
> Because they reference shared external data, communicate through shared
> memory, or otherwise need compatible memory layouts. It's exactly the same
> reason all C extensions need to be using the same C runtime as CPython on
> Windows: because things like file descriptors break if they don't.
>

OK -- maybe we need a better term than shared external dependencies -- that
makes me think shared library. Also even the scipy stack is not as
dependent in build env as we seem to thin it is -- I don't think there is
any reason you can't use the "standard" MPL with Golke's MKL-build numpy,
for instance. And I"m pretty sure that even scipy and numpy don't need to
share their build environment more than any other  extension (i.e. they
could use different BLAS implementations, etc... numpy version matters, but
that's handled by the usual dependency handling.

The reason Gohke's repo, and Anoconda and Canopy all exist is because it's
a pain to build some of this stuff, period, not complex compatibly issues
-- and the real pain goes beyond the standard scipy stack (VTK is a killer!)

> Conda solves a specific problem for the scientific community,
>
well, we are getting Anaconda, the distribution, and conda, the package
manager, conflated here:

Having a nice full distribution of all the packages you are likely to need
to great, but you could so that with wheels, and Gohlke is already doing it
with MSIs (which don't handle dependencies at all -- whic is a problem).


> but in their enthusiasm, the developers are pitching it as a general
> purpose packaging solution. It isn't,
>

It's not? Aside from momentum, and all that, could it not be a replacement
for pip and wheel?


> Wheels *are* the way if one or both of the following conditions hold:
>
> - you don't need to deal with build variants
> - you're building for a specific target environment
>
> That covers an awful lot of ground, but there's one thing it definitely
> doesn't cover: distributing multiple versions of NumPy built with different
> options and cohesive ecosystems on top of that.
>

hmm -- I'm not sure, you could have an Anoconda-like repo built with
wheels, could you not? granted, it would be easier to make a mistake, and
pull wheels from two different wheelhouses that are incompatible, so there
is a real advantage to conda there.

> By contrast, conda already exists, and already works, as it was designed
> *specifically* to handle the scientific Python stack.
>
I'm not sure we how well it works -- it works for Anoconda, and good point
about the scientifc stack -- does it work equally well for other stacks? or
mixing and matching?

>  This means that one key reason I want to recommend it for the cases
> where it is a good fit (i.e. the scientific Python stack) is so we can
> explicitly advise *against* using it in other cases where it will just add
> complexity without adding value.
>
I'm actually pretty concerned about this: lately the scipy community has
defined a core "scipy stack":

http://www.scipy.org/stackspec.html

Along with this is a push to encourage users to just go with a scipy
distribution to get that "stack":

http://www.scipy.org/install.html

and

http://ipython.org/install.html

I think this is in response to a years of pain of each package trying to
build binaries for various platforms, and keeping it all in sync, etc. I
feel their pain, and "just go with Anaconda or Canopy" is good advise for
folks who want to get the "stack" up and running as easily as possible.

But it does not server everyone else well -- web developers that need MPL
for some plotting , scientific users that need a desktop GUI toolkit,
pyhton newbies that want iPython, but none of that other stuff...

What would serve all those folks well is a "standard build" of packages --
i.e. built to go with the python.org builds, that can be downloaded with:

pip install the_package.

And I think, with binary wheels, we have the tools to do that.

> Saying nothing is not an option, since people are already confused. Saying
> to never use it isn't an option either, since bootstrapping conda first
> *is* a substantially simpler cross-platform way to get up to date
> scientific Python software on to your system.
>
again, it is Anoconda that helps here, not conda itself.

  Or
> how about a scientist that wants wxPython (to use Chris' example)?
> Apparently the conda repo doesn't include wxPython, so do they need to
> learn how to install pip into a conda environment? (Note that there's
> no wxPython wheel, so this isn't a good example yet, but I'd hope it
> will be in due course...)


Actually if only it were as simple as "install pip', but as you point out,
there is no wxPython binary wheel, but if there were, it would be
compatible with the pyton.org python, and maybe not Anoconda (would conda
catch that?)

Looks like the conda stack is built around msvcr90, whereas python.org
> Python 3.3 is built around msvcr100.
> So conda is not interoperable *at all* with standard python.org Python
> 3.3 on Windows :-(


again, Anaconda  the distribution, is not, but I assume conda, the package
manager, is. And IIUC, then conda would catch that incompatibly if you
tried to install incompatible packages. That's the whole point, yes? And
this would help the recent concerns from the stackless folks about building
a pyton binary for Windows with  a newer MSVC (see pyton-dev)

- if there's no wheel
> - and you can't build it from source yourself
> - then you can try "pip install conda && conda init && conda install
> <pkg>" as a fallback option.
> And then we encourage the conda devs to follow the installation
> database standard properly (if they aren't already), so things
> installed with conda play nice with things installed with pip.
> It sounds like we also need to get them to ensure they're using the
> right compiler/C runtime on Windows so their packages are
> interoperable with the standard python.org installers.


maybe we should just have conda talk to PyPi?

As it stands, one of the POINTS of Anoconda is that it ISN'T the standard
pyhton.org installer!

But really, this just puts us back in the state that we want to avoid -- a
bunch of binary-incompatible builds out there to get confused by -- again
thouhg, at least conda apparently wont let you install binary incompatible
packages...

However, there haven't been any compelling examples presented other
> than the C runtime (which wheel needs to handle as part of the
> platform tag and/or the ABI tag) and the scientific stack,


Again, I'm pretty sure it doesn't even apply to the scientific stack in any
special way...

At the moment, we're getting people trying to use conda as the base,
> and stuff falling apart at a later stage,


Still confused: conda the package manager, or Anaconda the distribution?

well, except that the anaconda index covers non-python projects like "qt",
> which a private wheel index wouldn't cover (at least with the normal
> intended use of wheels)


umm, why not? you couldn't have a pySide wheel???


-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20131203/63008c62/attachment-0001.html>


More information about the Distutils-SIG mailing list