[Python-Dev] Software integrators vs end users (was Re: Language Summit notes)
Donald Stufft
donald at stufft.io
Fri Apr 18 22:50:44 CEST 2014
On Apr 18, 2014, at 4:22 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 18 April 2014 15:39, Donald Stufft <donald at stufft.io> wrote:
>>
>> On Apr 18, 2014, at 3:18 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>>> At this point, however, I'm mainly looking for consensus that there
>>> *are* two different problems to be solved here, and solving them both
>>> well in a single tool is likely to be nigh on impossible. (I'm aware
>>> we're really on the wrong list for that discussion, but I also think
>>> there's value in getting some broader python-dev awareness of this
>>> particular topic)
>>
>> I’m not sure about this? I mean yes those are two different areas, but I’m
>> not sure about the split between Conda and pip here. As far as I can tell
>> Conda is useful in the same way apt-get or yum is useful, you get a non
>> Python specific set of packages (because sometimes things aren’t pure
>> python) and you also remove a little bit of thinking about versions (although
>> honestly I think it’s possible to remove most of that for consumers of
>> packages).
>
> You also get the ability to use that same system to update *Python
> itself*, regardless of platform. Being notified of and consuming
> CPython updates on Windows, as well as consuming alternate versions on
> Linux distros, is currently not a well solved problem - with conda,
> it's no more complex than updating a PyPI package.
>
> That's one of the most attractive aspects for me - making Python 3.4,
> pypy, etc as easy to update with consistent cross platform
> instructions as Python packages are.
Ah right. I would never want that personally so I forgot about it. (I also don’t use
the system Python for development but I user a different tool for managing it).
>
>> To be quite frank, a lot of the benefit of Conda outside of the “I need something
>> that isn’t strictly Python) is in the fact they can bootstrap compiled packages
>> whereas pip/wheel/PyPI combination we need to convince authors to upload
>> their own binary packages (or at least develop something to make it easier
>> like a build farm).
>
> Yep, that's a large part of why I think "divide and conquer" is the
> way to go here. While it isn't completely accurate (as most SaaS
> developers don't want to build C extensions from source) I think
> Guido's "build from source" vs "install a pre-built binary"
> distinction is still a reasonably good way to characterise it. For a
> distro like Fedora (or, even more so, RHEL), we're not going to trust
> a binary created by someone else if we can possibly avoid it, so
> upstream binaries aren't useful to us, but pip's ability to abstract
> away the vagaries of the upcoming metadata 2.0 migration is incredibly
> helpful. The other distros are in the same situation (we'll always
> feed source tarballs into our own build systems), and ditto for the
> conda folks. We need that for all sorts of reasons that potential new
> Python users don't care about, but continuing to meet our
> requirements, along with the free-for-all that is PyPI makes handling
> the binary distribution problem *much* harder for pip.
>
> By contrast, like any other distro, conda doesn't need to boil the
> ocean - it just needs to provide a filtered, up to date set of core
> packages that work well together. The advantage it has over other
> distros is that it is *cross-platform* - it works essentially the same
> way on Windows, Mac OS X. Most other package management systems are
> either platform specific or can't handle arbitrary binary
> dependencies. By being Python-centric (even if not Python specific),
> there's also a strong focus on updating the core packages more often
> than the Linux distros do.
>
> There's no "always use conda instead of pip" competition here - we
> need pip, we need wheels. But I see those as software integrator
> focused tools that would need a lot of additional work to provide a
> truly spectacularly compelling out of the box experience for new
> users. I don't think that's a useful way to expend effort - I think it
> makes more sense to work on a separate "here's the fully assembled
> environment for using Python as a tool to develop ideas" introduction,
> while pushing the "here's how to build your own custom environment
> from upstream parts" as a later step in a new user's journey towards
> full Python mastery (if they're interested in that path). Making sure
> that "pip install foo" does the right thing inside conda environments
> (if it doesn't already) should be all that is needed to ensure that
> random installation instructions on the internet still work.
So I’m not really worried about a competition or anything. I’m mostly worried
about confusion of users. What you’re suggestion we give to use is *two* ways
to install Python packages (and 2 or 3 ways to virtualize a Python instance).
That provides extra cognitive burden for people who are new to Python. They
have to both understand that the packages you install from Conda are different
than the ones you install from pip, and that they come from different places.
If a new user is reasonably likely to have to use ``pip install`` when they are
using pip, then I think that provides a worse experience than only having to
use one tool to manage your packages. This confusion is going to be worse when
new users find a library they want to use and it tells them to use
``pip install`` (or even easy_install!) even if Anaconda itself has a package
inside of it.
So I think this could potentially in the short term make things easier for
people who need nothing but what conda provides (because of the bootstrapped
pre-compiled binaries), but I think ultimately you’re going to confuse them
until they know enough about the ecosystem to grasp the difference.
I’ve *personally* seen this first hand with the Linux and other OSs who
distribute stuff where people get really confused about what the difference is
between pip and apt-get/yum/portmaster and when they’d use one over the other,
adding another column in the matrix for new users seems to be something that is
only going to cause more confusion and pain.
As you said the difference is often "precompiled vs source: and I think the
answer to that is to make it easier to use precompiled stuff in pip and PyPI
and not to push people to an entirely different stack and invalidate the vast
bulk of documentation (which most of it already recommends pip) that those
users are likely to encounter.
Baring implementation problems (bad defaults, ux, etc) the fundamental
difference between conda and pip/easy_install is that conda manages more than
just Python packages, much like apt-get/yum) does and any discussion about what
to recommend where should focus on fundamental differences as implementation
problems in *either* solution can be addressed (lack of binary/ux on pip,
security issues on Conda, etc).
-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140418/28d301c9/attachment.sig>
More information about the Python-Dev
mailing list