[Python-Dev] Status of packaging in 3.3

Thu Jun 21 06:44:55 CEST 2012

On 06/20/2012 11:57 PM, Nick Coghlan wrote:
> On Thu, Jun 21, 2012 at 3:29 AM, PJ Eby<pje at telecommunity.com>  wrote:
>> On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan<ncoghlan at gmail.com>  wrote:
>>>
>>> On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou<solipsis at pitrou.net>
>>> wrote:
>>>> Agreed, especially if the "proven in the wild" criterion is required
>>>> (people won't rush to another third-party distutils replacement, IMHO).
>>>
>>> The existence of setuptools means that "proven in the wild" is never
>>> going to fly - a whole lot of people use setuptools and easy_install
>>> happily, because they just don't care about the downsides it has in
>>> terms of loss of control of a system configuration.
>>
>>
>> Um, this may be a smidge off topic, but what "loss of control" are we
>> talking about here?  AFAIK, there isn't anything it does that you can't
>> override with command line options or the config file.  (In most cases,
>> standard distutils options or config files.)  Do you just mean that most
>> people use the defaults and don't care about there being other options?  And
>> if that's the case, which other options are you referring to?
>
> No, I mean there are design choices in setuptools that explain why
> many people don't like it and are irritated when software they want to
> use depends on it without a good reason. Clearly articulating the
> reasons that "just include setuptools" is no longer being considered
> as an option should be one of the goals of any PEPs associated with
> adding packaging back for 3.4.
>
> The reasons I'm personally aware of:
> - it's a unilateral runtime fork of the standard library that bears a
> lot of responsibility for the ongoing feature freeze in distutils.
> Standard assumptions about the behaviour of site and distutils cease
> to be valid once setuptools is installed
> - overuse of "*.pth" files and the associated sys.path changes for all
> Python programs running on a system. setuptools gleefully encourages
> the inclusion of non-trivial code snippets in *.pth files that will be
> executed by all programs.
> - advocacy for the "egg" format and the associated sys.path changes
> that result for all Python programs running on a system
> - too much magic that is enabled by default and is hard to switch off
> (e.g. http://rhodesmill.org/brandon/2009/eby-magic/)

All of these are really pretty minor issues compared with the main 
benefit of not needing to ship everything with everything else.  The 
killer feature is that developers can specify dependencies and users can 
have those dependencies installed automatically in a cross-platform way. 
  Everything else is complete noise if this use case is not served.

IMO, the second and third things you mention above (use of pth files and 
eggs) are actually features when compared against the result of 
something like pip, which installs things using 
--single-version-externally-managed and then tries to manage the 
resulting potentially-intertwined directories.  Eggs are *easier* to 
manage than files potentially overlapping files and directories 
installed into some other directory.  Either they exist or they don't. 
Either they're mentioned in a .pth file or they aren't.  It's not really 
that hard.

In any case, any tool that tries to manage distribution installation 
will need somewhere to keep distribution metadata.  It's a minor mystery 
to me why people think it could be done much better than in something 
very close to egg format.

> System administrators (and developers that think like system
> administrators when it comes to configuration management) *hate* what
> setuptools (and setuptools based installers) can do to their systems.
> It doesn't matter that package developers don't *have* to do those
> things - what matters is that the needs and concerns of system
> administrators simply don't appear to have been anywhere on the radar
> when setuptools was being designed. (If those concerns actually were
> taken into account at some point, it's sure hard to tell from the end
> result and the choices of default behaviour)

I think you mean easy_install here.  And I guess you mean managing .pth 
files.  Note that if you use pip, neither thing needs to happen.  And 
even easy_install lets you install a distribution that way (with 
--single-version-externally-managed).  So I think, as you mention, this 
is a matter of defaults (tool and or flag defaults) rather than core 
functionality.

> setuptools is a masterful achievement built on shaky foundations that
> will work most of the time. However, when it doesn't work, you're
> probably screwed, and as soon as it's present on a system, you know
> that your assumptions about understanding the Python interpreter's
> startup sequences are probably off.

It's true setuptools is based on shaky foundations.  The rest of the 
stuff you say above is pretty darn specious, I think.

> The efforts around
> distutils2/packaging have been focused on taking the time to *fix the
> foundations first* rather than accepting the inevitable shortcomings
> of trying to build something in the middle of a swamp.
>
>> If the long-term goal is to draw setuptools users over to packaging, then
>> AFAIK the packaging effort is still missing a few things, like build-time
>> dependencies and alternatives to setuptools' entry points and "extras", as
>> well as the ability to integrate version control for building sdists
>> (without requiring the sdist's recipient to *also* have the version control
>> integration in order to build the package or recreate a new sdist).
>
> Right - clearly enumerating the features that draw people to use
> setuptools over just using distutils should be a key element in any
> PEP for 3.4
>
> I honestly think a big part of why packaging ended up being incomplete
> for 3.3 is that we still don't have a clearly documented answer to two
> critical questions:
> 1. Why do people choose setuptools over distutils?

Because it supports automated installation of dependencies.  Almost 
everything else is noise (although some of the other things that 
setuptools provides, like entry points and console scripts, is important 
noise).

> 2. What's wrong with setuptools that meant the idea of including it
> directly in the stdlib was ultimately dropped and eventually replaced
> with the goal of incorporating distutils2?

Because distutils sucks and setuptools is based on distutils.  It's 
horrible to need to hack on.

Setuptools also has documentation which is effectively deltas to the 
distutils docs.  As a result, it's very painful to try to follow the 
setuptools docs.  IMO, it's not that the ideas in setuptools are bad, 
it's that setuptools requires a *lot* more docs to be consumable by 
normal humans, and those docs need to be a lot more accessible.

> I imagine there are answers to both of those questions embedded in
> past python-dev, distutils-sig, setuptools and distutils2 mailing list
> discussions, but that's no substitute for having them clearly
> documented in a PEP (or PEPs, given the scope of the questions).
>
> We've tried to shortcircuit this process twice now, first with "just
> include setuptools" back around 2.5, and again now with "just include
> distutils2 as packaging" for 3.3. It hasn't worked, so maybe it's time
> to try doing it properly and clearly articulating the desired end
> result. If the end goal is "the bulk of the setuptools feature set
> without the problematic features and default behaviours that make
> system administrators break out the torches and pitchforks", then we
> should *write that down* (and spell out the implications) rather than
> assuming that everyone knows the purpose of the exercise.

There's all kinds of built in conflict here wrt to those pitchforks. 
Most of it is stupid.

System admininstrators tend to be stuck in a "one package to rule them 
all" model of deployment and that model *just cant work* on a system 
where you need repeatable deployments of multiple pieces of Python-based 
software which may require mutually exclusive different Python and 
library versions.  Trying to pretend it can work is just plain madness. 
  Telling developers they must work on an exact replica of the 
production system in order to develop the software is also a terrible, 
unproductive idea.  This is a hopeless, 1990s waterfall model of 
deployment and devlopment.

This is why packages like virtualenv and buildout are so popular.  Using 
them gets developers what they need.  Developers get repeatable 
cross-platform deployments without requiring special privilege, and this 
allows for a *reduction* in the system administrator's role in 
deployment.  Sometimes a certain type of system administrator can be a 
hindrance to deployment and maintenance, like sometimes a DBA can be a 
hindrance to a developer who just needs to add a damn table.

With the tools available today (Fabric, buildout, salt, virtualenv, 
pip), it's a heck of a lot easier to script a cross-platform deployment 
that will work simultaneously on Debian, Red Hat, BSD, and Mac OS X than 
it is to build system-level packages for multiple platforms or even 
*one* platform.  And to be honest, if a system administrator can't cope 
with the notion that he may need to forsake his system-level package 
installer and instead follow the instructions we give to him to type 
four or five commands to get a completely working system deployed or 
updated, he probably should not be a system administrator.  His job is 
going to quickly be taken by folks who *can* cope with such deployment 
mechanisms like any cloud service: all the existing Python cloud 
deployment services handle distutils/setuptools installs just fine and 
these tend to be the *only* way you can get Python software installed 
into a system on them.

- C