Re: [Python-Dev] Status of packaging in 3.3

21 Jun 2012

      On 06/20/2012 11:57 PM, Nick Coghlan wrote:
...
On Thu, Jun 21, 2012 at 3:29 AM, PJ Eby<pje@telecommunity.com>  wrote:
...
On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan<ncoghlan@gmail.com>  wrote:
...
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou<solipsis@pitrou.net>
wrote:
...
Agreed, especially if the "proven in the wild" criterion is required
(people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never
going to fly - a whole lot of people use setuptools and easy_install
happily, because they just don't care about the downsides it has in
terms of loss of control of a system configuration.
Um, this may be a smidge off topic, but what "loss of control" are we
talking about here?  AFAIK, there isn't anything it does that you can't
override with command line options or the config file.  (In most cases,
standard distutils options or config files.)  Do you just mean that most
people use the defaults and don't care about there being other options?  And
if that's the case, which other options are you referring to?
No, I mean there are design choices in setuptools that explain why
many people don't like it and are irritated when software they want to
use depends on it without a good reason. Clearly articulating the
reasons that "just include setuptools" is no longer being considered
as an option should be one of the goals of any PEPs associated with
adding packaging back for 3.4.
The reasons I'm personally aware of:
- it's a unilateral runtime fork of the standard library that bears a
lot of responsibility for the ongoing feature freeze in distutils.
Standard assumptions about the behaviour of site and distutils cease
to be valid once setuptools is installed
- overuse of "*.pth" files and the associated sys.path changes for all
Python programs running on a system. setuptools gleefully encourages
the inclusion of non-trivial code snippets in *.pth files that will be
executed by all programs.
- advocacy for the "egg" format and the associated sys.path changes
that result for all Python programs running on a system
- too much magic that is enabled by default and is hard to switch off
(e.g. http://rhodesmill.org/brandon/2009/eby-magic/)
All of these are really pretty minor issues compared with the main 
benefit of not needing to ship everything with everything else.  The 
killer feature is that developers can specify dependencies and users can 
have those dependencies installed automatically in a cross-platform way. 
  Everything else is complete noise if this use case is not served.

IMO, the second and third things you mention above (use of pth files and 
eggs) are actually features when compared against the result of 
something like pip, which installs things using 
--single-version-externally-managed and then tries to manage the 
resulting potentially-intertwined directories.  Eggs are *easier* to 
manage than files potentially overlapping files and directories 
installed into some other directory.  Either they exist or they don't. 
Either they're mentioned in a .pth file or they aren't.  It's not really 
that hard.

In any case, any tool that tries to manage distribution installation 
will need somewhere to keep distribution metadata.  It's a minor mystery 
to me why people think it could be done much better than in something 
very close to egg format.
...
System administrators (and developers that think like system
administrators when it comes to configuration management) *hate* what
setuptools (and setuptools based installers) can do to their systems.
It doesn't matter that package developers don't *have* to do those
things - what matters is that the needs and concerns of system
administrators simply don't appear to have been anywhere on the radar
when setuptools was being designed. (If those concerns actually were
taken into account at some point, it's sure hard to tell from the end
result and the choices of default behaviour)
I think you mean easy_install here.  And I guess you mean managing .pth 
files.  Note that if you use pip, neither thing needs to happen.  And 
even easy_install lets you install a distribution that way (with 
--single-version-externally-managed).  So I think, as you mention, this 
is a matter of defaults (tool and or flag defaults) rather than core 
functionality.
...
setuptools is a masterful achievement built on shaky foundations that
will work most of the time. However, when it doesn't work, you're
probably screwed, and as soon as it's present on a system, you know
that your assumptions about understanding the Python interpreter's
startup sequences are probably off.
It's true setuptools is based on shaky foundations.  The rest of the 
stuff you say above is pretty darn specious, I think.
...
The efforts around
distutils2/packaging have been focused on taking the time to *fix the
foundations first* rather than accepting the inevitable shortcomings
of trying to build something in the middle of a swamp.
...
If the long-term goal is to draw setuptools users over to packaging, then
AFAIK the packaging effort is still missing a few things, like build-time
dependencies and alternatives to setuptools' entry points and "extras", as
well as the ability to integrate version control for building sdists
(without requiring the sdist's recipient to *also* have the version control
integration in order to build the package or recreate a new sdist).
Right - clearly enumerating the features that draw people to use
setuptools over just using distutils should be a key element in any
PEP for 3.4
I honestly think a big part of why packaging ended up being incomplete
for 3.3 is that we still don't have a clearly documented answer to two
critical questions:
1. Why do people choose setuptools over distutils?
Because it supports automated installation of dependencies.  Almost 
everything else is noise (although some of the other things that 
setuptools provides, like entry points and console scripts, is important 
noise).
...
2. What's wrong with setuptools that meant the idea of including it
directly in the stdlib was ultimately dropped and eventually replaced
with the goal of incorporating distutils2?
Because distutils sucks and setuptools is based on distutils.  It's 
horrible to need to hack on.

Setuptools also has documentation which is effectively deltas to the 
distutils docs.  As a result, it's very painful to try to follow the 
setuptools docs.  IMO, it's not that the ideas in setuptools are bad, 
it's that setuptools requires a *lot* more docs to be consumable by 
normal humans, and those docs need to be a lot more accessible.
...
I imagine there are answers to both of those questions embedded in
past python-dev, distutils-sig, setuptools and distutils2 mailing list
discussions, but that's no substitute for having them clearly
documented in a PEP (or PEPs, given the scope of the questions).
We've tried to shortcircuit this process twice now, first with "just
include setuptools" back around 2.5, and again now with "just include
distutils2 as packaging" for 3.3. It hasn't worked, so maybe it's time
to try doing it properly and clearly articulating the desired end
result. If the end goal is "the bulk of the setuptools feature set
without the problematic features and default behaviours that make
system administrators break out the torches and pitchforks", then we
should *write that down* (and spell out the implications) rather than
assuming that everyone knows the purpose of the exercise.
There's all kinds of built in conflict here wrt to those pitchforks. 
Most of it is stupid.

System admininstrators tend to be stuck in a "one package to rule them 
all" model of deployment and that model *just cant work* on a system 
where you need repeatable deployments of multiple pieces of Python-based 
software which may require mutually exclusive different Python and 
library versions.  Trying to pretend it can work is just plain madness. 
  Telling developers they must work on an exact replica of the 
production system in order to develop the software is also a terrible, 
unproductive idea.  This is a hopeless, 1990s waterfall model of 
deployment and devlopment.

This is why packages like virtualenv and buildout are so popular.  Using 
them gets developers what they need.  Developers get repeatable 
cross-platform deployments without requiring special privilege, and this 
allows for a *reduction* in the system administrator's role in 
deployment.  Sometimes a certain type of system administrator can be a 
hindrance to deployment and maintenance, like sometimes a DBA can be a 
hindrance to a developer who just needs to add a damn table.

With the tools available today (Fabric, buildout, salt, virtualenv, 
pip), it's a heck of a lot easier to script a cross-platform deployment 
that will work simultaneously on Debian, Red Hat, BSD, and Mac OS X than 
it is to build system-level packages for multiple platforms or even 
*one* platform.  And to be honest, if a system administrator can't cope 
with the notion that he may need to forsake his system-level package 
installer and instead follow the instructions we give to him to type 
four or five commands to get a completely working system deployed or 
updated, he probably should not be a system administrator.  His job is 
going to quickly be taken by folks who *can* cope with such deployment 
mechanisms like any cloud service: all the existing Python cloud 
deployment services handle distutils/setuptools installs just fine and 
these tend to be the *only* way you can get Python software installed 
into a system on them.

- C

Re: [Python-Dev] Status of packaging in 3.3

Chris McDonough