[Distutils] distlib updated - comments sought

Vinay Sajip vinay_sajip at yahoo.co.uk
Fri Oct 5 19:57:25 CEST 2012


Daniel Holth <dholth <at> gmail.com> writes:

> Bootstrapping is kinda annoying because Python doesn't include an
> installer for pip or buildout or ... and it can be hard to choose
> between the many excellent installers that are available on and off of
> pypi.

That was the point of packaging - to have something better than distutils
in the stdlib. The original goal was too ambitious to achieve in the 3.3
timeframe, and perhaps issue could be taken with some use cases which weren't
fully considered. Since packaging is an infrastructure concern, I believe
(as others do) that there's a place for *something* in the stdlib that's
better than distutils, because distutils failed to meet changing needs.

That something may be distlib, or something like it, or nothing like it.
It's too early to say.

> ~1300 of the ~20000 packages on pypi have trouble using setup.py as
> their build system / metadata source format.

I'm not sure what you mean. Packages don't have trouble, people do. For
example, it may be possible for me to install a particular package from PyPI
on my system (=> "no trouble"), but the package may be hard to package for
Linux distros, because some of the installation logic happens in setup.py
or code called from it.

> For the ~1300 broken packages, distutils is awful because it is not
> really extensible, though setuptools tried.

Valiant effort by setuptools, but it could be considered a band-aid.
Obviously there are different opinions about setuptools, but it's hard
to argue against the fact that setuptools and pkg_resources are not
considered worthy of inclusion in the stdlib by python-dev.

> People have to install setuptools against their will because there is
> only one implementation of the pkg_resources API and 75% of the
> packages on pypi require setuptools.

Well, isn't that what packaging was (and distlib is) trying to remedy?

> Packaging has been in turmoil for years waiting for something.

Mainly, people with the time and inclination :-)
 
> In my estimation we're not saving the world here.

Just trying to improve the world a teeny little bit is a worthy goal;
saving the world is beyond the reach of most of us.

> The goal should be
> to fix 1,300 packages without breaking 19,000, to make bootstrapping
> easier, and to make setuptools optional but neither required nor
> prohibited.

I'm not sure that anyone is anticipating, or working towards, breaking
thousands of packages. Nothing is "prohibited" - people can use whatever they
want. But setuptools (or other third-party package) can't be truly optional
while the stdlib lacks functionality which people need, and setuptools
provides.

So, perhaps the goal is just to offer more choices. People don't like change,
but change can't always be avoided. We can be optimistic that strategies will
be in place for mitigating the pain of migration for those who choose to
migrate (no coercion). Just like 2to3 eases the transition from 2.x to 3.x,
while forcing no-one to move over.

I certainly don't believe the answer is to keep pkg_resources and setuptools
APIs as some kind of fossilised, inviolable standard.

I do believe we have to move away from custom installation code, which
distutils by its nature forced people to produce, and which leads to
problems for some people.

Although I haven't published all my results, as it's still work in progress,
I managed to extract all the metadata from thousands of packages hosted on
PyPI from setup() to package.yaml, and then was able to generate a source
archive from package.yaml which was semantically the same as the original.
So I'm optimistic that by working on refining the metadata extraction
mechanism, most of the packages on PyPI can be represented by an alternative
metadata format, which allows the existence of a multiplicity of solutions to
build, package and install Python software.

> It is wonderful to have distlib. I support it. I'm playing with the
> competing distlib2 implementation so that both APIs can be better, so
> we can find out which parts provide functionality that does not just
> have a different name in pkg_resources, and so that it can be possible
> to replace the implementation without changing the API. If your goal
> is to avoid "implementation defined behavior" it's a good idea to have
> two.

Let a thousand flowers bloom :-)

Regards,

Vinay Sajip



More information about the Distutils-SIG mailing list