[Distutils] A smaller step towards de-specializing setuptools/distutils

Donald Stufft donald at stufft.io
Thu Oct 29 13:31:57 EDT 2015


Hello!

So I've been reading the various threads (well trying to, there's a lot going
on in all of them) and I'm trying to reconile them all in my head and sorting
out how things are going to look if we add them. I'm worried about adding more
complication to an already underspecified format and making it a lot harder for
tooling to work with these things.

I'm wondering if there isn't a smaller step we can take towards allowing better
interopability with diffrent build systems while also making the situation
clearer when someone is still using distutils or setuptools. For the sake of
argument, let's say we do something like this:

Standardize a key in setup.cfg that can be used for setup_requires. This will
be a static list of items which need to be installed in order to execute the
setup.py. Unlike the current key (which is inside of setup.py) you don't have
to deal with problems of trying to defer the import of the items or dealing
with setuptools at all since your setup.py will not be invoked until after
those items are installed. For people who want to continue to use setuptools,
this file is trivial to parse with the standard library, so they can actually
parse it and stick it into a setup_requires as well which will have the benefit
that the tool calling the setup.py can handle those dependencies but still have
it working on older versions of those tools without requiring duplication.

This can also help work to make setuptools less special. Right now you
basically have to just manually install setuptools prior to installing anything
that uses it (or at all for something like pip when installing from sdist).
With this we could pretty easily make it so that the rule is something like
"If there is a setup.cfg:setup_requires, then assume that everything that is
needed to execute the setup.py is listed there, else assume an implicit
'setuptools' requirement". This means that end users won't need to have
setuptools installed at all, because we'll treat it (mostly) like just another
build tool. This would also (for the first time) make it possible for things to
depend on a specific version of setuptools instead of having to support every
version of setuptools ever.

I think that could solve the problem of "bootstrapping" the requirements to
execute a setup.cfg and extract that from being implicitly handled by
setuptools. That still doesn't handle the problem of making it possible to
actually invoke the now installed build system.

I think that what we should do is figure out the minimal interface we need from
a ``setup.py`` based on what already exists. This would be a much smaller API
surface than what exists today and wouldn't (ideally) include anything that is
an implementation detail of setuptools. We would also need to standard on what
flags an arguments the various commands of setup.py MUST accept. This would of
course not require a setup.py to implement _only_ that particular interface so
additional commands that setuptools already provide can stay, they just won't
be required. Off the top of my head, I think we'd probably want to have the
``bdist_wheel``, ``sdist``, and ``build`` commands. We would also need
something like ``egg_info``, but I think we could probably diverge a bit from
what setuptools does now and make it ``dist_info`` so it's backed by the same
standard that Wheel is already. I think that these four commands can support
everything we actually _need_ to support. This isn't everything we actually
use today, but since this would be opt-in, we can actually be more restrictive
about what build/installation tools would call and what level of fallback we
need to support for them.

The way it would work when what we have available is a sdist is something like
this:

We download a sdist and we notice that there is a setup.cfg:setup_requires.
This toggles a stricter mode of operation where we no longer attempt to do as
many fallbacks or hacks to try and work around various broken shims with
setuptools. We read this key and install those items needed to execute the
setup.py. Once we do that, then pip would invoke ``setup.py bdist_wheel`` and
build a wheel from that [1]. Once we have a wheel built, we'll feed that data
back into the resolver [2] and use the runtime dependency information from
within that wheel to continue resolving the dependencies.

OR

We have an "arbitrary directory" (VCS, local FS, whatever) on disk that is not
being installed in editable. In this case we'll call ``setup.py sdist`` first,
then feed that into the above.

OR

We have an "arbitrary directory" (VCS, local FS, whatever) on disk that is
being installed as an editable. In this case, we'll call
``setup.py build --inplace`` first, then do something to ensure that the
inplace directory is on sys.path. This is currently undefined because I don't
know exactly what we'd need to do to make this work, but I think it should be
possible and will be more consistent. We'll probably also need something like
``setup.py egg_info`` or ``setup.py dist_info``, but balancing backwards compat
with removing setuptools specifics is something we'd need to figure out.

So why do I think something like this would be better?

* It uses interfaces that are already there (for the most part) which means
  that it's easier for people to adapt their current tooling to it, and to do
  it in a way that will work with existing legacy packages.

* It defines an interface that build systems must adhere too, instead of
  defining an interface that lets you query for how you actually interact with
  the build system (sort of a plugin system). This removes indirection and in
  my experience, interfaces are generally less problematic then "plugins".

* It doesn't drastically change the mechanics of how the underlying thing works
  so it's a much smaller iteration on existing concepts rather than throwing
  everything out and starting from scratch.

* It moves us towards an overall installation "path" that I think will be
  benefical for us and will help to reduce the combinatorial explosion of ways
  that a set of dependencies can be installed.

* It will be easier to integrate setuptools into this new system (since it
  largely already implements it) which means that it's easier for us to move
  projects to upgrade piece meal to this new system since it would only require
  dropping a ``setup.cfg`` with a ``setup_requires``. This also means we could
  adjust setuptools (like adding a dist_info command) and have the implicit
  setup.cfg:setup_requires be 'setuptools>=somever' instead of just
  'setuptools'.

What this doesn't solve:

* This essentially doesn't solve any of the dynamic vs static metadata issues
  with the legacy sdist format, but that's OK because it's just a small
  layering (a new setup.cfg feature) of a new feature combined with just
  standardizing what already exists. Drastically altering the sdist format to
  try and shoe-horn static metadata in there is probably not something that is
  going to work well in practice (and needs a more thought out, newer format).

* Dynamically specified _build_ requirements (or really, build requirements at
  all other than via setup_requires). You can sort of kludge dynamically
  specified build requirements by making a sort of meta-package that you put in
  your static setup_requires that generats dynamic runtime requirements. I
  think keeping this step simple though and work on enabling breaking the
  dependency on setuptools/distutils in this step and then waiting to see if
  that is "enough" or if we need to layer on additional features (like dynamic
  or otherwise seperately declared build requirements).


[1] This isn't related to the Wheel cache. In this stricter mode we would only
    ever build a wheel and install from that. If caching is on then we'll save
    that wheel and reuse it next time, if caching is not on then we'll just
    throw that wheel away at the end of the run.

[2] Pretending for a minute that we have a real resolver.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA




More information about the Distutils-SIG mailing list