[Distutils] moving things forward

Sat May 7 06:00:27 EDT 2016

tl;dr version

I think you're right that terminology can be confusing. I think the
definitions people typically work to are:

1. The "packaging" or "release" process - the process (run on a
developer's machine) of creating files that get published for users to
download and install.
2. The "install" process - the process (run on a user's machine) of
taking a published file and making it available in their environment.
This consists of separate steps:
  2a. Optional, and to be avoided wherever possible (by distribution
of wheels) - the "build" step that takes a published file and
configures (compiles) it for the user's environment
  2b. The "install" step (confusion alert! - the "install" step is
only one step of the "install" *process*) that puts the files in the
correct places on the user's machine.

We're not interested in trying to dictate the "packaging" process -
pip isn't involved in that process at all (see flit for a system that
lets projects build releases in a completely different way).

Sigh. Even the tl;dr version is too long :-)

On 7 May 2016 at 01:55, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Chris Barker wrote:
>>
>> But I think there is consensus here that build systems need to be
>> customisable -- which means arbitrary code may have to be run.
>
> I think different people are using the word "build" in
> different ways here.
>
> To my mind, "building" is what the developer of a package
> does, and a "build system" is what he uses to do it. I
> don't care how much arbitrary code gets run during that
> process.

That is correct, and I agree with you that making a build process like
this declarative is not particularly useful. However...

> But when I do "python setup.py install" or "pip install"
> or whatever the recommended way is going to be, from my
> point of view I'm not "building" the package, I'm
> *installing* it.

Unfortunately, "python setup.py install" does not work that way - it
builds the project and then installs the files. So whether you want to
or not, you're building. That's basically why we're trying to make
"pip install foo" the canonical way of installing packages. So let's
ignore "setup.py install" for the rest of this discussion.

Now, for "pip install foo", *if the foo project provides a wheel
compatible with your system* then what you expect is what you get - a
pure install with no build step.

The problem lies with projects that don't supply wheels, only source.
Or unusual systems that we can't expect projects to have wheels for.
Or local checkouts ("pip install ."). In those cases, it's necessary
to do a build before you can install.

So while we're aiming for 80% or more of the time "pip install" to do
a pure install from a binary distribution, we can't avoid the fact
that occasionally the install will need to run an implicit build step.

> Confusion arises because the process of installation may
> require running a C compiler to generate extension modules.
> But figuring out how to do that shouldn't require
> running arbitrary code supplied by the developer. All the
> tricky stuff should have been done before the package
> was released.

I'm not sure I follow what you are suggesting here. Do you expect that
projects should be able to publish something (it's not exactly a
sdist, but it's not a wheel either as it doesn't contain everything
built) should (somehow) contain simplified instructions on how to
build the various C/Fortran extensions supplied in the bundle as
source code? That's an interesting idea, but how would it work in
practice? The bundles would need to be platform specific, I assume? Or
would the user need to put all the various details of his system into
a configuration file somewhere (and update that file whenever he
installs a new library, updates his compiler, or whatever)? How would
this cope with (for example) projects on Windows that *have* to be
compiled with mingw, and not with MSVC?

> If it's having trouble finding some library or header
> file or whatever on my system, I'd much rather have a
> nice, clear declarative config file that I can edit to
> tell it where to find them, than some overly clever
> piece of python code that's great when it works but
> a pain to unravel when it doesn't.

This sounds to me more like an attempt to replace the "build" part of
distutils/setuptools with a more declarative system. While that may be
a worthwhile goal (I genuinely have no opinion on that) it's largely
orthogonal to the current discussions. Except in the sense that if you
were to build such a system, the proposals currently on the table
would allow you to ask pip to use that build system rather than
setuptools:

# Invented syntax, because syntax is what we *haven't* agreed on yet :-)
[buildsystem]
requires=gregsbuild
build_command=gregsbuild --make-wheel

Then if a user on a system for which the project doesn't have a binary
wheel installed tries to install the project, your build system will
be downloaded, and the "gregsbuild --make-wheel" command will be run
to make a wheel. That's a one-off event - the wheel will be cached on
the user's system for future use.

I think the key point to understand is that of necessity, "pip
install" runs two *steps* - one is the obvious install step, the other
is a build step. We're working to reduce the number of cases where a
build step is needed as far as we can, but the discussion here is
about making life easier for projects who can't provide wheels for all
their users, and need the build *step* (the one run on the user's
machine, not the build *process* - which I'd describe as a "release"
or "packaging" process, as it creates the distribution files made
available to users - that runs on the developer's machine.

Paul