Re: [Distutils] A possible refactor/streamlining of PEP 517

July 14, 2017

      On Fri, Jul 7, 2017 at 8:27 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
The latest round of discussions have been enlightening, as they have
allowed us to articulate that from pip's point of view, the key
requirement is to be able to tell a backend not to include anything
that wouldn't be included when building via an sdist.
From flit's point of view, Tomas wants frontends to able to express a
preference between two different failure modes:
1. The frontend wants the wheel build to *guarantee* it exactly
matches going via the sdist path
2. The frontend wants a working wheel build more than it cares about
matching the sdist path
Both of these are handled neatly by my draft posted at the beginning
of this thread.

OTOH this whole 11th hour discussion of forcing every build system to
have in-tree and out-of-tree build support is solving some other
problem. I'm not entirely sure what that problem is -- I don't think
anyone has articulated it. Your "key requirement" is technically
vacuous -- by definition, *any* correct build backend will only
include the things that would be included when building via an sdist.

Some possible problems that I've seen mentioned in the thread include:

- pip doesn't trust build systems to properly support incremental
builds, so it wants to force it to throw away the build artifacts
after every build
- pip wants to enforce builds going via sdists
- Thomas / Ralf / I are frustrated at the idea of not supporting
incremental builds

But the in-tree/out-of-tree build proposal doesn't address any of
these problems. Part of the support for it seems to be that it
*sounds* like it might somehow provide a compromise between the folks
arguing over what pip should do by default, because it provides a way
of doing incremental builds that kind of looks like pip's current hack
for doing non-incremental builds. But I think this is illusory. We
still have to argue over what pip will actually do by default; all
this does is replace 1 way of doing incremental builds with 2 ways,
and if pip doesn't want to do incremental builds it'll ignore both.

Seriously, what problem is this solving? How can we even have a
discussion about whether it's the best solution when we don't know
that?

In my previous emails I was trying to avoid getting into the
nitty-gritty of actually critiquing the proposal, because I think it's
a fundamental mistake to even try to hash out this kind of complex
design at the last minute; for every issue I think of now there's
probably another one that none of us have noticed yet. But since
everyone else seems so gung-ho to ship this thing without even having
that minimal discussion, here are some concerns and questions:

If we require every project to support both in-tree and out-of-tree
builds, then projects that don't really support out-of-tree builds now
need to implement the copytree hack themselves, and they might get it
wrong. For example, if you don't correctly clear out the old build
tree before copying, your new build could potentially be corrupted by
artifacts from your old build. (I'm having flashbacks to the bug
reports we get on numpy from people who used setup.py install to
upgrade, and because it doesn't uninstall the old version they end up
with some combination of old and new versions overlaid on top of each
other.) This is a whole new potential failure mode that this proposal
is introducing. Is that acceptable?

In fact, I'm guessing that pip will not actually cache build
directories and re-use them, and will only ever supply empty
directories as the out-of-tree build dir. This means that lots of
projects are likely to be released without ever testing their
build-with-a-non-empty-out-of-tree-build-dir path, and thus it won't
work. What are the chances that this turns into another feature that
exists in theory but in practice it rusts over and can't be used,
because trying to do so would break too often?

What happens if a project switches from scons to cmake or vice-versa,
and they get passed a build tree that contains foreign build
artifacts? Are they prepared to detect this and do something sensible?
Traditionally reusing a build context (either an in-place build tree
or an out-of-place build tree) is something that only developers do,
and they do it manually, so this isn't a problem -- you just send an
occasional email to the list saying "hey I just flipped the switch on
the build systems, make sure to do a 'git clean' before your next
build", or people just reflexively do a 'git clean' any time they get
weird results. But now we're proposing to make this a first-class
feature to be used by automatic unsupervised build pipelines; are they
going to end up producing garbage?

Are frontends allowed to move the out-of-tree build directory to
another parent directory? Another filesystem? Another machine? Another
operating system? what do you they have to preserve if they try?
timestamps? inode numbers? (Using inode numbers as part of a change
detection algorithm is a totally reasonable thing for a build system
to do.)

What if a source tree has previously been used for in-place builds, is
this allowed to make future out-of-place builds break? Off the top of
my head I know openssl's build system has an in-place XOR out-of-place
restriction [1]. It looks like CMake is documented to have one as well
[2]. There may be no automatic way to *ever* do a out-of-place build
in a tree that has previously had an in-place build; you might have to
throw out that tree and start over. Is a build system like this
compliant? (Notice from [1] that it sounds like openssl handles this
situation printing a message to the console saying "lol this build is
probably broken idek" and then happily produces a broken build. Notice
from [2] that cmake might randomly decide to do an in-place build even
if you requested an out-of-place build.)

Speaking of which, why do we force backends to support both in-place
and out-of-place? Out-of-place is strictly more powerful (if it works
at all). Why not make that the only mode of operation?

Or: another idea that came up was just passing a flag to the build
backend saying whether the source tree was temporary or not, which has
the advantage that it's clearly defined (pip certainly knows whether
it's going to delete the source tree after it finishes the build), and
potentially side-steps a lot of these problems with managing the
out-of-tree build cache. Would that be better or worse? I don't know
how to answer that given that I don't know what problem we're trying
to solve here.

Is pip planning to enforce that the source directory is left
unmodified? If not, then do we expect that projects will actually skip
modifying the source tree in practice? It's *very* easy to
accidentally break this rule without realizing it. Will this happen
often enough to make this non-viable for whatever we're trying to do
here? (I guess it has something to do with getting trustworthy builds
from untrustworthy build systems, so this seems relevant.)

Does anyone know how widely supported out-of-place builds are *in real
life*? I know that all the major build frameworks have the
infrastructure, but I'm one of those weirdos who habitually does
out-of-place builds whenever I build software by hand, and back when I
used to build a lot of software by hand then it was *very* common for
this mode of operation to be broken due to whatever weird hack used
inside a specific project's build system. It's been ~10 years since I
did that often though; maybe things have gotten better?

------

I'm really not trying to be an asshole here :-(. Being the asshole is
extremely unpleasant, and I just had throw myself in front of the
prepare_build_files train before everyone suddenly realized that
whoops, maybe wasn't as great an idea as they thought. But like...
everyone does understand right that whatever we put in here, we're
stuck with forever? This isn't like some new project where you can
release 0.0.1 and then spend a few years noodling around with the API
before you release 1.0 and start promising backcompat. This is version
0.0.1 and 1.0 at the same time, and also we can't write any tests
until after we release it, and we may never be able to release a 2.0.
(Look at WSGI -- I mean, it's tremendously popular and influential,
obviously they did some things right, but the spec is full of awful
stuff that everyone hates and all its imitators dropped, but fixing it
is impossible.)

I just don't understand how everyone has the confidence that this
proposal is a mature solid thing that will stand the test of time.
Maybe it will! But how can you possibly know that when we haven't even
scratched the surface of all its implications? Shouldn't the
prepare_build_files thing be a clue that your judgement might not be
100% reliable on these things?

And the alternative is just like... go ahead and ship something that
only supports in-place builds directly (e.g. my draft at the top of
this thread), add out-of-place builds later as an optional extension
if it's useful, if the frontend really wants an out-of-place build it
can fall back on shutil.copytree (plus as a bonus it *knows* that the
backend can't make use of the resulting temporary build directory, so
it can throw it away instead of caching it, and there's zero chance of
cross-build pollution). If anything this seems like the end result
would be *superior* to this proposal, and I've seen zero evidence that
out-of-core builds are something we need to solve now.

Can we please just not? I actually have a list of fiddly details that
need to be discussed about the core part of the proposal, but I don't
see how we'll ever get to the point of nailing down these kinds of
details when all the oxygen is going into this kind of proposal whose
implications are too complicated for us to even understand.
...
...
PEP 517 was written in 2015...
And PEP 426 was written in 2012. Standards development tImelines can
get looong when the status quo at least kinda sorta works, and nobody
has commercial deadlines forcing them to push to standardize new
interfaces before a genuine consensus has developed :)
I humbly suggest that this isn't an immutable fact of nature, but
rather there are strategies that we can adopt intentionally to reduce
the chance of repeating PEP 426's fate. If we want to.

-n

[1] https://mta.openssl.org/pipermail/openssl-dev/2016-June/007364.html
[2] "Note: Before performing an out-of-source build, ensure that all
CMake generated in-source build information is removed" --
https://cmake.org/Wiki/CMake_FAQ#What_is_an_.22out-of-source.22_build.3F

-- 
Nathaniel J. Smith -- https://vorpus.org

Re: [Distutils] A possible refactor/streamlining of PEP 517

Nathaniel Smith