[Distutils] A possible refactor/streamlining of PEP 517

Thomas Kluyver thomas at kluyver.me.uk
Thu Jul 6 12:35:36 EDT 2017


Thanks Nick for the detailed reply. I have read it carefully, and you've
probably convinced me to get back on board. Some more responses inline:

On Thu, Jul 6, 2017, at 03:38 PM, Nick Coghlan wrote:
> While I can completely understand how the current debate over whether
> or not the prepare_input_for_build_wheel hook is necessary or not
> would make you feel that way, I hope I can convince you that we're
> really just quibbling over a genuinely trivial arcane technical detail
> that I'd never let get in the way of flit being a full-fledged
> participant in the Python packaging ecosystem.

To be clear, I don't particularly care for the hook. I can see that it's
something of a kludge between two competing approaches.

What is important to me is that if a user installs from source the
obvious way (pip install . ), failure to build an sdist does not result
in a failure to install. The extra hook was one approach to that, but
it's also OK by me if it tries to make an sdist and falls back to either
copytree or an inplace build.

> That is, the current point of contention is specifically about how we
> want tools to behave when we're starting with a source directory that:
> 
> 1. Doesn't include VCS metadata (e.g. it's been exported as a tarball
> rather than cloned)
> 2. The build frontend doesn't want to use as the basis for an in-place
> build
> 3. The build frontend doesn't want to blindly copy into a separate
> build directory
> 
> So just by way of those preconditions, we're already well outside the
> most common package installation workflows.

One of my concerns in this debate is that this is presented as a very
rare corner case that we don't have to worry about too much. I agree
that it's not the most common case, but I think it's common enough that
we should care about making it easy,  given that:

- Condition 1 also covers directories with VCS metadata where the VCS
tools are not on $PATH. Another case occurred to me recently: Windows
users who have installed git but not added it to the default PATH.
- Conditions 2 and 3 seem likely to be the default for a source install
with pip.

As an order of magnitude, I'd estimate this is ~10% of installs from a
source directory - which is to say, moderately common.

> That perspective is embodied in the hypothetical proposal to add a
> "--build-strategy" option to pip that would allow folks building
> wheels to choose between:
> 
> - creating and unpacking an sdist and building a wheel from that
> - copying the directory tree and building a wheel from that
> - building a wheel directly from the original directory
> 
> (Perhaps with a variant that tries to create and unpack the sdist
> first, and only if that fails falls back to copying the entire tree)

This could be useful flexibility for advanced users. But I worry that
pip will use the 'sdist' build strategy by default, and expect users to
handle cases where that fails. I think this would be a mistake. From a
user perspective, it would mean:

- "pip install ." is the recommended way to install from source, but in
some situations it doesn't work.
- Adding the mystic incantation "--build-strategy direct" makes it work,
and from a user perspective makes absolutely no difference to the
result.

Of course, I also have a vested interest in things not working this way:
I would get a steady trickle of people asking "why does flit require a
VCS to install from source?" From my perspective, it doesn't require
that, but I would be unable to 'fix' it.

Donald:
> I think it is a complete non-starter to suggest removing installation from sdist support from pip

I'm certainly not suggesting that (hopefully this was already clear, but
just in case ;-)

> the question then becomes do we want to try and push things towards only having *one* primary flow through the state machine of Python’s packaging, or do we want to support transitions that allow you to “skip” steps. 

My idealised view of the state machine is something like this:

wheel <-- source tree <--> sdist

I agree that there's a problem with losing important data when you go
[source tree --> sdist --> source tree] - in fact this is one of the
pain points I was trying to avoid with flit. But I don't like the idea
of solving that by saying that all wheels must have passed through an
sdist; it feels like a redundant there-and-back-again journey.

So how else could we tackle the systematic problem? It's definitely a
good idea to ensure that [stree --> sdist --> stree --> wheel] doesn't
miss out anything that [stree --> wheel] includes, but I'd focus on
doing this in developer tools, e.g.:

1. Tools such as flit could check it when you're building a release
2. Tools running on CI services could build both and compare them
3. Bots could scan PyPI for projects with both a .whl and a .tar.gz,
build a wheel from the tarball, compare them, and notify the maintainer
if there's a problem.

In the short term, I reckon that 2 is the most promising - we can make a
convenient pip-installable tool and promote it as good practice for
testing that your builds work. But in any case, I see a range of options
for tackling this while leaving open the direct [stree --> wheel]
pathway.

> When I looked at flit it also suffered the same problem if you forgot to commit a file to the VCS repository (which meant it wouldn’t get added to the sdist) 

You have to explicitly ignore a file to hit this. If you have untracked
but non-ignored files in your repo, flit will refuse to build an sdist
at all. I recognise that this is quite strict and still doesn't entirely
prevent the issue, and I may refine it in the future, but I hope it
makes such problems hard to hit accidentally.

Thomas


More information about the Distutils-SIG mailing list