[Distutils] moving things forward (was: wheel including files it shouldn't)

Thu May 5 05:47:29 EDT 2016

On Wed, May 4, 2016 at 11:57 PM, Robert Collins
<robertc at robertcollins.net> wrote:
> On 5 May 2016 at 18:32, Nathaniel Smith <njs at pobox.com> wrote:
>> On Wed, May 4, 2016 at 10:42 PM, Robert Collins
>>...
>>> Yes, things will break: anyone using this will need a new pip, by
>>> definition. Not everyone will be willing to wait 10 years before using
>>> it :).
>>
>> Just to clarify (since we seem to agree): I meant that if pip starts
>> interpreting an existing setup.cfg thing, then the new-pip/old-package
>> situation could break, which would be bad.
>
> No. Old pip new package will break, new pip old package is entirely safe AFAICT.

We're talking past each other... I'm saying, *if* pip started
reinterpreting some existing thing as indicating setup-requirements,
*then* things would break. You're saying, pip isn't going to do that,
so they won't. So we're good :-)

>>>> - IMO an extremely valuable aspect of this new declarative
>>>> setup-requirements thing is that it gives us an opportunity to switch
>>>> to enforcing the accuracy of this metadata. Right now we have a swamp
>>>> we need to drain, where there's really no way to know what environment
>>>> any given setup.py needs to run. Sometimes there are setup_requires,
>>>> sometimes not; if there are setup_requires then sometimes they're
>>>
>>> Huh? I've not encountered any of this, ever. I'd love some examples to
>>> go look at. The only issue I've ever had with setup_requires is the
>>> easy_install stuff it ties into.
>>
>> I don't think I've ever seen a package that had accurate
>> setup_requires (outside the trivial case of packages where
>> setup_requires=[] is accurate). Scientific packages in particular
>> universally have undeclared setup requirements.
>
> Are those requirements pip installable?

Either they are, or they will be soon.

> ..
>>> I'm very much against forcing isolated build environments as part of
>>> this effort. I get where you are coming from, but really it conflates
>>> two entirely separate things, and you'll *utterly* break building
>>> anything with dependencies that are e.g. SWIG based unless you
>>> increase the size of the PEP by about 10-fold. (Thats not hyperbole, I
>>> think).
>>
>> Okay, now's my turn to be baffled :-). I literally have no idea what
>> you're talking about here. What would this 10x longer PEP be talking
>> about? Why would this be needed?
>
> Take an i386 linux machine, and build something needing pyqt5 on it
> :). Currently, you apt-get/yum/dnf/etc install python-pyqt5, then run
> pip install. If using a virtualenv you enable system site packages.
>
> When you introduce isolation, the build will only have the standard
> library + whatever is declared as a dep: and pyqt5 has no source on
> PyPI.
>
> So the 10x thing is defining how the thing doing the isolation (e.g.
> pip) should handle things that can't be installed but are already
> available on the system.
>
> And that has to tunnel all the way out to the user, because its
> context specific, its not an attribute of the dependencies per se
> (since new releases can add or remove this situation), nor of the
> consuming thing (same reason).

# User experience today on i386
$ pip install foo
<... error: missing pyqt5 ...>
$ apt install python-pyqt5
$ pip install foo

# User experience with build isolation on i386
$ pip install foo
<... error: missing pyqt5 ...>
$ apt install python-pyqt5
$ pip install --no-isolated-environment foo

It'd even be straightforward for pip to notice that the requirement
that it failed to satisfy is already satisfied by the ambient
environment, and suggest --no-isolated-environment as a solution.

> Ultimately, its not even an interopability question: pip could do
> isolated builds now, if it chose, and it has no ramifications as far
> as PEPs etc are concerned.

That's not true. In fact, it seems dangerously naive :-/

If pip just went ahead and flipped a switch to start doing isolated
builds now, then everything would burst into flame and there would be
a howling mob in the bug tracker. Sure, there's no PEP saying we
*can't* do that, but in practice it's utterly impossible.

If we roll out this feature without build isolation, then next year
we'll still be in the exact same situation we are today -- we'll have
the theoretical capability of enabling build isolation, but everything
would break, so in practice, we won't be able to.

The reason I'm being so intense about this is that AFAICT these are all true:

Premise 1: Without build isolation enabled by default, then in
practice everyone will putter along putting up with broken builds all
the time. It's *incredibly* easy to forget to declare a build
dependency, it's the kind of mistake that every new user makes, and
experienced users too.

Premise 2: We can either enable build isolation together with the new
static bootstrap requirements, or we can never enable build isolation
at all, ever.

Conclusion: If we want to ever reach a state where builds are
reliable, we need to tie build isolation to the new static metadata.

If you have some clever plan for how we could practically transition
to build isolation without having them opt-in via a new feature, then
that would be an interesting counter-argument; or an alternative plan
for how to reach a point where build requirements are accurate without
being enforced; or ...?

> ...
>> What are these things that aren't pip-installable and why isn't the
>> solution to fix that? I definitely don't want to break things that
>> work now, but providing new features that incentivize folks to clean
>> up their stuff is a good thing, surely? Yeah, it means that the
>> bootstrap-requirements stuff will take some time and cleanup to
>> spread, but that's life.
>
> We've a history in this group of biting off too much and things not
> getting executed. We're *still* in the final phases of deploying
> PEP-508, and it was conceptually trivial. I'm not arguing that we
> shouldn't make things better, I'm arguing that tying two separate
> things together because we *can* seems, based on the historical
> record, to be unwise.

My argument is not that we can, it's that we have to :-).

>> We've spent a huge amount of effort on reaching the point where pretty
>> much everything *can* be made pip installable. Heck, *PyQt5*, which is
>> my personal benchmark for a probably-totally-unpackageable package,
>> announced last week that they now have binary wheels on pypi for all
>> of Win/Mac/Linux:
>>
>>   https://pypi.python.org/pypi/PyQt5/5.6
>>
>> I want to work towards a world where this stuff just works, not keep
>> holding the entire ecosystem back with compromise hacks to work around
>> a minority of broken packages.
>
> Sure, but the underlying problem here is that manylinux is a 90%
> solve: its great for the common cases but it doesn't actually solve
> the actual baseline problem: we can't represent the actual system
> dependencies needed to rebuild many Python packages.
>
> pyqt5 not having i386 is just a trivial egregious case. ARM32 and 64
> is going to be super common, Power8 another one, let alone less common
> but still extant and used architectures like PPC, itanium, or new ones
> like x86_32 [If I remember the abbreviation right - no, its not i386].

(it's x32)

manylinux is helpful here, but it's not necessary -- build isolation
just requires that the dependencies be pip installable, could be from
source or whatever. In practice the wheel cache will kick in and
handle most of the work.

> Solve that underlying problem - great, then isolation becomes an
> optimisation question for things without manylinux wheels. But if we
> don't solve it then isolation becomes a 'Can build X at all' question,
> which is qualitatively different.

More like 'can build X at all (without adding one command line
option)'. And even this is only if you're in some kind of environment
that X upstream doesn't support -- no developer is going to make a
release of X with build isolation turned on unless build isolation
works on the platforms they care about.

> I'm all for solving the underlying problem, but not at the cost of
> *not solving* the 'any easy-install is triggered' problem for another
> X months while that work takes place.
>
>
>>> The reality, AFAICT, is that most projects with undeclared build deps
>>> today get corrected fairly quickly: a bug is filed, folk fix it, and
>>> we move on. A robotic system that isolates everything such that folk
>>> *cannot* fix it is much less usable, and I'm very much in favour of
>>> pragmatism here.
>>
>> Again, in my world ~100% of packages have undeclared build deps...
>
> So - put a patch forward to pip to do isolated builds. If/when bug
> reports come in, we can solve them there. There's no standards body
> work involved in that as far as I can see....

See above...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org