[Distutils] [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead

Sat Nov 7 09:57:24 EST 2015

On 7 November 2015 at 13:55, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> On Sat, Nov 7, 2015 at 2:02 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>>
>> On 7 November 2015 at 01:26, Chris Barker - NOAA Federal
>> <chris.barker at noaa.gov> wrote:
>> > So what IS supposed to be used in the development workflow? The new
>> > mythical build system?
>
> I'd like to point out again that this is not just about development
> workflow. This is just as much about simply *installing* from a local git
> repo, or downloaded sources/sdist.

Possibly I'm misunderstanding here.

> The "pip install . should reinstall" discussion in
> https://github.com/pypa/pip/issues/536 is also pretty much the same
> argument.

Well, that one is about pip reinstalling if you install from a local
directory, and not skipping the install if the local directory version
is the same as the installed version. As I noted there, I'm OK with
this, it seems reasonable to me to say that if someone has a directory
of files, they may have updated something but not (yet) bumped the
version.

The debate over there has gone on to whether we force reinstall for a
local *file* (wheel or sdist) which I'm less comfortable with. But
that's is being covered over there.

The discussion *here* is, I thought, about skipping build steps when
possible because you can reuse build artifacts. That's not "should pip
do the install?", but rather "*how* should pip do the install?"
Specifically, to reuse build artifacts it's necessaryto *not* do what
pip currently does for all (non-editable) installs, which is to
isolate the build in a temporary directory and do a clean build.
That's a sensible debate to have, but it's very different from the
issue you referenced.

IMO, the discussions currently are complex enough that isolating
independent concerns is crucial if anyone is to keep track. (It
certainly is for me!)

>> Fair question. Unfortunately, the answer is honestly that there's no
>> simple answer - pip is not a bad option, but it's not its core use
>> case so there are some rough edges.
>
> My impression is that right now pip's core use-case is not "installing", but
> "installing from PyPi (and similar repos". There are a lot of rough edges
> around installing from anything on your own hard drive.

Not true. The rough edges are around installing things where (a) you
don't want to rely in the invariant that name and version uniquely
identify an installation (that's issue 536) and (b) where you don't
want to do a clean build, because building is complex, slow, or
otherwise something you want to optimise (that's this discussion).

I routinely download wheels and use them to install. I also sometimes
download sdists and install from them, although 99.99% of the time, I
download them, build them into wheels and install them from wheels. It
*always* works exactly as I'd expect. But if I'm doing development, I
use -e. That seems to be the problem here, there are rough edges if
you want a development workflow that doesn't rely on editable
installs. I think that's what I already said :-)

>> I'd argue that the best way to use
>> pip is with pip install -e, but others in this thread have said that
>> doesn't suit their workflow, which is fine. I don't know of any other
>> really good options, though.
>>
>> I think it would be good to see if we can ensure pip is useful for
>> this use case as well, all I was pointing out was that people
>> shouldn't assume that it "should" work right now, and that changing it
>> to work might involve some trade-offs that we don't want to make, if
>> it compromises the core functionality of installing packages.
>
> It might be helpful to describe the actual trade-offs then, because as far
> as I can tell no one has actually described how this would either hurt
> another use-case or make pip internals much more complicated.

1. (For issue 536, not this thread) Pip and users can't rely on the
invariant that name and version uniquely identify a release. You could
have version 1.2dev4 installed, and it may have come from your local
working directory (with changes you made) or from a wheel that's on
your local hard drive that you built last week, or from the release on
PyPI you made last month. All 3 may behave differently. Also wheel
caching is based on name/version - it would need to be switched off in
cases where name/version doesn't guarantee repeatable code.
2. (For here) Builds are not isolated from what's in the development
directory. So if you have your sdist definition wrong, what you build
locally may work, but when you release it it may fail. Obviously that
can be fixed by proper development and testing practices, but pip is
designed currently to isolate builds to protect against mistakes like
this, we'd need to remove that protection for cases where we wanted to
do in-place builds.
3. The logic inside pip for doing builds is already pretty tricky.
Adding code to sometimes build in place and sometimes in a temporary
directory is going to make it even more complex. That might not be a
concern for end users, but it makes maintaining pip harder, and risks
there being subtle bugs in the logic that could bite end users. If you
want specifics, I can't give them at the moment, because I don't know
what the code to do the proposed in-place building would look like.

I hope that helps. It's probably not as specific or explicit as you'd
like, but to be fair, nor is the proposal.

What we currently have on the table is "If 'pip (install/wheel) .' is
supposed to become the standard way to build things, then it should
probably build in-place by default." For my personal use cases, I
don't actually agree with any of that, but my use cases are not even
remotely like those of numpy developers, so I don't want to dismiss
the requirement. But if it's to go anywhere, it needs to be better
explained.

Just to be clear, *my* position (for projects simpler than numpy and
friends) is:

1. The standard way to install should be "pip install <requirement or wheel>".
2. The standard way to build should be "pip wheel <sdist or
directory>". The directory should be a clean checkout of something you
plan to release, with a unique version number.
3. The standard way to develop should be "pip install -e ."
4. Builds (pip wheel) should always unpack to a temporary location and
build there. When building from a directory, in effect build a sdist
and unpack it to the temporary location.

I hear the message that for things like numpy these rules won't work.
But I'm completely unclear on why. Sure, builds take ages unless done
incrementally. That's what pip install -e does, I don't understand why
that's not acceptable.

If the discussion needs to go to the next level of detail, maybe that
applies to the requirements as well as to the objections?

Paul

PS Alternatively, feel free to ignore my comments. I'm not likely to
ever have the time to code any of the proposals being discussed here,
but I won't block other pip developers either doing so or merging
code, so my comments are not intended as anything more than input from
someone who knows a bit about how pip is coded, how it's currently
used, and what issues our users currently encounter. Seriously - I'm
happy to say my piece and leave it at that if you prefer.