[Distutils] Bad setup.py dependencies on numpy, and 'pip upgrade' (was: Towards a simple and standard sdist format that isn't intertwined with distutils)

Nathaniel Smith njs at pobox.com
Sun Oct 11 02:22:12 CEST 2015


On Sun, Oct 4, 2015 at 1:02 PM, Paul Moore <p.f.moore at gmail.com> wrote:
[...]
> A
> common bug report for pip is users finding that their installs fail,
> because setup.py requires numpy to be installed in order to run, and
> yet pip is running setup.py egg-info precisely to find out what the
> requirements are. We tell the user that the setup.py is written
> incorrectly, and they should install numpy and retry the install, but
> it's not a good user experience. And from a selfish point of view,
> users blame *pip* for the consequences of a project whose code to
> generate the metadata is buggy. Those bug reports are a drain on the
> time of the pip developers, as well as a frustrating experience for
> the users.
[...]

This is mostly orthogonal to the other discussion, but it's a problem
relevant to distutils-sig where I think there's a lot of confusion, so
I just wanted to try and clarify what's happening here.

It's true that almost no packages accurately declare their
dependencies on numpy. This is for two reasons:

First, for packages that build-depend on numpy (e.g. they contain
extensions that need numpy's header files), then distutils' design
makes things tricky, b/c you have to call np.get_include() to find the
headers and pass that to setup(). But this isn't the main issue --
solving this requires nasty hacks, but at least the more prominent
projects are gradually gaining those hacks.

The main issue is that people really, really, REALLY hate the
recursive behavior of 'pip install -U'. They hate it *so much* --
especially the way it has a habit of suddenly trying to rebuild numpy
when all you were trying to do was upgrade some little pure-python
package -- that they actively refuse to accurately report their
project's dependencies, because install -U can't recurse over
dependencies that it can't see.

E.g., here's two setup.py files for extremely mainstream projects,
that both report accurate setup_requires and install_requires on numpy
if and only if numpy is not already installed; if it is installed then
they pretend that they don't need it:

  https://github.com/scipy/scipy/blob/master/setup.py#L200-L208
  https://github.com/statsmodels/statsmodels/blob/master/setup.py#L96-L156

Obviously this is terrible -- it means that the wheels for these
projects end up with incorrect dependencies (usually! it's kinda
random!), and even if you install from source then whether future
upgrades will work correctly depends on arbitrary details about how
your virtualenv was configured when you did the original install.

But people care about this so much that you actually have prominent
developers going around and filing bugs on packages complaining that
they provide accurate metadata and that this is a bug, they should lie
instead. E.g.:

  https://github.com/pydata/patsy/issues/5

I'm not sure I agree with this conclusion, but the users have spoken.
AFAICT the only way to fix this problem and start getting packages
with accurate metadata is for pip to gain a non-recursive upgrade
mode.

The bug for that is
    https://github.com/pypa/pip/issues/59

AFAICT from reading that thread, work on this has stalled out because
of the following reasoning:
1) Everyone agrees that pip should have 'upgrade $PKG' and
'upgrade-all' commands, and 'install -U' should be deprecated/removed.
2) But implementing 'upgrade-all' is tricky and dangerous without
first fixing pip's dependency resolver.
3) Therefore we can't add 'upgrade' or 'upgrade-all' until after we
fix pip's dependency resolver.

I feel like there's a certain logical gap between (2) and (3)... we
could defer 'upgrade-all' until later but start supporting 'upgrade
$PKG' right now, couldn't we? (It'd be implemented as the equivalent
of 'pip install $PKG=$LATEST_VERSION', which is not scary looking at
all.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


More information about the Distutils-SIG mailing list