[Distutils] Metadata 2.0: Warning if optional features are missing

Robert Collins robertc at robertcollins.net
Tue Dec 15 13:59:00 EST 2015

On 16 December 2015 at 07:30, Paul Moore <p.f.moore at gmail.com> wrote:
> On 15 December 2015 at 16:37, Michael Merickel <mmericke at gmail.com> wrote:
>> It seems to me this would be easily accomplished by declaring some extras
>> like "cext" as default-included and if the install fails someone can depend on
>> "sqlalchemy[-cext]". The UI isn't quite as nice as your proposal but reuses
>> existing machinery.
> Hmm, so sqlalchemy says it provides an extra "speedups" (or "cext") by
> default, checks for a compiler, and removes that extra from what's
> installed if there's no compiler available?
> Not sure why anyone would depend on sqlalchemy[-cext] - they should
> always depend on sqlalchemy, as there's no functional difference
> between with speedups and without. And the user should never specify
> the extra, all they do is install a compiler and rebuild.
>>> What I'd like to be able to do:
>>> 1. pip install sqlalchemy works, but shows a warning "optional feature
>>> speedups not installed - no C compiler"
>> Extras wouldn't give a nice message like this. The install would fail and
>> the user would have to guess as to why and then opt out of the default
>> extra. Perhaps some better error message could be displayed if the package
>> failed to install and had a default extra included to show how to opt out.
> But the install doesn't fail - it succeeds and works fine, just
> without some speedups. That's exactly what I want. In my particular
> case I was installing csvkit which depends on sqlalchemy. It doesn't
> (nor should it) say that it doesn't need the speedups, nor should I
> have to manually locate the specific dependency (from a list of many)
> and install it by hand before my install works. The current behaviour
> (pip install csvkit -> a working csvkit with no issues) is perfect.
> But if I later want to use SQLalchemy independently, or I find that a
> particular usage of csvkit is too slow, I want to know that there's a
> speedups module I can get by downloading a binary build or installing
> my own compiler. And I want to be able to install it transparently.
>>> 3. A command to reinstall the currently installed version with new options
>>>    pip install --add-options sqlalchemy[speedups]
>>>    (Note that a plain pip install doesn't do this, as it won't
>>> reinstall. And --upgrade or --ignore-installed will install newer
>>> versions).
>> This should be done by simply reinstalling the package via "pip install
>> sqlalchemy[speedups]". I doubt you need an extra --add-options flag to
>> compete with extras.
> I guess you're saying add [speedups] as a way of requesting a rebuild?
> But if the build fails, would that remove sqlalchemy, or leave the
> existing build there? (I'd hope the latter).
> Won't that say "sqlalchemy is already installed"? (I've never used
> extras with pip, so I don't know). Also what if there's been a newer
> version of sqlalchemy released? Won't it get that one?
> I specifically want to say here "just reinstall the exact version I
> have here, but try again to include optional stuff that I didn't get
> last time". (In practice I don't really care much if I upgrade, so
> --upgrade or --ignore-installed is probably fine in reality).
> Anyway, all of this requires people to implement it, in pip and build
> tools, as well as projects to adopt it. So it's not really important
> that the details get thrashed out right now, just that we establish
> whether it's a practical scenario to support, and get a feel for how
> much work it would be for projects to adopt it (if it's too much, they
> won't, and the feature will end up unused). So the fact that extras
> might be able to support this is the main point here, not the details
> of how it would work - so thanks.

I'm not sure that extras would support it cleanly.

I agree that it is a common use case; in general I'd say

1) consumers (users and depending projects) shouldn't need to know
about accelerators
2) some environments will need to be able to exclude them [known to build badly]
3) some consumers will need to be able to mandate them [either using
an acclerator only feature, or their thing is known to be infeasible
without them]
4) there needs to be a means to get accelerators if they didn't
install first time around
5) projects with accelerators shouldn't be forced to split the
accelerators into separate projects

Some issues with reusing extras are:
 - extras refer to things in the dependency graph, but as
distributions are the installable things and the graph nodes are
distributions, foo[fast] is - in widespread deployment - entirely and
only a list of additional distributions.
 - there's no concept of 'default extra', and there is no clear path
for bringing it in compatibly, at least so far
 - we haven't worked through the ui implications about which end of
the relation this should be configured: should consumers be specifying
them, or providers?
 - negative operators on extras are as yet undefined, and due to the
dependencies of an install being a graph, not a tree, a naive
definition is likely very hard to use IMO

Recommends and suggests are an interesting way of modelling this, and
its possible we don't need an exclude relation- rather users should
blacklist them globally in the target environment somehow, which would
contain that partcular complexity.


Robert Collins <rbtcollins at hpe.com>
Distinguished Technologist
HP Converged Cloud

More information about the Distutils-SIG mailing list