[Distutils] Re: Environment markers for GPU/CUDA availibility

4 Sep 2018

      On Tue, 4 Sep 2018 at 11:28, Nathaniel Smith <njs@pobox.com> wrote:
...
On Tue, Sep 4, 2018 at 3:10 AM, Paul Moore <p.f.moore@gmail.com> wrote:
...
There's very much an 80-20 question here, we need to avoid letting the
needs of the 20% of projects with unusual needs, complicate usage for
the 80%. On the other hand, of course, leaving the specialist cases
with no viable solution also isn't reasonable, so even if tags aren't
practical here, finding a solution that allows projects to ship
specialised binaries some other way would be good. Just as a
completely un-thought through suggestion, maybe we could have a
mechanism where a small "generic" wheel can include pointers to
specialised extra code that gets downloaded at install time?
Package X -> x-1.0-cp37_cp37m_win_amd64.whl (includes generic code)
    Metadata - Implementation links:
        If we have a GPU -> <link to an archive of code to be added to
the install>
        If we don't have a GPU -> <link to an alternative non-GPU archive>
There's obviously a lot of unanswered questions here, but maybe
something like this would be better than forcing everything into the
wheel tags?
I think you've reinvented Requires-Dist and PEP 508 markers :-). (The
ones that look like '; python_version < "3.6"'.)
Oh, I see. Yes, I have haven't I? Aren't I clever (but forgetful)? :-)
...
Which IIUC was also
Dustin's original suggestion: make it possible to write requirements
like
tensorflow; not has_gpu
  tensorflow-gpu; has_gpu
Yes, I'd seen that, but thought it was in terms of using those markers
to say "this wheel is only valid on systems with/without a GPU" (which
doesn't work, because pip checks that too late). But you're right,
using it in requires-dist does the right thing.
...
But... do we actually know enough to define a "has_gpu" marker? It
isn't literally "this system has a gpu", right, it's something more
like "this system has an NVIDIA-brand GPU of a certain generation or
later with their proprietary libraries installed"? Or something like
that? There are actually lots of packages on PyPI with foo/foo-gpu
pairs, e.g. strawberryfields, paddlepaddle, magenta, cntk, deepspeech,
... Do these -gpu packages all have the same environmental
requirements, or is it different from package to package?
Yep, that's the killer question here. IMO, someone needs to come up
with a concrete proposal, along the likes of "here's some Python code
that returns a True/False value, and we want to name that value using
the market "has_gpu" (or whatever). There are then two debates:

1. Does that Python code return a value that's useful for a
sufficiently large consensus of the projects who care about shipping
GPU-enabled code?
2. Is has_gpu a sufficiently useful marker to warrant including in the
packaging standards?

Question 1 is for the package maintainers to debate, question 2 is for
distutils-sig (IMO). If the package maintainers aren't sufficiently
motivated to co-operate and come up with a concrete proposal, then
there's not much that the non-specialists on distutils-sig can do.
...
It would help if we had folks in the conversation who actually work on
these packages :-/. Anyone have contacts on the Tensorflow team? (It'd
also be good to talk to them about platform specifiers... the
tensorflow "manylinux1" wheels are really ubuntu-only, but they
intentionally lie about that b/c there is no ubuntu tag; maybe they're
interested in fixing that...?)
Anyway, I don't see how we could add an environment marker without
having a precise definition, and one that's useful for multiple
packages. Which may or may not be possible here...
See? I'm reinventing things other people have suggested again ;-)
...
Another wacky idea, maybe worth thinking about: should we let packages
specify their own auto-detection code that pip should run? E.g. you
could have a PEP 508 requirement like "somepkg;
extension[otherpackage.key] = ..." and that means "install
otherpackage inside the target Python environment, look up
otherpackage.key, and use its value to decide whether to install
somepkg". Maybe that's too messy to be worth it, but if "gpu
detection" isn't a well-defined problem then maybe it's the best
approach? Though basically that's what sdists do right now, and IIUC
how tensorflow-gpu-detect works. Maybe tensorflow-gpu-detect should
become the standard tensorflow library, with an sdist only, and at
install time it could decide whether to pull in 'tensorflow-gpu' or
'tensorflow-nogpu'...
This feels to me like going back to runtime executable metadata. Maybe
it's not, but I'd like to be careful here.

Paul