On Tue, Sep 4, 2018 at 3:10 AM, Paul Moore <p.f.moore@gmail.com> wrote:
There's very much an 80-20 question here, we need to avoid letting the needs of the 20% of projects with unusual needs, complicate usage for the 80%. On the other hand, of course, leaving the specialist cases with no viable solution also isn't reasonable, so even if tags aren't practical here, finding a solution that allows projects to ship specialised binaries some other way would be good. Just as a completely un-thought through suggestion, maybe we could have a mechanism where a small "generic" wheel can include pointers to specialised extra code that gets downloaded at install time?
Package X -> x-1.0-cp37_cp37m_win_amd64.whl (includes generic code) Metadata - Implementation links: If we have a GPU -> <link to an archive of code to be added to the install> If we don't have a GPU -> <link to an alternative non-GPU archive>
There's obviously a lot of unanswered questions here, but maybe something like this would be better than forcing everything into the wheel tags?
I think you've reinvented Requires-Dist and PEP 508 markers :-). (The ones that look like '; python_version < "3.6"'.) Which IIUC was also Dustin's original suggestion: make it possible to write requirements like tensorflow; not has_gpu tensorflow-gpu; has_gpu But... do we actually know enough to define a "has_gpu" marker? It isn't literally "this system has a gpu", right, it's something more like "this system has an NVIDIA-brand GPU of a certain generation or later with their proprietary libraries installed"? Or something like that? There are actually lots of packages on PyPI with foo/foo-gpu pairs, e.g. strawberryfields, paddlepaddle, magenta, cntk, deepspeech, ... Do these -gpu packages all have the same environmental requirements, or is it different from package to package? It would help if we had folks in the conversation who actually work on these packages :-/. Anyone have contacts on the Tensorflow team? (It'd also be good to talk to them about platform specifiers... the tensorflow "manylinux1" wheels are really ubuntu-only, but they intentionally lie about that b/c there is no ubuntu tag; maybe they're interested in fixing that...?) Anyway, I don't see how we could add an environment marker without having a precise definition, and one that's useful for multiple packages. Which may or may not be possible here... One thing that would help would be if tensorflow-gpu could say "Provides-Dist: tensorflow", so that downstream packages can say "Requires-Dist: tensorflow" and pip won't freak out if the user has manually installed tensorflow-gpu instead. E.g. in the proposal at [1], you could have 'tensorflow' as one wheel and 'tensorflow[gpu]' as a second wheel that 'Provides-Dist: tensorflow'. Conflicts-Dist would also be useful, though might require a real resolver first. Another wacky idea, maybe worth thinking about: should we let packages specify their own auto-detection code that pip should run? E.g. you could have a PEP 508 requirement like "somepkg; extension[otherpackage.key] = ..." and that means "install otherpackage inside the target Python environment, look up otherpackage.key, and use its value to decide whether to install somepkg". Maybe that's too messy to be worth it, but if "gpu detection" isn't a well-defined problem then maybe it's the best approach? Though basically that's what sdists do right now, and IIUC how tensorflow-gpu-detect works. Maybe tensorflow-gpu-detect should become the standard tensorflow library, with an sdist only, and at install time it could decide whether to pull in 'tensorflow-gpu' or 'tensorflow-nogpu'... -n [1] https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html -- Nathaniel J. Smith -- https://vorpus.org