
On 2 Sep 2018, at 18:04, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, 1 Sep 2018 at 11:02, Tzu-ping Chung <uranusjr@gmail.com> wrote:
I’m not knowledgable about GPUs, but from limited conversations with others, it is important to first decide what exactly the problem area is. Unlike currently available environment markers, there’s currently not a very reliable way to programmatically determine even if there is a GPU, let alone what that GPU can actually do (not every GPU can be used by Tensorflow, for example).
As Tzu-Ping notes, using environment markers currently requires that there be a well-defined "Python equivalent" to explain how installers should calculate the install-time value of the environment marker.
However, even regular CPU detection has problems when it comes to environment markers, since platform_machine reports x86_64 on a 64-bit CPU, even if the current interpreter is built as a 32-bit binary, and there are other oddities like Linux having two different 32-bit ABIs (there's i686, which is the original 32 bit ABI that predates x86_64, and then there's x32, which is the full x86_64 instruction set, but using 32-bit pointers: https://github.com/pypa/pip/issues/4962 ). (Also see https://github.com/pypa/pipenv/issues/2397 for some additional discussion)
This is primarily an indication that there is a missing API: an API that tells what architecture the Python interpreter is build for, rather than the architecture of the CPU. Or maybe not: distutils.util.get_platform() could be taught to do the right thing here, as was already done for macOS in the (ancient) past (although it is probably better to introduce a new API because of backward compatibility concerns). […]
Note that I don't think it's possible for folks to get away from the "3 projects" requirement if publishers want their users to be able to selectively *install* the GPU optimised version - when you're keeping everything within one project, then you don't need an environment marker at all, you just decide at import time which version you're actually going to import.
What’s the problem with including GPU and non-GPU variants of code in a binary wheel other than the size of the wheel? I tend to prefer binaries that work “everywhere", even if that requires some more work in building binaries (such as including multiple variants of extensions to have optimised code for different CPU variants, such as SSE and non-SSE variants in the past). Ronald