Re: [Numpy-discussion] New NEP: merging multiarray and umath

March 11, 2018

...
Am 09.03.2018 um 02:06 schrieb Nathaniel Smith <njs@pobox.com>:
On Thu, Mar 8, 2018 at 1:52 AM, Gregor Thalhammer
<gregor.thalhammer@gmail.com <mailto:gregor.thalhammer@gmail.com>> wrote:
...
Hi,
long time ago I wrote a wrapper to to use optimised and parallelized math
functions from Intels vector math library
geggo/uvml: Provide vectorized math function (MKL) for numpy
I found it useful to inject (some of) the fast methods into numpy via
np.set_num_ops(), to gain more performance without changing my programs.
While this original project is outdated, I can imagine that a centralised
way to swap the implementation of math functions is useful. Therefor I
suggest to keep np.set_num_ops(), but admittedly I do not understand all the
technical implications of the proposed change.
The main part of the proposal is to merge the two libraries; the
question of whether to deprecate set_numeric_ops is a bit separate.
There's no technical obstacle to keeping it, except the usual issue of
having more cruft to maintain :-).
...
It's usually true that any monkeypatching interface will be useful to
someone under some circumstances, but we usually don't consider this a
good enough reason on its own to add and maintain these kinds of
interfaces. And an unfortunate side-effect of these kinds of hacky
interfaces is that they can end up removing the pressure to solve
problems properly. In this case, better solutions would include:
- Adding support for accelerated vector math libraries to NumPy
directly (e.g. MKL, yeppp)
- Overriding the inner loops inside ufuncs like numpy.add that
np.ndarray.__add__ ultimately calls. This would speed up all addition
(whether or not it uses Python + syntax), would be a more general
solution (e.g. you could monkeypatch np.exp to use MKL's fast
vectorized exp), would let you skip reimplementing all the tricky
shared bits of the ufunc logic, etc. Conceptually it's not even very
hacky, because we allow you add new loops to existing ufuncs; making
it possible to replace existing loops wouldn't be a big stretch. (In
fact it's possible that we already allow this; I haven't checked.)
So I still lean towards deprecating set_numeric_ops. It's not the most
crucial part of the proposal though; if it turns out to be too
controversial then I'll take it out.
Dear Nathaniel,

since you referred to your reply in your latest post in this thread I comment here.

First, I agree that set_numeric_ops() is not very important for replacing numpy math functions with faster implementations, mostly because this covers only the basic operations (+, *, boolean operations), which are fast anyhow, only pow can be accelerated by a substantial factor.

I also agree that adding support for optimised math function libraries directly to numpy might be a better solution than patching numpy. But in the past there have been a couple of proposals to add fast vectorised math functions directly to numpy, e.g. for a GSoC project. There have always been long discussions about maintainability, testing, vendor lock-in, free versus non-free software — all attempts failed. Only the Intel accelerated Python distribution claims that it boosted performance for transcendental functions, but I do not know how they achieved this and if this could be integrated in the official numpy. 

Therefor I think there is some need for an „official“ way to swap numpy math functions at the user (Python) level at runtime. As Julian commented, you want this flexibility because of speed and accuracy trade-offs.

Just replacing the inner loop might be an alternative way, but I am not sure. Many optimised vector math libraries require contiguous arrays, so they don’t fulfil the expectations numpy has for an inner loop. So you would need to allocate memory, copy, and free memory for each call to the inner loop. I image this gives quite some overhead you could avoid by a completely custom ufunc. 
On the other hand, setting up a ufunc from inner loop functions is easy, you can reuse all the numpy machinery. I disagree with you that you have to reimplement the whole ufunc machinery if you swap math functions at the ufunc level.

Stupid question: how to get the first argument of 
 int PyUFunc_ReplaceLoopBySignature(PyUFuncObject <https://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#c.PyUFuncObject>* ufunc,
e.g. for np.add ?

So, please consider this when refactoring/redesigning the ufunc module.

Gregor
...
-n
-- 
Nathaniel J. Smith -- https://vorpus.org <https://vorpus.org/>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org>
https://mail.python.org/mailman/listinfo/numpy-discussion <https://mail.python.org/mailman/listinfo/numpy-discussion>