[Numpy-discussion] New package to speed up ufunc inner loops

Matti Picus matti.picus at gmail.com
Tue Nov 3 10:54:24 EST 2020


Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce 
"pnumpy", a package to speed up NumPy.

https://quansight.github.io/numpy-threading-extensions/stable/index.html


What is in it?

- use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops

- When the inner loop is called with a large enough array, chunk the 
data and perform the iteration via a thread pool

- Add a different memory allocator for "ndarray" data (will require an 
appropriate API from NumPy)

- Allow using optimized loops above and beyond what NumPy provides

- Allow logging inner loop calls and parameters to learn about the 
current process and perhaps tune the performance accordingly


The first release contains the hooking mechanism and the thread pool, 
the rest has been prototyped but is not ready for release. The idea 
behind the package is that a third-party package can try things out and 
iterate much faster than NumPy. If some of the ideas bear fruit, and do 
not add an undue maintenance burden to NumPy, the code can be ported to 
NumPy. I am not sure NumPy wishes to take upon itself the burden of 
managing threads, but a third-party package may be able to.


I am writing to the mailing list both to announce the pre-release under 
the wrong name, and, in accordance with the fair play rules[1], to 
request use of the "numpy" name in the package. We had considered many 
options, in the end would like to propose "pnumpy" (the p is either 
"parallel" or "performant" or "preliminary", whatever you desire).


Matti


[1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules



More information about the NumPy-Discussion mailing list