[Numpy-discussion] New package to speed up ufunc inner loops

Sebastian Berg sebastian at sipsolutions.net
Wed Nov 4 17:15:37 EST 2020


On Tue, 2020-11-03 at 17:54 +0200, Matti Picus wrote:
> Hi. On behalf of Quansight and RTOSHoldings, I would like to
> introduce 
> "pnumpy", a package to speed up NumPy.
> 
> https://quansight.github.io/numpy-threading-extensions/stable/index.html
> 

Nice to see these efforts especially with intention of possible
upstreaming.  I hope we can improve the NumPy infrastructure to make
these tries much easier and powerful in the future! (And as I
mentioned, I had such things in mind with NEP 43, albeit as a possible
later extension, not an explicit goal.)

I am a bit curious about the actual performance improvements even
without allowing more flexibility on the NumPy side, my gut feeling
would be fairly large variations with sometimes big improvements due to
parallelization bug often only added overheads due to NumPy not giving
you deep enough control?


As to the name, I don't have an issue with using `pnumpy`, although I
was never hugely concerned about it.

Initially I thought a longer name might be nicer, but the old(?)
accelerated-numpy or fast_numpy_loops doesn't seem that much clearer to
me.  I guess in the end, I think its just important to be clear that
this type of project patches/modifies NumPy and is not associated with
it directly.

It seams `pnumpy` is currently taken on PyPI with a small amount of
downloads: https://pypistats.org/packages/pnumpy
(Although I wonder how many are actual users.), though.

Cheers,

Sebastian


> 
> What is in it?
> 
> - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner
> loops
> 
> - When the inner loop is called with a large enough array, chunk the 
> data and perform the iteration via a thread pool
> 
> - Add a different memory allocator for "ndarray" data (will require
> an 
> appropriate API from NumPy)
> 
> - Allow using optimized loops above and beyond what NumPy provides
> 
> - Allow logging inner loop calls and parameters to learn about the 
> current process and perhaps tune the performance accordingly
> 
> 
> The first release contains the hooking mechanism and the thread
> pool, 
> the rest has been prototyped but is not ready for release. The idea 
> behind the package is that a third-party package can try things out
> and 
> iterate much faster than NumPy. If some of the ideas bear fruit, and
> do 
> not add an undue maintenance burden to NumPy, the code can be ported
> to 
> NumPy. I am not sure NumPy wishes to take upon itself the burden of 
> managing threads, but a third-party package may be able to.
> 
> 
> I am writing to the mailing list both to announce the pre-release
> under 
> the wrong name, and, in accordance with the fair play rules[1], to 
> request use of the "numpy" name in the package. We had considered
> many 
> options, in the end would like to propose "pnumpy" (the p is either 
> "parallel" or "performant" or "preliminary", whatever you desire).
> 
> 
> Matti
> 
> 
> [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201104/495a311e/attachment-0001.sig>


More information about the NumPy-Discussion mailing list