Hello,
At Continuum Analytics we've been working on a submodule implementing a set of lineal algebra operations as generalized ufuncs. This allows specifying arrays of lineal algebra problems to be computed with a single Python call, allowing broadcasting as well. As the vectorization is handled in the kernel, this gives a speed edge on the operations. We think this could be useful to the community and we want to share the work done.
I've created a couple of pull-requests:
The first one contains a fix for a bug in the handling of certain signatures in the gufuncs. This was found while building the submodule. The fix was done by Mark Wiebe, so credit should go to him :). https://github.com/numpy/numpy/pull/2953
The second pull request contains the submodule itself and builds on top of the previous fix. It contains a rst file that explains the submodule, enumerates the functions implemented and details some implementation bits. The entry point to the module is in written in Python and contains detailed docstrings. https://github.com/numpy/numpy/pull/2954
We are open to discussion and to make improvements to the code if needed, in order to adapt to NumPy standards.
Thanks, Oscar.
On Thu, Jan 31, 2013 at 8:43 AM, Oscar Villellas oscar.villellas@continuum.io wrote:
Hello,
At Continuum Analytics we've been working on a submodule implementing a set of lineal algebra operations as generalized ufuncs. This allows specifying arrays of lineal algebra problems to be computed with a single Python call, allowing broadcasting as well. As the vectorization is handled in the kernel, this gives a speed edge on the operations. We think this could be useful to the community and we want to share the work done.
It certainly does look useful. My question is -- why do we need two complete copies of the linear algebra routine interfaces? Can we just replace the existing linalg functions with these new implementations? Or if not, what prevents it?
-n
On Thu, Jan 31, 2013 at 7:44 PM, Nathaniel Smith njs@pobox.com wrote:
On Thu, Jan 31, 2013 at 8:43 AM, Oscar Villellas oscar.villellas@continuum.io wrote:
Hello,
At Continuum Analytics we've been working on a submodule implementing a set of lineal algebra operations as generalized ufuncs. This allows specifying arrays of lineal algebra problems to be computed with a single Python call, allowing broadcasting as well. As the vectorization is handled in the kernel, this gives a speed edge on the operations. We think this could be useful to the community and we want to share the work done.
It certainly does look useful. My question is -- why do we need two complete copies of the linear algebra routine interfaces? Can we just replace the existing linalg functions with these new implementations? Or if not, what prevents it?
The error reporting would have to be bodged back in.
-- Robert Kern
On Thu, Jan 31, 2013 at 9:35 PM, Robert Kern robert.kern@gmail.com wrote:
On Thu, Jan 31, 2013 at 7:44 PM, Nathaniel Smith njs@pobox.com wrote:
On Thu, Jan 31, 2013 at 8:43 AM, Oscar Villellas oscar.villellas@continuum.io wrote:
Hello,
At Continuum Analytics we've been working on a submodule implementing a set of lineal algebra operations as generalized ufuncs. This allows specifying arrays of lineal algebra problems to be computed with a single Python call, allowing broadcasting as well. As the vectorization is handled in the kernel, this gives a speed edge on the operations. We think this could be useful to the community and we want to share the work done.
It certainly does look useful. My question is -- why do we need two complete copies of the linear algebra routine interfaces? Can we just replace the existing linalg functions with these new implementations? Or if not, what prevents it?
The error reporting would have to be bodged back in.
Error reporting is one part of the equation. Right now the functions in the linalg module perform some baking of the data. For example, in linalg eigenvalue functions for real arrays check if all the eigenvalues/eigenvectors in the result are real. If that's the case it returns arrays with a real dtype. There is no way to handle that in a gufunc without a performance hit. The gufunc version always returns a matrix with complex dtype (it will involve copying otherwise).
So this provides the flexibility of being able to use linalg functions as generalized ufuncs. It also provides some extra performance as there is little python wrapper code and when used as gufuncs buffer handling is quite efficient.
IMO it would be great if this could evolve into a complete replacement for linalg, that's for sure, but right now it is missing functionality and is not a drop-in replacement. In the other hand it provides value, so it is a worthy addition.
Cheers, Oscar
Hello,
I've updated the pull request following feedback: https://github.com/numpy/numpy/pull/2954
It caters all the comments raised up-to-now
On Thu, Jan 31, 2013 at 5:43 PM, Oscar Villellas oscar.villellas@continuum.io wrote:
Hello,
At Continuum Analytics we've been working on a submodule implementing a set of lineal algebra operations as generalized ufuncs. This allows specifying arrays of lineal algebra problems to be computed with a single Python call, allowing broadcasting as well. As the vectorization is handled in the kernel, this gives a speed edge on the operations. We think this could be useful to the community and we want to share the work done.
I've created a couple of pull-requests:
The first one contains a fix for a bug in the handling of certain signatures in the gufuncs. This was found while building the submodule. The fix was done by Mark Wiebe, so credit should go to him :). https://github.com/numpy/numpy/pull/2953
The second pull request contains the submodule itself and builds on top of the previous fix. It contains a rst file that explains the submodule, enumerates the functions implemented and details some implementation bits. The entry point to the module is in written in Python and contains detailed docstrings. https://github.com/numpy/numpy/pull/2954
We are open to discussion and to make improvements to the code if needed, in order to adapt to NumPy standards.
Thanks, Oscar.