dot/tensordot limitations

Timothy Hochberg has proposed a generalization of the matrix mechanism to support manipulating arrays of linear algebra objects. For example, one might have an array of matrices one wants to apply to an array of vectors, to yield an array of vectors: In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0) In [89]: A Out[89]: array([[[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]], [[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]]]) In [90]: V = np.array([[1,0,0],[0,1,0]]) Currently, it is very clumsy to handle this kind of situation even with arrays, keeping track of dimensions by hand. For example if one wants to multiply A by V "elementwise", one cannot simply use dot: In [92]: np.dot(A,V.T) Out[92]: array([[[ 1., 0.], [ 0., 1.], [ 0., 0.]], [[ 1., 0.], [ 0., 1.], [ 0., 0.]]]) tensordot() suffers from the same problem. It arises because when you combine two multiindexed objects there are a number of ways you can do it: A: Treat all indices as different and form all pairwise products (outer product): In [93]: np.multiply.outer(A,V).shape Out[93]: (2, 3, 3, 2, 3) B: Contract the outer product on a pair of indices: In [98]: np.tensordot(A,V,axes=(-1,-1)).shape Out[98]: (2, 3, 2) C: Take the diagonal along a pair of indices: In [111]: np.diagonal(np.multiply.outer(A,V),axis1=0,axis2=3).shape Out[111]: (3, 3, 3, 2) What we want in this case is a combination of B and C: In [106]: np.diagonal(np.tensordot(A,V,axes=(-1,-1)),axis1=0,axis2=2).T.shape Out[106]: (2, 3) but it cannot be done without constructing a larger array and pulling out its diagonal. If this seems like an exotic thing to want to do, suppose instead we have two arrays of vectors, and we want to evaluate the array of dot products. None of no.dot, np.tensordot, or np.inner produce the results you want. You have to multiply elementwise and sum. (This also precludes automatic use of BLAS, if indeed tensordot uses BLAS.) Does it make sense to implement a generalized tensordot that can do this, or is it the sort of thing one should code by hand every time? Is there any way we can make it easier to do simple common operations like take the elementwise matrix-vector product of A and V? The more general issue, of making linear algebra natural by keeping track of which indices are for elementwise operations, and which should be used for dots (and how), is a whole other kettle of fish. I think for that someone should think hard about writing a full-featured multilinear algebra package (might as well keep track of coordinate transformations too while one was at it) if this is intended. Anne

You open here a Pandora box: What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]). Would your issue can be rephrase like this: A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python? Nadav. -----הודעה מקורית----- מאת: numpy-discussion-bounces@scipy.org בשם Anne Archibald נשלח: ד 30-אפריל-08 01:41 אל: Discussion of Numerical Python נושא: [Numpy-discussion] dot/tensordot limitations Timothy Hochberg has proposed a generalization of the matrix mechanism to support manipulating arrays of linear algebra objects. For example, one might have an array of matrices one wants to apply to an array of vectors, to yield an array of vectors: In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0) In [89]: A Out[89]: array([[[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]], [[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]]]) In [90]: V = np.array([[1,0,0],[0,1,0]]) Currently, it is very clumsy to handle this kind of situation even with arrays, keeping track of dimensions by hand. For example if one wants to multiply A by V "elementwise", one cannot simply use dot: In [92]: np.dot(A,V.T) Out[92]: array([[[ 1., 0.], [ 0., 1.], [ 0., 0.]], [[ 1., 0.], [ 0., 1.], [ 0., 0.]]]) tensordot() suffers from the same problem. It arises because when you combine two multiindexed objects there are a number of ways you can do it: A: Treat all indices as different and form all pairwise products (outer product): In [93]: np.multiply.outer(A,V).shape Out[93]: (2, 3, 3, 2, 3) B: Contract the outer product on a pair of indices: In [98]: np.tensordot(A,V,axes=(-1,-1)).shape Out[98]: (2, 3, 2) C: Take the diagonal along a pair of indices: In [111]: np.diagonal(np.multiply.outer(A,V),axis1=0,axis2=3).shape Out[111]: (3, 3, 3, 2) What we want in this case is a combination of B and C: In [106]: np.diagonal(np.tensordot(A,V,axes=(-1,-1)),axis1=0,axis2=2).T.shape Out[106]: (2, 3) but it cannot be done without constructing a larger array and pulling out its diagonal. If this seems like an exotic thing to want to do, suppose instead we have two arrays of vectors, and we want to evaluate the array of dot products. None of no.dot, np.tensordot, or np.inner produce the results you want. You have to multiply elementwise and sum. (This also precludes automatic use of BLAS, if indeed tensordot uses BLAS.) Does it make sense to implement a generalized tensordot that can do this, or is it the sort of thing one should code by hand every time? Is there any way we can make it easier to do simple common operations like take the elementwise matrix-vector product of A and V? The more general issue, of making linear algebra natural by keeping track of which indices are for elementwise operations, and which should be used for dots (and how), is a whole other kettle of fish. I think for that someone should think hard about writing a full-featured multilinear algebra package (might as well keep track of coordinate transformations too while one was at it) if this is intended. Anne _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Nadav Horesh wrote:
You open here a Pandora box: What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
Would your issue can be rephrase like this: A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
My plan for this is the general function listed on the NumPy Trac area along with a weave-like kernel creation from Python code. http://www.scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions I'd love to get time to do this or mentor someone else in doing it. -Travis

Nadav Horesh wrote:
You open here a Pandora box: What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
Would your issue can be rephrase like this: A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
Quoting from the GeneralLoopingFunction wiki page: There seems to be a general need for looping over functions that work on whole arrays and return other arrays. The function has information associated with it that states what the basic dimensionality of the inputs are as well as the corresponding dimensionality of the outputs. This is in addition to information like supported data-types and so forth. Then, when "larger" arrays are provided as inputs, the extra dimensions should be "broadcast" to create a looping that calls the underlying construct on the different sub-arrays and produces multiple outputs. Thus, let's say we have a function, basic_inner, that takes two vectors and returns a scalar where the signature of the function is known to take two 1-d arrays of the same shape and return a scalar. Then, when this same function is converted to a general looping function: loopfuncs.inner(a, b) where a is (3,5,N) and b is (5,N) will return an output that is (3,5) whose elements are constructed by calling the underlying function repeatedly on each piece. Perl has a notion of threading that captures a part of this idea, I think. The concept is to have a more general function than the ufuncs where the signature includes dimensionality so that the function does more than "element-by-element" but does "sub-array" by "sub-array". Such a facility would prove very useful, I think and would abstract many operations very well. -Travis

On Tue, Apr 29, 2008 at 10:31 PM, Travis E. Oliphant <oliphant@enthought.com> wrote:
Nadav Horesh wrote:
You open here a Pandora box: What should I do if I want to run an operation on elements of an array which are not in the library. The usual answers are either use more memory ( (A*B).sum(axis=1) ), or more time ([dot(A[i],B[i]) for i in len(A)]).
Would your issue can be rephrase like this: A way to iterate over array in the C level, where the function/operation is defined in the C level, without the overhead of python?
Quoting from the GeneralLoopingFunction wiki page:
There seems to be a general need for looping over functions that work on whole arrays and return other arrays. The function has information associated with it that states what the basic dimensionality of the inputs are as well as the corresponding dimensionality of the outputs. This is in addition to information like supported data-types and so forth.
Then, when "larger" arrays are provided as inputs, the extra dimensions should be "broadcast" to create a looping that calls the underlying construct on the different sub-arrays and produces multiple outputs.
Thus, let's say we have a function, basic_inner, that takes two vectors and returns a scalar where the signature of the function is known to take two 1-d arrays of the same shape and return a scalar.
Then, when this same function is converted to a general looping function:
loopfuncs.inner(a, b)
where a is (3,5,N) and b is (5,N) will return an output that is (3,5) whose elements are constructed by calling the underlying function repeatedly on each piece. Perl has a notion of threading that captures a part of this idea, I think. The concept is to have a more general function than the ufuncs where the signature includes dimensionality so that the function does more than "element-by-element" but does "sub-array" by "sub-array".
Such a facility would prove very useful, I think and would abstract many operations very well.
Basically, element wise operations on elements that are subarrays; generalized ufuncs, if you will. Sorting would be a unary type operation in that context ;) Chuck

On Tue, Apr 29, 2008 at 4:41 PM, Anne Archibald <peridot.faceted@gmail.com> wrote:
Timothy Hochberg has proposed a generalization of the matrix mechanism to support manipulating arrays of linear algebra objects. For example, one might have an array of matrices one wants to apply to an array of vectors, to yield an array of vectors:
In [88]: A = np.repeat(np.eye(3)[np.newaxis,...],2,axis=0)
In [89]: A Out[89]: array([[[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]],
[[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]]])
In [90]: V = np.array([[1,0,0],[0,1,0]])
Currently, it is very clumsy to handle this kind of situation even with arrays, keeping track of dimensions by hand. For example if one wants to multiply A by V "elementwise", one cannot simply use dot:
Let A have dimensions LxMxN and b dimensions LxN, then sum(A*b[:,newaxis,:], axis=-1) will do the trick. Example: In [1]: A = ones((2,2,2)) In [2]: b = array([[1,2],[3,4]]) In [3]: A*b[:,newaxis,:] Out[3]: array([[[ 1., 2.], [ 1., 2.]], [[ 3., 4.], [ 3., 4.]]]) In [4]: sum(A*b[:,newaxis,:], axis=-1) Out[4]: array([[ 3., 3.], [ 7., 7.]]) Chuck Chuck
participants (4)
-
Anne Archibald
-
Charles R Harris
-
Nadav Horesh
-
Travis E. Oliphant