Mailman 3 - NumPy-Discussion

A proposal for dot (or inner)
by Harald Hanche-Olsen Jan. 28, 2000

Jan. 28, 2000

I am having some problems relating to the current function dot (identical to matrixmultiply, though I haven't seen the equivalence in any documentation). Here is what the docs say: dot(a,b) Will return the dot product of a and b. This is equivalent to matrix multiply for 2d arrays (without the transpose). Somebody who does more linear algebra really needs to do this function right some day! Or the builtin doc string: >>> print Numeric.dot.__doc__ dot(a,b) returns matrix-multiplication between a and b. The product-sum is over the last dimension of a and the second-to-last dimension of b. First, this is misleading. It seems to me to indicate that b must have rank at least 2, which experiments indicate is not necessary. Instead, the rule appears to be to use the only axis of b if b has rank 1, and otherwise to use the second-to-last one. Frankly, I think this convention is ill motivated, hard to remember, and even harder to justify. As a mathematician, I can see only one reasonable default choice: One should sum over the last index of a, and the first index of b. Using the Einstein summation convention [*], that would mean that dot(a,b)[j,k,...,m,n,...] = a[j,k,...,i] * b[i,m,n,...] [*] that is, summing over repeated indices -- i in this example This would of course yield the current behaviour in the important cases where the rank of b is 1 or 2. But we could do better than this: Why not leave the choice up to the user? We could allow an optional third parameter which should be a pair of indices, indicating the axes to be summed over. The default value of this parameter would be (-1, 0). Returning to my example above, the user could then easily compute, for example, dot(a,b,(1,2))[j,k,...,m,n,...] = a[j,i,k,...] * b[m,n,i,...] while the current behaviour of dot would correspond to the new behaviour of dot(a,b,(-1,-2)) whenever b has rank at least 2. Actually, there is probably a lot of code out there that uses the current behaviour of dot. So I would propose leaving dot untouched, and introducing inner instead, with the behaviour I outlined above. We could even allow any number of pairs of axes to be summed over, for example inner(a,b,(1,2),(2,0))[k,l,...,m,n,...] = a[k,i,j,l,..] * b[j,m,i,n,...] With this notation, one can for example write the Hilbert-Schmidt inner product of two real 2x2 matrices (the sum of a[i,j]b[j,i] over all i and j) as inner(a,b,(0,1),(1,0)). If my proposal is accepted, the documentation should probably declare dot (and its alias matrixmultiply?) as deprecated and due to disappear in the future, with a pointer to its replacement inner. In the meantime, dot could in fact be replaced by a simple wrapper to inner: def dot(a,b): if len(b.shape) > 1: return inner(a,b,(-1,-2) else: return inner(a,b) (with the proper embellishments to allow it to be used with python sequences, of course). - Harald

1 0

install numPy with lapack
by fredrik Jan. 26, 2000

Jan. 26, 2000

This message was sent from Geocrawler.com by "fredrik" <fredriks(a)pinkfloyd.com> Be sure to reply to that address. How do i install the latest version of numPy if i have lapack already installed? I'm using distutils. Geocrawler.com - The Knowledge Archive

1 0

Proclamation: column-wise arrays
by Pearu Peterson Jan. 26, 2000

Jan. 26, 2000

Hi! Problem: Using Fortran routines from Python C/API is "tricky" when multi-dimensional arrays are passed in. Cause: Arrays in Fortran are stored in column-wise order while arrays in C are stored in row-wise order. Standard solutions: 1) Create a new C array; copy the data from the old one in column-wise order; pass the new array to fortran; copy changed array back to old one in row-wise order; deallocate the array. 2) Change the storage order of an array in place: element-wise swapping; pass the array to fortran; change the storage order back with element-wise swapping Why standard solutions are not good? 1) Additional memory allocation, that is problem for large arrays; Element-wise copying is time consuming (2 times). 2) It is good as no extra memory is needed but element-wise swapping (2 times) is approx. equivalent with the element-wise copying (4 times). Proclamation: Introduce a column-wise array to Numeric Python where data is stored in column-wise order that can be used specifically for fortran routines. Proposal sketch: 1) Introduce a new flag `row_order'to PyArrayObject structure: row_order == 1 -> the data is stored in row-wise order (default, as it is now) row_order == 0 -> the data is stored in column-wise order Note that now the concept of contiguousness depends on this flag. 2) Introduce new array "constructors" such as PyArray_CW_FromDims, PyArray_CW_FromDimsAndData, PyArray_CW_ContiguousFromObject, PyArray_CW_CopyFromObject, PyArray_CW_FromObject, etc. that all return arrays with row_order=0 and data stored in column-wise order (that is in case of contiguous results, otherwise strides feature is employd). 3) In order to operations between arrays (possibly with different storage order) would work correctly, many internal functions of NumPy C/API need to be modifyied. 4) anything else? What is the good of this? 1) The fact is that there is a large number of very good scietific tools freely available written in Fortran (Netlib, for instance). And I don't mean only Fortran 77 codes but also Fortran 90/95 codes. 2) Having Numeric Python arrays with data stored in column-wise order, calling Fortran routines from Python becomes really efficient and space-saving. 3) There should be little performance hit if, say, two arrays with different storage order are multiplied (compared to the operations between non-contiguous arrays in the current implementation). 4) I don't see any reason why older C/API modules would broke because of this change if it is carried out carefully enough. So, back-ward compability should be there. 5) anything else? What are against of this? 1) Lots of work but with current experience it should not be a problem. 2) The size of the code will grow. 3) I suppose that most people using Numerical Python will not care of calling Fortran routines from Python. Possible reasons: too "tricky" or no need. In the first case, the answer is that there are tools such as PyFort, f2py that solve this problem. In the later case, there is no problem:-) 4) anything else? I understand that my proposal is quite radical but taking into account that we want to use Python for many years to come, the use would be more pleasing if one cause of (constant) confusion would be less during this time. Best regards, Pearu

2 2

Hello
by Hassan Aurag Jan. 22, 2000

Jan. 22, 2000

Hi, I have just seen with pleasure that numpy is on sourceforge. So welcome. I am maintaining and writing a generic mathematical session manager, in which I have a numpy session. The project if you haven't noticed it is at http://gmath.sourceforge.net Now, this done, I'd like to know if you will be making rpm/tar.gz releases available on your site always and preferably on the anonymous ftp site for latest release, like in: ftp://numpy.sourceforge.net/latest/srpms .... If yes, then I'd be happy to make the app go fetch the packages from that site or another instead of making the srpms myself (which means possible bugs!). Thank you in advance. H. Aurag aurag(a)crm.umontreal.ca aurag(a)users.sourceforge.net

1 0

test
by Paul F. Dubois Jan. 20, 2000

Jan. 20, 2000

This is a test. Ignore. Paul

1 0