design issues  octave 'incompatibilities'
Dear all, I am new to numarray and as I am trying to use it daybyday I am now wondering about certain numeric/numarray design issues. Please forgive me / correct me as I probably misunderstood certain issues.  Why did you choose rowmajor instead of column major format as practiced by R/octave/matlab... Doesn't that mean performance problems as fortran/blas is optimized if you work with the transposed version ?  why do vectors have no 'orientation', i.e. there are only row but no column vectors (or why do you treat matrices/vectors differently, i.e. have vectors at all as a separate type) numarray: a=array([[1,2,3],[4,5,6],[7,8,9]])
a[0,:] array([1, 2, 3]) a[:,0] array([1, 4, 7])
vs. octave: a=[1,2,3;4,5,6;7,8,9];
a(1,:) ans = 1 2 3 a(:,1) ans = 1 4 7
 How can one obtain submatrices from full matrices: numarray gives only elements (which is very, very rarely needed and should IMHO rather be some extra function):
i=[0,1] j=[0,2] a[i,j] array([1, 6])
vs octave:
i=[1,2]; j=[1,3]; a(i,j) ans = 1 3 4 6
all one can do in numarray currently is as awkward as this:
a.take(i,0).take(j,1) array([[1, 3], [4, 6]])
also mixing slice and variable does not work a[:,i]  hmmhh seems that is a major issue for me...  why don't you allow iterating over the whole matrix via a single index ? i.e. a[3] Traceback (most recent call last): File "<stdin>", line 1, in ? IndexError: Index out of range vs octave
a(4) ans = 2
Are there more elegant ways to do this ? Which issues are likely to be fixed in the future ? Best regards, Soeren
Soeren Sonnenburg wrote:
Dear all,
I am new to numarray and as I am trying to use it daybyday I am now wondering about certain numeric/numarray design issues. Please forgive me / correct me as I probably misunderstood certain issues.
 Why did you choose rowmajor instead of column major format as practiced by R/octave/matlab... Doesn't that mean performance problems as fortran/blas is optimized if you work with the transposed version ?
Not everyone is interfacing with optimized FORTRAN code. Those who are are usually using ATLAS as their BLAS, and ATLAS has rowmajor versions of the BLAS subroutines. Rowmajor is C's convention and numarray/Numeric largely follow that. There are of course some performance issues when interfacing with FORTRAN code that expects columnmajor, but there would be other performance problems if numarray/Numeric were columnmajor and interfacing with rowmajor code.
 why do vectors have no 'orientation', i.e. there are only row but no column vectors (or why do you treat matrices/vectors differently, i.e. have vectors at all as a separate type)
You can certainly make column vectors. In [1]: a = arange(5) In [2]: a Out[2]: NumPy array, format: long [0 1 2 3 4] In [3]: a.shape = (5,1) In [4]: a Out[4]: NumPy array, format: long [[0] [1] [2] [3] [4]] Often, though, a "row" vector can also be used in place of a "column" vector. Although sometimes not:
 How can one obtain submatrices from full matrices:
In [6]: import numarray In [7]: a = numarray.arange(1, 10) In [8]: a.shape = (3,3) In [9]: a Out[9]: array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) In [10]: i = [0,1] In [11]: j = [[0],[2]] In [12]: a[i,j] Out[12]: array([[1, 4], [3, 6]])
 why don't you allow iterating over the whole matrix via a single index ?
ravel()
Are there more elegant ways to do this ? Which issues are likely to be fixed in the future ?
None. They're not broken, just different from what you are used to.  Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die."  Richard Harter
On Sat, 20050723 at 12:06 0700, Robert Kern wrote:
Soeren Sonnenburg wrote:
Dear all,
I am new to numarray and as I am trying to use it daybyday I am now wondering about certain numeric/numarray design issues. Please forgive me / correct me as I probably misunderstood certain issues.
 Why did you choose rowmajor instead of column major format as practiced by R/octave/matlab... Doesn't that mean performance problems as fortran/blas is optimized if you work with the transposed version ?
Not everyone is interfacing with optimized FORTRAN code. Those who are are usually using ATLAS as their BLAS, and ATLAS has rowmajor versions of the BLAS subroutines. Rowmajor is C's convention and numarray/Numeric largely follow that. There are of course some performance issues when interfacing with FORTRAN code that expects columnmajor, but there would be other performance problems if numarray/Numeric were columnmajor and interfacing with rowmajor code.
Well but why did you change to the C version then ? Well maybe if it is about optimizing stuff seriously one could work with the transpose anyway...
Often, though, a "row" vector can also be used in place of a "column" vector. Although sometimes not:
You are right, thanks!
Are there more elegant ways to do this ? Which issues are likely to be fixed in the future ?
None. They're not broken, just different from what you are used to.
Thanks a lot for your very helpful mail! Do you know whether mixing slices and arrays will be supported at some point a[:,[0,1]] ? Soeren.
Soeren Sonnenburg wrote:
On Sat, 20050723 at 12:06 0700, Robert Kern wrote:
Soeren Sonnenburg wrote:
Dear all,
I am new to numarray and as I am trying to use it daybyday I am now wondering about certain numeric/numarray design issues. Please forgive me / correct me as I probably misunderstood certain issues.
 Why did you choose rowmajor instead of column major format as practiced by R/octave/matlab... Doesn't that mean performance problems as fortran/blas is optimized if you work with the transposed version ?
Not everyone is interfacing with optimized FORTRAN code. Those who are are usually using ATLAS as their BLAS, and ATLAS has rowmajor versions of the BLAS subroutines. Rowmajor is C's convention and numarray/Numeric largely follow that. There are of course some performance issues when interfacing with FORTRAN code that expects columnmajor, but there would be other performance problems if numarray/Numeric were columnmajor and interfacing with rowmajor code.
Well but why did you change to the C version then ? Well maybe if it is about optimizing stuff seriously one could work with the transpose anyway...
Because numarray and Python are written in C. It's also consistent with listsoflists. In [8]: L = [[0, 1],[2,3]] In [9]: A = array(L) In [10]: L[1][0] Out[10]: 2 In [11]: A[1][0] Out[11]: 2 In [12]: A[1,0] Out[12]: 2  Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die."  Richard Harter
On Jul 24, 2005, at 2:41 PM, Soeren Sonnenburg wrote:
Well but why did you change to the C version then ? Well maybe if it is about optimizing stuff seriously one could work with the transpose anyway...
To get a really solid answer to "why" you would have to ask those that wrote the original Numeric. (Jim Hugunin and Co.). My guess is was that it was just as much to preserve the following relation arr[i,j] = arr[i][j] (instead of arr[i,j] = arr[j][i]) But I could be wrong. Note that this is a confusing issue to many and often as not there are unstated assumptions that are repeatedly made by many that are *not* shared by everyone. To illustrate, there are at least two different approaches to handling this mismatch and it seems to me that many seem oblivious to the fact that there are two approaches. Approach 1: Keep the index convention the same. As a user of Numeric or numarray, you wish M[i,j] to mean the same as it does in Fortran/IDL/matlab... The consequence is that the data are ordered differently between Numeric/numarray and these other languages. This is seen as a data compatibility problem. Approach 2: Keep the data invariant and change the indexing convention. What was M[i,j] in Fortran is now M[j,i] in Numeric/numarray. This is not a data compatibility problem, but is now a brain compatibility problem ;) Since we deal with big data sets we have adopted 2 (no doubt to the consternation of many). This ought to be in a FAQ, it comes up enough to be :)
Do you know whether mixing slices and arrays will be supported at some point a[:,[0,1]] ?
Soeren.
No plans at the moment. We figured indexing was complicated enough as it was. I think Travis Oliphant did allow for this in Numeric3 (aka scipy.core); see the link below. But it may not mean what you think it should so you should check that out to see: http://cvs.sourceforge.net/viewcvs.py/numpy/Numeric3/PEP.txt?view=markup Perry Greenfield
Soeren Sonnenburg wrote:
 why do vectors have no 'orientation', i.e. there are only row but no column vectors (or why do you treat matrices/vectors differently, i.e. have vectors at all as a separate type)
I think the key to understanding this is that NumPy uses a fundamentally different data type that MATLAB ans it's derivatives. MATLAB was originally just what it is called: a "Matrix" laboratory. The basic data type of Matlab is a 2d matrix. Even a scalar is really a 1X! matrix. Matlab has a few tricks that can make these look like row and column vectors, etc, but they are really always matrices. On the other hand, NumPy arrays are Ndimensional, where N is theoretically unlimited. In practice, I think the max N is defined and compiled in, but you could change it and recompile if you wanted. In any case, many of us frequently use 3d and higher arrays, and they can be very useful. When thought of this way, you can see why there is no such thing as a column vs. a row vector. A vector is a onedimensional array: it has no orientation. However, NumPy does support NX1 and 1XN 2d arrays which can be very handy:
import numarray as N a = N.arange(5) a.shape = (1,1) a array([[0, 1, 2, 3, 4]]) b = N.arange(5) b.shape = (1,1) a array([0, 1, 2, 3, 4]) b array([[0], [1], [2], [3], [4]])
So a is a row vector and b is a column vector. If you multiply them, you get "array broadcasting":
a * b array([[ 0, 0, 0, 0, 0], [ 0, 1, 2, 3, 4], [ 0, 2, 4, 6, 8], [ 0, 3, 6, 9, 12], [ 0, 4, 8, 12, 16]])
This eliminates a LOT of extra duplicate arrays that you have to make in Matlab with meshgrid. When you index into an array, you reduce its rank (number of dimensions) by 1:
a = N.arange(27) a.shape = (3,3,3) a array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]],
[[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]])
a.shape (3, 3, 3) a[1].shape (3, 3) a[1][1].shape (3,)
When you slice, you keep the rank the same:
a[1:2].shape (1, 3, 3)
This creates a way to make row and column "vectors" from your 2d array (matrix)
a = N.arange(25) a.shape = (5,5) a array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]])
To make a "row vector" (really a 1XN matrix)
a[0:1,:] array([[0, 1, 2, 3, 4]])
To make a "column vector" (really a NX1 matrix)
a[:,0:1] array([[ 0], [ 5], [10], [15], [20]])
I hope that helps: Chris  Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception Chris.Barker@noaa.gov
Soeren Sonnenburg wrote:
Dear all,
I am new to numarray and as I am trying to use it daybyday I am now wondering about certain numeric/numarray design issues. Please forgive me / correct me as I probably misunderstood certain issues.
 Why did you choose rowmajor instead of column major format as practiced by R/octave/matlab... Doesn't that mean performance problems as fortran/blas is optimized if you work with the transposed version ?
 why do vectors have no 'orientation', i.e. there are only row but no column vectors (or why do you treat matrices/vectors differently, i.e. have vectors at all as a separate type)
I think others have responded to these questions well.
 How can one obtain submatrices from full matrices:
numarray gives only elements (which is very, very rarely needed and should IMHO rather be some extra function):
I thought about this quite a bit because I initially felt as you did before beginning work on scipy.base (Numeric3). I wondered if the right choice had been made. After some thought and discussion, I decided this approach was more general and flexible, and agreed with the numarray design. Therefore, this works in scipy.base the same. Also, scipy.base does allow mixing slices and lists like you want: import scipy.base a = scipy.base.array([[1,2,3],[4,5,6],[7,8,9]])
a[:,[0,1]] array([[1, 2], [4, 5], [7, 8]], 'l')
 why don't you allow iterating over the whole matrix via a single index ?
You can with scipy.base with a.flat[index] but ravel(a)[index] always works too. The reason is that Numeric uses the concept of reduceddimensionality arrays so that a[index] for a 2d array returns a 1d array not an element of the 2d array. It may be different then you are used to but I think it is a better choice. Best regards, Travis Oliphant
I missed this part and was reminded by Travis's message. On Jul 23, 2005, at 12:03 PM, Soeren Sonnenburg wrote:
 How can one obtain submatrices from full matrices:
numarray gives only elements (which is very, very rarely needed and should IMHO rather be some extra function):
i=[0,1] j=[0,2] a[i,j] array([1, 6])
vs octave:
i=[1,2]; j=[1,3]; a(i,j) ans = 1 3 4 6
Maybe for you it is rarely needed, but not for us. The situation is reversed. The latter is rarely used in our case. This is largely a reflection of whether your orientation is matrices or multidimensional arrays. In our case it is quite handy to select out a list of point in an image by giving a list of their respective indices (e.g., stars). True, I do see that others may need the other view, but then the question is which should get the convenience. Since Numeric/numarray are primarily oriented towards multidimensional arrays (e.g., operations are elementbyelement rather than matrix) it seemed to make sense to go this way for consistency, but I understand that there is another side to this. Perry
On Mon, 20050725 at 10:10 0400, Perry Greenfield wrote:
On Jul 24, 2005, at 2:41 PM, Soeren Sonnenburg wrote:
Well but why did you change to the C version then ? Well maybe if it is about optimizing stuff seriously one could work with the transpose anyway...
To get a really solid answer to "why" you would have to ask those that wrote the original Numeric. (Jim Hugunin and Co.). My guess is was that it was just as much to preserve the following relation
arr[i,j] = arr[i][j]
(instead of arr[i,j] = arr[j][i])
Ok, that sounds reasonable.
But I could be wrong.
Note that this is a confusing issue to many and often as not there are unstated assumptions that are repeatedly made by many that are *not* shared by everyone. To illustrate, there are at least two different approaches to handling this mismatch and it seems to me that many seem oblivious to the fact that there are two approaches.
Approach 1: Keep the index convention the same. As a user of Numeric or numarray, you wish M[i,j] to mean the same as it does in Fortran/IDL/matlab... The consequence is that the data are ordered differently between Numeric/numarray and these other languages. This is seen as a data compatibility problem.
Approach 2: Keep the data invariant and change the indexing convention. What was M[i,j] in Fortran is now M[j,i] in Numeric/numarray. This is not a data compatibility problem, but is now a brain compatibility problem ;)
well it might not be *that* bad in the end... only it is a tough decision to break with everything that is there (to put it to extreme: the world is wrong and but we did it right) and be compatible with C like array access... If one does so one needs to have very serious reasons to do so ... that is why I was asking.
Since we deal with big data sets we have adopted 2 (no doubt to the consternation of many).
hmmhh, there is no advantage with big datasets for any of the formats.
Do you know whether mixing slices and arrays will be supported at some point a[:,[0,1]] ?
Soeren.
No plans at the moment. We figured indexing was complicated enough as it was. I think Travis Oliphant did allow for this in Numeric3 (aka scipy.core); see the link below. But it may not mean what you think it should so you should check that out to see:
http://cvs.sourceforge.net/viewcvs.py/numpy/Numeric3/PEP.txt?view=markup
Hmmhh while we are at it, is there work on a consensus num* ? I've seen the PEP for inclusion of numarray, now I see numeric3 and scipy and also cvxopt  are people actually converging towards a single num* package ? Soeren.
On Mon, 20050725 at 08:59 0700, Chris Barker wrote:
Soeren Sonnenburg wrote:
 why do vectors have no 'orientation', i.e. there are only row but no column vectors (or why do you treat matrices/vectors differently, i.e. have vectors at all as a separate type)
I think the key to understanding this is that NumPy uses a fundamentally different data type that MATLAB and it's derivatives. MATLAB was originally just what it is called: a "Matrix" laboratory. The basic data type of Matlab is a 2d matrix. Even a scalar is really a 1X! matrix. Matlab has a few tricks that can make these look like row and column vectors, etc, but they are really always matrices.
Ok, I am realizing that R also distinguishes between vectors and matrices.
On the other hand, NumPy arrays are Ndimensional, where N is theoretically unlimited. In practice, I think the max N is defined and compiled in, but you could change it and recompile if you wanted. In any case, many of us frequently use 3d and higher arrays, and they can be very useful. When thought of this way, you can see why there is no
Well at least this is the same for octave/matlab.
such thing as a column vs. a row vector. A vector is a onedimensional array: it has no orientation.
This makes life more difficult if one wants to convert from octave/matlab > numarray and automated systems close to impossible. If vectors had the same properties/functions as matrices one would not have such problems, i.e. v^{transpose} * u == dot(v,u) and v*u > error
However, NumPy does support NX1 and 1XN 2d arrays which can be very handy:
import numarray as N a = N.arange(5) a.shape = (1,1) a array([[0, 1, 2, 3, 4]]) b = N.arange(5) b.shape = (1,1) a array([0, 1, 2, 3, 4]) b array([[0], [1], [2], [3], [4]])
So a is a row vector and b is a column vector. If you multiply them, you get "array broadcasting":
a * b array([[ 0, 0, 0, 0, 0], [ 0, 1, 2, 3, 4], [ 0, 2, 4, 6, 8], [ 0, 3, 6, 9, 12], [ 0, 4, 8, 12, 16]])
This eliminates a LOT of extra duplicate arrays that you have to make in Matlab with meshgrid.
In my eyes 'array broadcasting' is confusing and should rather be in a function like meshgrid and instead a*b should return matrixmultiply(a,b) ...
When you index into an array, you reduce its rank (number of dimensions) by 1:
a = N.arange(27) a.shape = (3,3,3) a array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]],
[[ 9, 10, 11], [12, 13, 14], [15, 16, 17]],
[[18, 19, 20], [21, 22, 23], [24, 25, 26]]])
a.shape (3, 3, 3) a[1].shape (3, 3) a[1][1].shape (3,)
When you slice, you keep the rank the same:
a[1:2].shape (1, 3, 3)
This creates a way to make row and column "vectors" from your 2d array (matrix)
a = N.arange(25) a.shape = (5,5) a array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]])
To make a "row vector" (really a 1XN matrix)
a[0:1,:] array([[0, 1, 2, 3, 4]])
To make a "column vector" (really a NX1 matrix)
a[:,0:1] array([[ 0], [ 5], [10], [15], [20]])
I hope that helps:
Indeed it does  Thanks!! Unfortunately I am not at all happy now that '*' != matrixmultiply (but outerproduct) for vectors/matrices... I realize that with lists it is ok to grow them via slicing. x=[] x[0]=1 IndexError: list assignment index out of range x[0:0]=[1] x [1] that seems not to work with numarray ... or ? y=array() y[0]=1 TypeError: object does not support item assignment y[0:0]=array([1]) TypeError: object does not support item assignment Soeren.
Soeren Sonnenburg wrote:
On Mon, 20050725 at 08:59 0700, Chris Barker wrote:
[snip]
such thing as a column vs. a row vector. A vector is a onedimensional array: it has no orientation.
This makes life more difficult if one wants to convert from octave/matlab > numarray and automated systems close to impossible.
<shrug> That wasn't a design goal.
If vectors had the same properties/functions as matrices one would not have such problems, i.e. v^{transpose} * u == dot(v,u) and v*u > error
That would be a big problem for those of us who don't use Numeric just for linear algebra. These are general arrays, not matrices and vectors. [snip]
In my eyes 'array broadcasting' is confusing and should rather be in a function like meshgrid and instead a*b should return matrixmultiply(a,b) ...
Spend some time with it. It will probably grow on you. Numeric is not Matlab or Octave. It never will be, thank Gd. [snip]
I hope that helps:
Indeed it does  Thanks!! Unfortunately I am not at all happy now that '*' != matrixmultiply (but outerproduct) for vectors/matrices...
Again, Numeric is a package for arrays, not just linear algebra. Please spend some more time with Python and Numeric before deciding that they must be changed to match your preconceptions.
I realize that with lists it is ok to grow them via slicing.
x=[] x[0]=1 IndexError: list assignment index out of range x[0:0]=[1] x [1]
that seems not to work with numarray ... or ?
y=array() y[0]=1 TypeError: object does not support item assignment y[0:0]=array([1]) TypeError: object does not support item assignment
Python lists are designed to grow dynamically. Their memory is preallocated so that growing them is on average pretty cheap. Numeric arrays are not, nor will they be.  Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die."  Richard Harter
On Mon, 20050725 at 17:47 0400, Perry Greenfield wrote:
I missed this part and was reminded by Travis's message.
On Jul 23, 2005, at 12:03 PM, Soeren Sonnenburg wrote:
 How can one obtain submatrices from full matrices:
numarray gives only elements (which is very, very rarely needed and should IMHO rather be some extra function):
i=[0,1] j=[0,2] a[i,j] array([1, 6])
vs octave:
i=[1,2]; j=[1,3]; a(i,j) ans = 1 3 4 6
Maybe for you it is rarely needed, but not for us. The situation is reversed. The latter is rarely used in our case. This is largely a reflection of whether your orientation is matrices or multidimensional arrays. In our case it is quite handy to select out a list of point in an image by giving a list of their respective indices (e.g., stars).
Hmmhh. I see that this again breaks with R/octave/matlab. One should not do so if there is no serious reason. It just makes life harder for every potential convert from any of these languages. A funktion take() would have served this purposed much better... but this is ofcourse all IMHO and I can see your point: You design it according to your (or your communities) requirements and different communities different needs... I am realizing that this must have been why cvxopt switched away from numarray/numeric. There slicing/indexing and '*' work as I would have expected: In [71]: a=matrix([1,2,3,4,5,6,7,8,9],size=(3,3)) In [72]: a Out[72]: <3x3 matrix, tc='d'> In [73]: b=matrix([1,2,3,4,5,6,7,8,9],size=(3,3)) In [74]: a*b Out[74]: <3x3 matrix, tc='d'> In [75]: print a 1.0000e+00 4.0000e+00 7.0000e+00 2.0000e+00 5.0000e+00 8.0000e+00 3.0000e+00 6.0000e+00 9.0000e+00 In [76]: print b 1.0000e+00 4.0000e+00 7.0000e+00 2.0000e+00 5.0000e+00 8.0000e+00 3.0000e+00 6.0000e+00 9.0000e+00 In [77]: print a*b 3.0000e+01 6.6000e+01 1.0200e+02 3.6000e+01 8.1000e+01 1.2600e+02 4.2000e+01 9.6000e+01 1.5000e+02 In [78]: print a[:,0] 1.0000e+00 2.0000e+00 3.0000e+00 In [79]: print a[0,1] 4.0 In [80]: print a[0,:] 1.0000e+00 4.0000e+00 7.0000e+00
True, I do see that others may need the other view, but then the question is which should get the convenience. Since Numeric/numarray are primarily oriented towards multidimensional arrays (e.g., operations are elementbyelement rather than matrix) it seemed to make sense to go this way for consistency, but I understand that there is another side to this.
It now seems very difficult for me to end up with a single numeric/matrix package that makes it into core python  which is at the same time very sad. Soeren
On Tue, 26 Jul 2005, Soeren Sonnenburg apparently wrote:
In my eyes 'array broadcasting' is confusing and should rather be in a function like meshgrid and instead a*b should return matrixmultiply(a,b) ...
It does, if you work with matrices. Try import scipy help(scipy.mat) Coming from GAUSS I had the same initial reaction. It did not last long. hth, Alan Isaac
On Jul 26, 2005, at 12:35 AM, Soeren Sonnenburg wrote:
Since we deal with big data sets we have adopted 2 (no doubt to the consternation of many).
hmmhh, there is no advantage with big datasets for any of the formats.
It is if you have to reorder the data, or use nonoptimum iteration through the array. The first is definitely slower.
Do you know whether mixing slices and arrays will be supported at some point a[:,[0,1]] ?
Soeren.
No plans at the moment. We figured indexing was complicated enough as it was. I think Travis Oliphant did allow for this in Numeric3 (aka scipy.core); see the link below. But it may not mean what you think it should so you should check that out to see:
http://cvs.sourceforge.net/viewcvs.py/numpy/Numeric3/PEP.txt? view=markup
Hmmhh while we are at it, is there work on a consensus num* ? I've seen the PEP for inclusion of numarray, now I see numeric3 and scipy and also cvxopt  are people actually converging towards a single num* package ?
That's what the goal of Numeric3 is as far as different underlying array engines. But if you mean a merging of the array/matrix viewpoints, I think you've answered your own question. Perry
Soeren Sonnenburg wrote:
Hmmhh. I see that this again breaks with R/octave/matlab. One should not do so if there is no serious reason. It just makes life harder for every potential convert from any of these languages.
If you're looking for a matlab clone, use octave or psilab, or.... Speaking as an exmatlab user, I much prefer the NumPy approach. The reason is that I very rarely did linear algebra, and generally used matlab as a general computational environment. I got very tired of having to type that "." before all my operators. I also love array broadcasting, it seems so much cleaner and efficient. When I moved from Matlab to NumPy, I missed these things: Integrated plotting: many years later, this is still weak, but at least for 2d matplotlib is wonderful. The large library of math functions: SciPy is moving to fill that gap. Automatic handling of IEEE special values: numarray now does that pretty well. That's what I missed. What I gained was a far more powerful and expressive language, AND a more powerful and flexible array type. I don't miss MATLAB at all. In fact, you'll find me on the matplotlib mailing list, often repeating the refrain: matplotlib is NOT MATLAB nor should it be!
It now seems very difficult for me to end up with a single numeric/matrix package that makes it into core python 
That is probably true.
which is at the same time very sad.
But I'm not sure we should be sad about it. What we all want is the package best suited to our own needs to be in the standard library. However, I'd rather the current situation than having a package poorly suited to my needs in the standard library. As this thread proves, there is no one kind of array package that would fit even most people's needs. However, there is some promise for the future: 1) SciPy base may serve to unify Numeric/numarray 2) Travis has introduced the "array interface" http://numeric.scipy.org/array_interface.html this would provide an efficient way for the various array and matrix packages to exchange data. It does have a hope of making it into the standard library, though even if it doesn't, if a wide variety of addon packages use it, then the same thing is accomplished. If fact, if packages that don't implement an array type, but might use one (like PIL, wxPython, etc) accept this interface, then any package that implements it can be used together with them. 3) What about PEP 225: Elementwise/Objectwise Operators? It's listed under: Deferred, Abandoned, Withdrawn, and Rejected PEPs Which of those applied here? I had high hope for it one time. By the way, I had not seen cvxopt before, thanks for the link. Perhaps it would serve as a better base for a fullfeatured linear algebra package than Numeric/numarray. Ideally, it could use the array interface, for easy exchange of data with other packages. Chris  Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception Chris.Barker@noaa.gov
Soeren Sonnenburg wrote:
I am realizing that this must have been why cvxopt switched away from numarray/numeric. There slicing/indexing and '*' work as I would have expected:
cvxopt uses it's own classes because they did not feel that a general purpose array was needed. They wanted to define a matrix class with sparse matrix and dense matrix subclasses. In fact, cvxopt's matrix classes can not be used as ubiquitously as Numeric/numarray arrays. Everything is not a matrix. In fact, I would like to see more general linear algebra routines that allow people to more naturally deal with (for example) sixdimensional linear operators mapping from a threedimensional space to a threedimensional space. Currently, you are forced to perform an artificial rowscanning procedure just to interface with matrix libraries. Scipy can help with this kind of thing. I do not see cvxopt as a competing array implementation. At some point, hopefully cvxopt will be integrated with scipy. I am continually looking for feasible ways to make scipy more attractive to contributors. Everybody benefits when their is a standard infrastructure. For example, there are sparse matrices in SciPy. If cvxopt has better sparse matrix objects, I would love to use them. Hopefully, the array interface will assist on a more abstract scale so that memory reuse can occur for at least the dense cvxopt matrices.
It now seems very difficult for me to end up with a single numeric/matrix package that makes it into core python  which is at the
same time very sad.
Their are several issues here. But, yes a Matrix object will always be a separate object just as quaternions should be because they represent an interpretation to a memory block. In Numeric/numarray the focus is on generic multidimensional arrays. Therefore numeric operators must be elementby element. Note that Numeric does have a Matrix object that allows you to use '*' to represent matrix multiplication. It's only problem is that passing this object to a function usually returns an array again instead of a Matrix. Travis
On Tue, 20050726 at 09:30 0700, Chris Barker wrote:
Soeren Sonnenburg wrote:
Hmmhh. I see that this again breaks with R/octave/matlab. One should not do so if there is no serious reason. It just makes life harder for every potential convert from any of these languages.
If you're looking for a matlab clone, use octave or psilab, or....
No I am not, I am sold to python.
Speaking as an exmatlab user, I much prefer the NumPy approach. The reason is that I very rarely did linear algebra, and generally used matlab as a general computational environment. I got very tired of having to type that "." before all my operators. I also love array broadcasting, it seems so much cleaner and efficient.
Well coming from machine learning I am used to having just matrices.
When I moved from Matlab to NumPy, I missed these things:
Integrated plotting: many years later, this is still weak, but at least for 2d matplotlib is wonderful.
I agree here. [no single numeric package in python]
But I'm not sure we should be sad about it. What we all want is the package best suited to our own needs to be in the standard library. However, I'd rather the current situation than having a package poorly suited to my needs in the standard library. As this thread proves, there is no one kind of array package that would fit even most people's needs.
However, there is some promise for the future:
1) SciPy base may serve to unify Numeric/numarray
2) Travis has introduced the "array interface"
http://numeric.scipy.org/array_interface.html
this would provide an efficient way for the various array and matrix packages to exchange data. It does have a hope of making it into the standard library, though even if it doesn't, if a wide variety of addon packages use it, then the same thing is accomplished. If fact, if packages that don't implement an array type, but might use one (like PIL, wxPython, etc) accept this interface, then any package that implements it can be used together with them.
3) What about PEP 225: Elementwise/Objectwise Operators? It's listed under:
Deferred, Abandoned, Withdrawn, and Rejected PEPs
Which of those applied here? I had high hope for it one time.
By the way, I had not seen cvxopt before, thanks for the link. Perhaps it would serve as a better base for a fullfeatured linear algebra package than Numeric/numarray. Ideally, it could use the array interface, for easy exchange of data with other packages.
Actually that would be nice. Having had a closer look at cvxopt that might suit *my* needs more, but having an interface to get cvxopt matrices > numarray/numeric without trouble to get certain potentially missing functionality would be nice. Thanks, Soeren
On Wed, 20050727 at 02:02 0600, Travis Oliphant wrote:
Soeren Sonnenburg wrote:
I am realizing that this must have been why cvxopt switched away from numarray/numeric. There slicing/indexing and '*' work as I would have expected:
cvxopt uses it's own classes because they did not feel that a general purpose array was needed. They wanted to define a matrix class with sparse matrix and dense matrix subclasses. In fact, cvxopt's matrix classes can not be used as ubiquitously as Numeric/numarray arrays. Everything is not a matrix. In fact, I would like to see more general linear algebra routines that allow people to more naturally deal with (for example) sixdimensional linear operators mapping from a threedimensional space to a threedimensional space. Currently, you are forced to perform an artificial rowscanning procedure just to interface with matrix libraries. Scipy can help with this kind of thing.
Hmmhh,
I do not see cvxopt as a competing array implementation. At some point, hopefully cvxopt will be integrated with scipy. I am continually looking for feasible ways to make scipy more attractive to contributors. Everybody benefits when their is a standard infrastructure. For example, there are sparse matrices in SciPy. If cvxopt has better sparse matrix objects, I would love to use them. Hopefully, the array interface will assist on a more abstract scale so that memory reuse can occur for at least the dense cvxopt matrices.
I guess we now observe the different communities different expectations problem :/ In any case I agree that a standard infrastructure is very desirable. However it might come at a cost one might not want to pay, but still at least conversion functions from say cvxopt <> numarray are worth spending time on.
It now seems very difficult for me to end up with a single numeric/matrix package that makes it into core python  which is at the
same time very sad.
Their are several issues here. But, yes a Matrix object will always be a separate object just as quaternions should be because they represent an interpretation to a memory block. In Numeric/numarray the focus is on generic multidimensional arrays. Therefore numeric operators must be elementby element.
OK.
Note that Numeric does have a Matrix object that allows you to use '*' to represent matrix multiplication. It's only problem is that passing this object to a function usually returns an array again instead of a Matrix.
So the cvxopt approach is pretty valid, doing everything for matrices as they do, but allowing other types as 'int' etc.. Soeren
On Tue, 20050726 at 09:37 0400, Perry Greenfield wrote:
On Jul 26, 2005, at 12:35 AM, Soeren Sonnenburg wrote: [...]
Hmmhh while we are at it, is there work on a consensus num* ? I've seen the PEP for inclusion of numarray, now I see numeric3 and scipy and also cvxopt  are people actually converging towards a single num* package ?
That's what the goal of Numeric3 is as far as different underlying array engines. But if you mean a merging of the array/matrix viewpoints, I think you've answered your own question.
As Alan and Travis suggested Matrices scipy.mat, one could have matrices more or less independently of num* if one could convert them back and forth. Doing so would even allow for doing things differently, like getting submatrices, allowing for real matrixmatrix multiplication etc. However that of course kills consistency. Soeren
On Mon, 20050725 at 22:39 0700, Robert Kern wrote:
Soeren Sonnenburg wrote:
[snip]
In my eyes 'array broadcasting' is confusing and should rather be in a function like meshgrid and instead a*b should return matrixmultiply(a,b) ...
Spend some time with it. It will probably grow on you. Numeric is not [...] Again, Numeric is a package for arrays, not just linear algebra. Please spend some more time with Python and Numeric before deciding that they must be changed to match your preconceptions.
I realize that I just need plain matrices and operations on them so I am probably better off with cvxopt at that point.
I realize that with lists it is ok to grow them via slicing.
x=[] x[0]=1 IndexError: list assignment index out of range x[0:0]=[1] x [1]
that seems not to work with numarray ... or ?
y=array() y[0]=1 TypeError: object does not support item assignment y[0:0]=array([1]) TypeError: object does not support item assignment
Python lists are designed to grow dynamically. Their memory is preallocated so that growing them is on average pretty cheap. Numeric arrays are not, nor will they be.
Well I don't claim that this is/must be efficient. It is just what I would have expected to work if I use standard python arrays and now numarrays. Soeren.
participants (6)

Alan G Isaac

Chris Barker

Perry Greenfield

Robert Kern

Soeren Sonnenburg

Travis Oliphant