any better way to normalise a matrix
in my code i am trying to normalise a matrix as below mymatrix=matrix(..# items are of double type..can be negative values....) numrows,numcols=mymatrix.shape for i in range(numrows): temp=mymatrix[i].max() for j in range(numcols): mymatrix[i,j]=abs(mymatrix[i,j]/temp) # i am using abs() to make neg vals positive this way the code takes too much time to run for a case of numrows=25, and numcols=8100 etc.. being a beginner in numpy and python ,i would like to know if this can be done more efficiently..can anyone advise? thanx dn
On 27/12/2007, devnew@gmail.com
in my code i am trying to normalise a matrix as below
mymatrix=matrix(..# items are of double type..can be negative values....) numrows,numcols=mymatrix.shape
for i in range(numrows): temp=mymatrix[i].max() for j in range(numcols): mymatrix[i,j]=abs(mymatrix[i,j]/temp)
# i am using abs() to make neg vals positive
this way the code takes too much time to run for a case of numrows=25, and numcols=8100 etc.. being a beginner in numpy and python ,i would like to know if this can be done more efficiently..can anyone advise?
Yes. A major advantage - really the reason for existence - of numpy is that you can do operations to whole matrices at once in a single instruction, without loops. This is convenient from a coding point of view, and it also allows numpy to greatly accelerate calculations. Writing code the way you have above takes no advantage of numpy's machinery, and it would probably run faster using python lists than numpy arrays. I strongly recommend you read some numpy documentation; try starting with the tutorial: http://www.scipy.org/Tentative_NumPy_Tutorial For example, to extract an array containing the maxima of each row of mymatrix, you can use the amax() function: temp = numpy.amax(mymatrix, axis=1) Multiplying a whole matrix by a constant can be done simply as well: 7*mymatrix A final comment is that numpy is designed for computations with multidimensional data. Its basic data type is the array. It has a specialized data type called "matrix" which provides slightly more convenient matrix multiplication (in the sense of linear algebra). I recommend you do not use numpy's matrix class until you are more accustomed to the software. Read the documentation. Start with the tutorial. Good luck. Anne
try starting with the tutorial: http://www.scipy.org/Tentative_NumPy_Tutorial
For example, to extract an array containing the maxima of each row of mymatrix, you can use the amax() function:
temp = numpy.amax(mymatrix, axis=1)
thanx..had a tuff time finding the functions..will go thru the tutorial dn
Anne had it right -- much of the point of numpy is to use nd-arrays as the powerful objects they are - not just containers. Below is a version of your code for comparison. Note to numpy devs: I like the array methods a lot -- is there any particular reason there is no ndarray.abs(), or has it just not been added? -Chris #!/usr/bin/env python """ Simple exmaple of normalizing an array """ import numpy as N from numpy import random mymatrix=random.uniform(-100, 100,(3,4)) print "before:", mymatrix mymatrix2 = mymatrix.copy() numrows,numcols=mymatrix.shape for i in range(numrows): temp=mymatrix[i].max() for j in range(numcols): mymatrix[i,j]=abs(mymatrix[i,j]/temp) print "old way:", mymatrix ## "vectorized" way: # the "reshape" is a bit awkward, but it makes the 1-d result the right shape to "broadcast" to the original array row_max = mymatrix2.max(axis=1).reshape((-1, 1)) print row_max mymatrix2 = N.absolute((mymatrix2 / row_max)) print "vectorized:", mymatrix2 if (mymatrix == mymatrix2).all(): print "They are the same" -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 28/12/2007, Christopher Barker
I like the array methods a lot -- is there any particular reason there is no ndarray.abs(), or has it just not been added?
Here I have to disagree with you. Numpy provides ufuncs as general powerful tools for operating on matrices. More can be added relatively easily, they provide not just the basic "apply" operation but also "outer" and others. Adding another way to accomplish the same operation just adds bulk to numpy. I don't even like myarray.max(), preferring numpy.amax() or numpy.maximum.reduce. I realize we can't remove any array methods, but I don't think we should add an array method for every unary function - surely you don't want to see myarray.arctan()? Anne
Anne Archibald wrote:
Numpy provides ufuncs as general powerful tools for operating on matrices. More can be added relatively easily, they provide not just the basic "apply" operation but also "outer" and others. Adding another way to accomplish the same operation just adds bulk to numpy.
Maybe so, but if it were up to me, I'd have just the methods, and not the functions -- I like namespaces and OO design. I've always thought it was a spurious argument that these kinds of functions can operate on things that aren't numpy arrays -- it's true, but they all return arrays, to it's not like they really are universal.
Surely you don't want to see myarray.arctan()?
This was debated a fair bit on this list when the other methods were added. Personally, I like functions, rather than methods for things that "feel" like standard math -- trig functions, etc. I guess abs() kind of falls on the line. All I know is that I expected abs() to be a method this time. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 28/12/2007, Christopher Barker
Anne Archibald wrote:
Numpy provides ufuncs as general powerful tools for operating on matrices. More can be added relatively easily, they provide not just the basic "apply" operation but also "outer" and others. Adding another way to accomplish the same operation just adds bulk to numpy.
Maybe so, but if it were up to me, I'd have just the methods, and not the functions -- I like namespaces and OO design. I've always thought it was a spurious argument that these kinds of functions can operate on things that aren't numpy arrays -- it's true, but they all return arrays, to it's not like they really are universal.
It doesn't make a whole lot of sense to me to have n-ary methods (though I don't think we currently have any ternary ufuncs). How would you accomodate all the other operations ufuncs support? The big problem with methods, it seems to me, is that if I want to add a new ufunc, it's relatively easy. In fact, with vectorize() I do that all the time (with all sorts of arities). But adding compiled ufuncs in a new module is relatively straightforward; it's not at all clear how a separate package could ever add new methods to the ndarray object. This is not hypothetical - scipy.special introduces many mathematical functions that either are or should be ufuncs. As for whether this is "object-oriented", well, ufuncs are objects with a variety of methods; one could productively inherit from them (at least in principle, though I've never had occasion to). Anyway, I don't know that we need to rehash the old discussion. I just wanted to point out the value of ufuncs as objects. Anne
participants (3)
-
Anne Archibald
-
Christopher Barker
-
devnew@gmail.com