array vs matrix, converting code from matlab
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
Dear numpy users, I am converting some code from matlab to numpy/scipy, but still a bit confused by numpy, mostly the array vs matrix issue. Looking at the scipy website, the matrix type looks the closest to matlab syntax, but I still have some issues: - under matlab, everything, including scalar, are matrices in matlab sense. In python, they are not. So, of I want to handle scalar case in a function which takes arrays, what should I do ? Having special case for scalar sounds like a pain, so is asarray/asmatrix the best way to handle those cases so my function only deal with array types ? - what is the difference between matrix and array, except syntax ? If I want to handle both in one function, what is the "best" method ? Using one type only (for example matrix), and using asmatrix on all arguments accordingly ? To convert my matlab code, I was thinking about using asmatrix for arguments in all my functions, but I am not sure this is really the "right" way. I was hoping some other people would have some experience with the same issues, and could give me some general advices thank you, David P.S: it would be great to have this kind of information on scipy website; right now, the scipy for matlab users part is a bit sparse... I am willing to change this once I understand the problem myself, of course:)
![](https://secure.gravatar.com/avatar/d7a33f69d4fb125ee71b4bef74f8dc0f.jpg?s=120&d=mm&r=g)
Hi! All, I am also in the same process. And I would like to add one more question: In Matlab, for a 3D array or matrix, the indexing is a(i,j,k). In numpy, it became a[k-1,i-1,j-1]. Is there any way to make it become a[i-1,j-1,k-1]? Or I am doing something wrong here?? Gen On Apr 20, 2006, at 3:38 AM, David Cournapeau wrote:
Dear numpy users,
I am converting some code from matlab to numpy/scipy, but still a bit confused by numpy, mostly the array vs matrix issue. Looking at the scipy website, the matrix type looks the closest to matlab syntax, but I still have some issues:
- under matlab, everything, including scalar, are matrices in matlab sense. In python, they are not. So, of I want to handle scalar case in a function which takes arrays, what should I do ? Having special case for scalar sounds like a pain, so is asarray/asmatrix the best way to handle those cases so my function only deal with array types ? - what is the difference between matrix and array, except syntax ? If I want to handle both in one function, what is the "best" method ? Using one type only (for example matrix), and using asmatrix on all arguments accordingly ?
To convert my matlab code, I was thinking about using asmatrix for arguments in all my functions, but I am not sure this is really the "right" way. I was hoping some other people would have some experience with the same issues, and could give me some general advices
thank you,
David
P.S: it would be great to have this kind of information on scipy website; right now, the scipy for matlab users part is a bit sparse... I am willing to change this once I understand the problem myself, of course:)
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Gennan Chen wrote:
Hi! All,
I am also in the same process. And I would like to add one more question:
In Matlab, for a 3D array or matrix, the indexing is a(i,j,k). In numpy, it became a[k-1,i-1,j-1]. Is there any way to make it become a[i-1,j-1,k-1]? Or I am doing something wrong here??
In NumPy, arrays and matrices are by default in C-contiguous order so that the last index varies the fastest. Matlab is based on Fortran originally and defines arrays in Fortran-contiguous order (the first index varies the fastest as you walk linearly throught memory). The only time this really matters is if you are interfacing with some compiled code. Otherwise, how you think about indexing is up to you and how you define the array. So, to make it a[i-1,j-1,k-1] you need to reshape the array from the way you defined it in MATLAB. It really does just depend on how you define things. Perhaps you could give a specific example so we could be more specific on how you would write the same thing in NumPy. -Travis
![](https://secure.gravatar.com/avatar/d7a33f69d4fb125ee71b4bef74f8dc0f.jpg?s=120&d=mm&r=g)
Hi! Travis, Let's start with an example under matlab:
d = [0:23] k = reshape(d, 3,4,2) k(:,:,1) = 0 3 6 9 1 4 7 10 2 5 8 11 k(:,:,2) = 12 15 18 21 13 16 19 22 14 17 20 23
under numpy:
In [2]: d = numpy.asarray(range(0,24), numpy.float32)
In [3]: d Out[3]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.], dtype=float32) In [4]: k = d.reshape(3,4,2) In [5]: k Out[5]: array([[[ 0., 1.], [ 2., 3.], [ 4., 5.], [ 6., 7.]], [[ 8., 9.], [ 10., 11.], [ 12., 13.], [ 14., 15.]], [[ 16., 17.], [ 18., 19.], [ 20., 21.], [ 22., 23.]]], dtype=float32) So, if I want to port my Matlab code, I need to pay attention to this. And Since I have a lot of C/C++ mexing code in Matlab, I need to fix not just 1-0 based but also indexing issue here. Any chance I can make the indexing like the Matlab's way? Or I should just hunker down... Gen-Nan Chen, PhD Chief Scientist Research and Development Group CorTechs Labs Inc (www.cortechs.net) 1020 Prospect St., #304, La Jolla, CA, 92037 Tel: 1-858-459-9700 ext 16 Fax: 1-858-459-9705 Email: gnchen@cortechs.net On Apr 20, 2006, at 11:28 AM, Travis Oliphant wrote:
Gennan Chen wrote:
Hi! All,
I am also in the same process. And I would like to add one more question:
In Matlab, for a 3D array or matrix, the indexing is a(i,j,k). In numpy, it became a[k-1,i-1,j-1]. Is there any way to make it become a[i-1,j-1,k-1]? Or I am doing something wrong here??
In NumPy, arrays and matrices are by default in C-contiguous order so that the last index varies the fastest. Matlab is based on Fortran originally and defines arrays in Fortran-contiguous order (the first index varies the fastest as you walk linearly throught memory). The only time this really matters is if you are interfacing with some compiled code. Otherwise, how you think about indexing is up to you and how you define the array.
So, to make it a[i-1,j-1,k-1] you need to reshape the array from the way you defined it in MATLAB. It really does just depend on how you define things. Perhaps you could give a specific example so we could be more specific on how you would write the same thing in NumPy.
-Travis
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Gennan Chen wrote:
Hi! Travis,
Let's start with an example under matlab:
d = [0:23] k = reshape(d, 3,4,2) k(:,:,1) = 0 3 6 9 1 4 7 10 2 5 8 11 k(:,:,2) = 12 15 18 21 13 16 19 22 14 17 20 23
This is a FORTRAN-order reshaping. The linear sequence of values is reshaped into an array by varying the first index (the row) the fastest. NumPy, by default, uses C-contiguous order so that the last index varies the fastest as it places elements in the array. NumPy does have support for the FORTRAN-order, but there are a few constructs that don't support it: (arr.flat iterators are always in C-contiguous order and a.shape = (3,4,2) always assumes C-contiguous order for advancing through the elements). I don't know of anyone who has used the FORTRAN support to successfully convert MATLAB code so unless you want to be a guinea pig, you might want to just hunker down and convert to C-contiguous order. Alternatively you can just re-think the shape of your arrays in reverse: i.e. instead of creating a 3,4,2 array, create a 2,4,3 array and reverse all your indices.
So, if I want to port my Matlab code, I need to pay attention to this. And Since I have a lot of C/C++ mexing code in Matlab, I need to fix not just 1-0 based but also indexing issue here. Any chance I can make the indexing like the Matlab's way? Or I should just hunker down...
As far as the indexing is concerned. The only way to alter it is to implement a new class that subtracts 1 from all the indices. You could also define "end" as a simple object and when you see it replace with the number of dimension in the array. Such a thing is possible (you could also implement it to reverse the order of all your indices and simulate a matlab-style array). -Travis
![](https://secure.gravatar.com/avatar/d7a33f69d4fb125ee71b4bef74f8dc0f.jpg?s=120&d=mm&r=g)
Hi! Travis, Thanks for your suggestion. In fact, that is what I do. Creating a class and use __getitem__ to swap the indices. Unfortunately, In the medical imaging, index (i,j,k) is always related to FORTRAN way. Gen On Apr 20, 2006, at 1:44 PM, Travis Oliphant wrote:
Gennan Chen wrote:
Hi! Travis,
Let's start with an example under matlab:
d = [0:23] k = reshape(d, 3,4,2) k(:,:,1) = 0 3 6 9 1 4 7 10 2 5 8 11 k(:,:,2) = 12 15 18 21 13 16 19 22 14 17 20 23
This is a FORTRAN-order reshaping. The linear sequence of values is reshaped into an array by varying the first index (the row) the fastest.
NumPy, by default, uses C-contiguous order so that the last index varies the fastest as it places elements in the array.
NumPy does have support for the FORTRAN-order, but there are a few constructs that don't support it: (arr.flat iterators are always in C-contiguous order and a.shape = (3,4,2) always assumes C-contiguous order for advancing through the elements). I don't know of anyone who has used the FORTRAN support to successfully convert MATLAB code so unless you want to be a guinea pig, you might want to just hunker down and convert to C-contiguous order. Alternatively you can just re- think the shape of your arrays in reverse: i.e. instead of creating a 3,4,2 array, create a 2,4,3 array and reverse all your indices.
So, if I want to port my Matlab code, I need to pay attention to this. And Since I have a lot of C/C++ mexing code in Matlab, I need to fix not just 1-0 based but also indexing issue here. Any chance I can make the indexing like the Matlab's way? Or I should just hunker down...
As far as the indexing is concerned. The only way to alter it is to implement a new class that subtracts 1 from all the indices. You could also define "end" as a simple object and when you see it replace with the number of dimension in the array. Such a thing is possible (you could also implement it to reverse the order of all your indices and simulate a matlab-style array).
-Travis
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
Gennan Chen wrote:
Hi! All,
I am also in the same process. And I would like to add one more question:
In Matlab, for a 3D array or matrix, the indexing is a(i,j,k). In numpy, it became a[k-1,i-1,j-1]. Is there any way to make it become a[i-1,j-1,k-1]? Or I am doing something wrong here??
To be more specific, I am trying to convert a function which compute multivariate Gaussian densities. It should be able to handle scalar case, the case where the mean is a vector, and the case where va is a vector (diagonal covariance matrix) or square matrix (full covariance matrix). So, in matlab, I simply do: function [n, d, K, varmode] = gaussd_args(data, mu, var) [n, d] = size(data); [dm0, dm1] = size(mu); [dv0, dv1]= size(var); And I check that the dimensions are what I expect afterwards. Using arrays, I don't see a simple way to do that while passing scalar arguments to the functions. So either I should be using matrix type (and using asmatrix on the arguments), or I should never pass scalar to the function, and always pass arrays. But maybe I've used matlab too much, and there is a much simpler way to do that in scipy. To sum it up, what is the convention in scipy when a function handles both scalar and arrays ? Is there an idiom to treat scalar and arrays of size 1 the same way, whatever the number of dimensions arrays may have ? David
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
David Cournapeau wrote:
To sum it up, what is the convention in scipy when a function handles both scalar and arrays ? Is there an idiom to treat scalar and arrays of size 1 the same way, whatever the number of dimensions arrays may have ?
Very frequently, you can simply rely on the array broadcasting of the ufuncs and basic operations to do the work for you. I can't find a simple description of the broadcasting rules on the Web at the moment (big opportunity for a Wiki page), but very basically: In [1]: from numpy import * In [2]: a = arange(20).reshape((4,5)) In [3]: a Out[3]: array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]]) In [4]: a + 10 Out[4]: array([[10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]]) If you really do want scalars to be treated as arrays of size 1 (what dimensionality?), then you can usually use one of the atleast_* functions: In [5]: atleast*? atleast_1d atleast_2d atleast_3d In [6]: atleast_1d(10) Out[6]: array([10]) In [7]: atleast_2d(10) Out[7]: array([[10]]) In [8]: atleast_3d(10) Out[8]: array([[[10]]]) -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
David Cournapeau wrote:
To sum it up, what is the convention in scipy when a function handles both scalar and arrays ? Is there an idiom to treat scalar and arrays of size 1 the same way, whatever the number of dimensions arrays may have ?
Very frequently, you can simply rely on the array broadcasting of the ufuncs and basic operations to do the work for you. I can't find a simple description of the broadcasting rules on the Web at the moment (big opportunity for a Wiki page), but very basically:
In [1]: from numpy import *
In [2]: a = arange(20).reshape((4,5))
In [3]: a Out[3]: array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]])
In [4]: a + 10 Out[4]: array([[10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]])
I understand those cases, this is pretty similar to matlab, so I am used to it. But my problem is different (or maybe not ?)
If you really do want scalars to be treated as arrays of size 1 (what dimensionality?), then you can usually use one of the atleast_* functions:
This looks exactly like what I am looking for. My problem for my function is the following (pseudo code): foo(x, mu, va): if mu and va scalars: call scalar_implementation return result if mu and va rank 1: call scalar implementation on each element if mu rank 1 and va rank 2: call matrix implementation and assumed all arguments are always rank 2, even if they are "scalar" (size 1), a bit like in numpy.linalg, if I understood correctly (calling numpy.linalg.inv(1) does not work). It looks like those atleast* methods should do the work. Actually, my problem is pretty similar to implementing wrapper around numpy.linalg.inv which works in scalar case and rank 1 (assuming rank 1 means diagonal) cases. Are those atleast* functions expensive ? For small size arrays, I don't care too much, but in the case of a big array of rank 1 converted to a rank 2 array, does those function need to copy the data ? David
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
David Cournapeau wrote:
Actually, my problem is pretty similar to implementing wrapper around numpy.linalg.inv which works in scalar case and rank 1 (assuming rank 1 means diagonal) cases. Are those atleast* functions expensive ? For small size arrays, I don't care too much, but in the case of a big array of rank 1 converted to a rank 2 array, does those function need to copy the data ?
No: def atleast_2d(*arys): """ Force a sequence of arrays to each be at least 2D. Description: Force an array to each be at least 2D. If the array is 0D or 1D, the array is converted to a single row of values. Otherwise, the array is unaltered. Arguments: arys -- arrays to be converted to 2 or more dimensional array. Returns: input array converted to at least 2D array. """ res = [] for ary in arys: res.append(array(ary,copy=False,ndmin=2)) if len(res) == 1: return res[0] else: return res -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Robert Kern wrote:
David Cournapeau wrote:
This looks exactly like what I am looking for. My problem for my function is the following (pseudo code):
foo(x, mu, va):
if mu and va scalars: call scalar_implementation return result if mu and va rank 1: call scalar implementation on each element if mu rank 1 and va rank 2: call matrix implementation
To handle the first two cases (scalar, and call scalar on every element), you should be able to use 'numpy.frompyfunc' to create a version of your scalar function that automatically works that way. def plusone(v): return v+1 uf = numpy.frompyfunc(plusone,1,1)
uf(1) 2 uf([1,2,3]) array([2, 3, 4], dtype=object)
and assumed all arguments are always rank 2, even if they are "scalar"
(size 1), a bit like in numpy.linalg, if I understood correctly (calling numpy.linalg.inv(1) does not work). It looks like those atleast* methods should do the work.
Actually, my problem is pretty similar to implementing wrapper around numpy.linalg.inv which works in scalar case and rank 1 (assuming rank 1 means diagonal) cases. Are those atleast* functions expensive ? For small size arrays, I don't care too much, but in the case of a big array of rank 1 converted to a rank 2 array, does those function need to copy the data ?
Looks like atleast_1d doesn't copy the data, so yes, it should be fast. a = numpy.array([1,2,3,4]) b = numpy.atleast_2d(a) a array([1, 2, 3, 4]) b array([[1, 2, 3, 4]]) a[1]=0 a array([1, 0, 3, 4]) b array([[1, 0, 3, 4]]) -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
By the way, I'd be interested in an n-dimension Gaussian function for NumPy/SciPy too. Anyone else interested in machine learning and or bayesian methods? A port of Netlab (http://www.ncrg.aston.ac.uk/netlab/index.php) in SciPy would be great. --Bill
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
Bill Baxter wrote:
By the way, I'd be interested in an n-dimension Gaussian function for NumPy/SciPy too.
Anyone else interested in machine learning and or bayesian methods? A port of Netlab ( http://www.ncrg.aston.ac.uk/netlab/index.php) in SciPy would be great. Actually, I am porting a code for Gaussian Mixture Models with batch and online EM. I first try to do a pure python version to get an idea on scipy capabilities, and then I intend to create the stub to a C implementation (which already exists for matlab, the core being independant of matlab). I am hoping to have a much cleaner implementation, and more extensible (ie using other pdf, and why not more general models) using python languages capabilities (module, inheritance, etc...).
I think porting netlab would be a huge task, and quite difficult; there is also torch (http://www.torch.ch/) which may be interesting to use (C++ code, BSD license). Having a machine learning tool box would be a step forward for scipy, I guess. David
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
Yeh, I'm not so interested in having a PyNetlab per se, i.e. api-for-api equivalent, as much as having similar functionality. But I reckon copying the design and organization of something that already exists would be faster than doing it completely from scratch. The bits I use are GMM, Gaussian Processes, RBF networks, (P)PCA. But I guess that's probably the bulk of the code right there. Anyway, I can dream, can't I? :-) --bb On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Bill Baxter wrote:
By the way, I'd be interested in an n-dimension Gaussian function for NumPy/SciPy too.
Anyone else interested in machine learning and or bayesian methods? A port of Netlab ( http://www.ncrg.aston.ac.uk/netlab/index.php) in SciPy would be great. Actually, I am porting a code for Gaussian Mixture Models with batch and online EM. I first try to do a pure python version to get an idea on scipy capabilities, and then I intend to create the stub to a C implementation (which already exists for matlab, the core being independant of matlab). I am hoping to have a much cleaner implementation, and more extensible (ie using other pdf, and why not more general models) using python languages capabilities (module, inheritance, etc...).
I think porting netlab would be a huge task, and quite difficult; there is also torch (http://www.torch.ch/) which may be interesting to use (C++ code, BSD license). Having a machine learning tool box would be a step forward for scipy, I guess.
David
____________
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
One thing... I'm not sure why you think porting Netlab to SciPy would be such a huge task. It's a big task, sure. Porting to C++ would definitely be a huge task. But I would think porting to another high-level language like python would be a one-summer job for a reasonably clueful grad student. It's only 9000 lines of code excluding comments and blank lines: [BAXTER-PC<7>netlab]> egrep -v '(^%|^$)' *.m | wc 8971 41422 361276 I think converting 100 lines a day for 90 days is not unreasonable. That includes all the demos too. If you leave out the demos its about half that: [BAXTER-PC<18>netlab]> egrep -v '(^%|^\s*$)' `ls *.m | grep -v '^dem'` | wc 4171 18725 156628 Ok maybe it's still a little unreasonable. Alright, maybe it's not a 1-man summer job. I've also ignored testing and converting the comments, but the task is also fairly parallelizeable. Probably a little team of 3 eager new grad students could do a bang up job over a summer. --bb On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Bill Baxter wrote:
By the way, I'd be interested in an n-dimension Gaussian function for NumPy/SciPy too.
Anyone else interested in machine learning and or bayesian methods? A port of Netlab ( http://www.ncrg.aston.ac.uk/netlab/index.php) in SciPy would be great. Actually, I am porting a code for Gaussian Mixture Models with batch and online EM. I first try to do a pure python version to get an idea on scipy capabilities, and then I intend to create the stub to a C implementation (which already exists for matlab, the core being independant of matlab). I am hoping to have a much cleaner implementation, and more extensible (ie using other pdf, and why not more general models) using python languages capabilities (module, inheritance, etc...).
I think porting netlab would be a huge task, and quite difficult; there is also torch ( http://www.torch.ch/) which may be interesting to use (C++ code, BSD license). Having a machine learning tool box would be a step forward for scipy, I guess.
David
____________
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
One thing... I'm not sure why you think porting Netlab to SciPy would be such a huge task. It's a big task, sure. Porting to C++ would definitely be a huge task. Well, the nice thing with C++ is that you can plug it directly to python using swig and hand-coded wrapping code. It is actually one reason why I want to go on python: wrapping C code for matlab is awful (there is no way to control the memory handler, for example), and things like swig or
Bill Baxter wrote: python::boost are much better (without even taking into account that C and python have the same convention for indexing and row major ordering). As the code is BSD, I think the licenses are compatible with scipy. I think in a summer internship, you could write good swig or boost::python extension to have the wrapping mostly automated. Porting from matlab to scipy involve porting/testing all the code, whereas using C++ code involve mostly glue-code. But maybe I am underestimating the difficulty... David
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
Torch does look pretty nice. Yeh, providing wrappers for torch may be easier (and result in faster code) as long as their data format is relatively sane. I'm not really sure how one goes about interfacing numpy.arrays with external code, but it's certainly possible, since that's how the bulk of SciPy was written (by calling on external fortran or C code, not sure about C++). [info about NumPy and SWIG here if you haven't seen it already: http://www.scipy.org/Cookbook/SWIG_and_NumPy] The other problem with my estimate on time to port Matlab code is that a figure like 4200 lines doesn't reveal the real cost if one of those lines happens to be a call to something like Matlab's nonlinear optimization routines or something else for which there is currently no numpy equivalent. I don't think there are /many/ of those gotchas in Netlab, but eigs() is one of them. As far as I know SciPy has no function to get just a few eigenvalues without having to find them all. Nothing prevents it from being added to SciPy (matlab's eigs is just a wrapper for the freely available ARPACK) it just hasn't been done yet. Anyway if you're just wrapping existing C++, you know you're not going to run into rats' nests like that. --bb On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
One thing... I'm not sure why you think porting Netlab to SciPy would be such a huge task. It's a big task, sure. Porting to C++ would definitely be a huge task. Well, the nice thing with C++ is that you can plug it directly to python using swig and hand-coded wrapping code. It is actually one reason why I want to go on python: wrapping C code for matlab is awful (there is no way to control the memory handler, for example), and things like swig or
Bill Baxter wrote: python::boost are much better (without even taking into account that C and python have the same convention for indexing and row major ordering). As the code is BSD, I think the licenses are compatible with scipy. I think in a summer internship, you could write good swig or boost::python extension to have the wrapping mostly automated.
Porting from matlab to scipy involve porting/testing all the code, whereas using C++ code involve mostly glue-code. But maybe I am underestimating the difficulty...
David
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Bill Baxter wrote:
The other problem with my estimate on time to port Matlab code is that a figure like 4200 lines doesn't reveal the real cost if one of those lines happens to be a call to something like Matlab's nonlinear optimization routines or something else for which there is currently no numpy equivalent.
Have you looked at scipy.optimize? -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/fa3279b202e9a85f7d90a0422bca4489.jpg?s=120&d=mm&r=g)
One thing... I'm not sure why you think porting Netlab to SciPy would be such a huge task. It's a big task, sure. Porting to C++ would definitely be a huge task. Well, the nice thing with C++ is that you can plug it directly to python using swig and hand-coded wrapping code. It is actually one reason why I want to go on python: wrapping C code for matlab is awful (there is no way to control the memory handler, for example), and things like swig or
Hello all Just thought I'd throw in my 2 cents. I recently started with my masters thesis, which will probably focus on the application of Gaussian Mixture Models and Support Vector Machines to the problem of speaker verification. I would be very interested in cooperating on any effort to implement or wrap GMM and SVM code for SciPy. As far as SVM libraries go, I quite like SVM-Light and libsvm, which both have Python wrappers. Unfortunately these don't integrate with NumPy at the moment. http://www.cs.cornell.edu/~tomf/svmpython/ http://www.csie.ntu.edu.tw/~cjlin/libsvm/ I'm also interested in implementing some feature extraction algorithms and compensation techniques, such as Mel-Frequency Cepstral Coefficients (MFCC), RASTA filtering and others. There's some very nice MATLAB code that can serve as a starting point for these efforts: http://www.ee.columbia.edu/labrosa/matlab/rastamat/ Anybody else doing speech recognition/speaker verification/etc. with NumPy and SciPy? Regards, Albert _____ From: scipy-user-bounces@scipy.net [mailto:scipy-user-bounces@scipy.net] On Behalf Of Bill Baxter Sent: 21 April 2006 09:28 To: SciPy Users List Subject: Re: [SciPy-user] array vs matrix, converting code from matlab Torch does look pretty nice. Yeh, providing wrappers for torch may be easier (and result in faster code) as long as their data format is relatively sane. I'm not really sure how one goes about interfacing numpy.arrays with external code, but it's certainly possible, since that's how the bulk of SciPy was written (by calling on external fortran or C code, not sure about C++). [info about NumPy and SWIG here if you haven't seen it already: http://www.scipy.org/Cookbook/SWIG_and_NumPy] The other problem with my estimate on time to port Matlab code is that a figure like 4200 lines doesn't reveal the real cost if one of those lines happens to be a call to something like Matlab's nonlinear optimization routines or something else for which there is currently no numpy equivalent. I don't think there are /many/ of those gotchas in Netlab, but eigs() is one of them. As far as I know SciPy has no function to get just a few eigenvalues without having to find them all. Nothing prevents it from being added to SciPy (matlab's eigs is just a wrapper for the freely available ARPACK) it just hasn't been done yet. Anyway if you're just wrapping existing C++, you know you're not going to run into rats' nests like that. --bb On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote: Bill Baxter wrote: python::boost are much better (without even taking into account that C and python have the same convention for indexing and row major ordering). As the code is BSD, I think the licenses are compatible with scipy. I think in a summer internship, you could write good swig or boost::python extension to have the wrapping mostly automated. Porting from matlab to scipy involve porting/testing all the code, whereas using C++ code involve mostly glue-code. But maybe I am underestimating the difficulty... David
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
David Cournapeau wrote:
Gennan Chen wrote:
Hi! All,
I am also in the same process. And I would like to add one more question:
In Matlab, for a 3D array or matrix, the indexing is a(i,j,k). In numpy, it became a[k-1,i-1,j-1]. Is there any way to make it become a[i-1,j-1,k-1]? Or I am doing something wrong here??
To be more specific, I am trying to convert a function which compute multivariate Gaussian densities. It should be able to handle scalar case, the case where the mean is a vector, and the case where va is a vector (diagonal covariance matrix) or square matrix (full covariance matrix).
Also, both of you might want to take a look at this page and its links: http://www.scipy.org/NumPy_for_Matlab_Users -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/9820b5956634e5bbad7f4ed91a232822.jpg?s=120&d=mm&r=g)
Robert Kern wrote:
Also, both of you might want to take a look at this page and its links:
I think the page lacks some key points, at least concerning these rank issues, where numpy and matlab are quite different (for example, nowhere it is said that slicing gives an array whose rank is different than the original array). Once I am sure to get my head around the whole issue, I will try to complete it accordingly. David
![](https://secure.gravatar.com/avatar/915f9e3213021d75d43bb4262167b067.jpg?s=120&d=mm&r=g)
On 4/21/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Robert Kern wrote:
Also, both of you might want to take a look at this page and its links:
I think the page lacks some key points, at least concerning these rank issues, where numpy and matlab are quite different (for example, nowhere it is said that slicing gives an array whose rank is different than the original array). Once I am sure to get my head around the whole issue, I will try to complete it accordingly.
That would be great. Despite the fact that there's a column for 'array' there, most of the page was written with the assumption that you're using matlab for linear algebra, and so will be doing most everything with 'matrix', not 'array'. Slicing a 2-index matrix always returns another 2-index matrix. But you have a very good point when it comes to 'array's. --bb
![](https://secure.gravatar.com/avatar/c248bc10c66731ecd789b0135b635329.jpg?s=120&d=mm&r=g)
David Cournapeau wrote:
To be more specific, I am trying to convert a function which compute multivariate Gaussian densities. It should be able to handle scalar case, the case where the mean is a vector, and the case where va is a vector (diagonal covariance matrix) or square matrix (full covariance matrix). So, in matlab, I simply do:
function [n, d, K, varmode] = gaussd_args(data, mu, var)
[n, d] = size(data); [dm0, dm1] = size(mu); [dv0, dv1]= size(var);
And I check that the dimensions are what I expect afterwards. Using arrays, I don't see a simple way to do that while passing scalar arguments to the functions. So either I should be using matrix type (and using asmatrix on the arguments), or I should never pass scalar to the function, and always pass arrays. But maybe I've used matlab too much, and there is a much simpler way to do that in scipy. To sum it up, what is the convention in scipy when a function handles both scalar and arrays ? Is there an idiom to treat scalar and arrays of size 1 the same way, whatever the number of dimensions arrays may have ?
You could use rank-0 arrays instead of scalars. For example, if your function were to wrap the arguments up with 'asarray', they'd then have the normal methods and attributes of arrays: def foo(x, mu, va): x = asarray(x) mu = asarray(m) va = asarray(va) if mu.ndim == 0 and va.ndim == 0: call scalar_implementation return result if mu.ndim == 1 and va.ndim == 1: call scalar implementation on each element if mu.ndim == 1 and va.ndim == 2: call matrix implementation -- Ed
participants (7)
-
Albert Strasheim
-
Bill Baxter
-
David Cournapeau
-
Ed Schofield
-
Gennan Chen
-
Robert Kern
-
Travis Oliphant