Beside proper programing paradigm Python easily scales to large-scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab. It is not difficult to construct a program that run over several computers (independent of the hardware and OS they have). If you are willing to invest a little time in software integration you'll have an access to powerful computing packages such as VTK, OpenCV, etc.
Nadav.
-----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Christopher Barker Sent: Wed 25-Apr-07 20:24 To: Discussion of Numerical Python Cc: Subject: Re: [Numpy-discussion] matlab vs. python question
Neal Becker wrote:
I'm interested in this comparison
There have got to be comparison's on the web -- google away!
My few comments:
I happened to look on the matlab vendor's website, and found that it does have classes.
Matlab added classes in a fairly recent version, so technically, yes, it does support OO. However, OO aside, Python is, in many ways, are far more sophisticated and capable language. It is better suited to larger projects, and well suited to wide variety of software development, rather than just numerical work. Indeed, python+numpy supports more sophisticated numerical work too (more data types, etc).
So, my reasons for using python+numpy (I did my entire dissertation with Matlab -- I loved it at the time):
* Better, more flexible language.
* I can use it for other things I do: web programming, sophisticated GUIs, etc.
* It integrates well with lots of other libraries
* It's free: both $$ and libre
What's better about Matlab:
* A wider selection of out-of-the-box numerical routines.
* Excellent integration of command-line and plotting.
In short, $$ and freedom aside, I think Matlab provides a slightly more productive environment for interactive experimentation and quickie prototypes and scripts, but much less productive for larger projects, or for people that need to do non-numerical work too.
just my $0.2
-Chris
Well - these threads always go on for a long time, but...
I've used matlab heavily for 10 years. I found that I had to use perl and C fairly heavily to get things done that matlab could not do well. Now I've switched to numpy, scipy, matplotlib, there is really nothing I miss in matlab. We would not attempt what we are doing now:
http://neuroimaging.scipy.org/
in matlab - it's just not the right tool for a large scale programming effort.
I agree that matlab has many attractions as a teaching tool and for small numeric processing scripts, but if you are writing a large to medium-sized application, I really don't think there is any comparison...
Matthew
Matthew Brett wrote: [...]
I agree that matlab has many attractions as a teaching tool and for small numeric processing scripts, but if you are writing a large to medium-sized application, I really don't think there is any comparison...
[...]
Matthew, there are also other (engineering) computational needs than just neuroimaging.
Coming from the field of control engineering, I don't think that at this moment there is any replacement for their graphical interface to solvers for nonlinear differential/difference equations called Simulink. Its extension for discrete-event systems called Stateflow is also impressive. I know on no other tool (commercial or free) that would offer this high productivity.
But what makes Matlab difficult to be replaced is that lots of other projects (commercial: Mathematica, Maple, ... and free: octave, maxima, scipy, ...) only offer computation and visualization, while engineers in my field also need INTERACTION OF THE SYSTEM WITH EXTERNAL WORLD. That is, compatibility with a real-time operating system and MOST available input-output (AD-DA) cards. Being able to acquire measurement data from an external industrial system, process them computationally (for instance, solving some Riccati matrix differential equations), visualize the data and put the computed results back to the real system, this is what we need. Well, to make this complete, I should say that Scilab project is starting to be a competitor. But it is not completely free and the user comfort is not too high at the moment. (Perhaps joining efforts with Openmodelica developers is the way to go for Python community?)
I am absolutely sure that Python (and Scipy and Numpy) project has a potential to fulfill also these needs (and by starting using these and sharing my own code I would like to contribute), but it is not the case at the moment. Without being negative and discouraging, I think it is fair to say that currently for some people it would be very difficult to switch completely to Python libraries.
But surely this is improving steadily.
Zdenek Hurak
Zdeněk Hurák wrote:
Coming from the field of control engineering, I don't think that at this moment there is any replacement for their graphical interface to solvers for nonlinear differential/difference equations called Simulink.
This is correct. This is the one thing that Python needs improvements in. Many of my colleagues use simulink to design and model a digital signal processing element which they can then download to an FPGA using a third-party tool from XiLINX. There is no way they could replace that with Python/SciPy at this point (although of course it "could" be done).
I would love to see some good contributions in the area of Simulink-like work. There are several things out there that are good starts.
But what makes Matlab difficult to be replaced is that lots of other projects (commercial: Mathematica, Maple, ... and free: octave, maxima, scipy, ...) only offer computation and visualization, while engineers in my field also need INTERACTION OF THE SYSTEM WITH EXTERNAL WORLD. That is, compatibility with a real-time operating system and MOST available input-output (AD-DA) cards.
The only way to solve this is to get more users interested in making these kinds of things happen. Or to start a company that does this and charges for a special build of Python compatible with your favorite hardware.
Being able to acquire measurement data from an external industrial system, process them computationally (for instance, solving some Riccati matrix differential equations), visualize the data and put the computed results back to the real system, this is what we need.
This is doable in many respects already (look at what Andrew Straw has done for example), but it of course could be made "easier" to do. But, I'm not sure it will happen without the work of a company. It would be great if hardware manufacturers used Python and made sure their stuff worked right with it, but this requires many more users.
I am absolutely sure that Python (and Scipy and Numpy) project has a potential to fulfill also these needs (and by starting using these and sharing my own code I would like to contribute), but it is not the case at the moment. Without being negative and discouraging, I think it is fair to say that currently for some people it would be very difficult to switch completely to Python libraries.
Yes, this is fair to say. As you have indicated, the situation is improving.
-Travis
Travis Oliphant wrote:
[...]
I would love to see some good contributions in the area of Simulink-like work. There are several things out there that are good starts.
Even though I praised Simulink highly in the previous contribution, I don't think that it would be a good way to mimic it. That way (the way that, for instance, Scilab/Scicos is going) you will only be SECOND.
What I like (still as a newcomer) about Scipy/Numpy is that it does not try to be Matlab for poor. You (I mean the community) can take an inspiration BY THE USE, but please, don't look too much on Matlab. Let's create somethink better even at the cost of not being compatible and perhaps requiring some learing effort from Matlab users. I keep my fingers crossed for Octave project, which is kind of free (GNU GPL) Matlab clone, but at the same moment I can recognize their limits in being JUST A CLONE. The leaders of Octave project do realize this danger but user kind of dictate compatibility with Matlab.
To be finally more specific, there is some discussion going on in the systems&control community as to "block-diagram vs. equation-based modelling". I am not a guru in dynamic systems modeling but in my humble opinion it is worth considering something like Modelica language http://www.modelica.org/ as the ground for modeling in Python. There is a young free implementation called OpenModelica http://www.ida.liu.se/~pelab/modelica/OpenModelica.html. A commercial implementation is produced by Dynasim and is called Dymola http://www.dynasim.com/ and can at least give an inspiration, showing that from a user point of view, it is also just graphical blocks that are being connected to build a model.
To make it clear, I am not a proponent of the above mentioned tools, I just came across these a couple of days ago when searching for an alternative to Simulink (commercial of free) and I found the whole Modelica movement interesting and perhaps the only one with significant development effort investment.
But what makes Matlab difficult to be replaced is that lots of other projects (commercial: Mathematica, Maple, ... and free: octave, maxima, scipy, ...) only offer computation and visualization, while engineers in my field also need INTERACTION OF THE SYSTEM WITH EXTERNAL WORLD. That is, compatibility with a real-time operating system and MOST available input-output (AD-DA) cards.
The only way to solve this is to get more users interested in making these kinds of things happen. Or to start a company that does this and charges for a special build of Python compatible with your favorite hardware.
I think that it would not be necessary to start from the scratch. There are some free projects for interfacing PC to these cards like COMEDI http://www.comedi.org/. There are also projects with real-time adaptations of commong operating systems, for Linux, I know of two: RTAI https://www.rtai.org/ and RTLinux-GPL http://www.rtlinux-gpl.org/. I am not an expert in these domains but it appears that there should be no problems to interface to these tools. The only think is the investment of developers effort, of course...:-)
Best regards,
Zdenek
On Thu, Apr 26, 2007 at 12:06:56PM +0200, Zdeněk Hurák wrote:
But what makes Matlab difficult to be replaced is that lots of other projects (commercial: Mathematica, Maple, ... and free: octave, maxima, scipy, ...) only offer computation and visualization, while engineers in my field also need INTERACTION OF THE SYSTEM WITH EXTERNAL WORLD. That is, compatibility with a real-time operating system and MOST available input-output (AD-DA) cards. Being able to acquire measurement data from an external industrial system, process them computationally (for instance, solving some Riccati matrix differential equations), visualize the data and put the computed results back to the real system, this is what we need.
I am very surprised that you think Matlab is more suited for such a task. I have had the opposite experience.
I work in an experimental lab where we do rather heavy experimental physics (Bose Einstein condensation). I am building an experiment from scratch and had to build a control framework for the experiment. It is fully automated and with need some reasonably sophisticated logics to scan parameters, do on the fly processing of the data, store it on the disk, display all this in a useful interface for the user, and feed all this back to the experiment.
Due to legacy reasons I built the software in Matlab. It was a bad experience.
First of all linking to C libraries to control hardware is really a pain. Writing Mex files is not that much fun and the memory management of MatLab is flawed. I have had segfaults for no obvious reasons, and the problems where neither in my code nor in the drivers (I worked with the engineer who wrote the driver to diagnose this). Matlab is single threaded so all your calls to the hardware are blocking (unless you are willing to add threads in you mex file, but this is very dangerous as you are playing with a loop-hole in matlab's mex loader which keeps the mex in the memory after it is executed). Finally the lack of proper object oriented programming and separated namespaces make it hard to write good code to control instruments.
The lack of object oriented programming, passing by reference, threads, and proper control of the GUI event-loop also makes it very hard to write a complex interactive GUI.
Python provides all this. What it does not provide are pre-linked libraries to control hardware, but if you are trying to control exotic hardware, all you have is a C or C++ SDK. It is even worse when the hardware is homemade, as you have to write the library yourself, and trust me, I'd much better write a hardware control library that speak to my custom made electronic over a bus in Python than in Matlab (GPIB, ethernet, serial, Python has bindings for all that).
I have had a second choice to build the computer control framework of a similar experiment. I took my chance to build it in Python (it was a different lab, and I was replacing an existing C software that had become impossible to maintain). I never regretted it. Python has really been very good at it due to all the different libraries available, the ease to link to C, and the possibility to do proper OOP.
I have written an article about this, that I submitted a month ago to CiSE. I have no news from them. The article is very poorly written (I had no hindsight about these matters when I started writing it, submitted it because I was tired to see it dragging along, and now I wish I had reworked it a bit), but I think I raise some points about what you need to build a control framework for an experiment, and how python can address those needs.
Anyway, IANAL, and I am not to sure if releasing a preprint on a mailing list renders the article ineligible for CiSE or not but I just put a version on http://gael-varoquaux.info/computers/agile_computer_control_of_an_experiment... . I hope it can help understanding what is needed to control an experiment from Python, what can be improved, and what already exists and is so incredibly useful.
I am interested in comments. Keep in mind that I am working in a field where nobody is interested in computing and I sure could use some advice. I have written quite a few softwares to control experiments in different labs with different languages, and I think I have acquired some experience, and made a lot of progress, but I also have had to reinvent the wheel more than once. We all know that good software engineering books or article readable by some one who nows little about computing are hard to find. Let us address this issue !
Gaël
On 4/26/07, Gael Varoquaux gael.varoquaux@normalesup.org wrote:
Anyway, IANAL, and I am not to sure if releasing a preprint on a mailing list renders the article ineligible for CiSE or not but I just put a version on http://gael-varoquaux.info/computers/agile_computer_control_of_an_experiment...
You're safe: there was a thread yesterday on the MPL list precisely on this, the CiSE policy (IEEE/AIP in reality) allows you to do this, as long as you replace it with their official PDF once/if it gets published and you clearly indicate their copyright terms. But hosting a preprint on your own site is allowed by their policies.
I can hunt down the specific references if you need them.
Cheers,
f
Beside proper programing paradigm Python easily scales to large- scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab.
Is that really true? (The large-matrix number crunching, not the proper programming paradigm ;-)
By no scientific means of evaluation, I was under the impression that the opposite was true to a smaller degree.
Also, a lot of the times that I'm dealing with extremely large matrices, they're of the sparse variety, which matlab currently handles with a bit more ease (and transparency) to the user.
-steve
On 4/26/2007 2:19 PM, Steve Lianoglou wrote:
Beside proper programing paradigm Python easily scales to large- scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab.
Is that really true? (The large-matrix number crunching, not the proper programming paradigm ;-)
By no scientific means of evaluation, I was under the impression that the opposite was true to a smaller degree.
Matlab have pass-by-value semantics, so you have to copy your data in and copy your data out for every function call. You can achieve the same result in Python by pickling and unpickling arguments and return values, e.g. using this function decorator:
import cPickle as pickle
def Matlab_Semantics(f):
''' Emulates Matlab's pass-by-value semantics, objects are serialized in and serialized out.
Example: @Matlab_Semantics def foo(bar): pass '''
func = f return wrapper
def wrapper(*args,**kwargs): args_in = pickle.loads(pickle.dumps(args)) kwargs_in = {} for k in kwargs: kwargs_in[k] = pickle.loads(pickle.dumps(kwargs[k])) args_out = func(*args_in,**kwargs_in) args_out = pickle.loads(pickle.dumps(args_out)) return args_out
Imagine using this horrible semantics in several layers of function calls. That is exactly what Matlab does. Granted, Matlab optimizes function calls by using copy-on-write, so it will be efficient in some cases, excessive cycles og copy-in and copy-out is usually what you get.
Sturla Molden
Sturla Molden wrote:
On 4/26/2007 2:19 PM, Steve Lianoglou wrote:
Beside proper programing paradigm Python easily scales to large- scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab.
Is that really true? (The large-matrix number crunching, not the proper programming paradigm ;-)
By no scientific means of evaluation, I was under the impression that the opposite was true to a smaller degree.
Matlab have pass-by-value semantics, so you have to copy your data in and copy your data out for every function call.
You are true for the semantics, but wrong for the consequences on copying, as matlab is using COW, and this works well in matlab. I have never noticed a big difference between matlab and python + numpy for memory consumption; I fail to see any reason why it would be significantly different (except the fact that numpy does not have to use double, whereas matlab had to for a long time, and double is still the default on matlab).
David
On 4/26/2007 2:42 PM, David Cournapeau wrote:
You are true for the semantics, but wrong for the consequences on
copying, as matlab is using COW, and this works well in matlab.
It works well only if you don't change your input arguments. Never try to write to a matrix received as an argument in a function call. If you do, the memory expenditure may grow very rapidly.
But as long a s NumPy does not use lazy evaluation the difference is not as striking as it could be.
S.M.
Sturla Molden wrote:
On 4/26/2007 2:19 PM, Steve Lianoglou wrote:
Beside proper programing paradigm Python easily scales to large- scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab.
Is that really true? (The large-matrix number crunching, not the proper programming paradigm ;-)
By no scientific means of evaluation, I was under the impression that the opposite was true to a smaller degree.
Matlab have pass-by-value semantics, so you have to copy your data in and copy your data out for every function call. You can achieve the same result in Python by pickling and unpickling arguments and return values, e.g. using this function decorator:
import cPickle as pickle
def Matlab_Semantics(f):
''' Emulates Matlab's pass-by-value semantics, objects are serialized in and serialized out. Example: @Matlab_Semantics def foo(bar): pass ''' func = f return wrapper def wrapper(*args,**kwargs): args_in = pickle.loads(pickle.dumps(args)) kwargs_in = {} for k in kwargs: kwargs_in[k] = pickle.loads(pickle.dumps(kwargs[k])) args_out = func(*args_in,**kwargs_in) args_out = pickle.loads(pickle.dumps(args_out)) return args_out
Imagine using this horrible semantics in several layers of function calls. That is exactly what Matlab does. Granted, Matlab optimizes function calls by using copy-on-write, so it will be efficient in some cases, excessive cycles og copy-in and copy-out is usually what you get.
That's interesting. How did you find this information?
On 4/26/2007 2:47 PM, Neal Becker wrote:
That's interesting. How did you find this information?
What information?
Matlab's pass-by-value semantics is well known to anyone who has ever used Matlab.
The Mathwork's employees have numerous times stated that Matlab uses copy-on-write to optimize function calls. I first learned of it in the usenet group comp.soft-sys.matlab.
Sturla Molden
Steve Lianoglou wrote:
Beside proper programing paradigm Python easily scales to large- scale number crunching: You can run large-matrices calculations with about 1/2 to 1/4 of memory consumption comparing to Matlab.
Is that really true? (The large-matrix number crunching, not the proper programming paradigm ;-)
By no scientific means of evaluation, I was under the impression that the opposite was true to a smaller degree.
Also, a lot of the times that I'm dealing with extremely large matrices, they're of the sparse variety, which matlab currently handles with a bit more ease (and transparency) to the user.
-steve
As Matlab is linking to Lapack, Atlas and similar libraries for (at least a part of) linear algebra computations, if you link to these freely available libraries from whatever other language, you should get about the same results.
Zdenek