Pros and Cons of Python verses other array environments
Hi all, I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments. The purpose is not to go into detail about semantic differences, but document higher-level differences that might help somebody decide whether or not they could use NumPy instead of some other environment. I've started with a comparison to MATLAB, based on an email response I sent to a friend earlier today. Additions and corrections welcome. -Travis O.
On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Hi all,
I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments.
Great, I think this is important to have. For reference, the link is: http://www.scipy.org/NumPyProConPage Cheers, f
I think everybody on this list is pretty familiar with the pros of PyLab (although a good set of arguments in one place is a Good Idea). Perhaps it would be productive for us to start a discussion of cons here on the list, and decide how to mitigate them. Here I give some of my impressions on how the novice might see PyLab. I am teaching a class on data assimilation this semester, and I have decided to use PyLab. I had used MATLAB in the past, and I had many students who weren't familiar with how it worked and there were licensing issues. Most students could eventually figure out how to deal with the language; vectorization is always a tough thing to get. Also, they could purchase a student version of MATLAB for about $50 -- not free, but cheaper than many college textbooks. Using PyLab, I have a different set of issues. First is installation. I recommended Enthon python to the PC users, and I helped the Mac users deal with the various distributions. Even with a number of clickable installer packages, putting python on their computers was not straightforward. Then there is the issue with Python itself. Python is a more powerful language than MATLAB's core programming language, and students have a slightly steeper learning curve figuring all of that out. They suddenly have to deal with importing packages, zero-based indexing, many different kinds of sequences, methods, and a host of other issues. Additionally, numpy is a more powerful array tool, in my opinion, but it is also harder to learn. In the final analysis, students had similar issues working with MATLAB, but my impression is that they find MATLAB slightly easier. Also, many of them come to the class with MATLAB experience -- none of them have used python, let alone any of the scientific packages, before. In the long run, python is clearly better, but in the short term, I think MATLAB might be simpler. Finally, I think that one of the reasons MATLAB is successful is that it includes everything together, in one place. It does some things very poorly, but I was always willing to put up with MATLAB's weaknesses to remain within a single environment. PyLab is starting to feel like that, but there are still some pretty clear boundaries between the packages, and an almost overwhelming array of choices if you want to look for them. This is good if you are a geek and what plotting package X or numeric package Y and you still want everything else to work, but bad if you are a novice and just want things to work simply and smoothly together. Thus, the first thing that would improve PyLab usage the most, for the majority of people migrating from MATLAB, would be a cohesive, easy to install package. Like Enthon for a wide variety of platforms. I think this is a goal that will could be achived in a year or so, given how things are going now. Also, a good tutorial would be essential. This would be less complete and less formal than actual NumPy documentation, and would include information on using all of the PyLab packages together. There is a good start to this on the scipy wiki. Third, I think it is important for all of the packages to work together. This has certainly happened in terms of compilation -- mpl and numpy have been released in lockstep so they work together despite major changes in numpy. The Python+NumPy+Scipy+Matplotlib +IPython suite seems like a great place to start, and I think more could be done to make these tools seem like one seamless superpackage. -Rob On Sep 28, 2006, at 12:09 PM, Fernando Perez wrote:
On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Hi all,
I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments.
Great, I think this is important to have. For reference, the link is:
http://www.scipy.org/NumPyProConPage
Cheers,
f _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331
I agree with Rob that python is slightly harder to figure out for complete beginners. And I agree that it lacks integration. I would like to have a application, with an icon on the Desktop, or in the menus, that you can start, and start right away typing calculations in, without importing packages, figuring out how with shell use, it might even have an editor. It would have proper branding, look pretty, have menus (including a help menu that would give help on python, scipy, and all the other packages)... I am (as everybody) lacking time to do this but I see enthought's envisage a good starting point for this. It seems possible to integrate pylab to it, in order to have dockable pylab windows. It already has an editor and a shell. The shell is not as nice as ipython: it is pycrust, but I hope one day ipython will be easy to integrate in wxpython applications, and that it will have syntax highlighting and docstrings popup like pycrust (beginners really love that). I think developing such an application would definitely help our community get more exposure. I know this will not interest the people who are currently investing a lot of time on scipy/ipython, as they are aiming for the other end of the spectrum: difficult tasks where no good answers are available, like distributed computing. I think that we should still keep this in mind, and as pycrust, envisage, and other inegration tools make progress see if and when we can put together such application. Maybe we should put up a wiki page to throw down some ideas about this. Gaël
I really don't agree that "python is slightly harder to figure out for complete beginners" (Gael). (You knew SOMEBODY would be disagreeable). I've used/taught Matlab, Mathcad, Scilab, and Python ... not to mention Fortran, QBasic, VBasic, etc. In Python, I click on Idle and start calculating. Maybe I have to "import math," but that's it. Sure, there are some points I have to know, but believe me, it's a lot easier than starting a beginner on Mathcad. Scilab is very similar to Matlab (except that Scilab is free, and has a nicer syntax for functions). If I want array calculations, it's one more "import," but otherwise no more difficult than Matlab and friends, and much easier than Mathcad. Now, suppose I want to do something a little more complicated, and I want a function. In Python (and Scilab), I can define the function on the fly, and then use it. In Matlab, I have to save the function in an "m-file" before I can use it, which brings up all kinds of problems of where to put the file, what to name it, etc. Easy enough for us, maybe, but not for our prototypical "complete beginner." (It's also not esthetically pleasing ... but that's a different problem.) In Mathcad, simple functions are pretty easy; complex ones are pretty not easy, but there's a fair bit to learn before you can make any of them work. I wouldn't expect a student to write functions in any of these without at least some background, and the required background in Python is certainly no more than that in any of the others. The biggest difference, for me, is that Python can "keep going." Matlab/Scilab/Mathcad all hit the wall fairly quickly, in terms of program size and/or complexity. But my real peeve is that Matlab is incomplete (batteries NOT included). I'm "visiting faculty" (hired help - I've retired, but I'm teaching a course). There came a point where we needed to solve a small set of simultaneous nonlinear equations. The students here are required to have Matlab, so I said, "Just call up fsolve." Oops. In Matlab, that's not in the student version. They'd have to buy the "optimization toolbox" to get it. EVERY other math system has it ... but it costs extra in Matlab. I wasn't fond of Matlab before that, but that little incident really soured me on it. (So I taught them flowsheet tearing. If you know what that is, you're a Chemical Engineer, and you're old.) I've never taught Python to "complete beginners," but I _have_ taught Matlab and Mathcad, and I can't imagine that Python would be any more difficult to teach. Maybe a beginners tutorial aimed at scientific calculation would help (although they exist on the web), but I think that the problem is really more of perception than reality. john Gael Varoquaux wrote:
I agree with Rob that python is slightly harder to figure out for complete beginners. And I agree that it lacks integration. I would like to have a application, with an icon on the Desktop, or in the menus, that you can start, and start right away typing calculations in, without importing packages, figuring out how with shell use, it might even have an editor. It would have proper branding, look pretty, have menus (including a help menu that would give help on python, scipy, and all the other packages)...
I am (as everybody) lacking time to do this but I see enthought's envisage a good starting point for this. It seems possible to integrate pylab to it, in order to have dockable pylab windows. It already has an editor and a shell. The shell is not as nice as ipython: it is pycrust, but I hope one day ipython will be easy to integrate in wxpython applications, and that it will have syntax highlighting and docstrings popup like pycrust (beginners really love that).
I think developing such an application would definitely help our community get more exposure. I know this will not interest the people who are currently investing a lot of time on scipy/ipython, as they are aiming for the other end of the spectrum: difficult tasks where no good answers are available, like distributed computing. I think that we should still keep this in mind, and as pycrust, envisage, and other inegration tools make progress see if and when we can put together such application. Maybe we should put up a wiki page to throw down some ideas about this.
Gaël
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
John, You have good argument and I certainly won't fight over you about that. I prefer python and find it easy to use. But I use object oriented programming, importing modules, functional programming, regexp, operator overloading, and some of my colleagues have difficulties with this. Maybe python can seem harder because it can be used at higher levels. I still thing that it is lacking an IDE where you can "click and start typing". -- Gaël
On 28/09/06, Gael Varoquaux <gael.varoquaux@normalesup.org> wrote:
John,
You have good argument and I certainly won't fight over you about that. I prefer python and find it easy to use. But I use object oriented programming, importing modules, functional programming, regexp, operator overloading, and some of my colleagues have difficulties with this.
This is a question of what *you* use - python is fairly good about keeping its fancy features out of the way of those who don't want to use them. The fact that you *can* write object-oriented programs doesn't affect the syntax you use to define a function (for example).
Maybe python can seem harder because it can be used at higher levels. I still thing that it is lacking an IDE where you can "click and start typing".
This is a question of user interface, and it's a good one. I use ipython and vim, mostly, and am still getting the hang of making them work together well (I had been using reload(module) constantly, and I just found %run; I expect there are other things I still need, like a good way to use PDB). This is not a terribly convenient user interface, although it could perhaps be made into one with a bit of integration work (and a good tutorial). IDLE seems to be an attempt to provide a convenient user interface integrating debugger, interactive session, editor, and help, but I pretty consistently crash it after half an hour of use, losing all my state in all the aforementioned omponents. One advantage of a single unified package is that people can start filing integration and UI bugs - "I start PyLab but my plots appear under all the other windows", "the editor has a different default directory than the interpreter", and so on. As for missing features, I'd like to see even very basic symbolic calculation tools, so that (for example) I could just call a symbolic derivative function to supply a Jacobian to the minimizers. From a bit of web research, swiginac seems like the most promising alternative. (This should naturally be integrated with the orthogonal polynomials.) A. M. Archibald
On Thu, Sep 28, 2006 at 04:06:23PM -0400, A. M. Archibald wrote:
As for missing features, I'd like to see even very basic symbolic calculation tools, so that (for example) I could just call a symbolic derivative function to supply a Jacobian to the minimizers. From a bit of web research, swiginac seems like the most promising alternative.
You can have a look at sympy too. It is very young, though. Gaël
Hi, AFAIK sympy does not support multivariable expressions or user-defined functions. These are useful in constructing Jacobians in the most syntactically neat way. But if you need something simple and in pure python right away, our PyDSTool project contains a modestly-featured symbolic toolbox in pure python, including functions to create Jacobians very easily. You don't have to keep the whole baggage of our project to use the module Symbolic.py. Although it currently only works with "old" SciPy/Numeric, we are working on migrating to NumPy, etc. right now. Also, it's not as professionally developed as swiginac/ginac and undoubtedly not as fast, but it has worked well for me on moderately-sized problems... e.g. for a vector field defined in a test example of SciPy's VODE integrator:
y0=Var('y0') y1=Var('y1') y2=Var('y2')
ydot0=Fun(-0.04*y0 + 1e4*y1*y2, [y0, y1, y2], 'ydot0') ydot2=Fun(3e7*y1*y1, [y0, y1, y2], 'ydot2') ydot1=Fun(-ydot0(y0,y1,y2)-ydot2(y0,y1,y2), [y0, y1, y2], 'ydot1')
F = Fun([ydot0(y0,y1,y2),ydot1(y0,y1,y2),ydot2(y0,y1,y2)], [y0,y1,y2], 'F')
# Diff returns a symbolic object so would like to turn back into a # function of y0, y1, y2
jac=Fun(Diff(F,[y0,y1,y2]), [y0, y1, y2], 'Jacobian')
# Calls to this function return a symbolic object by default
jac(0.1, 0.3, 0.5) QuantSpec Jacobian (ExpFuncSpec)
# so use
print jac(0.1, 0.3, 0.5) [[-0.040000000000000001,5000.0,3000.0],[0.040000000000000001,-18005000.0,-3000.0],[0,18000000.0,0]]
# or use .tonumeric() because there are no free names left in the # resulting expression
jac(0.1, 0.3, 0.5).tonumeric() array([[ -4.00000000e-02, 5.00000000e+03, 3.00000000e+03], [ 4.00000000e-02, -1.80050000e+07, -3.00000000e+03], [ 0.00000000e+00, 1.80000000e+07, 0.00000000e+00]])
# The beauty of this is being able to do use other symbols in the call, # substitutions, etc.
x=Var('x') j = jac(x, x, 0.5) print j [[-0.04,10000*0.5,10000*x],[0.040000000000000001,(-10000*0.5)-30000000*2*x,-10000*x],[0,60000000*x,0]] jsubs = j.eval(x=10) jsubs.tonumeric() array([[ -4.00000000e-02, 5.00000000e+03, 1.00000000e+05], [ 4.00000000e-02, -6.00005000e+08, -1.00000000e+05], [ 0.00000000e+00, 6.00000000e+08, 0.00000000e+00]])
Automatic simplification of the resulting float-only sub-expressions (like "10000*0.5") isn't fully working yet for these "symbol arrays", but it does work in the scalar case. Anyway, check out the wiki page Symbolic at pydstool.sourceforge.net/ if you want more examples and documentation. Hope this is of interest! -Rob On Thu, 28 Sep 2006, Gael Varoquaux wrote:
On Thu, Sep 28, 2006 at 04:06:23PM -0400, A. M. Archibald wrote:
As for missing features, I'd like to see even very basic symbolic calculation tools, so that (for example) I could just call a symbolic derivative function to supply a Jacobian to the minimizers. From a bit of web research, swiginac seems like the most promising alternative.
You can have a look at sympy too. It is very young, though.
Ga�l
On Thu, Sep 28, 2006 at 07:34:37PM -0400, Robert Clewley wrote:
But if you need something simple and in pure python right away, our PyDSTool project contains a modestly-featured symbolic toolbox in pure python, including functions to create Jacobians very easily.
Nice, if I find some time I'll have a look. Ga�l
On Sep 28, 2006, at 3:06 PM, A. M. Archibald wrote:
This is a question of what *you* use - python is fairly good about keeping its fancy features out of the way of those who don't want to use them. The fact that you *can* write object-oriented programs doesn't affect the syntax you use to define a function (for example).
Yes, you can keep it simple, and not use classes out of the box. However, you are faced, immediately, with a bunch of different sequence objects -- at least tuples, lists, and arrays. Then there is the issue of methods, which most science students I know have not encountered. These things are not insurmountable, and the argument made below that a function in matlab must be a file is really powerful (and one I must have blotted from my memory). All of the arguments made *for* PyLab are true -- you think so too, or you wouldn't be reading this. I have been a huge proponent of PyLab, and have taught seminars on it here at Texas A&M and Woods Hole to people who primarily use MATLAB. I have heard a number of objections or excuses that it all looks good, but..... - it's hard to install - I already know how to use MATLAB, and it works fine for me - when do I find a week (or month or semester) to learn a new programing language - I already have so many m-files that I would need to rewrite Then there are the issues of bugs and beta quality software (not true anymore for numpy, but still true for mpl), small user community in your own research community (e.g., Oceanographers all use MATLAB), etc. We need to think about the objections of the people who are *not* already here, and make sure we have an easy way for them to join us on the true path... -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331
"Rob" == Rob Hetland <hetland@tamu.edu> writes:
Rob> All of the arguments made *for* PyLab are true -- you think Rob> so too, or you wouldn't be reading this. I have been a huge Rob> proponent of PyLab, and have taught seminars on it here at Rob> Texas A&M and Woods Hole to people who primarily use MATLAB. Rob> I have heard a number of objections or excuses that it all Rob> looks good, but..... - it's hard to install - I already know Rob> how to use MATLAB, and it works fine for me - when do I find Rob> a week (or month or semester) to learn a new programing Rob> language - I already have so many m-files that I would need Rob> to rewrite The first thing this thread makes me think is: why does wikipedia work but wikis for scientific python not. If we followed Travis' lead and aggregated the collective wisdom on this thread into the wiki page, we would have something enduring for the masses. As it is, only geeks like us who read mailing lists or archives will benefit from it. Maybe this points to the problem: the primary users and developers of scientific computing in python are sufficiently technologically literate that they not only overcome the additional complexity, they need it and crave it. I was a huge matlab user for almost a decade; I tried to write a book about matlab (see http://matplotlib.sf.net/matlab_cookbook.pdf, unfortunately as incomplete as the mpl cookbook and other documentation). At some point I "hit the wall" and could no longer be productive in matlab. The extra overhead of managing complex data structures, developing complex GUIs, and working with networked data and databases was consuming most of my programming energy. Yes, matlab provides you a simple, comprehensive interface, and a fairly complete set of numerical libs, but when you want to work with complex data in a realistic networked environment, you hit the limits of the language and environment pretty hard. Then you rewrite what you like about matlab in python and get on with it. matlab is a great tool for beginners and intermediates. For experts, it has limitations which are hard to overcome. My advice to students: if you aspire to be an expert, bite the bullet now and build a set of tools that can scale with you on your ascent. Also, realize that The Mathworks is like the crack dealer on the street: the first hit is free; once you are addicted it becomes quite expensive. An academic license or a student version sells for under $100. If you are a business and need the important toolkits, you are looking at 50K per year. If you are an entrepreneurial student and dream of starting your own business once you graduate, ask yourself what you could do with the extra cash saved from a single site license. If your fledgling business grows, ask yourself what you can do with the cash saved from 50 site licenses (hint, that is 2.5 million dollars a year). If you are ready to spend the 2.5 million dollars, fine, but first try the following exercises in matlab and python * download and parse a CSV file from a web server, eg http://ichart.finance.yahoo.com/table.csv?s=INTC&d=8&e=29&f=2006&g=d&a=6&b=9&c=1986&ignore=.csv (for a python implementation, see the matplotlib.finance module) * fill out a web CGI form in matlab (hint: you can do it with the embedded JVM, a virtual machine running in a virtual machine) * query a mysql database on linux, win32, and OS X with the same script and populate an array with the results Now how much would you pay? PS: it's been a while since I looked at that matlab cookbook I was working on. I find the following sections of the matlab PDF linked above fun in a historical light:: Alternatives to matlab I am a devotee of open source software. I (almost exclusively) use linux as an operating system, emacs as an integrated development environment, python for small and large scale programming, C++ for numerics, and so on. Matlab is the only commercial piece of software I use regularly. I really don't want to use it, mainly because it is so expensive. I work in an academic environment, where site licenses go for the incredibly cheap price of $75 per year, toolboxes included. Check out the commercial price list to get an idea of just how expensive it is outside of academia. I'll give you a hint. About as much as a new Lexus sport utility vehicle. So aside from my support for GNU and linux and open source software, I don't want to wake up some day outside the folds of academia having to pay for matlab. Every day I use matlab is another set of plotting and analysis functions that I come to rely on, which makes it increasingly hard to go cold turkey. Every once in a while I make an aborted attempt to give it up (I know it's not good for me) but I always find myself coming back. The main reason is the graphics -- the ease with which I can make publication quality figures that I just haven't found in competing, open source, free as in Richard Stallman (http://www.gnu.org/philosophy/free-sw.html), solutions. Free alternatives * python -- python is the one true language. I have written extensively in perl, C++, FORTRAN, BASIC, and yes matlab, and in python I have found the one true language. I say that with tongue in cheek -- there is no one true language, because the strengths of a language often imply its weaknesses. The classic trade offs between user friendliness and power, expressiveness and readability, development time and execution time. python solves all these problems for me because it is so clear syntactically, has so many great libraries built in, and so many great external libraries. In the final category, relevant to this discussion, is numpy (http://www.pfdubois.com/numpy) and its recent successor scipy (http://www.scipy.org). These libraries provide efficient C/C++/FORTRAN libraries, all wrapped in python, that give you a huge array of highly tested, optimized, numerical libraries, for free. And you can read and modify the source code at will, in large part obviating the classic problem of closed source (matlab) libraries. That in a few years, when another platform is dominant, your solution of today is no longer supported. With open source, your solution is supported as long as users continue to use it and support it. SGI was the proprietary platform of choice for high performance graphics software 5 years ago. Today, support and maintenance have become increasingly difficult and expensive. And while numerous graphics packages for scipy exist, none compare to the breadth, ease of use, generality and quality of the matlab libraries. Yet. As a general rule, open source solutions follow excellent close source solutions with a short time lag. Witness the gimp, an excellent drop in replacement for Photoshop). So keep your eye on python for standardized, excellent graphics solutions in the near future. If you want to split the difference, python does support an interface to matlab called pymat (http://claymore.engineer.gvsu.edu/~steriana/Python/pymat.html), so you can do your number crunching in numpy, and pass the results off to matlab for plotting, thus minimizing your dependence on matlab until the final step of producing graphical output. * octave (http://www.octave.org) Octave is an open source clone of matlab. Many m-files will run in octave without changes. But when you start to make plots, you'll hit incompatibilities. octave uses gnuplot for plotting, and the support, particularly for handle graphics, is limited, as is the quality of the graphics produced. JDH
On 28/09/06, John Hunter <jdhunter@ace.bsd.uchicago.edu> wrote:
"Rob" == Rob Hetland <hetland@tamu.edu> writes:
The first thing this thread makes me think is: why does wikipedia work but wikis for scientific python not. If we followed Travis' lead and aggregated the collective wisdom on this thread into the wiki page, we would have something enduring for the masses. As it is, only geeks like us who read mailing lists or archives will benefit from it. Maybe this points to the problem: the primary users and developers of scientific computing in python are sufficiently technologically literate that they not only overcome the additional complexity, they need it and crave it.
As an ex-Wikipedia addict, my first thought was to go and start hacking on the page. But there's no discussion page! You have to just go in and start mangling it and hope nobody minds (because if they do, all they can do is start changing it back, with acerbic comments...) A. M. Archibald
"A" == A M Archibald <peridot.faceted@gmail.com> writes:
A> As an ex-Wikipedia addict, my first thought was to go and start A> hacking on the page. But there's no discussion page! You have A> to just go in and start mangling it and hope nobody minds A> (because if they do, all they can do is start changing it back, A> with acerbic comments...) I could be wrong, but I think it safe to say that unless you are a psychopath, we would all be much obliged if you just jumped in and started hacking on the page. JDH
On Thu, Sep 28, 2006 at 11:17:23PM -0400, A. M. Archibald wrote:
As an ex-Wikipedia addict, my first thought was to go and start hacking on the page. But there's no discussion page! You have to just go in and start mangling it and hope nobody minds (because if they do, all they can do is start changing it back, with acerbic comments...)
I think you should just go ahead and do it, people don't fight over the wiki at scipy.org. If you have controversial changes to do, discuss them on the mailing list, but elsewhere just edit the page. Speaking of which, John, can we quote you on the wiki ? And a question to everybody, where should we put that quote ? On a dedicated page with a link to it from the ProConPage ? Gaël
"Gael" == Gael Varoquaux <gael.varoquaux@normalesup.org> writes:
Gael> Speaking of which, John, can we quote you on the wiki ? And Gael> a question to everybody, where should we put that quote ? On Gael> a dedicated page with a link to it from the ProConPage ? My general policy on people quoting me is "anywhere and everywhere" :-) I think we should concentrate on making the main page as comprehensive and useful as possible. So put as much there as you can. If you want to summarize arguments on the main page and provide links to supporting material do so. It's a wiki! JDH
As a newbie and non-programmer... If I could alter my knowledge/experiences in a flash, I would add some (not exactly sure what though) C programming after doing a thorough Python course. Then I would read a few good articles on open source and what it is. Get up to speed with the terminology used in mailing lists and what they refer to, and how to use a mailing list. Get people excited about open source. I think many first years (and non-programmers like myself) will then better understand what PyLab is about. Also, I realised that I need to take my MATLAB cap off when working with PyLab. When I do that, I tend to get to solutions quicker, I think one should come to the party with an attitude of "This is a different tool to solve engineering/science problems with." My biggest gripe is the inconsistent look of of the wikis, but I suppose I could change that. Lot of work for nothing really... As for easy installation: Enthought's a beauty (bit big though). A Windows PyLab installer and a Linux PyLab deb or rpm would be great... Is this realistic though, taking into acount all the Linux flavours out there? -- WH
On 28/09/06, Gael Varoquaux <gael.varoquaux@normalesup.org> wrote:
On Thu, Sep 28, 2006 at 11:17:23PM -0400, A. M. Archibald wrote:
As an ex-Wikipedia addict, my first thought was to go and start hacking on the page. But there's no discussion page! You have to just go in and start mangling it and hope nobody minds (because if they do, all they can do is start changing it back, with acerbic comments...)
I think you should just go ahead and do it, people don't fight over the wiki at scipy.org. If you have controversial changes to do, discuss them on the mailing list, but elsewhere just edit the page.
Speaking of which, John, can we quote you on the wiki ? And a question to everybody, where should we put that quote ? On a dedicated page with a link to it from the ProConPage ?
Well, I'll let you judge about the psychopathy, but I created a discussion page and put some discussion there (all mine, so far, but please do add your own). A. M. Archibald
Hi John, * John Hunter <jdhunter@ace.bsd.uchicago.edu> wrote:
"Rob" == Rob Hetland <hetland@tamu.edu> writes:
Rob> All of the arguments made *for* PyLab are true -- you think Rob> so too, or you wouldn't be reading this. I have been a huge Rob> proponent of PyLab, and have taught seminars on it here at Rob> Texas A&M and Woods Hole to people who primarily use MATLAB. Rob> I have heard a number of objections or excuses that it all Rob> looks good, but..... - it's hard to install - I already know Rob> how to use MATLAB, and it works fine for me - when do I find Rob> a week (or month or semester) to learn a new programing Rob> language - I already have so many m-files that I would need Rob> to rewrite
[...]
So aside from my support for GNU and linux and open source software, I don't want to wake up some day outside the folds of academia having to pay for matlab. Every day I use matlab is another set of plotting and analysis functions that I come to rely on, which makes it increasingly hard to go cold turkey. Every once in a while I make an aborted attempt to give it up (I know it's not good for me) but I always find myself coming back. The main reason is the graphics -- the ease with which I can make publication quality figures that I just haven't found in competing, open source, free as in Richard Stallman (http://www.gnu.org/philosophy/free-sw.html), solutions.
Pretty interesting! I hope I can convince my Prof to go for pyhton using your arguments ... but I do not understand that there is a problem with publication quality graphics. For 2D matplotlib should be enough; for 3D mayavi2/tvtk and python-vtk should be able to produce quality graphics too!? Greetings! Fabian
Fabian Braennstroem wrote:
Pretty interesting! I hope I can convince my Prof to go for pyhton using your arguments ... but I do not understand that there is a problem with publication quality graphics. For 2D matplotlib should be enough; for 3D mayavi2/tvtk and python-vtk should be able to produce quality graphics too!?
I don't really understand either. As I said earlier, I have some grips about matplotlib, but the export is not one of them, I would say it is a strong point of matplotlib, on the contrary; I always find matlab's way of exporting clumsy, and latex support in matplotlib is easier to use, too. The only problem I ever had with matplotlib for exporting to eps for articles is to have wasted one half hour to find a way to get transparent background. David
On 9/29/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Fabian Braennstroem wrote:
... but I do not understand that there is a problem with publication quality graphics. For 2D matplotlib should be enough; for 3D mayavi2/tvtk and python-vtk should be able to produce quality graphics too!?
I don't really understand either. As I said earlier, I have some grips about matplotlib, but the export is not one of them,
I think John's gripe about plotting was from a day gone by: John Hunter said: "PS: it's been a while since I looked at that matlab cookbook I was working on. I find the following sections of the matlab PDF linked above fun in a historical light::" --bb
I don't now the matplotlib history that well but an educated guess is that matplotib wasn't written at that time (1 october 2003), or at least it was in a very early stage. Otto Quoting David Cournapeau <david@ar.media.kyoto-u.ac.jp>:
Fabian Braennstroem wrote:
Pretty interesting! I hope I can convince my Prof to go for pyhton using your arguments ... but I do not understand that there is a problem with publication quality graphics. For 2D matplotlib should be enough; for 3D mayavi2/tvtk and python-vtk should be able to produce quality graphics too!?
I don't really understand either. As I said earlier, I have some grips about matplotlib, but the export is not one of them, I would say it is a strong point of matplotlib, on the contrary; I always find matlab's way of exporting clumsy, and latex support in matplotlib is easier to use, too. The only problem I ever had with matplotlib for exporting to eps for articles is to have wasted one half hour to find a way to get transparent background.
David
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
"Otto" == Otto Tronarp <otto@tronarp.se> writes:
Otto> I don't now the matplotlib history that well but an educated Otto> guess is that matplotib wasn't written at that time (1 Otto> october 2003), or at least it was in a very early stage. Yes, exactly, I abandoned my matlab cookbook project to solve the problems addressed in that note and wrote matplotlib. JDH
Gael Varoquaux wrote:
John,
[snippage]
Maybe python can seem harder because it can be used at higher levels. I still thing that it is lacking an IDE where you can "click and start typing".
But that's exactly my point. For the most common sort of calculations one does as an undergraduate, using Python is exactly like using Matlab, with trivial differences. Here's a little homework problem - calculating the flow velocity in a packed column. This requires the solution of a quadratic equation: Click on Idle. Then start typing: Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 1.2
from numpy import * # Yes, I know, but ... rho = 62.4 rhop = 160. # Use decimal point, or use from __future ... psi = 0.95 Dp = 0.02/12 mu = 1*6.72e-4 eps = 0.42 g = 32.174 a = 1.75*rho/(psi*Dp*eps**3) b = 150.*mu*(1-eps)/((psi*Dp)**2*eps**3) c = g*(rhop - rho) print a,b,c 14918.248001 314771.892136 3140.1824 roots([a,b,-c]) array([-0.34783557, 0.00969792])
Velocity is 0.097 m/sec.
This is how the students use Matlab - the problems are longer, but the basic idea is the same. You also say: Gael: "But I use object oriented programming, importing modules, functional programming, regexp, operator overloading, and some of my colleagues have difficulties with this." Do they also have difficulty doing these in Matlab??????? (You can't even DO most of these in Matlab. Most Matlab users wouldn't even recognize the words.) (Ok, ok, ad hominem towards Matlab, I apologize.) john
I pretty much agree with Rob's points. Ease of installation and having all the basic tools packaged together (note, this doesn't mean everything) is a big deal to many users. Despite others' comments to the contrary, that Python is more general than matlab will make it less seamless. Beginners can get confused over simple things like why math.add doesn't work on an array. The fact is that most of the rest of the Python world isn't array aware (and don't care). Matlab or IDL pretty much ensure that everything integrates with arrays. I don't think this issue will ever go away. We can minimize it, but we will never (imho) beat matlab in this respect. On the other hand, we can sell people on the fact that Python is much more useful for other things so they don't have to learn some other tool for those. Perry On Sep 28, 2006, at 2:13 PM, Rob Hetland wrote:
I think everybody on this list is pretty familiar with the pros of PyLab (although a good set of arguments in one place is a Good Idea).
Perhaps it would be productive for us to start a discussion of cons here on the list, and decide how to mitigate them. Here I give some of my impressions on how the novice might see PyLab.
I am teaching a class on data assimilation this semester, and I have decided to use PyLab. I had used MATLAB in the past, and I had many students who weren't familiar with how it worked and there were licensing issues. Most students could eventually figure out how to deal with the language; vectorization is always a tough thing to get. Also, they could purchase a student version of MATLAB for about $50 -- not free, but cheaper than many college textbooks.
Using PyLab, I have a different set of issues. First is installation. I recommended Enthon python to the PC users, and I helped the Mac users deal with the various distributions. Even with a number of clickable installer packages, putting python on their computers was not straightforward.
Then there is the issue with Python itself. Python is a more powerful language than MATLAB's core programming language, and students have a slightly steeper learning curve figuring all of that out. They suddenly have to deal with importing packages, zero-based indexing, many different kinds of sequences, methods, and a host of other issues. Additionally, numpy is a more powerful array tool, in my opinion, but it is also harder to learn. In the final analysis, students had similar issues working with MATLAB, but my impression is that they find MATLAB slightly easier. Also, many of them come to the class with MATLAB experience -- none of them have used python, let alone any of the scientific packages, before. In the long run, python is clearly better, but in the short term, I think MATLAB might be simpler.
Finally, I think that one of the reasons MATLAB is successful is that it includes everything together, in one place. It does some things very poorly, but I was always willing to put up with MATLAB's weaknesses to remain within a single environment. PyLab is starting to feel like that, but there are still some pretty clear boundaries between the packages, and an almost overwhelming array of choices if you want to look for them. This is good if you are a geek and what plotting package X or numeric package Y and you still want everything else to work, but bad if you are a novice and just want things to work simply and smoothly together.
Thus, the first thing that would improve PyLab usage the most, for the majority of people migrating from MATLAB, would be a cohesive, easy to install package. Like Enthon for a wide variety of platforms. I think this is a goal that will could be achived in a year or so, given how things are going now.
Also, a good tutorial would be essential. This would be less complete and less formal than actual NumPy documentation, and would include information on using all of the PyLab packages together. There is a good start to this on the scipy wiki.
Third, I think it is important for all of the packages to work together. This has certainly happened in terms of compilation -- mpl and numpy have been released in lockstep so they work together despite major changes in numpy. The Python+NumPy+Scipy+Matplotlib +IPython suite seems like a great place to start, and I think more could be done to make these tools seem like one seamless superpackage.
-Rob
On Sep 28, 2006, at 12:09 PM, Fernando Perez wrote:
On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Hi all,
I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments.
Great, I think this is important to have. For reference, the link is:
http://www.scipy.org/NumPyProConPage
Cheers,
f _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Quoting Perry Greenfield <perry@stsci.edu>:
Despite others' comments to the contrary, that Python is more general than matlab will make it less seamless. Beginners can get confused over simple things like why math.add doesn't work on an array. The fact is that most of the rest of the Python world isn't array aware (and don't care). Matlab or IDL pretty much ensure that everything integrates with arrays.
This touches the subject of inconsistency with math.add as one example. One of my "favorite" peeve's is how the size (shape) is given to functions that generate matrices. In some functions the size is given as a tuple, in other the size in different dimensions are given as seperate arguments. Here are some examples: To to create an array with shape M, N you do: zeros((M, N)) ones((M, N)) rand(M, N) eye(M, N) I'm sure more examples exists. Is it only me that find that increadible irretating? Otto
On 9/29/06, Otto Tronarp <otto@tronarp.se> wrote:
This touches the subject of inconsistency with math.add as one example. One of my "favorite" peeve's is how the size (shape) is given to functions that generate matrices. In some functions the size is given as a tuple, in other the size in different dimensions are given as seperate arguments. Here are some examples:
To to create an array with shape M, N you do: zeros((M, N)) ones((M, N)) rand(M, N) eye(M, N)
I'm sure more examples exists.
Yep, repmat(A,M,N) is another one. http://projects.scipy.org/scipy/numpy/ticket/292 But I think you'll find that rand at least is no longer there. Instead you're supposed to use random.random((M,N)) eye(M,N) is considered to be ok, because 1-D eye is pretty useless, and it's not clear what you'd want out of N-D identity matrix. There has been a lot of discussion about trying to unify on shape args always being tuples. --bb
Quoting Bill Baxter <wbaxter@gmail.com>:
On 9/29/06, Otto Tronarp <otto@tronarp.se> wrote:
This touches the subject of inconsistency with math.add as one example. One of my "favorite" peeve's is how the size (shape) is given to functions that generate matrices. In some functions the size is given as a tuple, in other the size in different dimensions are given as seperate arguments. Here are some examples:
To to create an array with shape M, N you do: zeros((M, N)) ones((M, N)) rand(M, N) eye(M, N)
I'm sure more examples exists.
Yep, repmat(A,M,N) is another one. http://projects.scipy.org/scipy/numpy/ticket/292
But I think you'll find that rand at least is no longer there. Instead you're supposed to use random.random((M,N))
eye(M,N) is considered to be ok, because 1-D eye is pretty useless, and it's not clear what you'd want out of N-D identity matrix.
That is valid point, but I would still argue for unified use of shape args. Assume that I have a function that does some calculations on a matrix and adds another like this: def foo(A, mxCreateFunc): # do some stuff with A return A + mxCreateFunc(A.shape) If we hade unified shape args I could do foo(A, random) # Add some noice foo(A, zeros # Don't add nocie foo(A, eye) # ?? I must admitt that I don't have any real world use cases for eye at hand, but I'm sure they are out there... Otto
There has been a lot of discussion about trying to unify on shape args always being tuples.
--bb _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
On 9/29/06, Otto Tronarp <otto@tronarp.se> wrote:
Quoting Bill Baxter <wbaxter@gmail.com>:
On 9/29/06, Otto Tronarp <otto@tronarp.se> wrote:
This touches the subject of inconsistency with math.add as one example. One of my "favorite" peeve's is how the size (shape) is given to functions that generate matrices. In some functions the size is given as a tuple, in other the size in different dimensions are given as seperate arguments. Here are some examples:
eye(M,N) is considered to be ok, because 1-D eye is pretty useless, and it's not clear what you'd want out of N-D identity matrix.
That is valid point, but I would still argue for unified use of shape args. Assume that I have a function that does some calculations on a matrix and adds another like this:
Yeh, I don't disagree with you. The other argument is that eye((3,3)) is more typing for interactive sessions, and seems weird to people coming from matlab. Why the extra parens? Matlab generally accepts either form for functions like that. Personally I wrote my own versions of eye(), rand(), zeros(), ones(), and empty() that work either with tuples or separate args. I think that's the best solution, but some folks here cringe at the thought of overloading Python functions in that way. It's true that overloading is not very pretty in Python, but the users don't have to look at the code in the library. :-) As long as it looks simple from the outside. (At least that's the C++ motto). --bb
I went back to the beginning of the thread, to find out what I was actually talking about. I interpreted the question to mean using Python vs Matlab the way Matlab is commonly used by students. I now see that this is too restrictive, but still, I think it's representative of a large class of users. So what, exactly, is the question? What sort of user do we mean? Somebody who has used other "array environments" would have no difficulty switching to Python. Someone who is completely new to computer computation would seem to me to be unlikely to use any advanced features of the language. Matlab has some specialized (as in "expensive") toolboxes for special problems; do we mean these? I'm familiar with the controls toolbox, and by omission, with the optimization toolbox. Neither has anything that an undergraduate student would use that isn't also in SciPy. I don't know anything about any of the other toolboxes. As an aside, I use Python with Jedit. It serves as a perfectly usable combination, at least as convenient as Matlab with its built-in editor. (I've used Scite and PSPad, too, but I personally like Jedit better.) I've got it set up so that I save and hit F5 to run ... as in Matlab. john Travis Oliphant wrote:
Hi all,
I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments.
The purpose is not to go into detail about semantic differences, but document higher-level differences that might help somebody decide whether or not they could use NumPy instead of some other environment. I've started with a comparison to MATLAB, based on an email response I sent to a friend earlier today.
Additions and corrections welcome.
-Travis O.
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
The purpose is not to go into detail about semantic differences, but document higher-level differences that might help somebody decide whether or not they could use NumPy instead of some other environment. I've started with a comparison to MATLAB, based on an email response I sent to a friend earlier today.
I think this is an important resource to have around. In the context of choosing an array environment for a project where you need speed it's important that python's advantages are made clear and easy to communicate. That way we all get to use python :)
The numpy for matlab users page ( http://www.scipy.org/NumPy_for_Matlab_Users ) also list a number of pros and cons. So far, I find the biggest cons to numpy to be 1) integration of plotting is not as good as matlab. You have to be careful about calling "show()" in matplotlib because of event-loop integration issues. Also no good 3D plotting solution. MayaVi is supposed to be good, but it would be better if it were all just built into matplotlib. 2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this. --bb On 9/29/06, stephen emslie <stephenemslie@gmail.com> wrote:
On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
The purpose is not to go into detail about semantic differences, but document higher-level differences that might help somebody decide whether or not they could use NumPy instead of some other environment. I've started with a comparison to MATLAB, based on an email response I sent to a friend earlier today.
Bill Baxter wrote:
The numpy for matlab users page ( http://www.scipy.org/NumPy_for_Matlab_Users ) also list a number of pros and cons.
2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
--bb
You might try the Eric IDE: http://www.die-offenbachs.de/detlev/eric3.html The GUI is a little complicated/full, but now that I've started to get the hang of it, I really like it. Instead of a console, it gives you a list of locals and globals while in debug mode. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma rmay@rossby.ou.edu
On Thu, 28 Sep 2006, Ryan May wrote:
2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
While winpdb is not a full-blown IDE, it is a gui front end to pdb. It does a very good job of showing you what's going on, and you can examine the value of variables and do other explorations at breakpoints or while stepping over (or through) your code. Rich -- Richard B. Shepard, Ph.D. | The Environmental Permitting Applied Ecosystem Services, Inc.(TM) | Accelerator <http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863
I checked out winpdb, but I didn't see any debug console the likes of Matlab or Wing IDE. I recall that it also integrates pretty with Stani's Python Editor (SPE) to make something very close to an IDE, but I couldn't find the equivalent of the debug console I was looking for in that combo. Looking here: http://www.digitalpeers.com/pythondebugger/images/screenshot_winpdb.png there is a big 'console' window, but it looks like that 'console' is a console for the debugger, rather than regular python shell, like a GUI version of what ipython gives you. That means you issue _debugger_ commands at the prompt, rather than plain old python expressions. IIRC, one of those debugger commands is indeed "eval a python expression", but that's an unnecessary layer of inconvenience. If there is a way to get a real python shell executing in the context of the program being currently debugged with ipython or with winpdb, I would certainly like to know about it, and would like to advertise that ability on the numpy for matlab users page. I was really quite surprised to not find any debuggers or IDEs that had this very useful feature of matlab. --bb On 9/29/06, Rich Shepard <rshepard@appl-ecosys.com> wrote:
On Thu, 28 Sep 2006, Ryan May wrote:
2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
While winpdb is not a full-blown IDE, it is a gui front end to pdb. It does a very good job of showing you what's going on, and you can examine the value of variables and do other explorations at breakpoints or while stepping over (or through) your code.
Rich
On 9/28/06, Bill Baxter <wbaxter@gmail.com> wrote:
The numpy for matlab users page ( http://www.scipy.org/NumPy_for_Matlab_Users ) also list a number of pros and cons.
So far, I find the biggest cons to numpy to be 1) integration of plotting is not as good as matlab. You have to be careful about calling "show()" in matplotlib because of event-loop integration issues. Also no good 3D plotting solution. MayaVi is supposed to be good, but it would be better if it were all just built into matplotlib. 2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
You may want to try ipython. It's a console program, not an IDE, but it does both of the above (no 3d plotting, just integrating 'intelligently' with mpl). I'll be happy to provide you with further details if you have questions. Cheers, f
On 9/29/06, Fernando Perez <fperez.net@gmail.com> wrote:
On 9/28/06, Bill Baxter <wbaxter@gmail.com> wrote:
The numpy for matlab users page ( http://www.scipy.org/NumPy_for_Matlab_Users ) also list a number of pros and cons.
So far, I find the biggest cons to numpy to be 1) integration of plotting is not as good as matlab. You have to be careful about calling "show()" in matplotlib because of event-loop integration issues. Also no good 3D plotting solution. MayaVi is supposed to be good, but it would be better if it were all just built into matplotlib. 2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
You may want to try ipython. It's a console program, not an IDE, but it does both of the above (no 3d plotting, just integrating 'intelligently' with mpl). I'll be happy to provide you with further details if you have questions.
Hey Fernando. I actually broke down and started using ipython instead of pyCrust recently, despite my dislike for being stuck in the lame Windows console. It is a great shell, (love the ? and "func arg" --> "func(arg)" features). Just sad that it's locked into text mode. Hopefully the ipython1 project will keep moving along so we can have ipython in a GUI before long. As for its debug console ability, I wasn't aware of that. I knew you could have it trigger the debugger at a breakpoint, but then you drop into a prompt with debugger syntax rather than normal python syntax, right? So you have to prefix every command with something. Is there some other mode that I'm not aware of that gives you a regular console? No gui also means setting breakpoints by line number or function name, no? --bb
On 9/28/06, Bill Baxter <wbaxter@gmail.com> wrote:
Hey Fernando. I actually broke down and started using ipython instead of pyCrust recently, despite my dislike for being stuck in the lame Windows console. It is a great shell, (love the ? and "func arg" --> "func(arg)" features). Just sad that it's locked into text mode. Hopefully the ipython1 project will keep moving along so we can have ipython in a GUI before long.
It's moving along...
As for its debug console ability, I wasn't aware of that. I knew you could have it trigger the debugger at a breakpoint, but then you drop into a prompt with debugger syntax rather than normal python syntax, right? So you have to prefix every command with something. Is there some other mode that I'm not aware of that gives you a regular console?
Well, the ipdb console is a primitive python console, but any single-line expression which is valid python will be directly evaluated. In addition, it has a few extra commands (type help to see them). I happen to find it quite satisfactory for most of my needs, but I'm sure better could be done.
No gui also means setting breakpoints by line number or function name, no?
Yes, that's certainly true. pdb is fairly gdb-like in that respect. I'm certainly /not/ claiming ipython to be an IDE, simply that it does have some useful features. For a certain class of users, probably those who prefer the emacs/vi/favorite editor + terminal combination to an IDE, it seems to do the trick quite nicely. And yes, things are moving along to make sure that it's even better in the future, with GUI integration, notebook-type environments and lots more. It's slowly but surely coming together. Cheers, f
Bill Baxter wrote:
Hey Fernando. I actually broke down and started using ipython instead of pyCrust recently, despite my dislike for being stuck in the lame Windows console. It is a great shell, (love the ? and "func arg" --> "func(arg)" features).
everytime I am stuck on windows, I have the same problem. I don't know if it is doable, but you may want to see if ipython can work inside console (I don't know how is the console application separated from the shell, and if pyreadline can be used inside something else than cmd.exe): http://sourceforge.net/projects/console/ At least, you can change the font, have tab consoles, and change the size of the window. David
On 9/29/06, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Bill Baxter wrote:
Hey Fernando. I actually broke down and started using ipython instead of pyCrust recently, despite my dislike for being stuck in the lame Windows console. It is a great shell, (love the ? and "func arg" --> "func(arg)" features).
everytime I am stuck on windows, I have the same problem. I don't know if it is doable, but you may want to see if ipython can work inside console (I don't know how is the console application separated from the shell, and if pyreadline can be used inside something else than cmd.exe):
http://sourceforge.net/projects/console/
At least, you can change the font, have tab consoles, and change the size of the window.
Just yesterday it was reported on the ipython list that it works like a charm. Ville Vainio (the trunk maintainer) was very pleased with it, after another user mentioned it on the list. It sounds like a good alternative for those who are forced to use Windows for one reason or another, and would like a sane terminal to work in. Cheers, f
Bill Baxter wrote:
The numpy for matlab users page ( http://www.scipy.org/NumPy_for_Matlab_Users ) also list a number of pros and cons.
So far, I find the biggest cons to numpy to be 1) integration of plotting is not as good as matlab. You have to be careful about calling "show()" in matplotlib because of event-loop integration issues. Also no good 3D plotting solution. MayaVi is supposed to be good, but it would be better if it were all just built into matplotlib. 2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
Concerning point 1, matplotlib is better than matlab for some things, but much worse for other (it can also be that I am just clueless). For example, interactive plot is not great with matplotlib (zoom and so), and much slower for redrawing (but I think this is a consequence of the flexibility of matplotlib). As Someone said it before, matlab makes it easier for beginners; but once you hit the wall, you hit it very hard :) I am using scipy for a few months now, after several years of matlab which I consider myself at least moderately knowledgeable about, and there is no coming back for me. After 2 weeks, I was more or less as efficient in numpy as I was in matlab The things I consider much better in matlab are: - interactive plots - profiling -> this is the thing I am missing the most. If one of my module is slow, I find it really hard to find where the problems are compared to matlab. - size of community: in machine learning and signal processing, matlab is kind of pervasive. In machine learning, particularly, you have tons of software available on the internet for free. Something nobody has mentioned before for pros for numpy is integration with C code: this is really clumsy in matlab, and it works great with python (ctypes, swig, boost, pyrex are all great tools for that, depending on what you want to do). David
Something nobody has mentioned before for pros for numpy is integration with C code: this is really clumsy in matlab, and it works great with python (ctypes, swig, boost, pyrex are all great tools for that, depending on what you want to do).
It is on the web page that started the discussion: "very easy to extend in C/C++ or Fortran" But I agree. It's nicer with python/numpy. Both extending with C/C++ and embedding in C/C++. I think I would say "easy" rather than "very easy", but that quibbling... :-) --bb
Given the amount of discussion here, I'm starting to think maybe each of the lines on the comparison page (http://www.scipy.org/NumPyProConPage) should be a link to a whole page of discussion just about that particular issue. --bb On 9/29/06, Bill Baxter <wbaxter@gmail.com> wrote:
Something nobody has mentioned before for pros for numpy is integration with C code: this is really clumsy in matlab, and it works great with python (ctypes, swig, boost, pyrex are all great tools for that, depending on what you want to do).
It is on the web page that started the discussion: "very easy to extend in C/C++ or Fortran"
But I agree. It's nicer with python/numpy. Both extending with C/C++ and embedding in C/C++. I think I would say "easy" rather than "very easy", but that quibbling... :-)
--bb
David Cournapeau wrote:
The things I consider much better in matlab are:
- profiling -> this is the thing I am missing the most. If one of my module is slow, I find it really hard to find where the problems are compared to matlab.
Hi David, I've never found profiling a problem, especially using prun in ipython. Have you tried this? Gary
Gary Ruben wrote:
Hi David, I've never found profiling a problem, especially using prun in ipython. Have you tried this? I didn't know prun, but it looks like it is doing some profiling with hotshot. For example, on one of my package, I get:
300 48.510 0.162 48.510 0.162 densities.py:109(_diag_gauss_den) 10 7.500 0.750 59.930 5.993 gmm_em.py:122(sufficient_statistics) 10 6.470 0.647 13.820 1.382 gmm_em.py:143(update_em) 1050 6.310 0.006 6.310 0.006 :0(dot) 10 3.150 0.315 52.070 5.207 gmm_em.py:233(multiple_gauss_den) 51 1.740 0.034 1.740 0.034 :0(sum) 600 1.510 0.003 1.510 0.003 :0(where) 5 0.810 0.162 0.810 0.162 :0(double_vq) 1 0.650 0.650 2.030 2.030 kmean.py:47(kmean) 1 0.580 0.580 3.780 3.780 gmm_em.py:62(init_kmean) etc... Basically, I know that sufficient_statistics is the cullprit, and I know it is because of _diag_gauss_densities (this last point can only be known if you read the code, though, but I know this code quite well, as it is mine: ) ). But now, how can I optimize this function ? The dot, sum are called everywhere through the code, so I don't know which call are expensive where (all calls are not done with the same args, for example). So in my experience, this is not enough. In matlab, once you do profiling, you can generate a really nice report in the form of one html file, and it gives you the time taken by all your code, per line (for the lines which matter). For people not familiar with matlab, I put an example here: http://www.ar.media.kyoto-u.ac.jp/members/david/profile_results/ (this is the new version of matlab we have at the lab; I am not familiar with it, it is much more fancy than what I need and what used to be on older matlab versions, but this should give you an idea) You have first an index on all top level functions, and you can dig it through as deep as you want. Notice how you know for a given function which call are called when and how often. I have no idea how difficult this would be to implement in python. I was told some months ago on the main python list that hotshot can give a per line profiling of python code, but this is not documented; also, it looks like it is possible to get the source code at runtime without too much difficulty in python. I would be really surprised if nobody tried to do something similar for python in general, because this is really useful. I have never found anything for python, but it may be just because I don't know the name for this kind of tools (I tried googling with terms such as "source profiling", without much success). David
David Cournapeau wrote:
You have first an index on all top level functions, and you can dig it through as deep as you want. Notice how you know for a given function which call are called when and how often. I have no idea how difficult this would be to implement in python. I was told some months ago on the main python list that hotshot can give a per line profiling of python code, but this is not documented; also, it looks like it is possible to get the source code at runtime without too much difficulty in python. I would be really surprised if nobody tried to do something similar for python in general, because this is really useful. I have never found anything for python, but it may be just because I don't know the name for this kind of tools (I tried googling with terms such as "source profiling", without much success).
One excellent tool for drilling through these results is a KDE application called kcachegrind. It was written to visualize valgrind profiling results, but the file format is generic enough that someone wrote a script hotshot2calltree that converts hotshot results to it. I believe it comes with kcachegrind. http://kcachegrind.sourceforge.net/cgi-bin/show.cgi There is a new profiler the comes with 2.5 (but I believe is compatible with at least 2.4) called cProfile (available separately as lsprof). It too has a converter for kcachegrind. http://codespeak.net/svn/user/arigo/hack/misc/lsprof/ http://www.gnome.org/~johan/lsprofcalltree.py -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
David Cournapeau wrote:
You have first an index on all top level functions, and you can dig it through as deep as you want. Notice how you know for a given function which call are called when and how often. I have no idea how difficult this would be to implement in python. I was told some months ago on the main python list that hotshot can give a per line profiling of python code, but this is not documented; also, it looks like it is possible to get the source code at runtime without too much difficulty in python. I would be really surprised if nobody tried to do something similar for python in general, because this is really useful. I have never found anything for python, but it may be just because I don't know the name for this kind of tools (I tried googling with terms such as "source profiling", without much success).
One excellent tool for drilling through these results is a KDE application called kcachegrind. It was written to visualize valgrind profiling results, but the file format is generic enough that someone wrote a script hotshot2calltree that converts hotshot results to it. I believe it comes with kcachegrind.
http://kcachegrind.sourceforge.net/cgi-bin/show.cgi
There is a new profiler the comes with 2.5 (but I believe is compatible with at least 2.4) called cProfile (available separately as lsprof). It too has a converter for kcachegrind.
http://codespeak.net/svn/user/arigo/hack/misc/lsprof/ http://www.gnome.org/~johan/lsprofcalltree.py
Thanks for the tip: it looks like lsprof can gives you the information per child functions, which is the main weakness of previous python profilers IMHO. I cannot make it work right now on python2.4, will try at home with python 2.5 David
Hi Bill, If you're using Windows, the FOSS PyScripter IDE does this. The debugger gives you watches, conditional breakpoints, tooltip hover values etc. Unfortunately it's not available for Linux. The latest release version is here: <http://mmm-experts.com/Products.aspx?ProductID=4> The latest beta is available here: <http://www.optimizeddecisions.com/anonymous/> I use SciTE, ipython and PyScripter, but use PyScripter for most of my serious debugging. Gary R. Bill Baxter wrote:
2) integration of debugging is not as good as matlab. In matlab when you stop at a breakpoint in your code, you get an interactive console where you can probe current values in your program, or create new ones etc. The Wing IDE has this, but I couldn't find any open source IDEs that did this.
--bb
I'm a lurker on this list, but this thread peaked my interests. I am a hydrographer (coastal mapping) not a computer scientist and I don't have much training in numerical computation (I did take the token applied math class in Matlab of course). So, my perspective on numeric Python is as an end user in a production environment. To me (and most of the people I support), data analysis environments like Matlab are black boxes. Maybe I am not the numeric Python target audience. We depend extensively on Matlab to do data analysis and plotting on our team. The vast majority of the scientists I work with struggle with programming and hand coding an FFT in FORTRAN would be impossible (for example). Matlab, or something like it is a necessary tool. Why not numeric Python? In a nutshell, it still looks like alpha software compared to Matlab. Documentation is not ready for end users (and not professionally published). Some of the numeric libraries have been around for ages, but that only adds to the confusion because there are numerous packages spread all over the Internet with a chain of dependencies that adds still more confusion. It still looks like a patchwork quilt rather than an organized system. Finally, most production environments outside of academia use Windows or maybe OSX. That means that (a) there is no compiler, its batteries included or its dead; and (b) there is a real need for an integrated environment because Emacs doesn't come installed on Windows. Python itself is making great headway on Windows at least. In my field (mapping), the big commercial vendor included Python as its macro language, so there has been an explosion of interest in python scripting. Recent publishing of IronPython for .NET was fully supported by Microsoft and there is every reason to believe that it will be a popular way to script and control the .NET framework. So, I think that average engineers and scientists are aware of Python the language and would be receptive to a data analysis package in Python so long as it was polished and well done. For example, the R statistical language has done a good job of packaging up R for Windows (I wish it were as well integrated into Gnome). I am not trying to take pot shots at numeric python here or FOSS or Linux. I use all of these personally. I just can't convince myself that this is a safe recommendation for folks I support. -- David Finlayson
G'day David: I believe the weaknesses you list are well understood and are being addressed. I don't think it fair to say that SciPy looks like alpha software. It may not have a slick interface suitable for the most naive users, but it is quite powerful. You'll have to decide about the "safety" of recommending Python/NumPy/SciPy for your organization. I introduced Python into an engineering organization about six years ago as a scripting language for a piece of equipment we were developing and am just beginning to introduce NumPy. It's been a bumpy road, but Python has gained wider adoption than I expected. Many of our engineers can write an FFT in C or assembly, but balked at Python. Once they really used it, most grew to like it and only drop down to lower level code when necessary. I'd suggest looking for a niche requirement that SciPy can fill as a starting point. If it does well there, you can expand to other niches. If not, the risk should be minimal. Regards, Steve David Finlayson wrote:
I'm a lurker on this list, but this thread peaked my interests. I am a hydrographer (coastal mapping) not a computer scientist and I don't have much training in numerical computation (I did take the token applied math class in Matlab of course). So, my perspective on numeric Python is as an end user in a production environment. To me (and most of the people I support), data analysis environments like Matlab are black boxes. Maybe I am not the numeric Python target audience.
We depend extensively on Matlab to do data analysis and plotting on our team. The vast majority of the scientists I work with struggle with programming and hand coding an FFT in FORTRAN would be impossible (for example). Matlab, or something like it is a necessary tool. Why not numeric Python?
In a nutshell, it still looks like alpha software compared to Matlab. Documentation is not ready for end users (and not professionally published). Some of the numeric libraries have been around for ages, but that only adds to the confusion because there are numerous packages spread all over the Internet with a chain of dependencies that adds still more confusion. It still looks like a patchwork quilt rather than an organized system. Finally, most production environments outside of academia use Windows or maybe OSX. That means that (a) there is no compiler, its batteries included or its dead; and (b) there is a real need for an integrated environment because Emacs doesn't come installed on Windows.
Python itself is making great headway on Windows at least. In my field (mapping), the big commercial vendor included Python as its macro language, so there has been an explosion of interest in python scripting. Recent publishing of IronPython for .NET was fully supported by Microsoft and there is every reason to believe that it will be a popular way to script and control the .NET framework. So, I think that average engineers and scientists are aware of Python the language and would be receptive to a data analysis package in Python so long as it was polished and well done. For example, the R statistical language has done a good job of packaging up R for Windows (I wish it were as well integrated into Gnome).
I am not trying to take pot shots at numeric python here or FOSS or Linux. I use all of these personally. I just can't convince myself that this is a safe recommendation for folks I support.
-- David Finlayson
-- Steven H. Rogers, Ph.D., steve@shrogers.com Weblog: http://shrogers.com/weblog "He who refuses to do arithmetic is doomed to talk nonsense." -- John McCarthy
Hi, Although I am addicted to Python and a frequent user of Numpy/Scipy, I don't want to go into the advantages of Python/Scipy/Numpy, as this has been very well addressed in many previous posts. But I would like to stress some issues that I feel and see/hear more and more from people, and I believe may scare some potential users away from Scipy/Numpy. I think we can draw parallels with many other situations, like Latex/ Word for example. I've got used to the "not so clear and direct" way to write a 10 line letter in Latex, but I also got used to very high quality of the output produced by it. But there was a time where getting all the TeX packages to work together, set up the outputs and so on was a long term project, and required some head scratching. There was a possibility that I would never endure that if all I ever had to do was to write 10 lines letters. I did that because what was available did not met the requirements of the project. I am not saying that Numpy/Scipy is in that same stage, but, for example, only recently we saw an effort to converge to one array package (not to mention that the type array exists in both the standard python package distribution, although different). There are other small quirks, that sometimes is a distribution fault, like matplotlib failing to compile on Suse 10. These are minor problems once one realizes all the potential of Python /Numpy/Scipy, but to overcome, understand and solve this may require some big motivation from the first time user, and can scare the "prospector" type of user, the one thinking "let me try this to see what I can do with it". The problem is that this creates a first impression of broken or alpha software that will not go away easily ... It's good to see that most of this is being addressed, but as someone pointed out, it still requires some travel to different sites and maybe some fiddling to get all working together. Having one big package may not be feasible due distribution restrictions, but a good improvement may be to work with the distributions to have rpms, debs and alike ready to use, even during of the development stage. Cheers, Marco Leite
David Finlayson wrote:
In a nutshell, it still looks like alpha software compared to Matlab. Documentation is not ready for end users (and not professionally published). Some of the numeric libraries have been around for ages, but that only adds to the confusion because there are numerous packages spread all over the Internet with a chain of dependencies that adds still more confusion. It still looks like a patchwork quilt rather than an organized system. Finally, most production environments outside of academia use Windows or maybe OSX. That means that (a) there is no compiler, its batteries included or its dead;
http://code.enthought.com/enthon/
and (b) there is a real need for an integrated environment because Emacs doesn't come installed on Windows.
The next release of Enthon will include the SPE IDE. http://pythonide.stani.be/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
How about a comparrison of pylab and opencv? Would this be appropriate? On 9/28/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Hi all,
I've started a possibly controversial but hopefully informative page that tries to list some of the advantages of using Python+NumPy+Scipy+Matplotlib+IPython (I'm calling that combination PyLab) versus other array environments.
The purpose is not to go into detail about semantic differences, but document higher-level differences that might help somebody decide whether or not they could use NumPy instead of some other environment. I've started with a comparison to MATLAB, based on an email response I sent to a friend earlier today.
Additions and corrections welcome.
-Travis O.
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Not really, opencv is really quite specific for image analysis isn't it? Actually, if anyone has successfully compiled it on win32/python24, would you please consider sharing it ;') -jelle
participants (23)
-
A. M. Archibald
-
Bill Baxter
-
David Cournapeau
-
David Finlayson
-
Fabian Braennstroem
-
Fernando Perez
-
Gael Varoquaux
-
Gary Ruben
-
Jelle Feringa / EZCT Architecture & Design Research
-
John Hassler
-
John Hunter
-
Marco Leite
-
Otto Tronarp
-
Perry Greenfield
-
Rich Shepard
-
Rob Hetland
-
Robert Clewley
-
Robert Kern
-
Ryan May
-
stephen emslie
-
Steven H. Rogers
-
Travis Oliphant
-
William Hunter