SciPy versus Matlab, Excel, and other tools

I asked a number of friends involved in science on a daily basis what tools they use day to day and for publication. The top answer for doing analysis and graphs seems to be Excel and presentations are done in PowerPoint which are then projected rather than printed and then publications are done in PDF format. One person was using Calcusyn for statistical analysis. Matlab seemed to be the most popular tool after Excel. Excel can be manipulated via COM (and possibly AppleEvents) so it can be scripted from Python, but then that would be a Windows and Mac solution. Is the big issue here something that will run under Unix? I'm assuming that the costs of Excel and Matlab are prohibitive for some people. Matlab in particular looks pretty expensive for an individual. Can Matlab be scripted from Python? Of course, Excel doesn't have most of the libraries that SciPy has, so my investigations may have been too focused on the visualization aspect. Again, I'm not a scientific computing user, so I'm just trying to better understand the SciPy goals, so I can understand how SciPy relates to PythonCard. Thanks, ka --- Kevin Altis altis@semi-retired.com

Kevin, All, just stumbled over this. Maybe it still helps. (Not speaking for "scipy.org" by any means.) Kevin> I asked a number of friends involved in science on a daily Kevin> basis what tools they use day to day and for publication. The Kevin> top answer for doing analysis and graphs seems to be Excel and Kevin> presentations are done in PowerPoint which are then projected Kevin> rather than printed and then publications are done in PDF Kevin> format. Excel? What kind of "scientist" did you ask? I don't know any serious scientist who uses Excel for more than calculating a mean or getting a histogram. (Btw, these people also publish standard deviations for 2-value means... Well, it's not wrong, I guess) Try plotting a dataset (simple x,y will do) with >1e6 points in Excel. Last time I tried (ok, years ago) the limit was 2**14:( I know the limit is bigger nowadays, but generated data sets are so as well:O Not considering that Excel calculates "wrong" (by design(?))! There are papers in the scientific literature that test Excel's "numeric-engine"; Microsoft doesn't consider it worthwhile fixing "bugs" (that is, wrong numerical calculation results) that were pointed out half a decade (or more) ago. (Yes, I have read some of these papers. Sorry, I don't keep references of that stuff, but searching any reasonable literature database should provide you with the papers I am talking about. I do know a former colleague who has hardcopies of some of these papers, so I could contact him to get references if really needed.) Origin, SigmaPlot, and such are used by many to plot (mostly 2d) data and do some analysis (low-scale usually, although these programs promise to do a lot for you nowadays). I actually use Origin myself all the time to prepare presentation graphics. If you have repeating tasks it may still be faster to write a specific solution yourself, though. Kevin> Again, I'm not a scientific computing user, so I'm just trying Kevin> to better understand the SciPy goals I don't know what the goals of Eric, Travis, and Co. were when they started scipy. I think it is a really useful programming library for me. I use stuff like LAPACK or FFTW all the time writing data analysis software. Now I get that stuff within one package for my little python scripts, which do most of my day2day job (well, I haven't gotten them to *produce* the data I actually need, yet). And I can tell you it is faster to write a small python script to do some analysis instead of loading 50 files into any of these ready-made programs, clicking through the menus for minutes, rearranging intermediate results from one function to another, ... Yes, scripting of these apps should be a solution here. I never figured that out because it changes with every version, so I start back from scratch again.... Never had the passion to get that working. python worked right from the first line... Ok, knowing C/C++, Fortran, Pascal, Lisp, HTML, LaTeX, bash, whatever... You know what? These "languages" are actually designed to be useful, not put around a program designed to be colorful and menu-driven as its main goals. I hope nobody (besides maybe Excel designers) feels offended, that is not my intention at all. If you are happy with Excel, good for you! If you are *sure* you don't have problems with precision, stick with it. Nothing is easier and faster than what you know. (Unless you know something better:) Greetings, Jochen -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Liberté, Égalité, Fraternité GnuPG key: 44BCCD8E Sex, drugs and rock-n-roll

"Jochen" == Jochen Küpper <jochen@jochen-kuepper.de> writes:
Jochen> Excel? What kind of "scientist" did you ask? I don't know Jochen> any serious scientist who uses Excel for more than Jochen> calculating a mean or getting a histogram. (Btw, these Jochen> people also publish standard deviations for 2-value Jochen> means... Well, it's not wrong, I guess) I sadly know many excellent scientists who primarily do analyses in excel. They do take a bit of time to get the analyses out, and the analyses tend to be correct, if simplistic (many corrected t-tests rather than a reasonably single linear (regression/anova) model). Don't judge the tool-user by the tools, however, the mark of a good one is the ability to convert when presented with evidence :-). best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net rossini@scharp.org -------- (fridays are probably at Rosen) -------- FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email.

I'm not from the camp of "Excel bad, Command line good" or visa-versa. They both have very nice qualities for different problems. Many many people use Excel because non-programmer can do quite sophistaticated analysis using it. The grid/cell function approach is very powerful. Heck, you can even do set up a 2D heat diffusion problem (finite-difference) in the cells very quickly and watch the solution converge to equilibrium. Programming/Scripting is also available for more sophisticated stuff via VB, COM, etc. It is a programming model that has an easy interface and yet can scale to more sophisticated needs. I don't like VB much, but the general model is good -- and something we can learn from. Graphs that you can manipulate from both the command line and by clicking on the graph is very desirable. Of course other tools like Origin, Matlab, IDL, Maple, SciGraphica, etc. all have things we can learn from to. I'm especially interested in borrowing ideas from SciGraphica as our plotting moves forward. We could just start using SciGraphica in SciPy, but it relies on GTK which is a show-stopper on windows (I know it is ported, but it ain't supported or stable, and it doesn't have a native look). We can also borrow some ideas from Sping, etc. There is much work to do improving SciPy's graphics. Besides solidifying the majority of SciPy's interface, I'd say this was 2nd in line as most important next task -- well documentation and unit tests rank way up there also... see ya, eric ----- Original Message ----- From: "A.J. Rossini" <rossini@blindglobe.net> To: <scipy-dev@scipy.org> Sent: Thursday, October 25, 2001 9:06 AM Subject: Re: [SciPy-dev] SciPy versus Matlab, Excel, and other tools
"Jochen" == Jochen Küpper <jochen@jochen-kuepper.de> writes:
Jochen> Excel? What kind of "scientist" did you ask? I don't know Jochen> any serious scientist who uses Excel for more than Jochen> calculating a mean or getting a histogram. (Btw, these Jochen> people also publish standard deviations for 2-value Jochen> means... Well, it's not wrong, I guess)
I sadly know many excellent scientists who primarily do analyses in excel. They do take a bit of time to get the analyses out, and the analyses tend to be correct, if simplistic (many corrected t-tests rather than a reasonably single linear (regression/anova) model).
Don't judge the tool-user by the tools, however, the mark of a good one is the ability to convert when presented with evidence :-).
best, -tony
-- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net rossini@scharp.org -------- (fridays are probably at Rosen) -------- FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email. _______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev

Don't shoot the messenger. :) My purpose in bringing up Excel was more about the possibilities of using the graphing facilities via COM, which is a Windows-only solution so it is a non-starter based on Eric's messages. I'm not really a SciPy user, I was trying to better understand what facilities SciPy needed that would benefit from collaboration with PythonCard, other graphing tools or application frameworks. I'm actually starting work on a more general framework than PythonCard, which will be specifically for wxPython. The topic was brought up yesterday on the wxPython-users mailing list. http://aspn.activestate.com/ASPN/Mail/Message/wxPython-users/812294 Anyway, the win32com tools have an example of driving Excel with Python and COM which I've included in part below. In the ActivePython distribution (default install location), the files are: c:\python21\win32com\tests\testMSOffice.py c:\python21\win32com\tests\testMSOfficeEvents.py You can also handle events. When I did some Python COM work earlier this year I found the simplest solution was to use VB for the initial tests and to browse the COM object properties and methods and then modify the syntax as necessary for Python and then work in Python once the initial tests were done. Just something to keep in mind. The Excel COM model is documented on MSDN. Here's the Excel 2000 model http://msdn.microsoft.com/library/default.asp?url=/library/en-us/modcore/htm l/deovrMicrosoftExcel2000.asp ka --- def TextExcel(xl): xl.Visible = 0 if xl.Visible: raise error, "Visible property is true." xl.Visible = 1 if not xl.Visible: raise error, "Visible property not true." if int(xl.Version[0])>=8: xl.Workbooks.Add() else: xl.Workbooks().Add() xl.Range("A1:C1").Value = (1,2,3) xl.Range("A2:C2").Value = ('x','y','z') xl.Range("A3:C3").Value = ('3','2','1') for i in xrange(20): xl.Cells(i+1,i+1).Value = "Hi %d" % i if xl.Range("A1").Value <> "Hi 0": raise error, "Single cell range failed" if xl.Range("A1:B1").Value <> ((Unicode("Hi 0"),2),): raise error, "flat-horizontal cell range failed" if xl.Range("A1:A2").Value <> ((Unicode("Hi 0"),),(Unicode("x"),)): raise error, "flat-vertical cell range failed" if xl.Range("A1:C3").Value <> ((Unicode("Hi 0"),2,3),(Unicode("x"),Unicode("Hi 1"),Unicode("z")),(3,2,Unicode("Hi 2"))): raise error, "square cell range failed" xl.Range("A1:C3").Value =((3,2,1),("x","y","z"),(1,2,3)) if xl.Range("A1:C3").Value <> ((3,2,1),(Unicode("x"),Unicode("y"),Unicode("z")),(1,2,3)): raise error, "Range was not what I set it to!" # test dates out with Excel xl.Cells(5,1).Value = "Excel time" xl.Cells(5,2).Formula = "=Now()" import time xl.Cells(6,1).Value = "Python time" xl.Cells(6,2).Value = pythoncom.MakeTime(time.time()) xl.Cells(6,2).NumberFormat = "d/mm/yy h:mm" xl.Columns("A:B").EntireColumn.AutoFit() xl.Workbooks(1).Close(0) xl.Quit() def TestAll(): try: TestWord() print "Starting Excel for Dynamic test..." xl = win32com.client.dynamic.Dispatch("Excel.Application") TextExcel(xl) try: print "Starting Excel 8 for generated excel8.py test..." mod = gencache.EnsureModule("{00020813-0000-0000-C000-000000000046}", 0, 1, 2, bForDemand=1) xl = win32com.client.Dispatch("Excel.Application") TextExcel(xl) except ImportError: print "Could not import the generated Excel 97 wrapper" try: import xl5en32 mod = gencache.EnsureModule("{00020813-0000-0000-C000-000000000046}", 9, 1, 0) xl = win32com.client.Dispatch("Excel.Application.5") print "Starting Excel 95 for makepy test..." TextExcel(xl) except ImportError: print "Could not import the generated Excel 95 wrapper" except KeyboardInterrupt: print "*** Interrupted MSOffice test ***" except: traceback.print_exc() if __name__=='__main__': TestAll() CheckClean() pythoncom.CoUninitialize()
participants (4)
-
eric jones
-
Jochen Küpper
-
Kevin Altis
-
rossini@blindglobe.net