![](https://secure.gravatar.com/avatar/959f1f67ba8219ff01302f9aa24bcfe3.jpg?s=120&d=mm&r=g)
I'm helping my wife with programming for her economics thesis, which needs to calculate a "Multiple Linear Regression" on her data. Does anyone know of any (preferably though not necesarrily free) software that can do this? I'm working in Python, but not limited to it as I can relatively freely access other languages. I'm still looking for a library written in Python, but haven't had any luck. My second thought was Matlab, but looking over the Matlab website, I couldn't find anything like this by a name I recognize. It looks like I might be able to construct something out of a combination of Sparse Matrices and Linear Regesstion, or perhaps the stuff for overdetermined Linear Equations? Another option may be LAPACK routines, but I'm not familiar with those. Does anyone here have any experience with this kind of stuff? Is there a better place to ask? I'm about ready to take a shot at writing something myself, but I'd really rather avoid this if it's been done before. -Jasper
![](https://secure.gravatar.com/avatar/a53ea657e812241a1162060860f698c4.jpg?s=120&d=mm&r=g)
Jasper Phillips <jasper@peak.org> writes:
I'm still looking for a library written in Python, but haven't had any luck.
Numerical Python has all the basic stuff, but you need to read in and arrange the data yourself. All linear regression problems ultimately become least-squares problems for a system of linear equations, which can be solved using LinearAlgebra.linear_least_squares. Konrad.
![](https://secure.gravatar.com/avatar/fbb61bd6d94bfce41ffa985c2081577f.jpg?s=120&d=mm&r=g)
On Mon, Apr 29, 2002 at 03:13:44AM -0700, Jasper Phillips wrote:
I'm helping my wife with her History PhD, and have to deal with similar stuff. I found R to be a very useful environment for statistical computations. R is a free software clone of S-plus, which is to statistics what Matlab is to linear algebra and automation. Pros: - programming environment, with a high level programming language - extensive statistical and linalg library (using C and FORTRAN code) - lots of third party code available, covering a very wide range of situations - Python bindings available if you don't want to learn the Scheme-like language - Tons of documentation available - Excellent support through the mailing lists - GPL'd - Tons of way to import data (ranging from CSV files to ODBC queries) - 2 printed books available, at Springer Verlag - postscript, png, wmf, X outputs, with precise control of the layout of the graphs and figures available for a nice colourful thesis Cons: - the language can be a bit weird at times (it took me some time to get used to '.' being used instead of '_' and vice versa in the scoping and variable naming), but you can use Python to script R, thanks to RPython - it's quite a big piece of code, with a rather steep learning curve and you need time to get inside it - the documentation is aimed at professional statisticians. I had to dig back in my statistics courses and to buy a couple of books on that topic for the software to become really useful. Asking newbie statistician questions on the r-help mailing list is off-topic - the springer verlag books are very expensive (Modern Applied Statistics with S-plus costs something like 70 euros), but they are great So you have a powerful tool available at your fingertips, designed to do precisely what you need. I think it's worth taking the time to look at it carefully. The more I get to understand the topic, the more ideas I get for new ways of exploring the data of my wife's PhD. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL).
![](https://secure.gravatar.com/avatar/fbb61bd6d94bfce41ffa985c2081577f.jpg?s=120&d=mm&r=g)
On Mon, Apr 29, 2002 at 03:19:37PM +0200, Alexandre wrote:
Woops, I forgot to add a couple of URLs: The R project website http://www.r-project.org/ The Comprehensive R Archive Network (CRAN) http://cran.r-project.org/ Using R from Python http://rpy.sourceforge.net/ Using R from Python and Python from R (coding R extensions in Python) http://www.omegahat.org/RSPython/ Cheers, Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL).
![](https://secure.gravatar.com/avatar/9567eaa720c05c4f4a2dbf7d831f1ef8.jpg?s=120&d=mm&r=g)
Jasper Phillips wrote:
Jasper, Use R (a free implementation of S). See http://www.r-project.org If you are managing your data in Python and NumPy, you can "embed" R in Python and transparently send data to it using Walter Moreira's wonderful RPy module - see http://rpy.sf.net Tim C
![](https://secure.gravatar.com/avatar/a53ea657e812241a1162060860f698c4.jpg?s=120&d=mm&r=g)
Jasper Phillips <jasper@peak.org> writes:
I'm still looking for a library written in Python, but haven't had any luck.
Numerical Python has all the basic stuff, but you need to read in and arrange the data yourself. All linear regression problems ultimately become least-squares problems for a system of linear equations, which can be solved using LinearAlgebra.linear_least_squares. Konrad.
![](https://secure.gravatar.com/avatar/fbb61bd6d94bfce41ffa985c2081577f.jpg?s=120&d=mm&r=g)
On Mon, Apr 29, 2002 at 03:13:44AM -0700, Jasper Phillips wrote:
I'm helping my wife with her History PhD, and have to deal with similar stuff. I found R to be a very useful environment for statistical computations. R is a free software clone of S-plus, which is to statistics what Matlab is to linear algebra and automation. Pros: - programming environment, with a high level programming language - extensive statistical and linalg library (using C and FORTRAN code) - lots of third party code available, covering a very wide range of situations - Python bindings available if you don't want to learn the Scheme-like language - Tons of documentation available - Excellent support through the mailing lists - GPL'd - Tons of way to import data (ranging from CSV files to ODBC queries) - 2 printed books available, at Springer Verlag - postscript, png, wmf, X outputs, with precise control of the layout of the graphs and figures available for a nice colourful thesis Cons: - the language can be a bit weird at times (it took me some time to get used to '.' being used instead of '_' and vice versa in the scoping and variable naming), but you can use Python to script R, thanks to RPython - it's quite a big piece of code, with a rather steep learning curve and you need time to get inside it - the documentation is aimed at professional statisticians. I had to dig back in my statistics courses and to buy a couple of books on that topic for the software to become really useful. Asking newbie statistician questions on the r-help mailing list is off-topic - the springer verlag books are very expensive (Modern Applied Statistics with S-plus costs something like 70 euros), but they are great So you have a powerful tool available at your fingertips, designed to do precisely what you need. I think it's worth taking the time to look at it carefully. The more I get to understand the topic, the more ideas I get for new ways of exploring the data of my wife's PhD. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL).
![](https://secure.gravatar.com/avatar/fbb61bd6d94bfce41ffa985c2081577f.jpg?s=120&d=mm&r=g)
On Mon, Apr 29, 2002 at 03:19:37PM +0200, Alexandre wrote:
Woops, I forgot to add a couple of URLs: The R project website http://www.r-project.org/ The Comprehensive R Archive Network (CRAN) http://cran.r-project.org/ Using R from Python http://rpy.sourceforge.net/ Using R from Python and Python from R (coding R extensions in Python) http://www.omegahat.org/RSPython/ Cheers, Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL).
![](https://secure.gravatar.com/avatar/9567eaa720c05c4f4a2dbf7d831f1ef8.jpg?s=120&d=mm&r=g)
Jasper Phillips wrote:
Jasper, Use R (a free implementation of S). See http://www.r-project.org If you are managing your data in Python and NumPy, you can "embed" R in Python and transparently send data to it using Walter Moreira's wonderful RPy module - see http://rpy.sf.net Tim C
participants (4)
-
Alexandre
-
Jasper Phillips
-
Konrad Hinsen
-
Tim Churches