[Numpy-discussion] Multiple Linear Regression?
Alexandre
Alexandre.Fayolle at logilab.fr
Mon Apr 29 06:20:03 EDT 2002
On Mon, Apr 29, 2002 at 03:13:44AM -0700, Jasper Phillips wrote:
> I'm helping my wife with programming for her economics thesis, which needs
> to calculate a "Multiple Linear Regression" on her data.
>
> Does anyone know of any (preferably though not necesarrily free) software
> that can do this? I'm working in Python, but not limited to it as I
> can relatively freely access other languages.
>
> I'm still looking for a library written in Python, but haven't had any luck.
>
I'm helping my wife with her History PhD, and have to deal with similar
stuff. I found R to be a very useful environment for statistical
computations. R is a free software clone of S-plus, which is to statistics
what Matlab is to linear algebra and automation.
Pros:
- programming environment, with a high level programming language
- extensive statistical and linalg library (using C and FORTRAN code)
- lots of third party code available, covering a very wide range of
situations
- Python bindings available if you don't want to learn the Scheme-like
language
- Tons of documentation available
- Excellent support through the mailing lists
- GPL'd
- Tons of way to import data (ranging from CSV files to ODBC queries)
- 2 printed books available, at Springer Verlag
- postscript, png, wmf, X outputs, with precise control of the layout
of the graphs and figures available for a nice colourful thesis
Cons:
- the language can be a bit weird at times (it took me some time to get
used to '.' being used instead of '_' and vice versa in the scoping
and variable naming), but you can use Python to script R, thanks to
RPython
- it's quite a big piece of code, with a rather steep learning curve
and you need time to get inside it
- the documentation is aimed at professional statisticians. I had to
dig back in my statistics courses and to buy a couple of books on
that topic for the software to become really useful. Asking newbie
statistician questions on the r-help mailing list is off-topic
- the springer verlag books are very expensive (Modern Applied
Statistics with S-plus costs something like 70 euros), but they are
great
So you have a powerful tool available at your fingertips, designed to do
precisely what you need. I think it's worth taking the time to look at
it carefully. The more I get to understand the topic, the more ideas I
get for new ways of exploring the data of my wife's PhD.
Alexandre Fayolle
--
LOGILAB, Paris (France).
http://www.logilab.com http://www.logilab.fr http://www.logilab.org
Narval, the first software agent available as free software (GPL).
More information about the NumPy-Discussion
mailing list