
Hello all, Is there a way to get probability values for the various families of distributions in numpy? I.e. ala R:
pnorm(1.96, mean = 0 , sd = 1)
[1] 0.9750021 # for the normal
pt(1.65, df=100)
[1] 0.9489597 # for student t Any suggestions would be greatly appreciated. Mark Janikas Product Engineer ESRI, Geoprocessing 380 New York St. Redlands, CA 92373 909-793-2853 (2563) mjanikas@esri.com

Mark Janikas wrote:
Hello all,
Is there a way to get probability values for the various families of distributions in numpy? I.e. ala R:
We have a full complement of PDFs, CDFs, etc. in scipy. In [1]: from scipy import stats In [2]: stats.norm.pdf(1.96, loc=0.0, scale=1.0) Out[2]: array(0.058440944333451469) In [3]: stats.norm.cdf(1.96, loc=0.0, scale=1.0) Out[3]: array(0.97500210485177952) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Wed, 20 Dec 2006, Robert Kern apparently wrote:
We have a full complement of PDFs, CDFs, etc. in scipy.
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.) Although it is a slippery slope, and I definitely do not want NumPy to slide down it, I would certainly not complain if this basic functionaltiy were moved to NumPy... Cheers, Alan Isaac

On Dec 20, 2006, at 8:41 PM, Alan G Isaac wrote:
On Wed, 20 Dec 2006, Robert Kern apparently wrote:
We have a full complement of PDFs, CDFs, etc. in scipy.
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.)
If they're already installing numpy, isn't 98% of the work already done at that point? I'm pretty sure you can install scipy w/o the fortran dependency if you're not concerned about speed and what not, right? It should be a pretty easy install. Besides ... what else do they have to do during the first week-and-a- half of the semester anyway, right? ;-) -steve

On 20/12/06, Alan G Isaac <aisaac@american.edu> wrote:
On Wed, 20 Dec 2006, Robert Kern apparently wrote:
We have a full complement of PDFs, CDFs, etc. in scipy.
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.) Although it is a slippery slope, and I definitely do not want NumPy to slide down it, I would certainly not complain if this basic functionaltiy were moved to NumPy...
This is silly. If it were up to me I would rip out much of the fancy features from numpy and put them in scipy. It's really not very difficult to install, particularly if you don't much care how fast it is, or are using (say) a Linux distribution that packages it. It seems to me that numpy should include only tools for basic calculations on arrays of numbers. The ufuncs, simple wrappers (dot, for example). Anything that requires nontrivial amounts of math (matrix inversion, statistical functions, generating random numbers from exponential distributions, and so on) should go in scipy. If numpy were to satisfy everyone who says, "I like numpy, but I wish it included [their favourite feature from scipy] because I don't want to install scipy", numpy would grow to include everything in scipy. Perhaps an alternative criterion would be "it can go in numpy if it has no external requirements". I think this is a mistake, since it means users have a monstrous headache figuring out what is in which package (for example, some of scipy.integrate depends on external tools and some does not). Moreover it damages the performance of numpy. For example, dot would be faster (for arrays that happen to be matrix-shaped, and possibly in general) if it could use ATLAS' routine from BLAS. Of course, numpy is currently fettered by the need to maintain some sort of compatibility with Numeric and numarray; shortly it will have to worry about compatibility with previous versions of numpy as well. A. M. Archibald

A. M. Archibald schrieb:
On 20/12/06, Alan G Isaac <aisaac@american.edu> wrote:
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.) Although it is a slippery slope, and I definitely do not want NumPy to slide down it, I would certainly not complain if this basic functionaltiy were moved to NumPy... ... If numpy were to satisfy everyone who says, "I like numpy, but I wish it included [their favourite feature from scipy] because I don't want to install scipy", numpy would grow to include everything in scipy.
Well my package manager just reported something like 800K for numpy and 20M for scipy, so I think we're not quite at the point of numpy taking over everything yet (if those numbers are actually meaningful, probably I'm missing something ?). I would also welcome if some functionality could be moved to numpy if the size requirements are reasonably small. Currently I try to avoid to depend on the scipy package to make my programs more portable, and I'm mostly successful, but not always. The p-value stuff in numpy would be helpful here, as Alan already said. Now I don't know if that stuff passes the size criterion, some expert would know that. But if it does, it would be nice if you could consider moving it over eventually. Of course you need to strike a balance, and the optimum is debatable. But again, if scipy is really more than 20 times the size of numpy, and some frequently used things are not in numpy, is there really an urgent need to freeze numpy's set of functionality? just a user's thought, sven

Thanks for all the input so far. The only thing that seems odd about the omission of probability or quantile functions in NumPy is that all the random number generators are present in RandomArray. At any rate, hopefully this bit of functionality will be present in the future, but for now, IMO the library is awesome..... I am used to using R for math routines, and all my sparse matrix stuff is WAAAAAAY faster using the Python-NumPy Combo! Thanks to all for their insight, MJ -----Original Message----- From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Sven Schreiber Sent: Thursday, December 21, 2006 7:10 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Newbie Question, Probability A. M. Archibald schrieb:
On 20/12/06, Alan G Isaac <aisaac@american.edu> wrote:
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.) Although it is a slippery slope, and I definitely do not want NumPy to slide down it, I would certainly not complain if this basic functionaltiy were moved to NumPy... ... If numpy were to satisfy everyone who says, "I like numpy, but I wish it included [their favourite feature from scipy] because I don't want to install scipy", numpy would grow to include everything in scipy.
Well my package manager just reported something like 800K for numpy and 20M for scipy, so I think we're not quite at the point of numpy taking over everything yet (if those numbers are actually meaningful, probably I'm missing something ?). I would also welcome if some functionality could be moved to numpy if the size requirements are reasonably small. Currently I try to avoid to depend on the scipy package to make my programs more portable, and I'm mostly successful, but not always. The p-value stuff in numpy would be helpful here, as Alan already said. Now I don't know if that stuff passes the size criterion, some expert would know that. But if it does, it would be nice if you could consider moving it over eventually. Of course you need to strike a balance, and the optimum is debatable. But again, if scipy is really more than 20 times the size of numpy, and some frequently used things are not in numpy, is there really an urgent need to freeze numpy's set of functionality? just a user's thought, sven _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Mark Janikas wrote:
Thanks for all the input so far. The only thing that seems odd about the omission of probability or quantile functions in NumPy is that all the random number generators are present in RandomArray. A big part of the issue is that getting many of those pdfs into NumPy would require putting many special functions into NumPy (some of which are actually coded in Fortran).
I much prefer to make SciPy an easy install for as many people as possible and/or work on breaking up SciPy into modular components that can be installed separately if needed. This was my original intention --- to make NumPy as small as possible. It's current size is driven by backwards compatibility, only. -Travis

On Thursday 21 December 2006 16:10, Travis Oliphant wrote:
I much prefer to make SciPy an easy install for as many people as possible and/or work on breaking up SciPy into modular components that can be installed separately if needed.
Talking about that, what happened to these projects of modular installation of scipy ? Robert promised us last month to explain what went wrong with his approach, but never had the time...

Pierre GM wrote:
On Thursday 21 December 2006 16:10, Travis Oliphant wrote:
I much prefer to make SciPy an easy install for as many people as possible and/or work on breaking up SciPy into modular components that can be installed separately if needed.
Talking about that, what happened to these projects of modular installation of scipy ? Robert promised us last month to explain what went wrong with his approach, but never had the time...
I created a module (scipy_subpackages.py, IIRC) next to setup.py that essentially just served as a global configuration to inform all of the setup.py's what subpackages they were supposed to build (mostly just Lib/setup.py, actually). I then had a script run through the various collections of subpackages that I wanted to build, set the appropriate values in scipy_subpackages, and run setup() with the appropriate parameters to build an egg for each collection. However, build/ apparently needs to be cleaned out between each egg, otherwise you contaminate later eggs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Robert Kern schrieb:
Pierre GM wrote:
Talking about that, what happened to these projects of modular installation of scipy ? Robert promised us last month to explain what went wrong with his approach, but never had the time...
I created a module (scipy_subpackages.py, IIRC) next to setup.py that essentially just served as a global configuration to inform all of the setup.py's what subpackages they were supposed to build (mostly just Lib/setup.py, actually). I then had a script run through the various collections of subpackages that I wanted to build, set the appropriate values in scipy_subpackages, and run setup() with the appropriate parameters to build an egg for each collection.
However, build/ apparently needs to be cleaned out between each egg, otherwise you contaminate later eggs.
So, to put it "pointedly" (if that's the right word...?): Numpy should not get small functions from scipy -> because the size of scipy doesn't matter -> because scipy's modules will be installable as add-ons separately (and because there will be ready-to-use installers); however, nobody knows how to actually do that in practice ? <irony>Well that will convince my colleagues!</irony> Please don't feel offended, I just want to make the point (as usual) that this way numpy is going to be a good library for other software projects, but not super-attractive for direct users (aka "matlab converts", although I personally don't come from matlab). cheers, sven

On Fri, Dec 22, 2006 at 10:47:33AM +0100, Sven Schreiber wrote:
Please don't feel offended, I just want to make the point (as usual) that this way numpy is going to be a good library for other software projects, but not super-attractive for direct users (aka "matlab converts", although I personally don't come from matlab).
I think that the equivalent of MatLab is more scipy than numpy. I thing there is a miss-understanding here. Gaël

Sven Schreiber wrote:
So, to put it "pointedly" (if that's the right word...?): Numpy should not get small functions from scipy -> because the size of scipy doesn't matter -> because scipy's modules will be installable as add-ons separately (and because there will be ready-to-use installers); however, nobody knows how to actually do that in practice ?
I just want to make the point (as usual) that this way numpy is going to be a good library for other software projects, but not super-attractive for direct users (aka "matlab converts"
No one is denying that there is still work to do. I also think there are a lot of us (from "just users" to the major contributers), that WOULD like to see an easy to install package that can do everything (and more, and better) that MATLAB can. The only questions are: A) How best to accomplish this (or work toward it anyway)? B) Who's going to do the work? As for (A), I think the consensus is pretty clear -- keep numpy focused on the basic array package, with some extras for backwards compatibility, and work towards a SciPy that has all the bells and whistles, preferably installable as separate packages (like Matlab "toolboxes" I suppose). As for the above comments, if we are looking at the "Matlab converts", or more to the point, people looking for a comprehensive scientific/engineering computation package: -- "The size of SciPy doesn't matter" -- True, after all, how big is MATLAB? -- "Scipy's modules will be installable as add-ons separately" -- this is a good goal, and I think there has been progress there. -- "Nobody knows how to actually do that in practice" -- well, it's not so much that nobody knows how to do it, as that nobody has done it -- it's going to take work, but adding extra stuff to numpy takes work too, it's a matter of where you're going to focus the work. Given the size of disk drives and the speed of Internet connections these days, I'm not sure it's that important to have the "core" part of SciPy very small -- but it does need to have easy installers. That approach provides opportunity though -- breaking SciPy down into smaller packages requires expertise and consensus among the developers. However, building an installer requires only one person to take the time to do it. Yes, SciPy is too hard to build and install for an average newbie -- but it's gotten better, and it's not too hard for a savvy user that is willing to put some time it. The kind of person who is willing to put the time in to post to discussions on this group, for instance. Please, rather than complaining that core developers aren't putting your personally desired "small function" into numpy , just take the time to build the installer you need -- we need one for OS-X, build it and put it up on pythonmac -- it's not that hard, and there are a lot of people here and on the scipy and python-mac lists that will help. Now my rant: Please, please, please, could at least a few of the people that build packages take the time to make a simple installer for them and put them up on the web somewhere? Now a question: One of the key difficulties to building SciPy is that parts of it depend on Fortran and LAPACK. We'd all like LAPACK to be built to support our particular hardware for best performance. However, would it be that hard to have Scipy build by default with generic LAPACK (kind of like numpy), and put installers built that way up on the web, along with instructions for those that want to re-build and optimize? For that matter, is it possible (let's say on Windows) to deliver SciPy with a set of multiple dlls for LAPACK/BLAS, and have the appropriate ones chosen at install or run time? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Sven Schreiber wrote:
So, to put it "pointedly" (if that's the right word...?): Numpy should not get small functions from scipy -> because the size of scipy doesn't matter -> because scipy's modules will be installable as add-ons separately (and because there will be ready-to-use installers); however, nobody knows how to actually do that in practice ?
Rather, to put it accurately, numpy should not get large chunks of scipy functionality that require FORTRAN dependencies for reasons that should be obvious from that description. scipy.stats.distributions is just such a chunk. The ancillary point is that I think that, for those who do find the largeness and difficult-to-installness of scipy onerous, the best path forward is to work on the build process of scipy. And it will take *work* not wishes nor complaints nor <irony/> tags. And honestly, the more I see the latter, the less motivated I am to bother with the former. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Robert Kern schrieb:
Rather, to put it accurately, numpy should not get large chunks of scipy functionality that require FORTRAN dependencies for reasons that should be obvious from that description. scipy.stats.distributions is just such a chunk.
I was probably not very clear, I was referring to "small" functions. As you and others have pointed out, for p-value stuff this doesn't apply apparently. Ok.
The ancillary point is that I think that, for those who do find the largeness and difficult-to-installness of scipy onerous, the best path forward is to work on the build process of scipy. And it will take *work* not wishes nor complaints nor <irony/> tags. And honestly, the more I see the latter, the less motivated I am to bother with the former.
My impression of the discussion was that many people said _nothing_ at all should be added into numpy ever, which sounded kind of fundamentalistic to me. And partly the given justification were some features of scipy that don't exist yet (fair enough), and that nobody is even working on. Of course I understand the lack of manpower, but then I think this state of affairs should be properly taken into account when arguing against moving (only small!) features from scipy into numpy. Hence my earlier post. I also try to contribute to open-source projects where I can, and believe me, it would probably help my career more to just have my faculty pay for matlab and forget about numpy et al. You know, users' time is valuable, too. Unfortunately I don't have the skills to help with modularizing scipy (nor the time to acquire those skills). Btw, that's why I like the idea of paying for stuff like documentation and other things that open-source projects often forget about because it's not fun for the developers. (Hey, I'm an economist...) So I would be willing to donate money for some of the dull tasks, for example. (I'm fully aware that would not cover the real cost of the work, just like in Travis' case with the numpy guide.) Ok, that's enough, happy holidays, Sven

Sven Schreiber wrote:
Robert Kern schrieb:
Pierre GM wrote:
So, to put it "pointedly" (if that's the right word...?): Numpy should not get small functions from scipy -> because the size of scipy doesn't matter -> because scipy's modules will be installable as add-ons separately (and because there will be ready-to-use installers); however, nobody knows how to actually do that in practice ?
<irony>Well that will convince my colleagues!</irony>
Please don't feel offended, I just want to make the point (as usual) that this way numpy is going to be a good library for other software projects, but not super-attractive for direct users (aka "matlab converts", although I personally don't come from matlab).
Don't worry about offending, we all recognize the weaknesses. I think you are pointing out something most of us already see. It is the combination of SciPy+NumPy+Matplotlib+IPython (+ perhaps a good IDE) that can succeed at being a MATLAB/IDL replacement for a lot of people. NumPy by itself isn't usually enough, and it's also important to keep NumPy as a library that can be used for other development. What is also needed is a good "package" of it all --- like the Enthon distribution. This requires quite a bit of thankless work. Enthought has done quite a bit in this direction for Windows but they have not had client demand to get it wrapped up for other platforms. I think people are hoping that eggs will help here but it hasn't come to fruition. This is an area that SciPy could really use someone stepping up and taking charge. I like the discussions that have taken place regarding documentation issues, as well as the work that went into making all of SciPy compile with gfortran (though perhaps not bug free...). These are important steps that are much appreciated. Best regards, -Travis

Travis Oliphant wrote:
It is the combination of SciPy+NumPy+Matplotlib+IPython (+ perhaps a good IDE) that can succeed at being a MATLAB/IDL replacement for a lot of people.
What is also needed is a good "package" of it all --- like the Enthon distribution. This requires quite a bit of thankless work.
I know Robert put some serious effort into "MacEnthon" a while back, but is no longer maintaining that, which doesn't surprise me a bit -- that looked like a LOT of work. However, MacEnthon was much bigger than that just the packages Travis listed above, and I think Travis has that right -- those are the key ones to do. Let's "just do it!" -- first we need to solve the Fortran+Universal binary problems though -- that seems to be the technical sticking point on OS-X Also, while the Enthon distribution is fabulous, they do tend to stay behind the bleeding edge a fair bit -- it would be nice to have the core packages with the latest and greatest on Windows and Linux too, all as one easy installer. (or rpm or .deb or whatever for Linux) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Wed, 27 Dec 2006, Christopher Barker wrote:
Travis Oliphant wrote:
It is the combination of SciPy+NumPy+Matplotlib+IPython (+ perhaps a good IDE) that can succeed at being a MATLAB/IDL replacement for a lot of people.
What is also needed is a good "package" of it all --- like the Enthon distribution. This requires quite a bit of thankless work.
I know Robert put some serious effort into "MacEnthon" a while back, but is no longer maintaining that, which doesn't surprise me a bit -- that looked like a LOT of work.
However, MacEnthon was much bigger than that just the packages Travis listed above, and I think Travis has that right -- those are the key ones to do. Let's "just do it!" -- first we need to solve the Fortran+Universal binary problems though -- that seems to be the technical sticking point on OS-X
Let me add a comment on the Fortran problem (which I assume to be the (lack of) Fortran compiler problem, right?). I have been working on f2py rewrite to support wrapping Fortran 90 types among other F90 constructs and as a result we have almost a complete Fortran parser in Python. It is relatively easy to use this parser to automatically convert Fortran 77 codes that we have in scipy to C codes whenever no Fortran compiler is available. Due to lack of funding this work has been freezed for now but I'd say that there is a hope to resolve the Fortran compiler issues for any platform in future. Pearu

pearu@cens.ioc.ee wrote:
I have been working on f2py rewrite to support wrapping Fortran 90 types among other F90 constructs and as a result we have almost a complete Fortran parser in Python. It is relatively easy to use this parser to automatically convert Fortran 77 codes that we have in scipy to C codes whenever no Fortran compiler is available.
Cool! How is the different/better than the old standby f2c? One issue with f2c is that it required a pretty good set of libs to support stuff that Fortran had that C didn't -- complex numbers come to mind, I'm not sure what else was is in libf2c. In fact, I've often wondered why scipy doesn't use f2c. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Christopher Barker wrote:
pearu@cens.ioc.ee wrote:
I have been working on f2py rewrite to support wrapping Fortran 90 types among other F90 constructs and as a result we have almost a complete Fortran parser in Python. It is relatively easy to use this parser to automatically convert Fortran 77 codes that we have in scipy to C codes whenever no Fortran compiler is available.
Cool!
How is the different/better than the old standby f2c?
One issue with f2c is that it required a pretty good set of libs to support stuff that Fortran had that C didn't -- complex numbers come to mind, I'm not sure what else was is in libf2c.
In fact, I've often wondered why scipy doesn't use f2c.
Generally speaking, g77 was always more likely to work on more platforms with less hassle. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

I just discovered the: Scipy Superpack for OS X http://trichech.us/?page_id=4 Maybe this will help folks looking for an OS_X Scipy build. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/20/06, A. M. Archibald <peridot.faceted@gmail.com> wrote:
Moreover it damages the performance of numpy. For example, dot would be faster (for arrays that happen to be matrix-shaped, and possibly in general) if it could use ATLAS' routine from BLAS.
I thought numpy uses ATLAS. Matrix multiplication in numpy is about as fast as in Octave. So it must be using ATLAS.

A Dijous 21 Desembre 2006 05:59, A. M. Archibald escrigué:
On 20/12/06, Alan G Isaac <aisaac@american.edu> wrote:
On Wed, 20 Dec 2006, Robert Kern apparently wrote:
We have a full complement of PDFs, CDFs, etc. in scipy.
This is my "most missed" functionality in NumPy. (For now I feel cannot ask students to install SciPy.) Although it is a slippery slope, and I definitely do not want NumPy to slide down it, I would certainly not complain if this basic functionaltiy were moved to NumPy...
This is silly.
If it were up to me I would rip out much of the fancy features from numpy and put them in scipy. It's really not very difficult to install, particularly if you don't much care how fast it is, or are using (say) a Linux distribution that packages it.
It seems to me that numpy should include only tools for basic calculations on arrays of numbers. The ufuncs, simple wrappers (dot, for example). Anything that requires nontrivial amounts of math (matrix inversion, statistical functions, generating random numbers from exponential distributions, and so on) should go in scipy.
If numpy were to satisfy everyone who says, "I like numpy, but I wish it included [their favourite feature from scipy] because I don't want to install scipy", numpy would grow to include everything in scipy.
Perhaps an alternative criterion would be "it can go in numpy if it has no external requirements". I think this is a mistake, since it means users have a monstrous headache figuring out what is in which package (for example, some of scipy.integrate depends on external tools and some does not). Moreover it damages the performance of numpy. For example, dot would be faster (for arrays that happen to be matrix-shaped, and possibly in general) if it could use ATLAS' routine from BLAS.
Of course, numpy is currently fettered by the need to maintain some sort of compatibility with Numeric and numarray; shortly it will have to worry about compatibility with previous versions of numpy as well.
I agree with most of the arguments above, so +1 --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"

A Dijous 21 Desembre 2006 05:59, A. M. Archibald escrigué:
It seems to me that numpy should include only tools for basic calculations on arrays of numbers. The ufuncs, simple wrappers (dot, for example). Anything that requires nontrivial amounts of math (matrix inversion, statistical functions, generating random numbers from exponential distributions, and so on) should go in scipy.
As a user, I suggest that this becomes a reasonable goal when up to date SciPy installers are maintained for all target platforms. Unless you wish to exclude everyone who is intimidated when installation is less than trivial... Until then, I suggest, the question of the proper functionality bundle with NumPy remains open. Of course as a user I do not pretend to resolve such a question---recall that I mentioned the slippery slope in my post---but I do object to it being dismissed as "silly" when I offered a straightforward explanation. It is well understood that the current view of the developers is that if anything too much is already in NumPy. Any user comments are taking place within that context. Alan Isaac PS A question: is it a good thing if more students start using NumPy *now*? It looks to me like building community size is an important current goal for NumPy. Strip it down like you suggest and aside from Windows users (and Macs are increasingly popular among my students) you'll have only the few that are not intimidated by building SciPy (which still has no intaller for Python 2.5).

On 21/12/06, Alan G Isaac <aisaac@american.edu> wrote:
A Dijous 21 Desembre 2006 05:59, A. M. Archibald escrigué:
It seems to me that numpy should include only tools for basic calculations on arrays of numbers. The ufuncs, simple wrappers (dot, for example). Anything that requires nontrivial amounts of math (matrix inversion, statistical functions, generating random numbers from exponential distributions, and so on) should go in scipy.
As a user, I suggest that this becomes a reasonable goal when up to date SciPy installers are maintained for all target platforms. Unless you wish to exclude everyone who is intimidated when installation is less than trivial...
Until then, I suggest, the question of the proper functionality bundle with NumPy remains open. Of course as a user I do not pretend to resolve such a question---recall that I mentioned the slippery slope in my post---but I do object to it being dismissed as "silly" when I offered a straightforward explanation.
It is well understood that the current view of the developers is that if anything too much is already in NumPy. Any user comments are taking place within that context.
Just to be clear: I am not a developer. I am a user who is frustrated with the difficulty of telling whether to look for a given feature in numpy or in scipy. (I have also never really had much difficulty installing scipy either from the packages in one of several linux distribution or compiling it from scratch.) I suppose the basic difference of opinions here is that I think numpy has already taken too many steps down the slippery slope. Also I don't think 20 megabytes is enough disk space to care about, and I think it is better in the long term to encourage the scipy developers to get the installers working than it is to jam all kinds of scientific functionality into this array package to avoid having to install the scientific computing package.
PS A question: is it a good thing if more students start using NumPy *now*? It looks to me like building community size is an important current goal for NumPy. Strip it down like you suggest and aside from Windows users (and Macs are increasingly popular among my students) you'll have only the few that are not intimidated by building SciPy (which still has no intaller for Python 2.5).
I didn't have to build scipy (though I have, it's not hard), and I don't use Windows. But no, I don't think it can be stripped down yet; the backward compatibility issue is currently important. I think moving scientific functionality from scipy to numpy is a step in the wrong direction, though. A. M. Archibald

A key thing to remember here is that each user has their particular set of "small things" that are all they need from scipy -- put us all together, and you have SciPy -- that's what it is for.
As a user, I suggest that this becomes a reasonable goal when up to date SciPy installers are maintained for all target platforms.
All it takes is someone to do it. Also, there was talk of "modularizing" scipy so that it would be easy to install only those bits you need -- in particular, the non-Fortran stuff should be trivial to build.
Macs are increasingly popular among my students
It can be a pain to build this kind of thing on OS-X, as Apple has not supported a Fortran compiler yet, but it can (and has) been done. IN fact, the Mac is a great target for pre-built binaries as there is only a small variety of hardware to support, and Apple supplies LAPACK/BLAS libs with the system. As for distributing it, the archive at: pythonmac.org/packages takes submissions from anyone -- just send a note to the pythonmac list -- that list is a great help in figuring out how to build stuff too. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Christopher Barker wrote:
It can be a pain to build this kind of thing on OS-X, as Apple has not supported a Fortran compiler yet, but it can (and has) been done. IN fact, the Mac is a great target for pre-built binaries as there is only a small variety of hardware to support, and Apple supplies LAPACK/BLAS libs with the system. As for distributing it, the archive at:
I'm always confused about how to distribute something like SciPy for the MAC. What exactly should be distributed? Is it possible to use distutils to get it done? I'd love to provide MAC binaries of SciPy / NumPy. Right now, we rely on people like Chris for that. -Travis

Travis Oliphant wrote:
I'm always confused about how to distribute something like SciPy for the MAC. What exactly should be distributed? Is it possible to use distutils to get it done?
To get a package format that is actually useful (bdist_dumb just doesn't cut it on any platform, really), you need to install something else. I prefer building eggs instead of mpkgs. So this is what I do: Install setuptools. I then create a script (I usually call it ./be) in the numpy/ and scipy/ directories to hold all of the options I use for building: #!/bin/sh pythonw2.5 -c "import setuptools; execfile('setup.py')" build_src build_clib --fcompiler=gnu95 build_ext --fcompiler=gnu95 build "$@" Then, to build an egg: $ ./be bdist_egg You can then upload it to the Package Index (maybe. I had trouble uploading the Windows scipy binary that Gary Pajer sent me. I suspect that the Index rejected it because it was too large). Here are the outstanding issues as I see them: * Using scipy requires that the FORTRAN runtime libraries that you compiled against be installed in the appropriate place, i.e. /usr/local/lib. This is annoying, since there are currently only tarballs available, so the user needs root access to install them. If an enterprising individual wants to make this situation better, he might try to make a framework out of the necessary libraries such that we can simply link against those. Frameworks are easier to install to different locations with less hassle. http://hpc.sourceforge.net/ * g77 does not work with the Universal Python build process, so we are stuck with gfortran. * The GNU FORTRAN compilers that are available are architecture-specific. For us, that means that we cannot build Universal scipy binaries. If you build on an Intel Mac, the binaries will only work on Intel Macs; if you build on a PPC Mac, likewise, the resulting binaries only work on PPC Macs. * In a related problem, I cannot link against ATLAS on Intels, and possibly not on PPCs, either (I haven't tried building with a Universal Python on PPC). The Universal compile flags (notably "-arch ppc -arch intel") are used when compiling the numpy.distutils test programs for discovering ATLAS's version. Using a single-architecture ATLAS library causes those test programs to not link (since they are missing the _ATL_buildinfo symbol for the missing architecture). I've tried using lipo(1) to assemble a Universal ATLAS library from a PPC-built library and an Intel-built library, but this did not change the symptomology. Fortunately, part of ATLAS is already built into the Accelerate.framework provided with the OS and is automatically recognized by numpy.distutils. It's missing the C versions of LAPACK functions, so scipy.linalg.clapack will be empty. Also, I think numpy.distutils won't recognize that it is otherwise an ATLAS (_ATL_buildinfo is also missing from the framework), so it may not try to compile the C BLAS interfaces, either. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Alan G Isaac wrote:
PS A question: is it a good thing if more students start using NumPy *now*? It looks to me like building community size is an important current goal for NumPy. Strip it down like you suggest and aside from Windows users (and Macs are increasingly popular among my students) you'll have only the few that are not intimidated by building SciPy (which still has no intaller for Python 2.5).
You mean a Windows installer? Yes, it does. http://sourceforge.net/project/showfiles.php?group_id=27747 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Alan G Isaac wrote:
Strip it down like you suggest and aside from Windows users (and Macs are increasingly popular among my students) you'll have only the few that are not intimidated by building SciPy (which still has no intaller for Python 2.5).
On Fri, 22 Dec 2006, Robert Kern apparently wrote:
You mean a Windows installer? Yes, it does. http://sourceforge.net/project/showfiles.php?group_id=27747
1. No, I meant a Mac installer. Sorry that was unclear. And let me be clear that I understand that if I really want one for my students that I should learn how to build one. (And if I get a moment to breathe, I'd like to learn how.) My point was only that lack of availability does have implications. 2. Re: other message, I was not aware that moving scipy.stats.distributions into NumPy would complicate the Numpy build process (which is currently delightfully easy). Thank you, Alan Isaac
participants (13)
-
A. M. Archibald
-
Alan G Isaac
-
Christopher Barker
-
Francesc Altet
-
Gael Varoquaux
-
Keith Goodman
-
Mark Janikas
-
pearu@cens.ioc.ee
-
Pierre GM
-
Robert Kern
-
Steve Lianoglou
-
Sven Schreiber
-
Travis Oliphant