Future directions for SciPy in light of meeting at Berkeley

I wanted to send an update to this list regarding the meeting at Berkeley that I attended. A lot of good disscussions took place at the meeting that should stimulate larger feedback. Personally, I had far more to discuss before I had to leave, and so I hope that the discussions can continue. I was looking to try and understand why with an increasing number of Scientific users of Python, relatively few people actually seem to want to contribute to scipy, regularly, even becoming active developers. There are lots of people who seem to identify problems (though very often vague ones), but not many who seem able (either through time or interest contraints) to actually contribute to code, documentation, or infrastructure. Scipy is an Open source project and relies on the self-selection process of open source contributors. It would seem that while the scipy conference demonstrates a continuing and even increasing use of Python for scientific computing, not as many of these users are scipy devotees. Why? I think the answers come down to a few issues which I will attempt to answer with proposals. 1) Plotting -- scipy's plotting wasn't good enough (we knew that) and the promised solution (chaco) took too long to emerge as a simple replacement. While the elements were all there for chaco to work, very few people knew that and nobody stepped up to take chaco to the level that matplotlib, for example, has reached in terms of cross-gui applicability and user-interface usability. Proposal: Incorporate matplotlib as part of the scipy framework (replacing plt). Chaco is not there anymore and the other two plotting solutions could stay as backward compatible but not progressing solutions. I have not talked to John about this, though I would like to. I think if some other packaging issues are addressed we might be able to get John to agree. 2) Installation problems -- I'm not completely clear on what the "installation problems" really are. I hear people talk about them, but Pearu has made significant strides to improve installation, so I'm not sure what precise issues remain. Yes, installing ATLAS can be a pain, but scipy doesn't require it. Yes, fortran support can be a pain, but if you use g77 then it isn't a big deal. The reality, though, is that there is this perception of installation trouble and it must be based on something. Let's find out what it is. Please speak up users of the world!!!! Proposal (just an idea to start discussion): Subdivide scipy into several super packages that install cleanly but can also be installed separately. Implement a CPAN-or-yum-like repository and query system for installing scientific packages. Base package: scipy_core -- this super package should be easy to install (no Fortran) and should essentially be old Numeric. It was discussed at Berkeley, that very likely Numeric3 should just be included here. I think this package should also include plotting, weave, scipy_distutils, and even f2py. Some of these could live in dual namespaces (i.e. both weave and scipy.weave are available on install). scipy.traits scipy.weave (weave) scipy.plt (matplotlib) scipy.numeric (Numeric3 -- uses atlas when installed later) scipy.f2py scipy.distutils scipy.fft scipy.linalg? (include something like lapack-lite for basic but slow functionality, installation of improved package replaces this with atlas usage) scipy.stats scipy.util (everything else currently in scipy_core) scipy.testing (testing facilities) Each of these should be a separate package installable and distributable separately (though there may be co-dependencies so that scipy.plt would have to be distributed with scipy. Libraries (each separately installable). scipy.lib -- there should be several sub-packages that could live under hear. This is simply raw code with basic wrappers (kind of like a /usr/lib) scipy.lib.lapack -- installation also updates narray and linalg (hooks to do that) scipy.lib.blas -- installation updates narray and linalg scipy.lib.quadpack etc... Extra sub-packages: named in a hierarchy to be determined and probably each dependent on a variety of scipy-sub-packages. I haven't fleshed this thing out yet as you can tell. I'm mainly talking publicly to spur discussion. The basic idea is that we should force ourselves to distribute scipy in separate packages. This would force us to implement a yum-or-CPAN-like package repository, so that we define the interface as to how an additional module could be developed by someone, even maintained separately (with a different license), and simply inserted into an intelligent point under the scipy infrastructure. It also allow installation/compilation issues to be handled on a more per-module basis so that difficult ones could be noted. I think this would also help interested people get some of the enthought stuff put in to the scipy hierarchy as well. Thoughts and comments (and even half-working code) welcomed and encouraged... -Travis O.

It would seem that while the scipy conference demonstrates a continuing and even increasing use of Python for scientific computing, not as many of these users are scipy devotees. Why?
I think the answers come down to a few issues which I will attempt to answer with proposals.
1) Plotting While plotting is important, I don't think that SciPy needs to offer
Travis Oliphant wrote: plotting capabilities in order to become successful. Numerical Python doesn't include plotting, and it's hugely popular. I would think that installing Scipy-lite + (selection of SciPy-lib sub-packages) + (your favorite plotting package) separately is acceptable.
2) Installation problems This is the real problem. I'm one of the maintainers of Biopython (python and C code for computational biology), which relies on Numerical Python. Now that Numerical Python is not being actively maintained, I'd love to be able to direct our users to SciPy instead. But as long as SciPy doesn't install out of the box with a python setup.py install, it's not viable as a replacement for Numerical Python. I'd spend the whole day dealing with installation problems from Biopython users.
There are three other reasons why I have not become a SciPy devotee, although I use Python for scientific computing all the time: 3) Numerical Python already does the job very well. There are few packages in SciPy that I actually need. Special functions would be nice, but it's easier to write your own module than to install SciPy. 4) SciPy looks bloated. It seems to try to do too many things, so that it becomes impossible to maintain SciPy well. 5) Uncertain future. With Numerical Python, we know what we get. I don't know what SciPy will look like in a few years (numarray? Numeric3? Numeric2?) and if it still has a trouble-free installation. So it's too risky for Biopython to go over to SciPy. It's really unfortunate, because my impression is that the SciPy developers are smart people who write good code, which currently is not used as much as it could because of these problems. I hope my comments will be helpful. --Michiel.

On Mar 9, 2005, at 9:32, Michiel Jan Laurens de Hoon wrote:
It would seem that while the scipy conference demonstrates a continuing and even increasing use of Python for scientific computing, not as many of these users are scipy devotees. Why? I think the answers come down to a few issues which I will attempt to answer with proposals. 1) Plotting While plotting is important, I don't think that SciPy needs to offer
Travis Oliphant wrote: plotting capabilities in order to become successful. Numerical Python doesn't include plotting, and it's
... Thanks for your three comments, they reflect exactly my views as well, so I'll just add a "+1" to them. There is only one aspect I would like to add: predictibility of development. Python has become my #1 tool in my everyday research over the last years. I haven't done any scientific computation for at least five years that did not involve some Python code. Which means that I am very much dependent on Python and some Python packages. Moreover, I publish computational methods that I develop in the form of Python code that is used by a community large enough to make support an important consideration. There are only two kinds of computational tools on which I can accept being dependent: those that are supported by a sufficiently big and stable community that I don't need to worry about their disappearence or sudden mutation into something different, and those small enough that I can maintain them in usable state myself if necessary. Python is in the first category, Numeric in the second. SciPy is not in either one. The proposed division of SciPy into separately installable maintainable subpackages could make a big difference there. The core could actually be both easily maintainable and supported by a big enough community. So I am all for it, and I expect to contribute to such a loser package collection as well. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire Léon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ---------------------------------------------------------------------

Travis Oliphant wrote:
1) Plotting -- scipy's plotting wasn't good enough (we knew that) and the promised solution (chaco) took too long to emerge as a simple replacement. While the elements were all there for chaco to work, very few people knew that and nobody stepped up to take chaco to the level that matplotlib, for example, has reached in terms of cross-gui applicability and user-interface usability.
I actually looked at Chaco before I started working on pygist (which is now also included in SciPy, I think). My impression was that Chaco was under active development by enthought, and that they were not looking for developers to join in. When Chaco didn't come through, I tried several plotting packages for python that were around at the time, some of which were farther along than Chaco. In the end, I decided to work on pygist instead because it was already working (on unix/linux, at least) and seemed to be a better starting point for a cross-platform plotting package, which pygist is today. The other point is that different plotting packages have different advantages and disadvantages, so you may not be able to find a plotting package that suits everybody's needs. --Michiel.

Travis Oliphant wrote:
Proposal (just an idea to start discussion):
Subdivide scipy into several super packages that install cleanly but can also be installed separately. Implement a CPAN-or-yum-like repository and query system for installing scientific packages.
Yes! If SciPy could become a kind of scientific CPAN for python from which users can download the packages they need, it would be a real improvement. In the end, the meaning of SciPy evolve into "the website where you can download scientific packages for python" rather than "a python package for scientific computing", and the SciPy developers might not feel OK with that.
Base package:
scipy_core -- this super package should be easy to install (no Fortran) and should essentially be old Numeric. It was discussed at Berkeley, that very likely Numeric3 should just be included here.
+1.
I think this package should also include plotting, weave, scipy_distutils, and even f2py.
I think you are underestimating the complexity of plotting software. Matplotlib relies on a number of other packages, which breaks the "easy to install" rule. Pygist doesn't rely on other packages, but (being the pygist maintainer) I know that in practice users can still run into trouble installing pygist (it's a little bit harder than installing Numerical Python). And if you do include pygist with scipy_core anyway, you may find out that some users want matplotlib after all. Since both pygist and matplotlib exist as separate packages, it's better to leave them out of scipy_core, I'd say. --Michiel.

On Wed, 9 Mar 2005, Michiel Jan Laurens de Hoon wrote:
Travis Oliphant wrote:
Proposal (just an idea to start discussion):
Subdivide scipy into several super packages that install cleanly but can also be installed separately. Implement a CPAN-or-yum-like repository and query system for installing scientific packages.
Yes! If SciPy could become a kind of scientific CPAN for python from which users can download the packages they need, it would be a real improvement. In the end, the meaning of SciPy evolve into "the website where you can download scientific packages for python" rather than "a python package for scientific computing", and the SciPy developers might not feel OK with that.
Personally, I would be OK with that. SciPy as a "download site" does not exclude it to provide also a "scipy package" as it is now. I am all in favore of refactoring current scipy modules as much as possible. Pearu

On Mar 9, 2005, at 8:32, Travis Oliphant wrote:
2) Installation problems -- I'm not completely clear on what the "installation problems" really are. I hear people talk about them, but Pearu has made significant strides to improve installation, so I'm not sure what precise issues remain. Yes, installing ATLAS can be a pain, but scipy doesn't require it. Yes, fortran support can be a pain, but if you use g77 then it isn't a big deal. The reality, though, is that there is this perception of installation trouble and it must be based on something. Let's find out what it is. Please speak up users of the world!!!!
One more comment on this: Ease of installation depends a lot on the technical expertise of the people doing it. If you see SciPy as a package aimed at computational scientists and engineers, then you can indeed expect them to be able to handle some difficulties (though that doesn't mean that thery are willing to if the quantity of trouble is too high). But for me, scientific Python packages are not only modules used by me in my own scripts, but also building blocks in the assembly of end-user applications aimed at non-experts in computation. For example, my DomainFinder tool (http://dirac.cnrs-orleans.fr/DomainFinder), is used mostly by structural biologists. Most people in that community don't even know that a compiler is, so how can I expect them to install g77? Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire Léon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ---------------------------------------------------------------------

Proposal (just an idea to start discussion):
Subdivide scipy into several super packages that install cleanly but can also be installed separately. Implement a CPAN-or-yum-like repository and query system for installing scientific packages.
+1, I would be far more inclined to contribute if we could agree on such a structure.
Extra sub-packages: named in a hierarchy to be determined and probably each dependent on a variety of scipy-sub-packages.
I haven't fleshed this thing out yet as you can tell. I'm mainly talking publicly to spur discussion. The basic idea is that we should force ourselves to distribute scipy in separate packages. This would force us to implement a yum-or-CPAN-like package repository, so that we define the interface as to how an additional module could be developed by someone, even maintained separately (with a different license), and simply inserted into an intelligent point under the scipy infrastructure.
Two comments: 1) We should consider the issue of licenses. For instance: the python wrappers for GSL and FFTW probably need to be GPL-licensed. These packages definitely need to be part of a repository. There needs to be some kind of a category for such packages, as their license is more restrictive. 2) If there is going to be a repository structure it should provide for packages that can be installed independently of a scipy hierarchy. Packages that only require a dependency on the Numeric core should not require scipy_core. That makes sense if Numeric3 ever gets into the core Python. Such packages could (and probably should) also live in a dual scipy namespace. Peter

Hi Travis,
"TO" == Travis Oliphant <oliphant@ee.byu.edu> writes:
TO> I was looking to try and understand why with an increasing TO> number of Scientific users of Python, relatively few people TO> actually seem to want to contribute to scipy, regularly, even TO> becoming active developers. There are lots of people who seem TO> to identify problems (though very often vague ones), but not TO> many who seem able (either through time or interest TO> contraints) to actually contribute to code, documentation, or TO> infrastructure. I think there are two issues here. 1. Finding developers. Unfortunately, I'm as clueless as anyone else. It looks to me that most folks who are capable of contributing are already occupied with other projects. The rest use scipy and are quite happy with it (except for the occasional problem). Others are either heavily invested in other solutions, or don't have the skill or time to contribute. I also think that there are a fair number of users who use scipy at some level or another but are quiet about it and don't have a chance to contribute. From what I can tell, the intersection of the set of people who possess good computing skills and also persue numerical work from Python is still a very small number compared to other fields. 2. Packaging issues. More on this later. [...] TO> I think the answers come down to a few issues which I will TO> attempt to answer with proposals. TO> 1) Plotting -- scipy's plotting wasn't good enough (we knew I am not sure what this has to do with scipy's utility? Do you mean to say that you'd like to have people starting to use scipy to plot things and then hope that they contribute back to scipy's numeric algorithms? If all they did was to use scipy for plotting, the only contributions would be towards plotting. If you only mean this as a convenience, then this seems like a packaging issue and not related to scipy. Plotting is one part of the puzzle. You don't seem to mention any deficiencies with respect to numerical algorithms. This seems to suggest that apart from things like packaging and docs, the numeric side is pretty solid. Let me take this to an extreme, if plotting be deemed a part of scipy's core then how about f2py? It is definitely core functionality. So why not make f2py part of scipy? How about g77, g95, and gcc. The only direction this looks to be headed is to make a SciPy OS (== Enthon?). I think we are mixing packaging along with other issues here. To make it clear, I am not against incorporating matplotlib in scipy. I just think that the argument for its inclusion does not seem clear to me. [...] TO> 2) Installation problems -- I'm not completely clear on what TO> the TO> "installation problems" really are. I hear people talk about [...] TO> Proposal (just an idea to start discussion): TO> Subdivide scipy into several super packages that install TO> cleanly but can also be installed separately. Implement a TO> CPAN-or-yum-like repository and query system for installing TO> scientific packages. What does this have to do with scipy per se? This is more like a user convenience issue. [scipy-sub-packages] TO> I haven't fleshed this thing out yet as you can tell. I'm TO> mainly talking publicly to spur discussion. The basic idea is TO> that we should force ourselves to distribute scipy in separate TO> packages. This would force us to implement a yum-or-CPAN-like TO> package repository, so that we define the interface as to how TO> an additional module could be developed by someone, even TO> maintained separately (with a different license), and simply TO> inserted into an intelligent point under the scipy TO> infrastructure. This is in general a good idea but one that goes far beyond scipy itself. Joe Cooper mentioned that he had ideas on how to really do this in a cross-platform way. Many of us eagerly await his solution. :) regards, prabhu

On Wed, 9 Mar 2005, Prabhu Ramachandran apparently wrote:
What does this have to do with scipy per se? This is more like a user convenience issue.
I think the proposal is: development effort is a function of community size, and community size is a function of convenience as well as functionality. This seems right to me. Cheers, Alan Isaac

Hi All
I think the proposal is: development effort is a function of community size,
Undeniably true!
and community size is a function of convenience as well as functionality.
This is only partly true. I think the main barriers to more people using scipy are... 1. Not that many people actually know about it 2. People aren't easily convinced to change from what they were taught to use as an under-graduate (e.g. Matlab, IDL, Mathematica) As it stands, I don't think scipy is particularly inconvenient to install or use. On the two suggested improvements: I think incorporating matplotlib is an excellent idea. But I think the second suggestion of separating Scipy into independent packages will prove to be counter-productive. It might put people off even before they start, because instead of installing one package, they have a bewildering choice of many. And it could prove to be annoying to people using scipy who want to share or distribute code, with the requirement that both parties have scipy becoming a requirement that both parties have a specific combination of scipy packages. Also, another reason why there might be a lack of developers is that there a people like me who find that scipy and matplotlib already do everything that they need. Which is good right? Cheers, Cory.
This seems right to me.
Cheers, Alan Isaac
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user -- )))))))))))))))))))))))))))))))))))))))))))) Cory Davis Meteorology School of GeoSciences University of Edinburgh King's Buildings EDINBURGH EH9 3JZ
ph: +44(0)131 6505092 fax +44(0)131 6505780 cdavis@staffmail.ed.ac.uk cory@met.ed.ac.uk http://www.geos.ed.ac.uk/contacts/homes/cdavis ))))))))))))))))))))))))))))))))))))))))))))

"AI" == Alan G Isaac writes:
AI> On Wed, 9 Mar 2005, Prabhu Ramachandran apparently wrote: >> What does this have to do with scipy per se? This is more like >> a user convenience issue. AI> I think the proposal is: development effort is a function of AI> community size, and community size is a function of AI> convenience as well as functionality. To put it bluntly, I don't believe that someone who can't install scipy today is really capable of contributing code to scipy. I seriously doubt claims that scipy is scary or hard to install today. Therefore, the real problem does not appear to be convenience and IMHO neither is functionality the problem. My only point is this. I think Travis and Pearu have been doing a great job! I'd rather see them working on things like Numeric3 and core scipy functionality rather than spend time worrying about packaging, including other new packages and making things more comfortable for the user (especially when these things are already taken care of). Anyway, Joe's post about ASP's role is spot on! Thanks Joe. More on that thread. cheers, prabhu

On 09.03.2005, at 17:52, Prabhu Ramachandran wrote:
To put it bluntly, I don't believe that someone who can't install scipy today is really capable of contributing code to scipy. I
True, but not quite to the point. I can install SciPy, but given that most of my code is written with the ultimate goal of being published and used by people with less technical experience, I need to take those people into account when choosing packages to build on.
seriously doubt claims that scipy is scary or hard to install today.
I get support questions from people who are not aware that they need root permissions to do "python setup.py install" on a standard Linux system. On that scale of expertise, scipy *is* scary. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

I'd like to second Konrad's point and restate what I tried to articulate (probably poorly) at SciPy 04. How easy it is for me, as a developer, to install SciPy on my particular development platform (in my case OS X and Linux) is not the same as how easy it is to *deploy* an application which uses SciPy as a library to end-user clients (in my case on Windows). I had originally hoped that having the client simply install Enthon would suffice, but I wanted to use some features from wxPython 2.5.x (perhaps that's what I should have reconsidered). I tried combinations of having the client install packages separately and me using py2exe, but in the end my dependency on SciPy was small enough that it was easiest to just dump SciPy altogether. Just my 2c. (and I hope that it's clear that I do appreciate all the work that people have done and that I mean no offense by my comments). On Mar 9, 2005, at 2:30 PM, konrad.hinsen@laposte.net wrote:
On 09.03.2005, at 17:52, Prabhu Ramachandran wrote:
To put it bluntly, I don't believe that someone who can't install scipy today is really capable of contributing code to scipy. I
True, but not quite to the point. I can install SciPy, but given that most of my code is written with the ultimate goal of being published and used by people with less technical experience, I need to take those people into account when choosing packages to build on.
seriously doubt claims that scipy is scary or hard to install today.
I get support questions from people who are not aware that they need root permissions to do "python setup.py install" on a standard Linux system. On that scale of expertise, scipy *is* scary.
Konrad. -- ----------------------------------------------------------------------- -------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ----------------------------------------------------------------------- --------
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion

I only have a little to contribute at this point:
Proposal: Incorporate matplotlib as part of the scipy framework (replacing plt).
While this is an admirable goal, I personally find scipy and matplotlib easy to install separately. The only difficulty (of course!) is the numarray/numeric split, so I have to be sure that I select numerix as Numeric in my .matplotlibrc file before typing 'ipython -pylab -p scipy', which actually works really well.
2) Installation problems -- I'm not completely clear on what the "installation problems" really are.
scipy and matplotlib are both very easy to install. Using ATLAS is the biggest pain, as Travis says, and one can do without it. Now that a simple 'scipy setup.py bdist_rpm' seems to work reliably, I for one am happy. I think splitting scipy up into multiple subpackages isn't such a good idea. Perhaps I'm in the minority, but I find CPAN counter-intuitive, hard to use, and hard to keep track of in an RPM-based environment. Any large package is going to include a lot of stuff most people don't need, but like a NY Times ad used to say, "You might not read it all, but isn't it nice to know it's all there?" I can tell you why I'm not contributing much code to the effort at least in one recent instance. Since I'm still getting core dumps when I try to use optimize.leastsq with a defined Jacobian function, I dove into _minpackmodule.c and its associated routines last night. I'm at sea. I know enough Python to be dangerous, used LMDER from Fortran extensively while doing my Ph.D., and am pretty good at C, but am completely unfamiliar with the Python-C API. So I don't even know how to begin tracking the problem down. Finally, as I mentioned at SciPy04, our particular physics department is at an undergraduate institution (no Ph.D. program), so we mainly produce majors who stop at the B.S. or M.S. degree. Their job market seems to want MATLAB skills, not Python, at the moment, so that's what the faculty are learning and teaching to their students. Many of them/us simply don't have the time to learn Python on top of that. Though, when I showed some colleagues how trivial it was to trim some unwanted bits out of data files they had using Python, I think I converted them.

Hi, Thanks for the excellent discussion - and this has really been said already, but just for clarity: It seems that SciPy has two intended markets. The first is as a competitor to languages like Matlab and IDL. Here the ideal is that a Matlab IDL user can just google, look, download, install and have something with all the features they are used to sitting and looking at them saying "aren't I beautiful". I guess this is the point of ASP. Such a package will definitely need very good default plotting. The second is open-source developers. Until we reach the ideal above, developers will need flexibility and independence of install options to minimize the support they have to offer for SciPy install issues. So, aren't we suggesting providing a solution for both types of users? If cleverly done, can't we have nicely parsed separate packages for developers to use, which can also be downloaded as one big SciPy install? Over time, we can expect that individual installs will improve until we reach the necessary stability of the full install. In the meantime, we also have a problem of the perception that efforts in numerical python are widely spread across developers and websites; this makes new users googling for Python and Matlab or IDL nervous. It would be a great help if those writing scientific projects for python could try to use the SciPy home as a base, even if at first the project is rather independent of SciPy itself - IPython being a good example. Best, Matthew

On Mar 9, 2005, at 12:33 PM, Stephen Walton wrote:
2) Installation problems -- I'm not completely clear on what the "installation problems" really are.
scipy and matplotlib are both very easy to install. Using ATLAS is the biggest pain, as Travis says, and one can do without it. Now that a simple 'scipy setup.py bdist_rpm' seems to work reliably, I for one am happy.
On Mac OS X, using ATLAS should be pretty trivial because the OS already ships with an optimized implementation! The patch I created for Numeric was very short, and I'm pretty sure it's on the trunk (though last I packaged it, I had to make a trivial fix or two, which I reported on sourceforge). I haven't delved into SciPy's source in a really long time, so I'm not sure where changes would need to be made, but I think someone else should be fine to look at Numeric's setup.py and do what needs to be done to SciPy. FYI, matplotlib, the optimized Numeric, and several other Mac OS X packages are available in binary form here: http://pythonmac.org/packages/
I think splitting scipy up into multiple subpackages isn't such a good idea. Perhaps I'm in the minority, but I find CPAN counter-intuitive, hard to use, and hard to keep track of in an RPM-based environment. Any large package is going to include a lot of stuff most people don't need, but like a NY Times ad used to say, "You might not read it all, but isn't it nice to know it's all there?"
I also think that a monolithic package is a pretty good idea until it begins to cause problems with the release cycle. Twisted had this problem at 1.3, and went through a major refactoring between then and 2.0 (which is almost out the door). Though Twisted 2.0 is technically many different packages, they still plan on maintaining a "sumo" package that includes all of the Twisted components, plus zope.interface (the only required dependency). There are still several optional dependencies not included, though (such as PyCrypto). SciPy could go this route, and simply market the "sumo" package to anyone who doesn't already know what they're doing. An experienced SciPy user may want to upgrade one particular component of SciPy as early as possible, but leave the rest be, for example. -bob

I had a lengthy discussion with Eric today and clarified some things in my mind about the future directions of scipy. The following is basically what we have decided. We are still interested in input so don't think the issues are closed, but I'm just giving people an idea of my (and Eric's as far as I understand it) thinking on scipy. 1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). It will likely contain the functionality of (the names and placements will be similar to current scipy_core). Numeric3 (actually called ndarray or narray or numstar or numerix or something....) fft (based on c-only code -- no fortran dependency) linalg (a lite version -- no fortran or ATLAS dependency) stats (a lite version --- no fortran dependency) special (only c-code --- no fortran dependency) weave f2py? (still need to ask Pearu about this) scipy_distutils and testing matrix and polynomial classes ...others...? We will push to make this an easy-to-install effective replacement for Numeric and hopefully for numarray users as well. Therefore community input and assistance will be particularly important. 2) The rest of scipy will be a package (or a series of packages) of algorithms. We will not try to do plotting as part of scipy. The current plotting in scipy will be supported for a time, but users will be weaned off to other packages: matplotlib, pygist (for xplt -- and I will work to get any improvements for xplt into pygist itself), gnuplot, etc. 3) Having everything under a scipy namespace is not necessary, nor worth worrying about at this point. My scipy-related focus over the next 5-6 months will be to get scipy_core to the point that most can agree it effectively replaces the basic tools of Numeric and numarray. -Travis

Hey Travis, It sounds like the Berkeley meeting went well. I am glad that the Numeric3 project is going well and looks like it has a good chance to unify the Numeric/Numarray communities. I really appreciate you putting in so much effort intto its implementation. I also appreciate all the work by Perry, Todd, and the others at StSci have done building Numarray. We've all learned a ton from it. Most of the plans sound right to me (several questions/comments below). Much of SciPy has been structured in this way already, but we really have never worked to make the core useful as a stand alone package. Supporting lite and full versions of fft, linalg, and stats sounds potentially painful, but also worthwhile given the circumstances. Now: 1. How much of stats do we loose from removing fortran dependencies? 2. I do question whether weave really be in this core? I think it was in scipy_core before because it was needed to build some of scipy. 3. Now that I think about it, I also wonder if f2py should really be there -- especially since we are explicitly removing any fortran dependencies from the core. 4. I think keeping scipy a algorithms library and leaving plotting to other libraries is a good plan. At one point, the setup_xplt.py file was more than 1000 lines. It is much cleaner now, but dealing with X11, etc. does take maintenance work. Removing these libraries from scipy would decrease the maintenance effort and leave the plotting to matplotlib, chaco, and others. 5. I think having all the generic algorithm packages (signal, ga, stats, etc. -- basically all the packages that are there now) under the scipy namespace is a good idea. It prevents worry about colliding with other peoples packages. However, I think domain specific libraries (such as astropy) should be in their own namespace and shouldn't be in scipy. thanks, eric Travis Oliphant wrote:
I had a lengthy discussion with Eric today and clarified some things in my mind about the future directions of scipy. The following is basically what we have decided. We are still interested in input so don't think the issues are closed, but I'm just giving people an idea of my (and Eric's as far as I understand it) thinking on scipy.
1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). It will likely contain the functionality of (the names and placements will be similar to current scipy_core). Numeric3 (actually called ndarray or narray or numstar or numerix or something....) fft (based on c-only code -- no fortran dependency) linalg (a lite version -- no fortran or ATLAS dependency) stats (a lite version --- no fortran dependency) special (only c-code --- no fortran dependency) weave f2py? (still need to ask Pearu about this) scipy_distutils and testing matrix and polynomial classes
...others...?
We will push to make this an easy-to-install effective replacement for Numeric and hopefully for numarray users as well. Therefore community input and assistance will be particularly important.
2) The rest of scipy will be a package (or a series of packages) of algorithms. We will not try to do plotting as part of scipy. The current plotting in scipy will be supported for a time, but users will be weaned off to other packages: matplotlib, pygist (for xplt -- and I will work to get any improvements for xplt into pygist itself), gnuplot, etc.
3) Having everything under a scipy namespace is not necessary, nor worth worrying about at this point.
My scipy-related focus over the next 5-6 months will be to get scipy_core to the point that most can agree it effectively replaces the basic tools of Numeric and numarray.
-Travis
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user

Hi, To clarify few technical details: On Wed, 9 Mar 2005, eric jones wrote:
1. How much of stats do we loose from removing fortran dependencies? 2. I do question whether weave really be in this core? I think it was in scipy_core before because it was needed to build some of scipy.
At the moment scipy does not contain modules that need weave.
3. Now that I think about it, I also wonder if f2py should really be there -- especially since we are explicitly removing any fortran dependencies from the core.
f2py is not a fortran-only tool. In scipy it has been used to wrap also C codes (fft, atlas) and imho f2py should be used more so whenever possible.
Travis Oliphant wrote:
1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). It will likely contain the functionality of (the names and placements will be similar to current scipy_core). Numeric3 (actually called ndarray or narray or numstar or numerix or something....) fft (based on c-only code -- no fortran dependency)
Hmm, what would be the default underlying fft library here? Currently in scipy it is Fortran fftpack. And when fftw is available, it is used instead.
linalg (a lite version -- no fortran or ATLAS dependency)
Again, what would be the underlying linear algebra library here? Numeric uses f2c version of lite lapack library. Shall we do the same but wrapping the c codes with f2py rather than by hand? f2c might be useful also in other cases to reduce fortran dependency, but only when it is critical to ease the scipy_core installation.
stats (a lite version --- no fortran dependency) special (only c-code --- no fortran dependency) weave f2py? (still need to ask Pearu about this)
I am not against it, it actually would simplify many things (for scipy users it provides one less dependency to worry about, f2py bug fixes and new features are immidiately available, etc). And I can always ship f2py as standalone for non-scipy users.
scipy_distutils and testing matrix and polynomial classes
...others...?
There are few pure python modules (ppimport,machar,pexec,..) in scipy_base that I have heard to be used as very useful standalone modules.
We will push to make this an easy-to-install effective replacement for Numeric and hopefully for numarray users as well. Therefore community input and assistance will be particularly important.
2) The rest of scipy will be a package (or a series of packages) of algorithms. We will not try to do plotting as part of scipy. The current plotting in scipy will be supported for a time, but users will be weaned off to other packages: matplotlib, pygist (for xplt -- and I will work to get any improvements for xplt into pygist itself), gnuplot, etc.
+1 for not doing plotting in scipy. Pearu

On 10.03.2005, at 09:49, Pearu Peterson wrote:
f2py is not a fortran-only tool. In scipy it has been used to wrap also C codes (fft, atlas) and imho f2py should be used more so whenever possible.
Good to know. I never looked at f2py because I don't use Fortran any more.
Hmm, what would be the default underlying fft library here? Currently in scipy it is Fortran fftpack. And when fftw is available, it is used instead.
How about an f2c version of FFTPACK? Plus keeping the option of using fftw if installed, of course.
Again, what would be the underlying linear algebra library here? Numeric uses f2c version of lite lapack library. Shall we do the same but wrapping the c codes with f2py rather than by hand? f2c might be useful
I like the idea of the f2c versions because they can easily be replaced by the original Fortran code for more speed. It might even be a good to have scipy_core include the Fortran version as well and use it optionally during installation. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

Pearu Peterson wrote:
Travis Oliphant wrote:
1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). ... linalg (a lite version -- no fortran or ATLAS dependency)
Again, what would be the underlying linear algebra library here? Numeric uses f2c version of lite lapack library. Shall we do the same but wrapping the c codes with f2py rather than by hand? f2c might be useful also in other cases to reduce fortran dependency, but only when it is critical to ease the scipy_core installation.
If I understand Travis correctly, the idea is to use Numeric as the basis for scipy_core, allowing current Numerical Python users to switch to scipy_core with a minimum of trouble. So why not use Numeric's lite lapack library directly? What is the advantage of repeating the c code wrapping (by f2py or by hand)? --Michiel.

On Sun, 13 Mar 2005, Michiel Jan Laurens de Hoon wrote:
Pearu Peterson wrote:
Travis Oliphant wrote:
1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). ... linalg (a lite version -- no fortran or ATLAS dependency)
Again, what would be the underlying linear algebra library here? Numeric uses f2c version of lite lapack library. Shall we do the same but wrapping the c codes with f2py rather than by hand? f2c might be useful also in other cases to reduce fortran dependency, but only when it is critical to ease the scipy_core installation.
If I understand Travis correctly, the idea is to use Numeric as the basis for scipy_core, allowing current Numerical Python users to switch to scipy_core with a minimum of trouble. So why not use Numeric's lite lapack library directly? What is the advantage of repeating the c code wrapping (by f2py or by hand)?
First, I wouldn't repeat wrapping c codes by hand. But using f2py wrappers has the following advantages: (i) maintaining the wrappers is easier (as the wrappers are generated) (ii) one can easily link linalg_lite against optimized lapack. This is certainly possible with current Numeric but for a smaller set of Fortran compilers than when using f2py generated wrappers (for example, if a compiler produces uppercased symbol names then Numeric wrappers won't work) (iii) scipy provides wrappers to a larger set of lapack subroutines than Numeric, and with f2py it is easier and less errorprone to add new wrappers to lapack functions than wrapping them by hand, i.e. extending f2py generated linalg_lite is much easier than extending the current Numeric lapack_lite. (iv) and finally, f2py generated wrappers tend to be more efficient than Numeric hand coded wrappers. Here are some benchmark results comparing scipy and Numeric linalg functions: Finding matrix determinant ================================== | contiguous | non-contiguous ---------------------------------------------- size | scipy | Numeric | scipy | Numeric 20 | 0.16 | 0.22 | 0.17 | 0.26 (secs for 2000 calls) 100 | 0.29 | 0.41 | 0.28 | 0.56 (secs for 300 calls) 500 | 0.31 | 0.36 | 0.33 | 0.45 (secs for 4 calls) Finding matrix inverse ================================== | contiguous | non-contiguous ---------------------------------------------- size | scipy | Numeric | scipy | Numeric 20 | 0.28 | 0.33 | 0.27 | 0.37 (secs for 2000 calls) 100 | 0.64 | 1.06 | 0.64 | 1.24 (secs for 300 calls) 500 | 0.83 | 1.10 | 0.84 | 1.18 (secs for 4 calls) Solving system of linear equations ================================== | contiguous | non-contiguous ---------------------------------------------- size | scipy | Numeric | scipy | Numeric 20 | 0.26 | 0.18 | 0.26 | 0.21 (secs for 2000 calls) 100 | 0.31 | 0.35 | 0.31 | 0.52 (secs for 300 calls) 500 | 0.33 | 0.34 | 0.35 | 0.41 (secs for 4 calls) Remark: both scipy and Numeric are linked agaist the same ATLAS/Lapack library. Pearu

Pearu Peterson wrote:
If I understand Travis correctly, the idea is to use Numeric as the basis for scipy_core, allowing current Numerical Python users to switch to scipy_core with a minimum of trouble. So why not use Numeric's lite lapack library directly? What is the advantage of repeating the c code wrapping (by f2py or by hand)?
First, I wouldn't repeat wrapping c codes by hand. But using f2py wrappers has the following advantages:
OK I'm convinced. From a user perspective, it's important that the scipy_core linear algebra looks and feels as the Numerical Python linear algebra package. So if a user does
from LinearAlgebra import myfavoritefunction s/he should not note any difference other than "hey, my favorite function seems to be running faster now!"
--Michiel.

On Mar 9, 2005, at 11:41 PM, eric jones wrote:
2. I do question whether weave really be in this core? I think it was in scipy_core before because it was needed to build some of scipy. 3. Now that I think about it, I also wonder if f2py should really be there -- especially since we are explicitly removing any fortran dependencies from the core.
It would seem to me that so long as: 1) both these tools have very general usefulness (and I think they do), and 2) are not installation problems (I don't believe they are since they themselves don't require any compilation of Fortran, C++ or whatever--am I wrong on that?) That they are perfectly fine to go into the core. In fact, if they are used by any of the extra packages, they should be in the core to eliminate the extra step in the installation of those packages. Perry

Perry Greenfield wrote:
On Mar 9, 2005, at 11:41 PM, eric jones wrote:
2. I do question whether weave really be in this core? I think it was in scipy_core before because it was needed to build some of scipy. 3. Now that I think about it, I also wonder if f2py should really be there -- especially since we are explicitly removing any fortran dependencies from the core.
It would seem to me that so long as:
1) both these tools have very general usefulness (and I think they do), and 2) are not installation problems (I don't believe they are since they themselves don't require any compilation of Fortran, C++ or whatever--am I wrong on that?)
That they are perfectly fine to go into the core. In fact, if they are used by any of the extra packages, they should be in the core to eliminate the extra step in the installation of those packages.
-0. 1) In der Beschraenkung zeigt sich der Meister. In other words, avoid software bloat. 2) f2py is a Fortran-Python interface generator, once the interface is created there is no need for the generator. 3) I'm sure f2py is useful, but I doubt that it has very general usefulness. There are lots of other useful Python packages, but we're not including them in scipy-core either. 4) f2py and weave don't fit in well with the rest of scipy-core, which is mainly standard numerical algorithms. --Michiel. --Michiel.

Michiel Jan Laurens de Hoon wrote:
Perry Greenfield wrote:
On Mar 9, 2005, at 11:41 PM, eric jones wrote:
2. I do question whether weave really be in this core? I think it was in scipy_core before because it was needed to build some of scipy. 3. Now that I think about it, I also wonder if f2py should really be there -- especially since we are explicitly removing any fortran dependencies from the core.
It would seem to me that so long as:
1) both these tools have very general usefulness (and I think they do), and 2) are not installation problems (I don't believe they are since they themselves don't require any compilation of Fortran, C++ or whatever--am I wrong on that?)
That they are perfectly fine to go into the core. In fact, if they are used by any of the extra packages, they should be in the core to eliminate the extra step in the installation of those packages.
-0. 1) In der Beschraenkung zeigt sich der Meister. In other words, avoid software bloat. 2) f2py is a Fortran-Python interface generator, once the interface is created there is no need for the generator. 3) I'm sure f2py is useful, but I doubt that it has very general usefulness. There are lots of other useful Python packages, but we're not including them in scipy-core either. 4) f2py and weave don't fit in well with the rest of scipy-core, which is mainly standard numerical algorithms.
I'm of the opinion that f2py and weave should go into the core. 1) Neither one requires Fortran and both install very, very easily. 2) These packages are fairly small but provide huge utility --- inlining fortran or C code is an easy way to speed up Python. People who don't "need it" will never realize it's there 3) Building the rest of scipy will need at least f2py already installed and it would simplify the process. 4) Enthought packages (to be released in the future and of interest to scientists) rely on weave. Why not make that process easier with a single initial install. 5) It would encourage improvements of weave and f2py from the entire community. 6) The developers of f2py and weave are both scipy developers and so it would make sense for their code that forms a foundation for other work to go into scipy_core. -Travis

"TO" == Travis Oliphant <oliphant@ee.byu.edu> writes:
TO> I'm of the opinion that f2py and weave should go into the TO> core. If you are looking for feedback, I'd say +2 for that. regards, prabhu

Travis Oliphant wrote:
I'm of the opinion that f2py and weave should go into the core. <(6 good points)>
The act of putting something into the core will encourage people to use it. My understanding of the idea of the core is that it is minimal set of packages that various developers can use as a basis for their domain specific stuff. One barrier to entry for people currently using the whole of SciPy is the ease of installation issue, and f2py and weave are easy to install, so that's not a problem. However, if I understand it correctly, neither weave nor f2py is the least bit useful without a compiler. If they are in the core, you are encouraging people to use them in their larger packages, which will then impose a dependency on compilers. This seems to me not to fit in with the purpose of the core, which is to be a SINGLE, robust, easy-to-install dependency that others can build on. I suggest that weave and f2py go into a "devel" or "high-performance" package instead. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

I've got one more issue that might bear thinking about at this juncture: Versioning control One issue that has been brought up in the discussion of using ndarrays as an interchange format with other packages is that those packages might well become dependent on a particular version of SciPy. For me, this brings up the issue that I might well want (or need) to have more than one version of SciPy installed at once, and be able to select which one is used at run time. If nothing else, it facilitates testing as new versions come out. I suggest a system similar to that recently added to wxPython: import wxversion wxversion.select("2.5") import wx See: http://wiki.wxpython.org/index.cgi/MultiVersionInstalls for more details. Between the wxPython list and others, a lot of pros and cons to doing this have been laid out. Honestly, there never really was a consensus among the wxPython community, but Robin decided to go for it, and I, for one, am very happy with it. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Michiel Jan Laurens de Hoon wrote:
Perry Greenfield wrote:
[weave & f2py in the core]
That they are perfectly fine to go into the core. In fact, if they are used by any of the extra packages, they should be in the core to eliminate the extra step in the installation of those packages.
-0. 1) In der Beschraenkung zeigt sich der Meister. In other words, avoid software bloat. 2) f2py is a Fortran-Python interface generator, once the interface is created there is no need for the generator. 3) I'm sure f2py is useful, but I doubt that it has very general usefulness. There are lots of other useful Python packages, but we're not including them in scipy-core either. 4) f2py and weave don't fit in well with the rest of scipy-core, which is mainly standard numerical algorithms.
I'd like to argue that these two tools are actually critically important in the core of a python for scientific computing toolkit, at its most basic layer. The reason is that python's dynamic runtime type checking makes it impossible to write efficient loop-based code, as we all know. And it is not always feasible to write all algorithms in terms of Numeric vector operations: sometimes you just need to write an indexed loop. At this point, the standard python answer is 'go write an extension module'. While writing extension modules by hand, from scratch, is not all that hard, it certainly presents a significant barrier for less experienced programmers. And yet both weave and f2py make it incredibly easy to get working compiled array code in no time at all. I say this from direct experience, having pointed colleagues to weave and f2py for this very problem. After handing them some notes I have to get started, they've come back saying "I can't believe it was that easy: in a few minutes I had sped up the loop I needed with a bit of C, and now I can continue working on the problem I'm interested in". I know for a fact that if I'd told them to write a full extension module by hand, the result would have been quite different. The reality is that, in scientific work, you are likely to run into this problem at a very early stage, much more so than for other kinds of python usage. For this reason, it is important that the basic toolset provides a clean solution from the start. At least that's been my experience. Regards, f

On 16.03.2005, at 02:46, Fernando Perez wrote:
The reality is that, in scientific work, you are likely to run into this problem at a very early stage, much more so than for other kinds of python usage. For this reason, it is important that the basic toolset provides a clean solution from the start.
One can in fact argue that f2py, weave, and other tools (Pyrex comes to mind) are the logical extensions of Distutils, which is part of the Python core. As long as they can be installed without additional requirements (in particular requiring the compilers that they need to work), I don't mind having them in the core distribution, though I would still have them as logically separate packages (i.e. not scipy.core.f2py but scipy.f2py) . Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

Travis Oliphant wrote:
1) There will be a scipy_core package which will be essentially what Numeric has always been (plus a few easy to install extras already in current scipy_core). It will likely contain the functionality of (the names and placements will be similar to current scipy_core). Numeric3 (actually called ndarray or narray or numstar or numerix or something....) fft (based on c-only code -- no fortran dependency) linalg (a lite version -- no fortran or ATLAS dependency) stats (a lite version --- no fortran dependency) special (only c-code --- no fortran dependency)
That would be great! If it can be installed as easily as Numerical Python (and I have no reason to believe it won't be), I will certainly point users to this package instead of the older Numerical Python. I'd be happy to help out here, but I guess most of this code is working fine already.
2) The rest of scipy will be a package (or a series of packages) of algorithms. We will not try to do plotting as part of scipy. The current plotting in scipy will be supported for a time, but users will be weaned off to other packages: matplotlib, pygist (for xplt -- and I will work to get any improvements for xplt into pygist itself), gnuplot, etc.
Let me know which improvements from xplt you want to include into pygist. It might also be a good idea to move the pygist web pages to scipy.org. --Michiel.

Can I put in a good word for Fortran? Not the language itself, but the available packages for it. I've always thought that one of the really good things about Scipy was the effort put into getting all those powerful, well tested, robust Fortran routines from Netlib inside Scipy. Without them, it seems to me that folks who just install the new scipy_base are going to re-invent a lot of wheels. Is it really that hard to install g77 on non-Linux platforms? Steve Walton

On Mar 10, 2005, at 18:33, Stephen Walton wrote:
Can I put in a good word for Fortran? Not the language itself, but the available packages for it. I've always thought that one of the really good things about Scipy was the effort put into getting all those powerful, well tested, robust Fortran routines from Netlib inside Scipy. Without them, it seems to me that folks who just install the new scipy_base are going to re-invent a lot of wheels.
Is it really that hard to install g77 on non-Linux platforms?
It takes some careful reading of the instructions, which in turn requires a good command of the English language, including some peculiar technical terms, and either some experience in software installation or a high intimidation threshold. It also takes a significant amount of time and disk space. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire Léon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ---------------------------------------------------------------------

konrad.hinsen@laposte.net writes:
On Mar 10, 2005, at 18:33, Stephen Walton wrote:
Can I put in a good word for Fortran? Not the language itself, but the available packages for it. I've always thought that one of the really good things about Scipy was the effort put into getting all those powerful, well tested, robust Fortran routines from Netlib inside Scipy. Without them, it seems to me that folks who just install the new scipy_base are going to re-invent a lot of wheels.
Is it really that hard to install g77 on non-Linux platforms?
It takes some careful reading of the instructions, which in turn requires a good command of the English language, including some peculiar technical terms, and either some experience in software installation or a high intimidation threshold.
It also takes a significant amount of time and disk space.
Konrad.
I don't know about Windows, but on OS X it involves going to http://hpc.sourceforge.net/ and following the one paragraph of instructions. That could be even be simplified if an .pkg were made... In fact, it's so easy to make a .pkg with PackageMaker that I've done it :-) I've put a .pkg of g77 3.4 for OS X (using the above binaries) at http://arbutus.mcmaster.ca/dmc/osx/ [Warning: unsupported and lightly-tested. I'll email Gaurav Khanna about making packages of his other binaries.] It'll run, install into /usr/local/g77v3.4, and make a symlink at /usr/local/bin/g77 to the right binary. (To compile SciPy with this, I have to add -lcc_dynamic to the libraries to link with. I've got a patch which I'll submit to the SciPy bug tracker for that, soonish.) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca

On 10.03.2005, at 21:44, David M. Cooke wrote:
I don't know about Windows, but on OS X it involves going to http://hpc.sourceforge.net/ and following the one paragraph of instructions. That could be even be simplified if an .pkg were made...
I wasn't thinking of Windows and OS X, but of the less common Unices. I did my last gcc/g77 installation three years ago on an Alpha station running whatever Compaq's Unix is called. It worked without any problems, but it still took me about two hours, and I am pretty experienced at installation work. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

Stephen Walton wrote:
Can I put in a good word for Fortran? Not the language itself, but the available packages for it. I've always thought that one of the really good things about Scipy was the effort put into getting all those powerful, well tested, robust Fortran routines from Netlib inside Scipy. Without them, it seems to me that folks who just install the new scipy_base are going to re-invent a lot of wheels.
Is it really that hard to install g77 on non-Linux platforms?
I agree that Netlib should be in SciPy. But why should Netlib be in scipy_base? If SciPy evolves into a website of scientific packages for python, I presume Netcdf will be in one of those packages, maybe even a package by itself. Such a package, together with a couple of binary installers for common platforms, will be appreciated by users and developers who need Netcdf. But if Netcdf is in scipy_base, you're effectively forcing most users to waste time on Fortran only to install something they don't need. In turn, those users will ask their developers for help if something goes wrong (or give up altogether). And those developers, also not willing to waste time on something they don't need, will tell their users to use Numerical Python instead of SciPy. --Michiel.

Michiel Jan Laurens de Hoon wrote:
I agree that Netlib should be in SciPy. But why should Netlib be in scipy_base?
It should not, and I'm sorry if my original message made it sound like I was advocating for that. I was mainly advocating for f2py to be in scipy_base.

"Travis" == Travis Oliphant <oliphant@ee.byu.edu> writes:
Travis> It would seem that while the scipy conference demonstrates Travis> a continuing and even increasing use of Python for Travis> scientific computing, not as many of these users are scipy Travis> devotees. Why? Hi Travis, I like a lot of your proposal, and I want to throw a couple of additional ideas into the mix. There are two ideas about what scipy is: a collection of scientific algorithms and a general purpose scientific computing environment. On the first front, scipy has been a great success; on the second, less so. I think the following would be crucial to make such an effort a success (some of these are just restatements of your ideas with additional comments) * Easy to install: - it would be probably be important to have a fault-tolerant install so that even if a component fails, the parts that don't depend on that can continue. Matthew Knepley's build system might be an important tool to make this work right for source installs, rather than trying to push distutils too hard. * A package repository and a way of specifying dependencies between the packages and allow automated recursive downloads ala apt-get, yum, etc.... So basically we have to come up with a package manager, and probably one that supports src as well as binary installs. Everyone knows this is a significant problem in python, and we're in a good place to tackle it in that we have experience distributing complex packages across platforms which are a mixture of python/C/C++/FORTRAN, so if we can make it work, it will probably work for all of python. I think we would want contributions from people who do packaging on OSX and win32, eg Bob Ippolito, Joe Cooper, Robert Kern, and others. * Transparent support for Numeric, numarray and Numeric3 built into a compatibility layer, eg something like matplotlib.numerix which enables the user to be shielded from past and future changes in the array package. If you and the numarray developers can agree on that interface, that is an important start, because no matter how much success you have with Numeric3, Numeric 23.x and numarray will be in the wild for some time to come. Having all the major players come together and agree on a core interface layer would be a win. In practice, it works well in matplotlib.numerix. * Buy-in from the developers of all the major packages that people want and need to have the CVS / SVN live on a single site which also has mailing lists etc. I think this is a possibility, actually; I'm open to it at least. * Good tutorial, printable documentation, perhaps following a "dive into python" model with a "just-in-time" model of teaching the language; ie, task oriented. A question I think should be addressed is whether scipy is the right vehicle for this aggregation. I know this has been a long-standing goal of yours and appreciate your efforts to continue to make it happen. But there is a lot of residual belief that scipy is hard to install, and this is founded in an old memory that refuses, sometimes irrationally, to die, and in part from people's continued difficulties. If we make a grand effort to unify into a coherent whole, we might be better off with a new name that doesn't carry the difficult-to-install connotation. And easy-to-install should be our #1 priority. Another reason to consider a neutral name is that it wouldn't scare off a lot of people who want to use these tools but don't consider themselves to be scientists. In matplotlib, there are people who just want to make bar and pie charts, and in talks I've given many people are very happy when I tell them that I am interested in providing plotting capabilities outside the realm of scientific plotting. This is obviously a lot to bite off but it could be made viable with some dedicated effort; python is like that. Another concern I have, though, is that it seems to duplicate a lot of the enthought effort to build a scientific python bundle -- they do a great job already for win32 and I think an enthought edition for linux and OSX are in the works. The advantage of your approach is that it is modular rather than monolithic. To really make this work, I think enthought would need to be on board with it. Eg mayavi2 and traits2 are both natural candidates for inclusion into this beast, but both live in the enthought subversion tree. Much of what you describe seems to be parallel to the enthought python, which also provides scipy, numeric, ipython, mayavi, plotting, and so on. I am hesitant to get too involved in the packaging game -- it's really hard and would take a lot of work. We might be better off each making little focused pieces, and let packagers (pythonmac, fink, yum, debian, enthought, ...) do what they do well. Not totally opposed, mind you, just hesitant.... JDH

John Hunter wrote:
I think we would want contributions from people who do packaging on OSX and win32, eg Bob Ippolito, Joe Cooper, Robert Kern, and others.
Just a note about this. For OS-X, Jack Jansen developed PIMP, and the Package Manger App to go with it. Someone even made a wxPython based Packaged Manager app also. It was designed to be platform independent from the start. I think part of the idea was that if if caught on on the Mac, maybe it would be adopted elsewhere. I think it's worth looking at. However... The PIMP database maintenance has not been going very well. In fact, to some extent it's been abandoned, and replaced with a set of native OS-X .mpkg files. These are easy to install, and familiar to Mac users. This supports my idea from long ago: what we need are simply a set of packages in a platform native format: Windows Installers, rpms, .debs, .mpkg, etc. Whenever this comes up, it seems like people focus on nifty technological solutions for a package repository, which makes sense as we're all a bunch of programmers, but I'm not sure it gets the job done. a simple web site you can download all the installers you need is fine. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Hello, On Wed, Mar 09, 2005 at 12:32:33AM -0700, Travis Oliphant wrote:
Subdivide scipy into several super packages that install cleanly but can also be installed separately. Implement a CPAN-or-yum-like repository and query system for installing scientific packages.
Please don't try to reinvent a repository and installation system specific to scipy. Under unix distribution and package systems are already solving this problem. Python folks already reinvented part of the wheel with a PythonPackageIndex that can be updated in one command using distutils. If your goal is to have a unique reference for scientific tools, I think it would be better to set up a Python Scientific Package Index or just use the existing one at http://www.python.org/pypi/ Packaging/Installation/Querying/Upgrading is a complex task better left to dedicated existing tools, namely apt-get/yum/urpmi/portage/etc. Regarding subdividing scipy into several packages installable separately under the same scipy base namespace umbrella, you should be aware that PyXML has had many problems doing the same (but PyXML has also been occulting existing parts of standard library, which may feel a bit too weird). -- Nicolas Chauvat logilab.fr - services en informatique avancée et gestion de connaissances

Travis Oliphant <oliphant@ee.byu.edu> writes:
2) Installation problems -- I'm not completely clear on what the "installation problems" really are. I hear people talk about them, but Pearu has made significant strides to improve installation, so I'm not sure what precise issues remain. Yes, installing ATLAS can be a pain, but scipy doesn't require it. Yes, fortran support can be a pain, but if you use g77 then it isn't a big deal. The reality, though, is that there is this perception of installation trouble and it must be based on something. Let's find out what it is. Please speak up users of the world!!!!
While I am not a scientific user, I occasionally have a need for something like stats, linear algebra, or other such functions. I'm happy to install something (I'm using Python on Windows, so when I say "install", I mean "download and run a binary installer") but I'm a casual user, so I am not going to go to too much trouble. First problem - no scipy Windows binaries for Python 2.4. I'm not going to downgrade my Python installation for the sake of scipy. Even assuming there were such binaries, I can't tell from the installer page whether I need to have Numeric, or is it included. Assuming I need to install it, the binaries say Numeric 23.5, with 23.1 available. But the latest Numeric is 23.8, and only 23.8 and 23.7 have Python 2.4 compatible Windows binaries. Stuck again. As for the PIII/P4SSE2 binaries, I don't know which of those I'd need, but that's OK, I'd go for "Generic", on the basis that speed isn't relevant to me... There's no way on Windows that I'd even consider building scipy from source - my need for it simply isn't sufficient to justify the cost. As I say, this is from someone who is clearly not in the target audience of scipy, but maybe it is of use... Paul. -- A little inaccuracy sometimes saves tons of explanation -- Saki

Paul Moore wrote:
Travis Oliphant <oliphant@ee.byu.edu> writes
2) Installation problems -- I'm not completely clear on what the "installation problems" really are.
While I am not a scientific user, I occasionally have a need for something like stats, linear algebra, or other such functions. I'm happy to install something (I'm using Python on Windows, so when I say "install", I mean "download and run a binary installer") but I'm a casual user, so I am not going to go to too much trouble. ... There's no way on Windows that I'd even consider building scipy from source - my need for it simply isn't sufficient to justify the cost.
As I say, this is from someone who is clearly not in the target audience of scipy, but maybe it is of use...
I think you perfectly described the experience of a typical Biopython user. So as far as I'm concerned, you're squarely in the target audience of SciPy, if it intends to replace Numeric. --michiel.
participants (22)
-
Alan G Isaac
-
Bob Ippolito
-
Chris Barker
-
cookedm@physics.mcmaster.ca
-
Cory Davis
-
Daishi Harada
-
eric jones
-
Fernando Perez
-
John Hunter
-
konrad.hinsen@laposte.net
-
Matthew Brett
-
Michiel Jan Laurens de Hoon
-
Nicolas Chauvat
-
Paul Moore
-
Pearu Peterson
-
pearu@cens.ioc.ee
-
Perry Greenfield
-
Peter Verveer
-
Prabhu Ramachandran
-
Ralf Juengling
-
Stephen Walton
-
Travis Oliphant