RE: [SciPy-dev] scipy and ATLAS (in)dependency
![](https://secure.gravatar.com/avatar/c7976f03fcae7e1199d28d1c20e34647.jpg?s=120&d=mm&r=g)
Pearu Peterson wrote:
Hi,
Today I was reading "Special Session: Making Python attractive to General Scientists (Harrington, Greenfield)" notes in
http://www.scipy.org/wikis/scipy04/ConferenceSchedule
and I was a bit surprised on one of the conclusions that maybe ATLAS optimization should be undone due to difficulties in building ATLAS.
Though, IMHO, building recent versions of ATLAS libraries is not difficult at all on Linux platforms and not even on MS Windows (there are step-by-step instructions available on Scipy site), it just may be a very time consuming process ;). It's not even an issue for Mac as Scipy uses its vecLib framework. I can't say much on the situation on other unix platforms such as irix, sun, etc due to the lack of access to such platforms. But most of current and potential Scipy users are either on MS Windows, Linux, or Mac anyway..
But that was not the point I was surprised on. It was acctually the fact that people seem to be unaware of the possibility to build Scipy without ATLAS dependency by using Fortran sources of BLAS and LAPACK libraries. Let me stress that nothing in Scipy requires specifically ATLAS libraries, the corresponding interface in scipy.linalg is smart enough to pick up ATLAS optimized routines when available and use Fortran BLAS/LAPACK routines when they are not.
My point is that there is (almost) nothing to do to "undo ATLAS optimization" in Scipy. ATLAS is optional already. However, when ATLAS is not available then Scipy needs BLAS/LAPACK libraries that currently must be provided by the system or users must download them from Netlib. I think that BLAS/LAPACK libraries are the only external libraries that Scipy currently depends on.
To get rid of this dependency, I'd suggest include the sources of BLAS/LAPACK libraries to Scipy, and use them silently when optimized BLAS/LAPACK libraries are not available. This would be very similar to scipy.fftpack that silently uses Fortran fftpack sources when FFTW libraries are not found.
Just wanted to clear some things up.. Pearu
As one of the notetakers I'll admit I don't have much personal experience with the issue so I could have gotten the conversation wrong. But I think the gist was that some felt that there should be easy-to-install binary packaging for all popular platforms, not easy-to-build packaging (though of course one hopes that is also available). If having an optimized ATLAS got in the way of that goal, then it was argued that it would be better to have a slow version of the linear algebra libraries in the easy-to-install binaries. Those that needed the optimized versions could build it themselves. The stress was on making it very painless for people to install the general setup on their machines and that meant binary installers. If the conclusion that optimized atlas made that harder (either on the people doing the binary packaging--thus it never happens-- or the people doing the installing) is wrong, then I should correct the notes. But I think that was the point of the comments. Perry
![](https://secure.gravatar.com/avatar/1bc8694bf55c688b2aa2075eedf9b4c6.jpg?s=120&d=mm&r=g)
Perry Greenfield wrote:
Pearu Peterson wrote:
Hi,
Today I was reading "Special Session: Making Python attractive to General Scientists (Harrington, Greenfield)" notes in
http://www.scipy.org/wikis/scipy04/ConferenceSchedule
and I was a bit surprised on one of the conclusions that maybe ATLAS optimization should be undone due to difficulties in building ATLAS.
Though, IMHO, building recent versions of ATLAS libraries is not difficult at all on Linux platforms and not even on MS Windows (there are step-by-step instructions available on Scipy site), it just may be a very time consuming process ;). It's not even an issue for Mac as Scipy uses its vecLib framework. I can't say much on the situation on other unix platforms such as irix, sun, etc due to the lack of access to such platforms. But most of current and potential Scipy users are either on MS Windows, Linux, or Mac anyway..
But that was not the point I was surprised on. It was acctually the fact that people seem to be unaware of the possibility to build Scipy without ATLAS dependency by using Fortran sources of BLAS and LAPACK libraries. Let me stress that nothing in Scipy requires specifically ATLAS libraries, the corresponding interface in scipy.linalg is smart enough to pick up ATLAS optimized routines when available and use Fortran BLAS/LAPACK routines when they are not.
My point is that there is (almost) nothing to do to "undo ATLAS optimization" in Scipy. ATLAS is optional already. However, when ATLAS is not available then Scipy needs BLAS/LAPACK libraries that currently must be provided by the system or users must download them from Netlib. I think that BLAS/LAPACK libraries are the only external libraries that Scipy currently depends on.
To get rid of this dependency, I'd suggest include the sources of BLAS/LAPACK libraries to Scipy, and use them silently when optimized BLAS/LAPACK libraries are not available. This would be very similar to scipy.fftpack that silently uses Fortran fftpack sources when FFTW libraries are not found.
Just wanted to clear some things up.. Pearu
As one of the notetakers I'll admit I don't have much personal experience with the issue so I could have gotten the conversation wrong. But I think the gist was that some felt that there should be easy-to-install binary packaging for all popular platforms, not easy-to-build packaging (though of course one hopes that is also available). If having an optimized ATLAS got in the way of that goal, then it was argued that it would be better to have a slow version of the linear algebra libraries in the easy-to-install binaries. Those that needed the optimized versions could build it themselves. The stress was on making it very painless for people to install the general setup on their machines and that meant binary installers. If the conclusion that optimized atlas made that harder (either on the people doing the binary packaging--thus it never happens-- or the people doing the installing) is wrong, then I should correct the notes. But I think that was the point of the comments.
I remember that the discussion mainly focused on binary packages also. It isn't that we should remove ATLAS from the options available to people building from CVS, but that ATLAS makes building RPMs harder to do cleanly (in an automated way) and also calls for many more versions of installers for win32. I think our approach has been to provide Pentium III optimized ATLAS in the Python -- Enthought Edition that we make available. Joe Harrington took a pole to see how many people would object to have a single generic version of SciPy on the web site for each primary platform instead of multiple optimized binaries. 80-90% of the room thought this was a good idea to reduce the pain on the people maintaining packages. So: 1. I don't think we should remove that ATLAS from the system, but it might be worth while to make it more obvious (on web pages or whatever) to people building from CVS that ATLAS isn't required. 2. Adding LAPACK/BLAS to the repository would add a lot of code to the repository. I'm not all together opposed to this, but I'm not sure it is that great of an idea either. 90% of serious users will link to a more optimized version (even if it is a generic version of ATLAS). This pushes me to -0 on the idea. 3. Building a single binary per platform for download from SciPy using a generic version of ATLAS is a reasonable trade-off between maintenance headache and performance (with "reasonable" here defined by a community pole). thanks, eric
Perry
_______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev
![](https://secure.gravatar.com/avatar/5a7d8a4d756bb1f1b2ea729a7e5dcbce.jpg?s=120&d=mm&r=g)
eric jones wrote:
of installers for win32. I think our approach has been to provide Pentium III optimized ATLAS in the Python -- Enthought Edition that we make available.
I think at least on x86, a P-III is a very reasonable baseline for the 'easy' distribution, likely to satisfy I'd bet most users. Those to whom the difference between an SSE-2, P-IV optimized ATLAS and a P-III one is truly significant, can definitely take care of themselves and build it. Given that scipy.org already provides a number of precompiled ATLASes (thank you!), most of the hard part is already solved even for hand-builds. Quite honestly, these days building scipy even from CVS very easy (assuming you have already Numeric and F2PY in place). Pseudo shell script follows (make it a real one simply by defining ARCH and SCIPY_PATH): ARCH=P4SSE2_2HT # set to the right string for your arch PATH_TO_SCIPY=/your/path/to/scipy #Grab a prebuilt ATLAS from scipy.org, according to your architecture: wget http://www.scipy.org/download/atlasbinaries/linux/atlas3.6.0_Linux_${ARCH}.tgz # Unpack the tarball, this makes directory Linux_$ARCH: tar -xzf atlas3.6.0_Linux_${ARCH}.tgz ./install_atlas.py Linux_${ARCH} # This is a tiny script I wrote for this purpose, which just copies things # in the right place. It's trivial, but it saves me from actually having # to think what to do. Attached in case anybody cares. cd $SCIPY_PATH cvs -q up -P -d # update CVS python setup.py install That's it! And after you've done it once (so atlas is in place), you really just need to re-run the last two lines to update/reinstall. I used to cringe (years ago) at the thought of building scipy by hand, but these days it's honestly trivial. But back to the original topic, I really think the P-III ATLAS is an excellent compromise point for most users: it keeps the maintainer's burden under control and makes a very good (if not tip-top optimal) click-and-run scipy install feasible for new users. Cheers, f
![](https://secure.gravatar.com/avatar/14bd771ea8652b026bf18157f1c71fed.jpg?s=120&d=mm&r=g)
Quite honestly, these days building scipy even from CVS very easy (...) python setup.py install
For at least some system administrators scattering files into the system without adequate packaging managment is not an option. Supported software has to fit into the system mainenance process, i.e. on most systems "apt-get upgrade" should keep the software up to date. At the moment scipy an its dependencies are not very rpm-friendly: Numeric and SciPy both are statically linked against lapack and blas resp. atlas. While this may enhance speed, it also hands the architecture dependence over to SciPy. The standard blas package, at least for fedora core 2, does not contain all libraries needed to build SciPy. Thus, this package has to be replaced which may interfere with other applications like octave and scilab. F2PY's setup.py has some failing preconditions about the path, when changing to the 'src' directory in line 49: "F2PY Version 2.43.239_1835 Traceback (most recent call last): File "setup.py", line 131, in ? if 'install' in sys.argv and need_scipy_distutils(): File "setup.py", line 49, in need_scipy_distutils os.chdir('src') OSError: [Errno 2] No such file or directory: 'src' Fehler: Bad exit status from /var/tmp/rpm-tmp.43647 (%install)" This prevents the rpm from being build. Both, SciPy and Scipy_core have failing preconditions about the number of rpm packages build. E.g. on fedora core 2 the rpm packages build in build/bdist.linux-i686/rpm/RPMS/i386/ are SciPy-0.3.1_287.4340-1.i386.rpm and SciPy-debuginfo-0.3.1_287.4340-1.i386.rpm Another minor issue is that the spelling of the package names is inconsistent: SciPy vs. Scipy vs. scipy. Ways to overcome these problems could be: Dynamically link atlas and lapack libraries and provide reasonable default rpm packages, e.g. i386, i686, athlon and x86_64, for them and standard packages (i386, noarch) for the rest. Statically link atlas and lapack libraries and provide reasonable default rpm packages. Provide src-rpm packages or spec-files. Michael
![](https://secure.gravatar.com/avatar/9b85a909fbfc71a3ea3275c7872e714d.jpg?s=120&d=mm&r=g)
On Mon, 11 Oct 2004, Michael Reimpell wrote:
Quite honestly, these days building scipy even from CVS very easy (...) python setup.py install
For at least some system administrators scattering files into the system without adequate packaging managment is not an option. Supported software has to fit into the system mainenance process, i.e. on most systems "apt-get upgrade" should keep the software up to date. At the moment scipy an its dependencies are not very rpm-friendly:
See http://www.scipy.org/development/packagescipy.txt Btw, scipy and its dependencies are now available also as a part of Debian sid system thanks to José Fonseca, Alexandre Fayolle and Marco Presi. So, if RPM packagers would follow the above document then it should be possible to build RPMs without dependency conflicts.
Numeric and SciPy both are statically linked against lapack and blas resp. atlas. While this may enhance speed, it also hands the architecture dependence over to SciPy.
And so? Any software that is linked against atlas will have this dependence.
The standard blas package, at least for fedora core 2, does not contain all libraries needed to build SciPy. Thus, this package has to be replaced which may interfere with other applications like octave and scilab.
To build Scipy, you'll need blas and lapack libraries as minimum. You don't need atlas to build Scipy.
F2PY's setup.py has some failing preconditions about the path, when changing to the 'src' directory in line 49: "F2PY Version 2.43.239_1835 Traceback (most recent call last): File "setup.py", line 131, in ? if 'install' in sys.argv and need_scipy_distutils(): File "setup.py", line 49, in need_scipy_distutils os.chdir('src') OSError: [Errno 2] No such file or directory: 'src' Fehler: Bad exit status from /var/tmp/rpm-tmp.43647 (%install)" This prevents the rpm from being build.
For some reasons src/fortranobject.{c,h} files were not written to a MANIFEST file when building RPMs with 'setup.py bdist_rpm'. I worked around this issue that by including MANIFEST.in to MANIFEST.in. Try F2PY-2.43.239_1844 again.
Both, SciPy and Scipy_core have failing preconditions about the number of rpm packages build. E.g. on fedora core 2 the rpm packages build in build/bdist.linux-i686/rpm/RPMS/i386/ are SciPy-0.3.1_287.4340-1.i386.rpm and SciPy-debuginfo-0.3.1_287.4340-1.i386.rpm
This is either distutils or fedora core 2 issue, not Scipy. Have you tried building rpms for any other python package with extension modules using 'setup.py bdist_rpm' in your system? I'll bet you'll get similar failures.
Another minor issue is that the spelling of the package names is inconsistent: SciPy vs. Scipy vs. scipy.
I agree. I would drop using SciPy and use Scipy when referring to the project and scipy when referring to Python package.
Ways to overcome these problems could be: Dynamically link atlas and lapack libraries and provide reasonable default rpm packages, e.g. i386, i686, athlon and x86_64, for them and standard packages (i386, noarch) for the rest.
Statically link atlas and lapack libraries and provide reasonable default rpm packages.
Provide src-rpm packages or spec-files.
I hope that someone who is expert in building RPMs could contribute to resolve this issue. I can build binaries for a general Linux system and make Win32 installers, but I don't have time nor resources to deal with RPMs myself. Regards, Pearu
participants (5)
-
eric jones
-
Fernando Perez
-
Michael Reimpell
-
Pearu Peterson
-
Perry Greenfield