Mailman 3 performance matrix multiplication vs. matlab - NumPy-Discussion

performance matrix multiplication vs. matlab

David Paul Reichert

4 Jun 2009 4 Jun '09

7:36 a.m.

Hi all, I would be glad if someone could help me with the following issue: From what I've read on the web it appears to me that numpy should be about as fast as matlab. However, when I do simple matrix multiplication, it consistently appears to be about 5 times slower. I tested this using A = 0.9 * numpy.matlib.ones((500,100)) B = 0.8 * numpy.matlib.ones((500,100)) def test(): for i in range(1000): A*B.T I also used ten times larger matrices with ten times less iterations, used xrange instead of range, arrays instead of matrices, and tested it on two different machines, and the result always seems to be the same. Any idea what could go wrong? I'm using ipython and matlab R2008b. Thanks, David -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Show replies by date

Sebastian Walter

4 Jun 4 Jun

8:02 a.m.

Have a look at this thread: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13085.html The speed difference is probably due to the fact that the matrix multiplication does not call optimized an optimized blas routine, e.g. the ATLAS blas. Sebastian On Thu, Jun 4, 2009 at 3:36 PM, David Paul Reichert wrote:

...

Hi all,

I would be glad if someone could help me with the following issue:

From what I've read on the web it appears to me that numpy should be about as fast as matlab. However, when I do simple matrix multiplication, it consistently appears to be about 5 times slower. I tested this using

A = 0.9 * numpy.matlib.ones((500,100)) B = 0.8 * numpy.matlib.ones((500,100))

def test(): for i in range(1000): A*B.T

I also used ten times larger matrices with ten times less iterations, used xrange instead of range, arrays instead of matrices, and tested it on two different machines, and the result always seems to be the same.

Any idea what could go wrong? I'm using ipython and matlab R2008b.

Thanks,

David

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Chris Colbert

2:54 p.m.

Sebastian is right. Since Matlab r2007 (i think that's the version) it has included support for multi-core architecture. On my core2 Quad here at the office, r2008b has no problem utilizing 100% cpu for large matrix multiplications. If you download and build atlas and lapack from source and enable parrallel threads in atlas, then compile numpy against these libraries, you should achieve similar if not better performance (since the atlas routines will be tuned to your system). If you're on Windows, you need to do some trickery to get threading to work (the instructions are on the atlas website). Chris

Chris Colbert

2:56 p.m.

I should update after reading the thread Sebastian linked: The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK. chris On Thu, Jun 4, 2009 at 4:54 PM, Chris Colbert wrote:

...

Sebastian is right.

Since Matlab r2007 (i think that's the version) it has included support for multi-core architecture. On my core2 Quad here at the office, r2008b has no problem utilizing 100% cpu for large matrix multiplications.

If you download and build atlas and lapack from source and enable parrallel threads in atlas, then compile numpy against these libraries, you should achieve similar if not better performance (since the atlas routines will be tuned to your system).

If you're on Windows, you need to do some trickery to get threading to work (the instructions are on the atlas website).

Chris

Sebastian Walter

5 Jun 5 Jun

4:03 a.m.

On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote:

...

I should update after reading the thread Sebastian linked:

The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK.

in the best of all possible worlds this would be done by a package maintainer....

...

chris

On Thu, Jun 4, 2009 at 4:54 PM, Chris Colbert wrote:

...
Sebastian is right.

Since Matlab r2007 (i think that's the version) it has included support for multi-core architecture. On my core2 Quad here at the office, r2008b has no problem utilizing 100% cpu for large matrix multiplications.

If you download and build atlas and lapack from source and enable parrallel threads in atlas, then compile numpy against these libraries, you should achieve similar if not better performance (since the atlas routines will be tuned to your system).

If you're on Windows, you need to do some trickery to get threading to work (the instructions are on the atlas website).

Chris

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

David Cournapeau

3:58 a.m.

Sebastian Walter wrote:

...

On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote:

...
I should update after reading the thread Sebastian linked:

The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK.

in the best of all possible worlds this would be done by a package maintainer....

Numpy packages on windows do use ATLAS, so I am not sure what you are referring to ? On a side note, correctly packaging ATLAS is almost inherently impossible, since the build method of ATLAS can never produce the same binary (even on the same machine), and the binary is optimized for the machine it was built on. So if you want the best speed, you should build atlas by yourself - which is painful on windows (you need cygwin). On windows, if you really care about speed, you should try linking against the Intel MKL. That's what Matlab uses internally on recent versions, so you would get the same speed. But that's rather involved. cheers, David

Sebastian Walter

4:27 a.m.

On Fri, Jun 5, 2009 at 11:58 AM, David Cournapeau wrote:

...

Sebastian Walter wrote:

...
On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote:

...
I should update after reading the thread Sebastian linked:

The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK.

in the best of all possible worlds this would be done by a package maintainer....

Numpy packages on windows do use ATLAS, so I am not sure what you are referring to ? I'm on debian unstable and my numpy (version 1.2.1) uses an unoptimized blas. I had the impression that most ppl that use numpy are on linux. But apparently this is a misconception.

...

On a side note, correctly packaging ATLAS is almost inherently impossible, since the build method of ATLAS can never produce the same binary (even on the same machine), and the binary is optimized for the machine it was built on. So if you want the best speed, you should build atlas by yourself - which is painful on windows (you need cygwin). in the debian repositories there are different builds of atlas so there could be different builds for numpy, too. But there aren't....

...

On windows, if you really care about speed, you should try linking against the Intel MKL. That's what Matlab uses internally on recent versions, so you would get the same speed. But that's rather involved.

How much faster is MKL than ATLAS?

...

cheers,

David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

David Cournapeau

4:20 a.m.

Sebastian Walter wrote:

...

On Fri, Jun 5, 2009 at 11:58 AM, David Cournapeau wrote:

...
Sebastian Walter wrote:

...
On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote:

...
I should update after reading the thread Sebastian linked:

The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK.

in the best of all possible worlds this would be done by a package maintainer....

Numpy packages on windows do use ATLAS, so I am not sure what you are referring to ?

I'm on debian unstable and my numpy (version 1.2.1) uses an unoptimized blas.

Yes, it is because the package on Linux are not well done in that respect (to their defense, numpy build is far from being packaging friendly, and is both fragile and obscure).

...

I had the impression that most ppl that use numpy are on linux.

Sourceforge numbers tell a different story at least. I think most users on the ML use linux, and certainly almost every developer use linux or mac os x. But ML already filter most windows users - only geeks read ML :) I am pretty sure a vast majority of numpy users never even bother to look for the ML.

...

...
On a side note, correctly packaging ATLAS is almost inherently impossible, since the build method of ATLAS can never produce the same binary (even on the same machine), and the binary is optimized for the machine it was built on. So if you want the best speed, you should build atlas by yourself - which is painful on windows (you need cygwin).

in the debian repositories there are different builds of atlas so there could be different builds for numpy, too. But there aren't....

There are several problems: - packagers (rightfully) hate to have many versions of the same software - as for now, if ATLAS is detected, numpy is built differently than if it is linked against conventional blas/lapack - numpy on debian is not built with atlas support But there is certainly no need to build one numpy version for every atlas: the linux loader can load the most appropriate library depending on your architecture, the so called hwcap flag. If your CPU supports SSE2, and you have ATLAS installed for SSE2, then the loader will automatically load the libraries there instead of the one in /usr/lib by default. But because ATLAS is such a pain to support in a binary form, only ancient versions of ATLAS are packaged anyways (3.6.*). So if you care so much, you should build your own.

...

...
On windows, if you really care about speed, you should try linking against the Intel MKL. That's what Matlab uses internally on recent versions, so you would get the same speed. But that's rather involved.

It really depends on the CPU, compiler, how atlas was compiled, etc... it can be slightly faster to 10 times faster (if you use a very poorly optimized ATLAS). For some recent benchmarks: http://eigen.tuxfamily.org/index.php?title=Benchmark cheers, David

Eric Firing

11:29 a.m.

David Cournapeau wrote:

...

It really depends on the CPU, compiler, how atlas was compiled, etc... it can be slightly faster to 10 times faster (if you use a very poorly optimized ATLAS).

For some recent benchmarks:

http://eigen.tuxfamily.org/index.php?title=Benchmark

David, The eigen web site indicates that eigen achieves high performance without all the compilation difficulty of atlas. Does eigen have enough functionality to replace atlas in numpy? Presumably it would need C compatibility wrappers to emulate the blas functions. Would that kill its performance? Or be very difficult? (I'm asking from curiosity combined with complete ignorance. Until yesterday I had never even heard of eigen.) Eric

...

cheers,

David

David Cournapeau

11:48 a.m.

Eric Firing wrote:

...

David,

The eigen web site indicates that eigen achieves high performance without all the compilation difficulty of atlas. Does eigen have enough functionality to replace atlas in numpy?

No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't know if that's even a goal of eigen (it started as a project for KDE, to support high performance core computations for things like spreadsheet and co). But even then, it would be a huge undertaking. For all its flaws, LAPACK is old, tested code, with a very stable language (F77). Eigen is: - not mature. - heavily expression-template-based C++, meaning compilation takes ages + esoteric, impossible to decypher compilation errors. We have enough build problems already :) - SSE dependency harcoded, since it is setup at build time. That's going backward IMHO - I would rather see a numpy/scipy which can load the optimized code at runtime. cheers, David

Matthieu Brucher

12:46 p.m.

2009/6/5 David Cournapeau :

...

Eric Firing wrote:

...
David,

The eigen web site indicates that eigen achieves high performance without all the compilation difficulty of atlas. Does eigen have enough functionality to replace atlas in numpy?

No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't know if that's even a goal of eigen (it started as a project for KDE, to support high performance core computations for things like spreadsheet and co).

But even then, it would be a huge undertaking. For all its flaws, LAPACK is old, tested code, with a very stable language (F77). Eigen is: - not mature. - heavily expression-template-based C++, meaning compilation takes ages + esoteric, impossible to decypher compilation errors. We have enough build problems already :) - SSE dependency harcoded, since it is setup at build time. That's going backward IMHO - I would rather see a numpy/scipy which can load the optimized code at runtime.

I would add that it relies on C++ compiler extensions (the restrict keyword) as does blitz. You unfortunately can't expect every compiler to support it unless the C++ committee finally adds it to the standard. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

Chris Colbert

3:37 p.m.

I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken: https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510 just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds. If you need help with it, just email me off list. Cheers, Chris On Fri, Jun 5, 2009 at 2:46 PM, Matthieu Brucher

...

wrote:

...

2009/6/5 David Cournapeau :

...
Eric Firing wrote:

...
David,

The eigen web site indicates that eigen achieves high performance without all the compilation difficulty of atlas. Does eigen have enough functionality to replace atlas in numpy?

No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't know if that's even a goal of eigen (it started as a project for KDE, to support high performance core computations for things like spreadsheet and co).

But even then, it would be a huge undertaking. For all its flaws, LAPACK is old, tested code, with a very stable language (F77). Eigen is: - not mature. - heavily expression-template-based C++, meaning compilation takes ages + esoteric, impossible to decypher compilation errors. We have enough build problems already :) - SSE dependency harcoded, since it is setup at build time. That's going backward IMHO - I would rather see a numpy/scipy which can load the optimized code at runtime.

I would add that it relies on C++ compiler extensions (the restrict keyword) as does blitz. You unfortunately can't expect every compiler to support it unless the C++ committee finally adds it to the standard.

Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Keith Goodman

6 Jun 6 Jun

8:42 a.m.

On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...

I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck. Clear out your inbox!

Chris Colbert

10:59 a.m.

since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide: and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal) download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3 http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372 create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas now in the terminal: # remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose # build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out ################# cd SRC #this should build lapack without error make # build atlas cd /home/your-user-name/build/atlas #this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2 cd Linux_X64SSE2 #need to turn off cpu-throttling sudo cpufreq-selector -g performance #if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a #the configure step takes a bit, and should end without errors #this takes a long time, go get some coffee, it should end without error make build #this will verify the build, also long running make check #this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time cd lib #builds single threaded .so's make shared #builds multithreaded .so's make ptshared #copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/ #now we need to get and build numpy download numpy 1.3.0 http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515 extract the folder numpy-1.3.0 to /home/your-user-name/build #in the terminal cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include [blas_opt] libraries = ptf77blas, ptcblas, atlas [lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead #build numpy- should end without error python setup.py build #install numpy python setup.py install cd /home sudo ldconfig python

...

...
import numpy numpy.test() #this should run with no errors (skipped tests and known-fails are ok) a = numpy.random.randn(6000, 6000) numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads

Celebrate with a beer! Cheers! Chris On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote:

...

On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...
I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck.

Clear out your inbox! _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Minjae Kim

1:55 p.m.

Thanks for this excellent recipe. I have not tried it out myself yet, but I will follow the instruction on clean Ubuntu 9.04 64-bit. Best, Minjae On Sat, Jun 6, 2009 at 11:59 AM, Chris Colbert wrote:

...

since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide:

and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal)

download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3

http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372

create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack

extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas

now in the terminal:

# remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose

# build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc

gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out #################

cd SRC

#this should build lapack without error make

# build atlas

cd /home/your-user-name/build/atlas

#this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2

cd Linux_X64SSE2

#need to turn off cpu-throttling sudo cpufreq-selector -g performance

#if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC

--with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

#the configure step takes a bit, and should end without errors

#this takes a long time, go get some coffee, it should end without error make build

#this will verify the build, also long running make check

#this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time

cd lib

#builds single threaded .so's make shared

#builds multithreaded .so's make ptshared

#copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/

#now we need to get and build numpy

download numpy 1.3.0

http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515

extract the folder numpy-1.3.0 to /home/your-user-name/build

#in the terminal

cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg

gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include

[blas_opt] libraries = ptf77blas, ptcblas, atlas

[lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead

#build numpy- should end without error python setup.py build

#install numpy python setup.py install

cd /home

sudo ldconfig

python

...
...
import numpy numpy.test() #this should run with no errors (skipped tests and known-fails are ok) a = numpy.random.randn(6000, 6000) numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads

Celebrate with a beer!

Cheers!

Chris

...
On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...
I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as

On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote: the

...
...
package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck.

Clear out your inbox! _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Gabriel Beckers

7 Jun 7 Jun

2:20 a.m.

On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote:

...

../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

Many thanks Chris, I succeeded in building it. The configure command above contained two problems that I had to correct to get it to work though. In case other people are trying this, I used: ../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a That is (in addition to the different -b switch for my 32-bit machine and the different processor speed): the dash before "alg" should be removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a". Gabriel

Gabriel Beckers

3:52 a.m.

OK, perhaps I drank that beer too soon... Now, numpy.test() hangs at: test_pinv (test_defmatrix.TestProperties) ... So perhaps something is wrong with ATLAS, even though the building went fine, and "make check" and "make ptcheck" reported no errors. Gabriel On Sun, 2009-06-07 at 10:20 +0200, Gabriel Beckers wrote:

...

On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote:

...
../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

Many thanks Chris, I succeeded in building it.

The configure command above contained two problems that I had to correct to get it to work though.

In case other people are trying this, I used:

../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a

That is (in addition to the different -b switch for my 32-bit machine and the different processor speed): the dash before "alg" should be removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a".

Gabriel

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

David Cournapeau

3:37 a.m.

Gabriel Beckers wrote:

...

OK, perhaps I drank that beer too soon...

Now, numpy.test() hangs at:

test_pinv (test_defmatrix.TestProperties) ...

So perhaps something is wrong with ATLAS, even though the building went fine, and "make check" and "make ptcheck" reported no errors.

Maybe you did not use the same fortran compiler with atlas and numpy, or maybe something else. make check/make ptchek do not test anything useful to avoid problems with numpy, in my experience. That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices, cheers, David

Gael Varoquaux

4:12 a.m.

New subject: performance matrix multiplication vs. matlab

On Sun, Jun 07, 2009 at 06:37:21PM +0900, David Cournapeau wrote:

...

That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices,

Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores. Gaël

David Cournapeau

4 a.m.

New subject: performance matrix multiplication vs. matlab

Gael Varoquaux wrote:

...

On Sun, Jun 07, 2009 at 06:37:21PM +0900, David Cournapeau wrote:

...
That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices,

Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores.

hence *most* :) I doubt most numpy users need to do PCA on high-dimensional data. cheers, David

Gabriel Beckers

4:31 a.m.

New subject: performance matrix multiplication vs. matlab

On Sun, 2009-06-07 at 19:00 +0900, David Cournapeau wrote:

...

hence *most* :) I doubt most numpy users need to do PCA on high-dimensional data.

OK a quick look on the MDP website learns that I am one of the exceptions (as Gaël's email already suggested). Gabriel

David Warde-Farley

10:29 p.m.

New subject: performance matrix multiplication vs. matlab

On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote:

...

Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores.

Just curious Gael: how many PC's are you retaining? Have you tried iterative methods (i.e. the EM algorithm for PCA)? David

Gael Varoquaux

11:32 p.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote:

...

On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote:

...

...
Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores.

...

Just curious Gael: how many PC's are you retaining? Have you tried iterative methods (i.e. the EM algorithm for PCA)?

I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. The PCA bootstrap is time-consuming. Thanks, Gaël

David Cournapeau

11:17 p.m.

New subject: performance matrix multiplication vs. matlab

Gael Varoquaux wrote:

...

I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996

We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm.

I would not be surprised if David had this paper in mind :) http://www.cs.toronto.edu/~roweis/papers/empca.pdf cheers, David

Gael Varoquaux

11:38 p.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 02:17:45PM +0900, David Cournapeau wrote:

...

...
However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm.

...

I would not be surprised if David had this paper in mind :)

...

http://www.cs.toronto.edu/~roweis/papers/empca.pdf

Excellent. Thanks to the Davids. I'll read that through. Gaël

David Warde-Farley

8 Jun 8 Jun

1:27 a.m.

New subject: performance matrix multiplication vs. matlab

On 8-Jun-09, at 1:17 AM, David Cournapeau wrote:

...

I would not be surprised if David had this paper in mind :)

http://www.cs.toronto.edu/~roweis/papers/empca.pdf

Right you are :) There is a slight trick to it, though, in that it won't produce an orthogonal basis on its own, just something that spans that principal subspace. So you typically have to at least extract the first PC independently to uniquely orient your basis. You can then either subtract off the projection of the data on the 1st PC and find the next one, one at at time, or extract a spanning set all at once and orthogonalize with respect to the first PC. David

Matthieu Brucher

5:09 a.m.

2009/6/8 David Warde-Farley :

...

On 8-Jun-09, at 1:17 AM, David Cournapeau wrote:

...
I would not be surprised if David had this paper in mind :)

http://www.cs.toronto.edu/~roweis/papers/empca.pdf

Right you are :)

There is a slight trick to it, though, in that it won't produce an orthogonal basis on its own, just something that spans that principal subspace. So you typically have to at least extract the first PC independently to uniquely orient your basis. You can then either subtract off the projection of the data on the 1st PC and find the next one, one at at time, or extract a spanning set all at once and orthogonalize with respect to the first PC.

David

Also Ch. Bishop has an article on using EM for PCA, Probabilistic Principal Components Analysis where I think he proves the equivalence as well. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

Jason Rennie

6:33 a.m.

Note that EM can be very slow to converge: http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf EM is great for churning-out papers, not so great for getting real work done. Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience. Btw, have you considered how much the Gaussianity assumption is hurting you? Jason On Mon, Jun 8, 2009 at 1:17 AM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:

...

Gael Varoquaux wrote:

...
I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996

We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm.

I would not be surprised if David had this paper in mind :)

http://www.cs.toronto.edu/~roweis/papers/empca.pdf http://www.cs.toronto.edu/%7Eroweis/papers/empca.pdf

cheers,

David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

David Cournapeau

6:55 a.m.

Jason Rennie wrote:

...

Note that EM can be very slow to converge:

http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf http://www.cs.toronto.edu/%7Eroweis/papers/emecgicml03.pdf

EM is great for churning-out papers, not so great for getting real work done.

I think it depends on what you are doing - EM is used for 'real' work too, after all :)

...

Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience.

Thanks for the link, I was not aware of this work. What is the difference between the ECG method and the method proposed by Lange in [1] ? To avoid 'local trapping' of the parameter in EM methods, recursive EM [2] may also be a promising method, also it seems to me that it has not been used so much, but I may well be wrong (I have seen several people using a simplified version of it without much theoretical consideration in speech processing). cheers, David [1] "A gradient algorithm locally equivalent to the EM algorithm", in Journal of the Royal Statistical Society. Series B. Methodological, 1995, vol. 57, n^o 2, pp. 425-437 [2] "Online EM Algorithm for Latent Data Models", by: Olivier Cappe;, Eric Moulines, in the Journal of the Royal Statistical Society Series B (February 2009).

Jason Rennie

8:40 a.m.

On Mon, Jun 8, 2009 at 8:55 AM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:

...

I think it depends on what you are doing - EM is used for 'real' work too, after all :)

Certainly, but EM is really just a mediocre gradient descent/hill climbing algorithm that is relatively easy to implement. Thanks for the link, I was not aware of this work. What is the

...

difference between the ECG method and the method proposed by Lange in [1] ? To avoid 'local trapping' of the parameter in EM methods, recursive EM [2] may also be a promising method, also it seems to me that it has not been used so much, but I may well be wrong (I have seen several people using a simplified version of it without much theoretical consideration in speech processing).

I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :) I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation). Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

David Cournapeau

9:02 a.m.

Jason Rennie wrote:

...

I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :)

Isn't it true for any general framework who enjoys some popularity :)

...

I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation).

I think there are cases where gradient methods are not applicable (latent models where the complete data Y cannot be split into observations-hidden (O, H) variables), although I am not sure that's a very common case in machine learning, cheers, David

Jason Rennie

11 Jun 11 Jun

11:12 a.m.

On Mon, Jun 8, 2009 at 11:02 AM, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:

...

Isn't it true for any general framework who enjoys some popularity :)

Yup :) I think there are cases where gradient methods are not applicable

...

(latent models where the complete data Y cannot be split into observations-hidden (O, H) variables), although I am not sure that's a very common case in machine learning,

I won't argue with that. My bias has certainly been strongly influenced by the type of problems I've been exposed to. It'd be interesting to hear of a problem where one can't separate observed/hidden variables :) Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

Gael Varoquaux

8 Jun 8 Jun

7 a.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 08:33:11AM -0400, Jason Rennie wrote:

...

EM is great for churning-out papers, not so great for getting real work done.�

That's just what I thought.

...

Btw, have you considered how much the Gaussianity assumption is hurting you?

I have. And the answer is: not much. But then, my order-selection method is just about selecting the non-gaussian components. And the non-orthogonality of the interessing 'indedpendant' signals is small, in that subspace. Ga�l

David Warde-Farley

12:14 p.m.

Robin

9 Jun 9 Jun

1:50 a.m.

On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote:

...

On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:

Note that EM can be very slow to converge:

That's absolutely true, but EM for PCA can be a life saver in cases where diagonalizing (or even computing) the full covariance matrix is not a realistic option. Diagonalization can be a lot of wasted effort if all you care about are a few leading eigenvectors. EM also lets you deal with missing values in a principled way, which I don't think you can do with standard SVD.

EM certainly isn't a magic bullet but there are circumstances where it's appropriate. I'm a big fan of the ECG paper too. :)

Hi, I've been following this with interest... although I'm not really familiar with the area. At the risk of drifting further off topic I wondered if anyone could recommend an accessible review of these kind of dimensionality reduction techniques... I am familiar with PCA and know of diffusion maps and ICA and others, but I'd never heard of EM and I don't really have any idea how they relate to each other and which might be better for one job or the other... so some sort of primer would be really handy. Cheers Robin

David Cournapeau

1:54 a.m.

Robin wrote:

...

On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote:

...
On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:

Note that EM can be very slow to converge:

That's absolutely true, but EM for PCA can be a life saver in cases where diagonalizing (or even computing) the full covariance matrix is not a realistic option. Diagonalization can be a lot of wasted effort if all you care about are a few leading eigenvectors. EM also lets you deal with missing values in a principled way, which I don't think you can do with standard SVD.

EM certainly isn't a magic bullet but there are circumstances where it's appropriate. I'm a big fan of the ECG paper too. :)

Hi,

I've been following this with interest... although I'm not really familiar with the area. At the risk of drifting further off topic I wondered if anyone could recommend an accessible review of these kind of dimensionality reduction techniques... I am familiar with PCA and know of diffusion maps and ICA and others, but I'd never heard of EM and I don't really have any idea how they relate to each other and which might be better for one job or the other... so some sort of primer would be really handy.

I think the biggest problem is the 'babel tower' aspect of machine learning (the expression is from David H. Wolpert I believe), and practitioners in different subfields often use totally different words for more or less the same concepts (and many keep being rediscovered). For example, what ML people call PCA is called Karhunen Loéve in signal processing, and the concepts are quite similar. Anyway, the book from Bishop is a pretty good reference by one of the leading researcher: http://research.microsoft.com/en-us/um/people/cmbishop/prml/ It can be read without much background besides basic 1st year calculus/linear algebra. cheers, David

David Cournapeau

2:09 a.m.

David Cournapeau wrote:

...

I think the biggest problem is the 'babel tower' aspect of machine learning (the expression is from David H. Wolpert I believe), and practitioners in different subfields often use totally different words for more or less the same concepts (and many keep being rediscovered). For example, what ML people call PCA is called Karhunen Loéve in signal processing, and the concepts are quite similar.

Anyway, the book from Bishop is a pretty good reference by one of the leading researcher:

http://research.microsoft.com/en-us/um/people/cmbishop/prml/

Should have mentioned that it is the same Bishop as mentioned by Matthieu, and that chapter 12 deals with latent models with continuous latent variable, which is one way to consider PCA in a probabilistic framework. David

Stéfan van der Walt

2:28 a.m.

2009/6/9 David Cournapeau :

...

Anyway, the book from Bishop is a pretty good reference by one of the leading researcher:

http://research.microsoft.com/en-us/um/people/cmbishop/prml/

It can be read without much background besides basic 1st year calculus/linear algebra.

Bishop's book could be confusing at times, so I would also recommend going back to the original papers. It is sometimes easier to learn *with* researchers than from them! Cheers Stéfan

David Warde-Farley

noon

On 9-Jun-09, at 3:54 AM, David Cournapeau wrote:

...

For example, what ML people call PCA is called Karhunen Loéve in signal processing, and the concepts are quite similar.

Yup. This seems to be a nice set of review notes: http://www.ece.rutgers.edu/~orfanidi/ece525/svd.pdf And going further than just PCA/KLT, tying it together with maximum likelihood factor analysis/linear dynamical systems/hidden Markov models, http://www.cs.toronto.edu/~roweis/papers/NC110201.pdf David

David Cournapeau

9:55 p.m.

David Warde-Farley wrote:

...

On 9-Jun-09, at 3:54 AM, David Cournapeau wrote:

...
For example, what ML people call PCA is called Karhunen Loéve in signal processing, and the concepts are quite similar.

Yup. This seems to be a nice set of review notes:

http://www.ece.rutgers.edu/~orfanidi/ece525/svd.pdf

This looks indeed like a very nice review from a signal processing approach. I never took the time to understand the similarities/differences/connections between traditional SP approaches and the machine learning approach. I wonder if the subspaces methods ala PENCIL/MUSIC and co have a (useful) interpretation in a more ML approach, I never really thought about it. I guess other people had :)

...

And going further than just PCA/KLT, tying it together with maximum likelihood factor analysis/linear dynamical systems/hidden Markov models,

http://www.cs.toronto.edu/~roweis/papers/NC110201.pdf

As much as I like this paper, I always felt that you miss a lot of insights when considering PCA only from a purely statistical POV. I really like the consideration of PCA within a function approximation POV (the chapter 9 of the Mallat book on wavelet is cristal clear, for example, and it is based on all those cool functional spaces theory likes Besov space). cheers, David

Matthieu Brucher

1:57 a.m.

2009/6/9 Robin :

...

On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote:

...
On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:

Note that EM can be very slow to converge:

That's absolutely true, but EM for PCA can be a life saver in cases where diagonalizing (or even computing) the full covariance matrix is not a realistic option. Diagonalization can be a lot of wasted effort if all you care about are a few leading eigenvectors. EM also lets you deal with missing values in a principled way, which I don't think you can do with standard SVD.

EM certainly isn't a magic bullet but there are circumstances where it's appropriate. I'm a big fan of the ECG paper too. :)

Hi,

I've been following this with interest... although I'm not really familiar with the area. At the risk of drifting further off topic I wondered if anyone could recommend an accessible review of these kind of dimensionality reduction techniques... I am familiar with PCA and know of diffusion maps and ICA and others, but I'd never heard of EM and I don't really have any idea how they relate to each other and which might be better for one job or the other... so some sort of primer would be really handy.

Hi, Check Ch. Bishop publication on Probabilistic Principal Components Analysis, you have there the parallel between the two (EM is in fact just a way of computing PPCA, and with some Gaussian assumptions, you get PCA). Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

Matthieu Brucher

8 Jun 8 Jun

12:58 a.m.

2009/6/8 Gael Varoquaux :

...

On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote:

...
On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote:

...
...
Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores.

...
Just curious Gael: how many PC's are you retaining? Have you tried iterative methods (i.e. the EM algorithm for PCA)?

I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996

We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. The PCA bootstrap is time-consuming.

Hi, Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. Just my 2 cents ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

Gael Varoquaux

1:29 a.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote:

...

Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one.

I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns. Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise. So maybe I should be using some non-linear dimensionality reduction, but what I am doing works, and I can write a generative model of it. Most importantly, it is actually quite computationaly simple. However, if you can point me to methods that you believe are better (and tell me why you believe so), I am all ears. Gaël

Matthieu Brucher

5:07 a.m.

2009/6/8 Gael Varoquaux :

...

On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote:

...
Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one.

I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns.

How many samples do you have? 10000? a million? a billion? The problem with 50 PCs is that your search space is mostly empty, "thanks" to the curse of dimensionality. This means that you *should* not try to get a meaning for the 10th and following PCs.

...

Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise.

ICa suffers from the same problems than PCA. And I'm not even talking about the linearity hypothesis that is never respected.

...

So maybe I should be using some non-linear dimensionality reduction, but what I am doing works, and I can write a generative model of it. Most importantly, it is actually quite computationaly simple.

Thanks linearity ;) The problem is that you will have a lot of confounds this way (your 50 PCs can in fact be the effect of 5 variables that are nonlinear).

...

However, if you can point me to methods that you believe are better (and tell me why you believe so), I am all ears.

My thesis was on nonlinear dimensionality reduction (this is why I believe so, especially in the medical imaging field), but it always need some adaptation. It depends on what you want to do, the time you can use to process data, ... Suffice to say we started with PCA some years ago and we were switching to nonlinear reduction because of the emptiness of the search space and because of the nonlinearity of the brain space (no idea what my former lab is doing now, but it is used for DTI at least). You should check some books on it, and you surely have to read something about the curse of dimensionality (at least if you want to get published, as people know about this issue in the medical field), even if you do not use nonlinear techniques. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

josef.pktd＠gmail.com

7:02 a.m.

On Mon, Jun 8, 2009 at 3:29 AM, Gael Varoquaux wrote:

...

On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote:

...
Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one.

I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns.

Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise.

Just curious: whats the actual shape of the array/data you run your PCA on. Number of time periods, size of cross section at point in time? Josef

Gael Varoquaux

7:17 a.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd@gmail.com wrote:

...

whats the actual shape of the array/data you run your PCA on.

50 000 dimensions, 820 datapoints.

...

Number of time periods, size of cross section at point in time?

I am not sure what the question means. The data is sampled at 0.5Hz. Gaël

Matthieu Brucher

7:25 a.m.

2009/6/8 Gael Varoquaux :

...

On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd@gmail.com wrote:

...
whats the actual shape of the array/data you run your PCA on.

50 000 dimensions, 820 datapoints.

You definitely can't expect to find 50 meaningfull PCs. It's impossible to robustly get them with less than thousand points!

...

...
Number of time periods, size of cross section at point in time?

I am not sure what the question means. The data is sampled at 0.5Hz.

Gaël _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher

Keith Goodman

7:28 a.m.

On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux wrote:

...

On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd@gmail.com wrote:

...
whats the actual shape of the array/data you run your PCA on.

50 000 dimensions, 820 datapoints.

Have you tried shuffling each time series, performing PCA, looking at the magnitude of the largest eigenvalue, then repeating many times? That will give you an idea of how large the noise can be. Then you can see how many eigenvectors of the unshuffled data have eigenvalues greater than the noise. It would be kind of the empirical approach to random matrix theory.

Gael Varoquaux

7:34 a.m.

New subject: performance matrix multiplication vs. matlab

On Mon, Jun 08, 2009 at 06:28:06AM -0700, Keith Goodman wrote:

...

On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux wrote:

...
On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd@gmail.com wrote:

...
whats the actual shape of the array/data you run your PCA on.

...

...
50 000 dimensions, 820 datapoints.

...

Have you tried shuffling each time series, performing PCA, looking at the magnitude of the largest eigenvalue, then repeating many times? That will give you an idea of how large the noise can be. Then you can see how many eigenvectors of the unshuffled data have eigenvalues greater than the noise. It would be kind of the empirical approach to random matrix theory.

Yes, that's the kind of things that is done in the paper I pointed out and I use to infer the number of PCAs I retain. Gaël

Gabriel Beckers

7 Jun 7 Jun

4:21 a.m.

On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote:

...

Maybe you did not use the same fortran compiler with atlas and numpy, or maybe something else. make check/make ptchek do not test anything useful to avoid problems with numpy, in my experience.

That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices,

cheers,

David

Hi David, I did: sudo apt-get remove g77 sudo apt-get install gfortran before starting the whole thing, so I assume that should take care of it. I am not sure how much I actually depend on Atlas for what I do, so your advice is well taken. One thing I can think of is PCA and ICA (of *big* matrices of float32 data), using the MDP toolbox mostly. I should find out in how far Atlas is crucial specifically for that. All the best, Gabriel

Gabriel Beckers

6:43 a.m.

On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote:

...

That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices,

I have been trying intel mkl and icc compiler instead, with no luck. I run into the same problem during setup as reported here: http://www.mail-archive.com/numpy-discussion@scipy.org/msg16595.html Sigh. I guess I should not get into these matters anyway; I am just a simple and humble user... As far as I understand the Ubuntu atlas problems have been found for complex types, which I don't use except for fft. I guess I'll continue to use the ubuntu libraries then and hope for better days in the future. Best, Gabriel

David Cournapeau

8:53 p.m.

Gabriel Beckers wrote:

...

On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote:

...
That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices,

I have been trying intel mkl and icc compiler instead, with no luck. I run into the same problem during setup as reported here:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg16595.html

See #1131 on numpy tracker - it has nothing to do with icc/mkl per-se. cheers, David

Chris Colbert

12:47 p.m.

when i had problems building atlas in the past (i.e. numpy.test() failed) it was a problem with my lapack build, not atlas. The netlib website gives instructions for building the lapack test suite. I suggest you do that and run the tests on lapack and make sure everything is kosher. Chris On Sun, Jun 7, 2009 at 5:52 AM, Gabriel Beckers wrote:

...

OK, perhaps I drank that beer too soon...

Now, numpy.test() hangs at:

test_pinv (test_defmatrix.TestProperties) ...

So perhaps something is wrong with ATLAS, even though the building went fine, and "make check" and "make ptcheck" reported no errors.

Gabriel

On Sun, 2009-06-07 at 10:20 +0200, Gabriel Beckers wrote:

...
On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote:

...
../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

Many thanks Chris, I succeeded in building it.

The configure command above contained two problems that I had to correct to get it to work though.

In case other people are trying this, I used:

../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a

That is (in addition to the different -b switch for my 32-bit machine and the different processor speed): the dash before "alg" should be removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a".

Gabriel

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Keith Goodman

14 Jul 14 Jul

1:07 p.m.

On Sun, Jun 7, 2009 at 2:52 AM, Gabriel Beckers wrote:

...

OK, perhaps I drank that beer too soon...

Now, numpy.test() hangs at:

test_pinv (test_defmatrix.TestProperties) ...

So perhaps something is wrong with ATLAS, even though the building went fine, and "make check" and "make ptcheck" reported no errors.

I ran into the same problem on 32-bit debian squeeze.

Chris Colbert

7 Jun 7 Jun

12:44 p.m.

thanks for catching the typos! Chris On Sun, Jun 7, 2009 at 4:20 AM, Gabriel Beckers wrote:

...

On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote:

...
../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

Many thanks Chris, I succeeded in building it.

The configure command above contained two problems that I had to correct to get it to work though.

In case other people are trying this, I used:

../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a

That is (in addition to the different -b switch for my 32-bit machine and the different processor speed): the dash before "alg" should be removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a".

Gabriel

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Jonathan Taylor

17 Jul 17 Jul

1:57 p.m.

...

...
...
import numpy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/jtaylor/lib/python2.5/site-packages/numpy/__init__.py",

Following these instructions I have the following problem when I import numpy. Does anyone know why this might be? Thanks, Jonathan. line 130, in <module> import add_newdocs File "/home/jtaylor/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in <module> from lib import add_newdoc File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/__init__.py", line 13, in <module> from polynomial import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 18, in <module> from numpy.linalg import eigvals, lstsq File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/__init__.py", line 47, in <module> from linalg import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 22, in <module> from numpy.linalg import lapack_lite ImportError: /usr/local/lib/libptcblas.so: undefined symbol: ATL_cpttrsm On Sat, Jun 6, 2009 at 12:59 PM, Chris Colbert wrote:

...

since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide:

and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal)

download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3 http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372

create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack

extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas

now in the terminal:

# remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose

# build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc

gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out #################

cd SRC

#this should build lapack without error make

# build atlas

cd /home/your-user-name/build/atlas

#this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2

cd Linux_X64SSE2

#need to turn off cpu-throttling sudo cpufreq-selector -g performance

#if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

#the configure step takes a bit, and should end without errors

#this takes a long time, go get some coffee, it should end without error make build

#this will verify the build, also long running make check

#this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time

cd lib

#builds single threaded .so's make shared

#builds multithreaded .so's make ptshared

#copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/

#now we need to get and build numpy

download numpy 1.3.0 http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515

extract the folder numpy-1.3.0 to /home/your-user-name/build

#in the terminal

cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg

gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include

[blas_opt] libraries = ptf77blas, ptcblas, atlas

[lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead

#build numpy- should end without error python setup.py build

#install numpy python setup.py install

cd /home

sudo ldconfig

python

...
...
import numpy numpy.test() #this should run with no errors (skipped tests and known-fails are ok) a = numpy.random.randn(6000, 6000) numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads

Celebrate with a beer!

Cheers!

Chris

On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote:

...
On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...
I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck.

Clear out your inbox! _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

David Warde-Farley

2:20 p.m.

On 17-Jul-09, at 3:57 PM, Jonathan Taylor wrote:

...

File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/ __init__.py", line 47, in <module> from linalg import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/ linalg.py", line 22, in <module> from numpy.linalg import lapack_lite ImportError: /usr/local/lib/libptcblas.so: undefined symbol: ATL_cpttrsm

It doesn't look like you ATLAS is linked together properly, specifically fblas. What fortran compiler are you using? What does ldd /usr/local/lib/libptcblas.so say? I seem to recall this sort of thing happening when g77 and gfortran get mixed up together... David

David Warde-Farley

2:29 p.m.

On 17-Jul-09, at 4:20 PM, David Warde-Farley wrote:

...

It doesn't look like you ATLAS is linked together properly, specifically fblas. What fortran compiler are you using?

...

...
ImportError: /usr/local/lib/libptcblas.so: undefined symbol: ATL_cpttrsm

Errr, nevermind. I seem to have very selective vision and saw that as 'ptf77blas.so'. Suffice it to say it's an ATLAS build problem and you seem to be doing everything right given the commands. You remembered ldconfig? David

Nicolas Pinto

19 Jul 19 Jul

9:35 p.m.

Jonathan, What does "ldd /home/jtaylor/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so" say ? You need to make sure that it's using the libraries in /usr/local/lib. You can remove the ones in /usr/lib or "export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH". Hope it helps. Best, N On Fri, Jul 17, 2009 at 3:57 PM, Jonathan Taylor wrote:

...

Following these instructions I have the following problem when I import numpy. Does anyone know why this might be?

Thanks, Jonathan.

...
...
...
import numpy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/jtaylor/lib/python2.5/site-packages/numpy/__init__.py", line 130, in <module> import add_newdocs File "/home/jtaylor/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in <module> from lib import add_newdoc File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/__init__.py", line 13, in <module> from polynomial import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 18, in <module> from numpy.linalg import eigvals, lstsq File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/__init__.py", line 47, in <module> from linalg import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 22, in <module> from numpy.linalg import lapack_lite ImportError: /usr/local/lib/libptcblas.so: undefined symbol: ATL_cpttrsm

On Sat, Jun 6, 2009 at 12:59 PM, Chris Colbert wrote:

...
since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide:

and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal)

download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3 http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372

create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack

extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas

now in the terminal:

# remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose

# build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc

gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out #################

cd SRC

#this should build lapack without error make

# build atlas

cd /home/your-user-name/build/atlas

#this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2

cd Linux_X64SSE2

#need to turn off cpu-throttling sudo cpufreq-selector -g performance

#if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

#the configure step takes a bit, and should end without errors

#this takes a long time, go get some coffee, it should end without error make build

#this will verify the build, also long running make check

#this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time

cd lib

#builds single threaded .so's make shared

#builds multithreaded .so's make ptshared

#copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/

#now we need to get and build numpy

download numpy 1.3.0 http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515

extract the folder numpy-1.3.0 to /home/your-user-name/build

#in the terminal

cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg

gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include

[blas_opt] libraries = ptf77blas, ptcblas, atlas

[lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead

#build numpy- should end without error python setup.py build

#install numpy python setup.py install

cd /home

sudo ldconfig

python

...
...
import numpy numpy.test() #this should run with no errors (skipped tests and known-fails are ok) a = numpy.random.randn(6000, 6000) numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads

Celebrate with a beer!

Cheers!

Chris

On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote:

...
On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...
I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck.

Clear out your inbox! _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto

Jonathan Taylor

22 Jul 22 Jul

12:34 p.m.

Sorry. I meant to update this thread after I had resolved my issue. This was indeed one problem. I had to set LD_LIBRARY_PATH. I also had another odd problem that I will spell out here in hopes that I save someone some trouble. Specifically, one should be very sure that the path to the blas that was compiled is correct when you configure ATLAS because it does not indicate any problems if it is not. Specifically, I tried compiling blas with make -j3 to get all my cores compiling at the same time but this actually caused a failure that I did not notice. It did create a temp_LINUX.a file in the right place so I configured ATLAS against that. Alas, many of the symbols needed were not contained in this file as BLAS had failed to compile. This was fairly hard to debug but once I got blas recompiled properly without the -j 3 switch I was able to follow the rest of the steps and everything works well. Thanks, Jonathan. On Sun, Jul 19, 2009 at 11:35 PM, Nicolas Pinto wrote:

...

Jonathan,

What does "ldd /home/jtaylor/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so" say ?

You need to make sure that it's using the libraries in /usr/local/lib. You can remove the ones in /usr/lib or "export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH".

Hope it helps.

Best,

N

On Fri, Jul 17, 2009 at 3:57 PM, Jonathan Taylor wrote:

...
Following these instructions I have the following problem when I import numpy. Does anyone know why this might be?

Thanks, Jonathan.

...
...
...
import numpy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/jtaylor/lib/python2.5/site-packages/numpy/__init__.py", line 130, in <module> import add_newdocs File "/home/jtaylor/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in <module> from lib import add_newdoc File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/__init__.py", line 13, in <module> from polynomial import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 18, in <module> from numpy.linalg import eigvals, lstsq File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/__init__.py", line 47, in <module> from linalg import * File "/home/jtaylor/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 22, in <module> from numpy.linalg import lapack_lite ImportError: /usr/local/lib/libptcblas.so: undefined symbol: ATL_cpttrsm

On Sat, Jun 6, 2009 at 12:59 PM, Chris Colbert wrote:

...
since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide:

and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal)

download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3 http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372

create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack

extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas

now in the terminal:

# remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose

# build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc

gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out #################

cd SRC

#this should build lapack without error make

# build atlas

cd /home/your-user-name/build/atlas

#this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2

cd Linux_X64SSE2

#need to turn off cpu-throttling sudo cpufreq-selector -g performance

#if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a

#the configure step takes a bit, and should end without errors

#this takes a long time, go get some coffee, it should end without error make build

#this will verify the build, also long running make check

#this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time

cd lib

#builds single threaded .so's make shared

#builds multithreaded .so's make ptshared

#copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/

#now we need to get and build numpy

download numpy 1.3.0 http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515

extract the folder numpy-1.3.0 to /home/your-user-name/build

#in the terminal

cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg

gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include

[blas_opt] libraries = ptf77blas, ptcblas, atlas

[lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead

#build numpy- should end without error python setup.py build

#install numpy python setup.py install

cd /home

sudo ldconfig

python

...
...
import numpy numpy.test() #this should run with no errors (skipped tests and known-fails are ok) a = numpy.random.randn(6000, 6000) numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads

Celebrate with a beer!

Cheers!

Chris

On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote:

...
On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote:

...
I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds.

If you need help with it, just email me off list.

That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck.

Clear out your inbox! _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto

Anne Archibald

4 Jun 4 Jun

3:03 p.m.

2009/6/4 David Paul Reichert :

...

Hi all,

I would be glad if someone could help me with the following issue:

From what I've read on the web it appears to me that numpy should be about as fast as matlab. However, when I do simple matrix multiplication, it consistently appears to be about 5 times slower. I tested this using

A = 0.9 * numpy.matlib.ones((500,100)) B = 0.8 * numpy.matlib.ones((500,100))

def test(): for i in range(1000): A*B.T

I also used ten times larger matrices with ten times less iterations, used xrange instead of range, arrays instead of matrices, and tested it on two different machines, and the result always seems to be the same.

Any idea what could go wrong? I'm using ipython and matlab R2008b.

Apart from the implementation issues people have chimed in about already, it's worth noting that the speed of matrix multiplication depends on the memory layout of the matrices. So generating B instead directly as a 100 by 500 matrix might affect the speed substantially (I'm not sure in which direction). If MATLAB's matrices have a different memory order, that might be a factor as well. Anne

...

Thanks,

David

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

David Warde-Farley

6:37 p.m.

On 4-Jun-09, at 5:03 PM, Anne Archibald wrote:

...

Apart from the implementation issues people have chimed in about already, it's worth noting that the speed of matrix multiplication depends on the memory layout of the matrices. So generating B instead directly as a 100 by 500 matrix might affect the speed substantially (I'm not sure in which direction). If MATLAB's matrices have a different memory order, that might be a factor as well.

AFAIK Matlab matrices are always Fortran ordered. Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) support multicore? Is there any sense in compiling ATLAS on OS X (I know it can be done)? David

David Cournapeau

6:26 p.m.

David Warde-Farley wrote:

...

On 4-Jun-09, at 5:03 PM, Anne Archibald wrote:

...
Apart from the implementation issues people have chimed in about already, it's worth noting that the speed of matrix multiplication depends on the memory layout of the matrices. So generating B instead directly as a 100 by 500 matrix might affect the speed substantially (I'm not sure in which direction). If MATLAB's matrices have a different memory order, that might be a factor as well.

AFAIK Matlab matrices are always Fortran ordered.

Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) support multicore? Is there any sense in compiling ATLAS on OS X (I know it can be done)?

It may be worthwhile if you use a recent gcc and recent ATLAS. Multithread support is supposed to be much better in 3.9.* compared to 3.6.* (which is likely the version used on vecLib/Accelerate). The main issue I could foresee is clashes between vecLib/Accelerate and Atlas if you mix softwares which use one or the other together. For the OP question: recent matlab versions use the MKL, which is likely to give higher performances than ATLAS, specially on windows (compilers on that platform are ancient, as building atlas with native compilers on windows requires super-human patience). David

David Paul Reichert

5 Jun 5 Jun

3:44 a.m.

New subject: performance matrix multiplication vs. matlab

Thanks for the replies so far. I had already tested using an already transposed matrix in the loop, it didn't make any difference. Oh and btw, I'm on (Scientific) Linux. I used the Enthought distribution, but I guess I'll have to get my hands dirty and try to get that Atlas thing working (I'm not a Linux expert though). My simulations pretty much consist of matrix multiplications, so if I don't get rid of that factor 5, I pretty much have to get back to Matlab. When you said Atlas is going to be optimized for my system, does that mean I should compile everything on each machine separately? I.e. I have a not-so-great desktop machine and one of those bigger multicore things available... Cheers David Quoting David Cournapeau :

...

David Warde-Farley wrote:

...
On 4-Jun-09, at 5:03 PM, Anne Archibald wrote:

...
Apart from the implementation issues people have chimed in about already, it's worth noting that the speed of matrix multiplication depends on the memory layout of the matrices. So generating B instead directly as a 100 by 500 matrix might affect the speed substantially (I'm not sure in which direction). If MATLAB's matrices have a different memory order, that might be a factor as well.

AFAIK Matlab matrices are always Fortran ordered.

Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) support multicore? Is there any sense in compiling ATLAS on OS X (I know it can be done)?

It may be worthwhile if you use a recent gcc and recent ATLAS. Multithread support is supposed to be much better in 3.9.* compared to 3.6.* (which is likely the version used on vecLib/Accelerate). The main issue I could foresee is clashes between vecLib/Accelerate and Atlas if you mix softwares which use one or the other together.

For the OP question: recent matlab versions use the MKL, which is likely to give higher performances than ATLAS, specially on windows (compilers on that platform are ancient, as building atlas with native compilers on windows requires super-human patience).

David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Jason Rennie

7:05 a.m.

Hi David, Let me suggest that you try the latest version of Ubuntu (9.04/Jaunty), which was released two months ago. It sounds like you are effectively using release 5 of RedHat Linux which was originally released May 2007. There have been updates (5.1, 5.2, 5.3), but, if my memory serves me correctly, RedHat updates are more focused on fixing bugs and security issues rather than improving functionality. Ubuntu does a full, new release every 6 months so you don't have to wait as long to see improvements. Ubuntu also has a tremendously better package management system. You generally shouldn't be installing packages by hand as it sounds like you are doing. This post suggests that the latest version of Ubuntu is up-to-date wrt ATLAS: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13102.html Jason On Fri, Jun 5, 2009 at 5:44 AM, David Paul Reichert < D.P.Reichert@sms.ed.ac.uk> wrote:

...

Thanks for the replies so far.

I had already tested using an already transposed matrix in the loop, it didn't make any difference. Oh and btw, I'm on (Scientific) Linux.

I used the Enthought distribution, but I guess I'll have to get my hands dirty and try to get that Atlas thing working (I'm not a Linux expert though). My simulations pretty much consist of matrix multiplications, so if I don't get rid of that factor 5, I pretty much have to get back to Matlab.

When you said Atlas is going to be optimized for my system, does that mean I should compile everything on each machine separately? I.e. I have a not-so-great desktop machine and one of those bigger multicore things available...

Cheers

David

-- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

Fadhley Salim

7:25 a.m.

New subject: Import fails of scipy.factorial when installed as a zipped egg

I've noticed that with scipy 0.7.0 + numpy 1.2.1, an importing the factorial function from the scipy module always seems to fail when scipy is installed as a zipped ",egg" file. When the project is installed as an unzipped directory it works fine. Is there any reason why this function should not be egg-safe? My test to verify this was pretty simple: I just installed my scipy egg (made by extracting the Windows, Python 2.4 Superpack) with the easy_install command. Whenever I install it with the "-Z" option (to uncompress) it works fine. With the "-z" option it always fails. Thanks! Sal Disclaimer CALYON UK: This email does not create a legal relationship between any member of the Cr=E9dit Agricole group and the recipient or constitute investment advice. The content of this email should not be copied or disclosed (in whole or part) to any other person. It may contain information which is confidential, privileged or otherwise protected from disclosure. If you are not the intended recipient, you should notify us and delete it from your system. Emails may be monitored, are not secure and may be amended, destroyed or contain viruses and in communicating with us such conditions are accepted. Any content which does not relate to business matters is not endorsed by us. Calyon is authorised by the Comit=e9 des Etablissements de Cr=e9dit et des Entreprises d'Investissement (CECEI) and supervised by the Commission Bancaire in France and subject to limited regulation by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. Calyon is incorporated in France with limited liability and registered in England & Wales. Registration number: FC008194. Registered office: Broadwalk House, 5 Appold Street, London, EC2A 2DA. Disclaimer CALYON France: This message and/or any attachments is intended for the sole use of its addressee. If you are not the addressee, please immediately notify the sender and then destroy the message. As this message and/or any attachments may have been altered without our knowledge, its content is not legally binding on CALYON Crédit Agricole CIB. All rights reserved. Ce message et ses pièces jointes est destiné à l'usage exclusif de son destinataire. Si vous recevez ce message par erreur, merci d'en aviser immédiatement l'expéditeur et de le détruire ensuite. Le présent message pouvant être altéré à notre insu, CALYON Crédit Agricole CIB ne peut pas être engagé par son contenu. Tous droits réservés.

josef.pktd＠gmail.com

8:32 a.m.

New subject: Import fails of scipy.factorial when installed as a zipped egg

On Fri, Jun 5, 2009 at 9:25 AM, Fadhley Salim wrote:

...

I've noticed that with scipy 0.7.0 + numpy 1.2.1, an importing the factorial function from the scipy module always seems to fail when scipy is installed as a zipped ",egg" file. When the project is installed as an unzipped directory it works fine.

Is there any reason why this function should not be egg-safe?

My test to verify this was pretty simple: I just installed my scipy egg (made by extracting the Windows, Python 2.4 Superpack) with the easy_install command. Whenever I install it with the "-Z" option (to uncompress) it works fine. With the "-z" option it always fails.

I don't think numpy/scipy are zip safe, the numpy packageloader uses os.path to find files. I would expect that you are not able to import anything, factorial might be just the first function that is loaded. easy_install usually does a check for zipsafe, and if it unpacks the egg it usually means it's not zipsafe, for example because of the use of __file__. That's my guess, Josef

Fadhley Salim

10:59 a.m.

New subject: Import fails of scipy.factorial wheninstalled as a zipped egg

...

I don't think numpy/scipy are zip safe, the numpy packageloader uses os.path to find files.

Evidently! :-) But the strange thing is that all this worked fine with Scipy 0.6.0 - it's only since 0.7.0 was released that this started going wrong. Sal Disclaimer CALYON UK: This email does not create a legal relationship between any member of the Cr=E9dit Agricole group and the recipient or constitute investment advice. The content of this email should not be copied or disclosed (in whole or part) to any other person. It may contain information which is confidential, privileged or otherwise protected from disclosure. If you are not the intended recipient, you should notify us and delete it from your system. Emails may be monitored, are not secure and may be amended, destroyed or contain viruses and in communicating with us such conditions are accepted. Any content which does not relate to business matters is not endorsed by us. Calyon is authorised by the Comit=e9 des Etablissements de Cr=e9dit et des Entreprises d'Investissement (CECEI) and supervised by the Commission Bancaire in France and subject to limited regulation by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. Calyon is incorporated in France with limited liability and registered in England & Wales. Registration number: FC008194. Registered office: Broadwalk House, 5 Appold Street, London, EC2A 2DA. Disclaimer CALYON France: This message and/or any attachments is intended for the sole use of its addressee. If you are not the addressee, please immediately notify the sender and then destroy the message. As this message and/or any attachments may have been altered without our knowledge, its content is not legally binding on CALYON Crédit Agricole CIB. All rights reserved. Ce message et ses pièces jointes est destiné à l'usage exclusif de son destinataire. Si vous recevez ce message par erreur, merci d'en aviser immédiatement l'expéditeur et de le détruire ensuite. Le présent message pouvant être altéré à notre insu, CALYON Crédit Agricole CIB ne peut pas être engagé par son contenu. Tous droits réservés.

David Paul Reichert

8:39 a.m.

New subject: performance matrix multiplication vs. matlab

Hi, Thanks for the suggestion. Unfortunately I'm using university managed machines here, so I have no control over the distribution, not even root access. However, I just downloaded the latest Enthought distribution, which uses numpy 1.3, and now numpy is only 30% to 60% slower than matlab, instead of 5 times slower. I can live with that. (whether it uses atlas now or not, I don't know). Cheers David Quoting Jason Rennie :

...

Hi David,

Let me suggest that you try the latest version of Ubuntu (9.04/Jaunty), which was released two months ago. It sounds like you are effectively using release 5 of RedHat Linux which was originally released May 2007. There have been updates (5.1, 5.2, 5.3), but, if my memory serves me correctly, RedHat updates are more focused on fixing bugs and security issues rather than improving functionality. Ubuntu does a full, new release every 6 months so you don't have to wait as long to see improvements. Ubuntu also has a tremendously better package management system. You generally shouldn't be installing packages by hand as it sounds like you are doing.

This post suggests that the latest version of Ubuntu is up-to-date wrt ATLAS:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg13102.html

Jason

On Fri, Jun 5, 2009 at 5:44 AM, David Paul Reichert < D.P.Reichert@sms.ed.ac.uk> wrote:

...
Thanks for the replies so far.

I had already tested using an already transposed matrix in the loop, it didn't make any difference. Oh and btw, I'm on (Scientific) Linux.

I used the Enthought distribution, but I guess I'll have to get my hands dirty and try to get that Atlas thing working (I'm not a Linux expert though). My simulations pretty much consist of matrix multiplications, so if I don't get rid of that factor 5, I pretty much have to get back to Matlab.

When you said Atlas is going to be optimized for my system, does that mean I should compile everything on each machine separately? I.e. I have a not-so-great desktop machine and one of those bigger multicore things available...

Cheers

David

-- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

5391

Age (days ago)

5439

Last active (days ago)

List overview

Download

68 comments

19 participants

participants (19)

Anne Archibald
Chris Colbert
David Cournapeau
David Paul Reichert
David Warde-Farley
Eric Firing
Fadhley Salim
Gabriel Beckers
Gael Varoquaux
Jason Rennie
Jonathan Taylor
josef.pktd＠gmail.com
Keith Goodman
Matthieu Brucher
Minjae Kim
Nicolas Pinto
Robin
Sebastian Walter
Stéfan van der Walt

performance matrix multiplication vs. matlab

David Paul Reichert

Sebastian Walter

Sebastian Walter

David Cournapeau

Sebastian Walter

David Cournapeau

David Cournapeau

Keith Goodman

Minjae Kim

Gabriel Beckers

Gabriel Beckers

David Cournapeau

David Cournapeau

Gabriel Beckers

David Warde-Farley

David Cournapeau

David Warde-Farley

Jason Rennie

David Cournapeau

Jason Rennie

David Cournapeau

Jason Rennie

David Warde-Farley

David Cournapeau

David Cournapeau

David Warde-Farley

David Cournapeau

Keith Goodman

Gabriel Beckers

Gabriel Beckers

David Cournapeau

Keith Goodman

Jonathan Taylor

David Warde-Farley

David Warde-Farley

Jonathan Taylor

David Warde-Farley

David Cournapeau

David Paul Reichert

Jason Rennie

Fadhley Salim

Fadhley Salim

David Paul Reichert

tags

participants (19)