PCA principal component analysis
Alexander Schmolck
a.schmolck at gmx.net
Wed Apr 9 13:09:57 EDT 2003
s.thuriez at laposte.net (sebastien) writes:
> Hi,
>
> Is there any PCA analysis tools for python ?
What is the analysis tool supposed to do?
Maybe this will do what you want, once you (downloaded and installed Numeric):
# Warning: hackish and not properly tested ripped out bit of code ahead
# so no guarantees whatsoever
# Anyway, it should at lesat sort of give you the idea
# try pca(X); if that doesn't do what you want try pca(t(X))
from Numeric import take, dot, shape, argsort, where, sqrt, transpose as t
from LinearAlgebra import eigenvectors
def pca(M):
"Perform PCA on M, return eigenvectors and eigenvalues, sorted."
T, N = shape(M)
# if there are less rows T than columns N, use
# snapshot method
if T < N:
C = dot(M, t(M))
evals, evecsC = eigenvectors(C)
# HACK: make sure evals are all positive
evals = where(evals < 0, 0, evals)
evecs = 1./sqrt(evals) * dot(t(M), t(evecsC))
else:
# calculate covariance matrix
K = 1./T * dot(t(M), M)
evals, evecs = eigenvectors(K)
# sort the eigenvalues and eigenvectors, decending order
order = (argsort(evals)[::-1])
evecs = take(evecs, order, 1)
evals = take(evals, order)
return evals, t(evecs)
You can download Numeric and use it to compute the eigenvalues and
eigenvectors of an array.
> If it does, do you have any idea on how well it would scale ?
It should scale fine. If you experience speed problems, configure Numeric 23
with ATLAS support (you have to install ATLAS and LAPACK first, of course).
For large matrices, this should be *much* faster than handwritten C code that
doesn't use ATLAS.
>
> I have already seen PyClimate (but it is not available for Windows
> which will be one of the target). Is there some LAPACK like packages ?
Yes, Numeric and scipy. (www.numpy.org, www.scipy.org, I should think)
'as
More information about the Python-list
mailing list