PCA principal component analysis

Curzio Basso Curzio.Basso at unibas.ch
Mon Apr 14 11:41:04 CEST 2003


On Fri, 2003-04-11 at 16:39, Colin J. Williams wrote:

> From my recollection, PCA can be applied to either the covariance 
> matrix or the correlation matrix. 

PCA of data is done finding the eigenvectors of the covariance matrix,
but this is best obtained from the SVD of the data matrix.

That is:

X is the data matrix, sample vectors are the columns
C = X*X' is the covariance matrix
C*P = L*P is the eigenproblem, the P matrix stores the components, L is
a diagonal matrix storing the eigenvalues
If X=U*S*V' through SVD, then

C=X*X'=(U*S*V')*(U*S*V')'=(U*S*V')*(V*S'*U')=U*(S*S')*U'

(because V'*V=I from the property of SVD)

=> C*U=(S*S')*U

which means P=U and L=(S*S').
So, the columns of U are the principal components, and the square of the
diagonal elements of S (which is a diagonal matrix) are the "weights"
which scale the components to the input space.

I hope I did not mess things up...

-- 
Curzio Basso <Curzio.Basso at unibas.ch>






More information about the Python-list mailing list