[scikit-learn] unclear help file for sklearn.decomposition.pca

Ismael Lemhadri lemhadri at stanford.edu
Sun Oct 15 21:42:56 EDT 2017


Dear all,
The help file for the PCA class is unclear about the preprocessing
performed to the data.
You can check on line 410 here:
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
decomposition/pca.py#L410
that the matrix is centered but NOT scaled, before performing the singular
value decomposition.
However, the help files do not make any mention of it.
This is unclear for someone who, like me, just wanted to compare that the
PCA and np.linalg.svd give the same results. In academic settings, students
are often asked to compare different methods and to check that they yield
the same results. I expect that many students have confronted this problem
before...
Best,
Ismael Lemhadri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment.html>


More information about the scikit-learn mailing list