<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Dear all,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The help file for the PCA class is unclear about the preprocessing performed to the data.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">You can check on line 410 here: </div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><a href="https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410" target="_blank">https://github.com/scikit-<wbr>learn/scikit-learn/blob/<wbr>ef5cb84a/sklearn/<wbr>decomposition/pca.py#L410</a></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">that the matrix is centered but NOT scaled, before performing the singular value decomposition.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">However, the help files do not make any mention of it.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">This is unclear for someone who, like me, just wanted to compare that the PCA and np.linalg.svd give the same results. In academic settings, students are often asked to compare different methods and to check that they yield the same results. I expect that many students have confronted this problem before...</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Best,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Ismael Lemhadri</div></div>