[scikit-learn] Should we standardize data before PCA?
James Melenkevitz
jmmelen at yahoo.com
Sun May 27 15:13:18 EDT 2018
And this is the SciKit Learn page on the normalizing: http://scikit-learn.org/stable/auto_examples/preprocessing/plot_scaling_importance.html
On Saturday, May 26, 2018, 10:10:32 PM PDT, Shiheng Duan <shiduan at ucdavis.edu> wrote:
Thanks.
Do you mean that if feature one has a larger derivation than feature two, after zscore they will have the same weight? In that case, it is a bias, right? The feature one should be more important than feature two in the PCA.
On Thu, May 24, 2018 at 5:09 PM, Michael Eickenberg <michael.eickenberg at gmail.com> wrote:
Hi,
that totally depends on the nature of your data and whether the standard deviation of individual feature axes/columns of your data carry some form of importance measure. Note that PCA will bias its loadings towards columns with large standard deviations all else being held equal (meaning that if you have zscored columns, and then you choose one column and multiply it by, say 1000, then that component will likely show up as your first component [if 1000 is comparable or large wrt the number of features you are using])
Does this help?Michael
On Thu, May 24, 2018 at 4:39 PM, Shiheng Duan <shiduan at ucdavis.edu> wrote:
Hello all,
I wonder is it necessary or correct to do z score transformation before PCA? I didn't see any preprocessing for face image in the example of Faces recognition example using eigenfaces and SVMs, link:http://scikit-learn.org/s table/auto_examples/applicatio ns/plot_face_recognition.html# sphx-glr-auto-examples- applications-plot-face- recognition-py
I am doing on a similar dataset and got a weird result if I standardized data before PCA. The components figure will have a strong gradient and it doesn't make any sense. Any ideas about the reason?
Thanks.
______________________________ _________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailma n/listinfo/scikit-learn
______________________________ _________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/ mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180527/a37f7a55/attachment.html>
More information about the scikit-learn
mailing list