<html><head></head><body><div style="font-family:verdana, helvetica, sans-serif;font-size:16px;"><div></div>
            <div>And this you have likely seen already in Wikipedia:</div><div><a href="https://en.wikipedia.org/wiki/Principal_component_analysis" rel="nofollow" target="_blank">https://en.wikipedia.org/wiki/Principal_component_analysis</a></div><div>"...<span>PCA is mostly used as a tool in <a href="https://en.wikipedia.org/wiki/Exploratory_data_analysis" title="Exploratory data analysis" rel="nofollow" target="_blank">exploratory data analysis</a> and for making <a href="https://en.wikipedia.org/wiki/Predictive_modeling" class="ydp6dfed68cmw-redirect" title="Predictive modeling" rel="nofollow" target="_blank">predictive models</a>. It's often used to visualize genetic distance and relatedness between populations. PCA can be done by <a href="https://en.wikipedia.org/wiki/Eigendecomposition_of_a_matrix" title="Eigendecomposition of a matrix" rel="nofollow" target="_blank">eigenvalue decomposition</a> of a data <a href="https://en.wikipedia.org/wiki/Covariance" title="Covariance" rel="nofollow" target="_blank">covariance</a> (or <a href="https://en.wikipedia.org/wiki/Correlation" class="ydp6dfed68cmw-redirect" title="Correlation" rel="nofollow" target="_blank">correlation</a>) matrix or <a href="https://en.wikipedia.org/wiki/Singular_value_decomposition" class="ydp6dfed68cmw-redirect" title="Singular value decomposition" rel="nofollow" target="_blank">singular value decomposition</a> of a <a href="https://en.wikipedia.org/wiki/Data_matrix_(multivariate_statistics)" class="ydp6dfed68cmw-redirect" title="Data matrix (multivariate statistics)" rel="nofollow" target="_blank">data matrix</a>, usually after <i>mean centering</i><sup class="ydp6dfed68cnoprint ydp6dfed68cInline-Template" style="margin-left:0.1em; white-space:nowrap;">[<i><a href="https://en.wikipedia.org/wiki/Wikipedia:Please_clarify" title="Wikipedia:Please clarify" rel="nofollow" target="_blank"><span title="This term may not be clear to everyone!! (May 2018)">clarification needed</span></a></i>]</sup> (and normalizing or using <a href="https://en.wikipedia.org/wiki/Z-score" class="ydp6dfed68cmw-redirect" title="Z-score" rel="nofollow" target="_blank">Z-scores</a>) the data matrix for each attribute.<sup id="ydp6dfed68ccite_ref-4" class="ydp6dfed68creference"><a href="https://en.wikipedia.org/wiki/Principal_component_analysis#cite_note-4" rel="nofollow" target="_blank">[4]</a></sup> The results of a PCA are usually discussed in terms of <i>component scores</i>, sometimes called <i>factor scores</i> (the transformed variable values corresponding to a particular data point), and <i>loadings</i> (the weight by which each standardized original variable should be multiplied to get the component score).</span>.."<br></div><div><br></div>
            
            <div id="ydpc0f43bafyahoo_quoted_8214646923" class="ydpc0f43bafyahoo_quoted">
                <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">
                    
                    <div>
                        On Saturday, May 26, 2018, 10:10:32 PM PDT, Shiheng Duan <shiduan@ucdavis.edu> wrote:
                    </div>
                    <div><br></div>
                    <div><br></div>
                    <div><div id="ydpc0f43bafyiv2807525728"><div><div dir="ltr">Thanks. <div><br clear="none"></div><div>Do you mean that if feature one has a larger derivation than feature two, after zscore they will have the same weight? In that case, it is a bias, right? The feature one should be more important than feature two in the PCA. </div></div><div class="ydpc0f43bafyiv2807525728gmail_extra"><br clear="none"><div class="ydpc0f43bafyiv2807525728yqt7425748386" id="ydpc0f43bafyiv2807525728yqtfd71683"><div class="ydpc0f43bafyiv2807525728gmail_quote">On Thu, May 24, 2018 at 5:09 PM, Michael Eickenberg <span dir="ltr"><<a shape="rect" href="mailto:michael.eickenberg@gmail.com" rel="nofollow" target="_blank">michael.eickenberg@gmail.com</a>></span> wrote:<br clear="none"><blockquote class="ydpc0f43bafyiv2807525728gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div dir="ltr">Hi,<div><br clear="none"></div><div>that totally depends on the nature of your data and whether the standard deviation of individual feature axes/columns of your data carry some form of importance measure. Note that PCA will bias its loadings towards columns with large standard deviations all else being held equal (meaning that if you have zscored columns, and then you choose one column and multiply it by, say 1000, then that component will likely show up as your first component [if 1000 is comparable or large wrt the number of features you are using])</div><div><br clear="none"></div><div>Does this help?</div><div>Michael</div></div><div class="ydpc0f43bafyiv2807525728gmail_extra"><br clear="none"><div class="ydpc0f43bafyiv2807525728gmail_quote"><div><div class="ydpc0f43bafyiv2807525728h5">On Thu, May 24, 2018 at 4:39 PM, Shiheng Duan <span dir="ltr"><<a shape="rect" href="mailto:shiduan@ucdavis.edu" rel="nofollow" target="_blank">shiduan@ucdavis.edu</a>></span> wrote:<br clear="none"></div></div><blockquote class="ydpc0f43bafyiv2807525728gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div class="ydpc0f43bafyiv2807525728h5"><div dir="ltr">Hello all,<div><br clear="none"></div><div style="text-align:left;">I wonder is it necessary or correct to do z score transformation before PCA? I didn't see any preprocessing for face image in the example of Faces recognition example using eigenfaces and SVMs, link:<a shape="rect" href="http://scikit-learn.org/stable/auto_examples/applications/plot_face_recognition.html#sphx-glr-auto-examples-applications-plot-face-recognition-py" rel="nofollow" target="_blank">http://scikit-learn.org/s table/auto_examples/applicatio ns/plot_face_recognition.html# sphx-glr-auto-examples- applications-plot-face- recognition-py</a></div><div style="text-align:left;"><br clear="none"></div><div style="text-align:left;">I am doing on a similar dataset and got a weird result if I standardized data before PCA. The components figure will have a strong gradient and it doesn't make any sense. Any ideas about the reason? </div><div style="text-align:left;"><br clear="none"></div><div style="text-align:left;">Thanks. </div></div>
<br clear="none"></div></div>______________________________ _________________<br clear="none">
scikit-learn mailing list<br clear="none">
<a shape="rect" href="mailto:scikit-learn@python.org" rel="nofollow" target="_blank">scikit-learn@python.org</a><br clear="none">
<a shape="rect" href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="nofollow" target="_blank">https://mail.python.org/mailma n/listinfo/scikit-learn</a><br clear="none">
<br clear="none"></blockquote></div><br clear="none"></div>
<br clear="none">______________________________ _________________<br clear="none">
scikit-learn mailing list<br clear="none">
<a shape="rect" href="mailto:scikit-learn@python.org" rel="nofollow" target="_blank">scikit-learn@python.org</a><br clear="none">
<a shape="rect" href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="nofollow" target="_blank">https://mail.python.org/ mailman/listinfo/scikit-learn</a><br clear="none">
<br clear="none"></blockquote></div><br clear="none"></div></div></div></div><div class="ydpc0f43bafyqt7425748386" id="ydpc0f43bafyqtfd02873">_______________________________________________<br clear="none">scikit-learn mailing list<br clear="none"><a shape="rect" href="mailto:scikit-learn@python.org" rel="nofollow" target="_blank">scikit-learn@python.org</a><br clear="none"><a shape="rect" href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="nofollow" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br clear="none"></div></div>
                </div>
            </div></div></body></html>