<div dir="ltr">I am attempting to validate the output of an L2 normalization function:<div><br></div><div><b>data_l2 = preprocessing.normalize(data, norm='l2') </b> # raw data is below at end of this email</div><div><br></div><div>output: </div><div><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">array([[ 0.57649683, 0.53806371, 0.61492995],
[-0.53806371, -0.57649683, -0.61492995],
[ 0.3359268 , 0.90089461, -0.2748492 ],
[ 0.6676851 , -0.39566524, -0.63059148],
[-0.70710678, 0. , 0.70710678],
[-0.63116874, 0.45083482, 0.63116874]])</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">Each row being a set of three features of an observation</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">I am under the belief that the sum of the 'squared' values of an instance (row) should be virtually equal to 1 (normalized).</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Problem - 1:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">the np.square() function is returning the absolute value of the sum of the three features, even when the sum of the squares is clearly negative.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000" style="font-size:small"><span style="font-size:14px">np.square(-0.53806371) returns </span></font><span style="font-size:small;color:rgb(34,34,34)">0.28951255601896408 however, (-0.53806371**2) returns -0.2895125560189641</span><br></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><span style="font-size:small;color:rgb(34,34,34)">The correct square of </span>-0.53806371 is -0.2895125560189641 (a negative number), even my 10 year old calculator gets it right.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">I can find nothing in the numpy documentation that indicates np.square() always returns the absolute value, instead of the correctly signed value.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Question:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">Is there a way to force np.square() to return the correctly signed square value not the absolute value?</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><br></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Problem - 2:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">For some of the observations (rows), the sum of the squared values (which should be virtually 1), are nowhere near 1.</font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">
<br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">print 0.57649683**2 + 0.53806371**2 + 0.61492995**2 row 1<br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><span style="font-family:arial,sans-serif">0.9999999944260154</span> (this is virtually 1)</pre></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">print -0.63116874**2 + 0.45083482**2 + 0.63116874**2 row 6</span></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><span style="font-family:arial,sans-serif">0.203252034924 </span> (<b>this is nowhere near 1</b>)</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">sum of the 'squared' values of an instance (row) should be virtually equal to 1.</span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><b>Question:</b></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">Is the preprocessing.normalize(data, norm='l2') messing up, or is my raw data being fed into the normalization routine to unrealistic (I made it up of both positive and negative numbers.</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><b>Raw Data</b>
</span></font></font><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">array([[ 1.5, 1.4, 1.6],
[-1.4, -1.5, -1.6],
[ 2.2, 5.9, -1.8],
[ 5.4, -3.2, -5.1],
[-1.4, 0. , 1.4],
[-1.4, 1. , 1.4]])</pre><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">
Thanks: Chris</span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">P.S.: Not a real world problem, just trying to understand the functionality of scikit-learn. Have only been working with the package for two weeks.</span></font></font></pre></div></div>