<div dir="ltr">(normalize(X) * normalize(X)).sum(axis=1) works fine here.<div><br></div><div>But I was unaware of these quirks in Python's implementation of pow:</div><div><br></div><div>Numpy seems to be consistent in returning nan when a negative float is raised to a non-integer (or equivalent float) power. By only calculating integer powers of negative floats, the absolute value is returned in suqareing. I assume this follows C conventions?</div><div><br></div><div>Python, on the other hand, seems to do strange things:</div><div><br></div><div>Numpy:</div><div><div>>>> np.array(-.6) ** 2.1</div><div>nan</div><div><div>>>> np.array(-.6+0j) ** 2.1</div><div>(0.32532987876940411+0.10570608538524294j)</div></div><div><br></div><div>Python 3.6.2 returns the norm of the complex power:</div><div>>>> -.6 ** 2.1</div><div>-0.3420720779420435</div><div><div>>>> (-.6 + 0j) ** 2.1</div><div>(0.3253298787694041+0.10570608538524294j)</div></div><div><div>>>> (((-.6 + 0j) ** 2.1).real ** 2 + ((-.6 + 0j) ** 2.1).imag ** 2) ** .5</div><div>0.3420720779420434</div></div><div><br></div><div>Very strangely, putting the LHS in parentheses performs complex power in Python.</div><div><br></div><div>>>> (-.6) ** 2.1</div><div>(0.3253298787694041+0.10570608538524294j)</div></div><div><br></div><div>At <a href="https://docs.python.org/3/reference/expressions.html">https://docs.python.org/3/reference/expressions.html</a>:</div><div><div class="gmail-section" id="gmail-the-power-operator"><p style="text-align:justify;line-height:22.4px">Raising a negative number to a fractional power results in a <a class="gmail-reference gmail-internal" href="https://docs.python.org/3/library/functions.html#complex" title="complex" style="color:rgb(99,99,187);text-decoration-line:none"><code class="gmail-xref gmail-py gmail-py-class gmail-docutils gmail-literal" style="background-color:transparent;padding:0px 1px;font-size:15.44px;font-family:monospace,sans-serif;border-radius:3px"><span class="gmail-pre" style="hyphens: none;">complex</span></code></a> number. (In earlier versions it raised a <a class="gmail-reference gmail-internal" href="https://docs.python.org/3/library/exceptions.html#ValueError" title="ValueError" style="color:rgb(99,99,187);text-decoration-line:none"><code class="gmail-xref gmail-py gmail-py-exc gmail-docutils gmail-literal" style="background-color:transparent;padding:0px 1px;font-size:15.44px;font-family:monospace,sans-serif;border-radius:3px"><span class="gmail-pre" style="hyphens: none;">ValueError</span></code></a>.)</p><p style="text-align:justify;line-height:22.4px">By "in earlier versions" it means Python 2. I don't know why this should only be the case where the LHS is parenthesised. Seems like a CPython bug!</p></div><div class="gmail-section" id="gmail-unary-arithmetic-and-bitwise-operations"><span id="gmail-unary" style="font-family:"Lucida Grande",Arial,sans-serif;font-size:16px"></span></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 8 October 2017 at 16:08, Christopher Pfeifer <span dir="ltr"><<a href="mailto:chrispfeifer8557@gmail.com" target="_blank">chrispfeifer8557@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I am attempting to validate the output of an L2 normalization function:<div><br></div><div><b>data_l2 = preprocessing.normalize(data, norm='l2') </b>        # raw data is below at end of this email</div><div><br></div><div>output: </div><div><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">array([[ 0.57649683,  0.53806371,  0.61492995],
       [-0.53806371, -0.57649683, -0.61492995],
       [ 0.3359268 ,  0.90089461, -0.2748492 ],
       [ 0.6676851 , -0.39566524, -0.63059148],
       [-0.70710678,  0.        ,  0.70710678],
       [-0.63116874,  0.45083482,  0.63116874]])</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">Each row being a set of three features of an observation</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">I am under the belief that the sum of the 'squared' values of an instance (row) should be virtually equal to 1 (normalized).</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Problem - 1:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">the np.square() function is returning the absolute value of the sum of the three features, even when the sum of the squares is clearly negative.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000" style="font-size:small"><span style="font-size:14px">np.square(-0.53806371) returns </span></font><span style="font-size:small;color:rgb(34,34,34)">0.28951255601896408    however, (-0.53806371**2)    returns    -0.2895125560189641</span><br></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><span style="font-size:small;color:rgb(34,34,34)">The correct square of </span>-0.53806371 is  -0.2895125560189641 (a negative number), even my 10 year old calculator gets it right.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">I can find nothing in the numpy documentation that indicates np.square() always returns the absolute value, instead of the correctly signed value.</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Question:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">Is there a way to force np.square() to return the correctly signed square value not the absolute value?</font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><br></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace"><b>Problem - 2:</b></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><font face="monospace, monospace">For some of the observations (rows), the sum of the squared values (which should be virtually 1), are nowhere near 1.</font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">
<br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">print 0.57649683**2 + 0.53806371**2 +  0.61492995**2      row 1<br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><span style="font-family:arial,sans-serif">0.9999999944260154</span>  (this is virtually 1)</pre></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">print -0.63116874**2 + 0.45083482**2  +  0.63116874**2    row 6</span></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><span style="font-family:arial,sans-serif">0.203252034924 </span>  (<b>this is nowhere near 1</b>)</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">sum of the 'squared' values of an instance (row) should be virtually equal to 1.</span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><b>Question:</b></pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">Is the preprocessing.normalize(data, norm='l2') messing up, or is my raw data being fed into the normalization routine to unrealistic (I made it up of both positive and negative numbers.</pre><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline"><br></pre><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><b>Raw Data</b>
</span></font></font><pre style="box-sizing:border-box;overflow:auto;font-size:14px;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline">array([[ 1.5,  1.4,  1.6],
       [-1.4, -1.5, -1.6],
       [ 2.2,  5.9, -1.8],
       [ 5.4, -3.2, -5.1],
       [-1.4,  0. ,  1.4],
       [-1.4,  1. ,  1.4]])</pre><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">
Thanks: Chris</span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap"><br></span></font></font></pre><pre style="box-sizing:border-box;overflow:auto;padding:0px;margin-top:0px;margin-bottom:0px;line-height:inherit;word-break:break-all;word-wrap:break-word;border:0px;border-radius:0px;vertical-align:baseline"><font face="monospace, monospace"><font color="#000000"><span style="font-size:14px;white-space:pre-wrap">P.S.: Not a real world problem, just trying to understand the functionality of scikit-learn. Have only been working with the package for two weeks.</span></font></font></pre></div></div>
<br>______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>
<br></blockquote></div><br></div>