<div dir="ltr">
<div dir="ltr"><div>Hi, I created an algorithm that may solve linear
regression problems with less time complexity than Singular Value
Decomposition. It only requires the gradient and the diagonal of the
hessian to calculate the optimal weights. I attached the Tensorflow code
below. I haven't been able to get it to work in pure NumPy yet, but I'm
sure someone will be able port it if it really does what it purports to
do.</div><div><br></div><div><pre style="background-color:rgb(255,255,255);color:rgb(0,0,0);font-family:"Courier New";font-size:9pt"><span style="color:rgb(0,0,128);font-weight:bold">import </span>numpy <span style="color:rgb(0,0,128);font-weight:bold">as </span>np<br><br>Y = np.arange(<span style="color:rgb(0,0,255)">10</span>).reshape(<span style="color:rgb(0,0,255)">10</span>,<span style="color:rgb(0,0,255)">1</span>)**<span style="color:rgb(0,0,255)">0<wbr>.5<br></span><span style="color:rgb(0,0,255)"><br></span>bias_X = np.ones(<span style="color:rgb(0,0,255)">10</span>).reshape(<span style="color:rgb(0,0,255)">10</span>,<span style="color:rgb(0,0,255)">1</span>)<br>X_feature1 = Y**<span style="color:rgb(0,0,255)">3<br></span>X_feature2 = Y**<span style="color:rgb(0,0,255)">4<br></span>X_feature3 = Y**<span style="color:rgb(0,0,255)">5<br></span>X = np.concatenate((bias_X, X_feature1, X_feature2, X_feature3), <span style="color:rgb(102,0,153)">axis</span>=<span style="color:rgb(0,0,255)">1</span>)<br><br>num_features = <span style="color:rgb(0,0,255)">4<br></span><span style="color:rgb(0,0,255)"><br></span><span style="color:rgb(0,0,255)"><br></span><span style="color:rgb(0,0,128);font-weight:bold">import </span>tensorflow <span style="color:rgb(0,0,128);font-weight:bold">as </span>tf<br><br>X_in = tf.placeholder(tf.float64, [<span style="color:rgb(0,0,128);font-weight:bold">None</span>,num_features])<br>Y_in = tf.placeholder(tf.float64, [<span style="color:rgb(0,0,128);font-weight:bold">None</span>,<span style="color:rgb(0,0,255)">1</span>])<br><br>W = tf.placeholder(tf.float64, [num_features,<span style="color:rgb(0,0,255)">1</span>])<br><br>W_squeezed = tf.squeeze(W)<br><br>Y_hat = tf.expand_dims(tf.tensordot(X_<wbr>in, W_squeezed, ([<span style="color:rgb(0,0,255)">1</span>],[<span style="color:rgb(0,0,255)">0</span>])), <span style="color:rgb(102,0,153)">axis</span>=<span style="color:rgb(0,0,255)">1</span>)<br><br>loss = tf.reduce_mean(Y - Y_hat)**<span style="color:rgb(0,0,255)">2<br></span><span style="color:rgb(0,0,255)"><br></span>gradient = tf.gradients(loss, [W_squeezed])[<span style="color:rgb(0,0,255)">0</span>]<br><br>gradient_2nd = tf.diag_part(tf.hessians(loss, [W_squeezed])[<span style="color:rgb(0,0,255)">0</span>])<br><br>vertex_offset = -gradient/gradient_2nd/num_<wbr>features<br><br>W_star = W_squeezed + vertex_offset<br>W_star = tf.expand_dims(W_star, <span style="color:rgb(102,0,153)">axis</span>=<span style="color:rgb(0,0,255)">1</span>)<br><br><span style="color:rgb(0,0,128);font-weight:bold">with </span>tf.Session() <span style="color:rgb(0,0,128);font-weight:bold">as </span>sess:<br> random_W = np.random.normal(<span style="color:rgb(102,0,153)">size</span>=(num_<wbr>features,<span style="color:rgb(0,0,255)">1</span>)).astype(np.<wbr>float64)<br> result1 = sess.run([loss, W_star, gradient, gradient_2nd], <span style="color:rgb(102,0,153)">feed_dict</span>={X_in:X, Y_in:Y, W:random_W})<br> random_loss = result1[<span style="color:rgb(0,0,255)">0</span>]<br> optimal_W = result1[<span style="color:rgb(0,0,255)">1</span>]<br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">'Random loss:'</span>,result1[<span style="color:rgb(0,0,255)">0</span>])<br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">'Gradient:'</span>, result1[-<span style="color:rgb(0,0,255)">2</span>])<br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">"2nd-order Gradient:"</span>, result1[-<span style="color:rgb(0,0,255)">1</span>])<br><br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">"W:"</span>)<br> <span style="color:rgb(0,0,128)">print</span>(random_W)<br> <span style="color:rgb(0,0,128)">print</span>()<br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">"W*:"</span>)<br> <span style="color:rgb(0,0,128)">print</span>(result1[<span style="color:rgb(0,0,255)">1</span>])<br> <span style="color:rgb(0,0,128)">print</span>()<br><br> optimal_loss = sess.run(loss, <span style="color:rgb(102,0,153)">feed_dict</span>={X_in:X, Y_in:Y, W:optimal_W})<br> <span style="color:rgb(0,0,128)">print</span>(<span style="color:rgb(0,128,128);font-weight:bold">'Optimal loss:'</span>, optimal_loss)</pre></div></div>
</div>