<div dir="ltr">Thanks for the help.<div><br></div><div>What I am actually doing is computing a gradient to a least squares objective. That is,</div><div><br></div><div>     X.T.dot(X.dot(beta) - Y)</div><div><br></div><div>If X is such that X.dot(beta) is fast (i.e. matvec is fast) then am I missing a "simple" optimization here at the cost of a copy? Alternatively, if X is such that vecmat is fast, then what is the best way to do this? A copy seems easiest, and possibly applying the previous "simple" optimization.</div><div>Based on my understanding of the other replies, I would guess that if X2=X.copy(), then the fastest way would be </div><div><br></div><div>    (X2.dot(beta) - Y).dot(X)</div><div><br></div><div>This doesn't pan out in my example, the winner is</div><div><br></div><div>    X2.T.dot(X2.dot(beta) - Y)</div><div><br></div><div>which is about the same as</div><div><br></div><div>    (X2.dot(beta)-Y).dot(X2)</div><div><br></div><div>I made a small gist: <span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><a href="https://gist.github.com/da7b2ef6ef109511af06a9cebbfc8ed1">https://gist.github.com/da7b2ef6ef109511af06a9cebbfc8ed1</a></span></div><div><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><br></span></div><div><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px">One difference I see between a numpy array with the same strides and the array loaded from a MAT file is the ALIGNED flag.</span></div>


</div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Aug 26, 2017 at 10:10 AM, Stephan Hoyer <span dir="ltr"><<a href="mailto:shoyer@gmail.com" target="_blank">shoyer@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="">On Sat, Aug 26, 2017 at 12:09 AM, Jonathan Taylor <span dir="ltr"><<a href="mailto:jonathan.taylor@stanford.edu" target="_blank">jonathan.taylor@stanford.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>So, matvec is just slower because of strides and where numpy retrieves data? Is there a simple way to do this besides a copy? I can easily afford the copy, just wondering.</div></div></blockquote><div><br></div></span><div>No, the only way to change the strides of an array with the same data is to make a copy.</div><div><br></div><div>Array operations will always be fastest when the smallest strides are along the axis iterated over in the inner-most (summed) loop. So this existing strides of your matrix are not sub-optimal in general, just for this specific operation. They would be suitable, for example, in a vector-matrix multiply.</div></div></div></div>

<br>______________________________<wbr>_________________<br>

SciPy-Dev mailing list<br>

<a href="mailto:SciPy-Dev@python.org">SciPy-Dev@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/scipy-dev" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scipy-dev</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Jonathan Taylor                          <br>Dept. of Statistics                      <br>Sequoia Hall, 137                          <br>390 Serra Mall<br>Stanford, CA 94305<br>Tel:   650.723.9230<br>Fax:   650.725.8977<br>Web: <a href="http://www-stat.stanford.edu/~jtaylo" target="_blank">http://www-stat.stanford.edu/~jtaylo</a></div>

</div>