<div dir="ltr"><div>I started a ufunc to compute the sum of square differences <a href="https://gist.github.com/mattharrigan/6f678b3d6df5efd236fc23bfb59fd3bd">here</a>.  It is about 4x faster and uses half the memory compared to np.sum(np.square(x-c)).  This could significantly speed up common operations like std and var (where c=np.mean(x).  It faster because its a single pass instead of 3, and also because the inner loop is specialized for the common reduce case, which might be an optimization considering more generally.<br><br>I think I have answered my question #2&3 above.<br><br></div>Can someone please point me to an example where "data" was used in a ufunc inner loop?  How can that value be set at runtime?  Thanks<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Nov 4, 2016 at 5:33 PM, Sebastian Berg <span dir="ltr"><<a href="mailto:sebastian@sipsolutions.net" target="_blank">sebastian@sipsolutions.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Fr, 2016-11-04 at 15:42 -0400, Matthew Harrigan wrote:<br>

> I didn't notice identity before.  Seems like frompyfunc always sets<br>

> it to None.  If it were zero maybe it would work as desired here.<br>

><br>

> In the writing your own ufunc doc, I was wondering if the pointer to<br>

> data could be used to get a constant at runtime.  If not, what could<br>

> that be used for?<br>

> static void double_logit(char **args, npy_intp *dimensions,<br>

>                             <wbr>npy_intp* steps, void* data)<br>

> Why would the numerical accuracy be any different?  The subtraction<br>

> and square operations look identical and I thought np.sum just calls<br>

> np.add.reduce, so the reduction step uses the same code and would<br>

> therefore have the same accuracy.<br>

><br>

<br>

</span>Sorry, did not read it carefully, I guess `c` is the mean, so you are<br>

doing the two pass method.<br>

<span class="HOEnZb"><font color="#888888"><br>

- Sebastian<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

> Thanks<br>

><br>

> On Fri, Nov 4, 2016 at 1:56 PM, Sebastian Berg <sebastian@sipsolution<br>

> <a href="http://s.net" rel="noreferrer" target="_blank">s.net</a>> wrote:<br>

> > On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote:<br>

> > > I was reading this and got thinking about if a ufunc could<br>

> > compute<br>

> > > the sum of squared differences in a single pass without a<br>

> > temporary<br>

> > > array.  The python code below demonstrates a possible approach.<br>

> > ><br>

> > > import numpy as np<br>

> > > x = np.arange(10)<br>

> > > c = 1.0<br>

> > > def add_square_diff(x1, x2):<br>

> > >     return x1 + (x2-c)**2<br>

> > > ufunc = np.frompyfunc(add_square_diff, 2, 1)<br>

> > > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2)<br>

> > > print(np.sum(np.square(x-c)))<br>

> > ><br>

> > > I have (at least) 4 questions:<br>

> > > 1. Is it possible to pass run time constants to a ufunc written<br>

> > in C<br>

> > > for use in its inner loop, and if so how?<br>

> ><br>

> > I don't think its anticipated, since a ufunc could in most cases<br>

> > use a<br>

> > third argument, but a 3 arg ufunc can't be reduced. Not sure if<br>

> > there<br>

> > might be some trickery possible.<br>

> ><br>

> > > 2. Is it possible to pass an initial value to reduce to avoid the<br>

> > > clean up required for the first element?<br>

> ><br>

> > This is the identity normally. But the identity can only be 0, 1 or<br>

> > -1<br>

> > right now I think. The identity is what the output array gets<br>

> > initialized with (which effectively makes it the first value passed<br>

> > into the inner loop).<br>

> ><br>

> > > 3. Does that ufunc work, or are there special cases which cause<br>

> > it to<br>

> > > fall apart?<br>

> > > 4. Would a very specialized ufunc such as this be considered for<br>

> > > incorporating in numpy since it would help reduce time and memory<br>

> > of<br>

> > > functions already in numpy?<br>

> > ><br>

> ><br>

> > Might be mixing up things, however, IIRC the single pass approach<br>

> > has a<br>

> > bad numerical accuracy, so that I doubt that it is a good default<br>

> > algorithm.<br>

> ><br>

> > - Sebastian<br>

> ><br>

> ><br>

> > > Thank you,<br>

> > > Matt<br>

> > > ______________________________<wbr>_________________<br>

> > > NumPy-Discussion mailing list<br>

> > > <a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

> > > <a href="https://mail.scipy.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.scipy.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>

> > ______________________________<wbr>_________________<br>

> > NumPy-Discussion mailing list<br>

> > <a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

> > <a href="https://mail.scipy.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.scipy.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>

> ><br>

> ______________________________<wbr>_________________<br>

> NumPy-Discussion mailing list<br>

> <a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

> <a href="https://mail.scipy.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.scipy.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a></div></div><br>______________________________<wbr>_________________<br>

NumPy-Discussion mailing list<br>

<a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

<a href="https://mail.scipy.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.scipy.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>

<br></blockquote></div><br></div>