[Numpy-discussion] ufunc for sum of squared difference

Julian Taylor jtaylor.debian at googlemail.com
Fri Nov 11 13:38:51 EST 2016


here is an old unfinished PR adding max_abs which has a similar purpose
as yours:
https://github.com/numpy/numpy/pull/5509

So far I know data is set during runtime construction and is part of the
ufunc object, so it is not really very suitable to pass in constants on
call.
Its intended purpose is pass in functions to be called in generic loops
like PyUFunc_d_d.

Having an option to mark arguments of a ufunc as special in reductions
could be useful, e.g. it would allow a potential
(fused-)multiply-and-add ufunc to be used to implement a weighted sum.

On 11.11.2016 17:25, Matthew Harrigan wrote:
> I started a ufunc to compute the sum of square differences here
> <https://gist.github.com/mattharrigan/6f678b3d6df5efd236fc23bfb59fd3bd>.  It
> is about 4x faster and uses half the memory compared to
> np.sum(np.square(x-c)).  This could significantly speed up common
> operations like std and var (where c=np.mean(x).  It faster because its
> a single pass instead of 3, and also because the inner loop is
> specialized for the common reduce case, which might be an optimization
> considering more generally.
> 
> I think I have answered my question #2&3 above.
> 
> Can someone please point me to an example where "data" was used in a
> ufunc inner loop?  How can that value be set at runtime?  Thanks
> 
> On Fri, Nov 4, 2016 at 5:33 PM, Sebastian Berg
> <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>> wrote:
> 
>     On Fr, 2016-11-04 at 15:42 -0400, Matthew Harrigan wrote:
>     > I didn't notice identity before.  Seems like frompyfunc always sets
>     > it to None.  If it were zero maybe it would work as desired here.
>     >
>     > In the writing your own ufunc doc, I was wondering if the pointer to
>     > data could be used to get a constant at runtime.  If not, what could
>     > that be used for?
>     > static void double_logit(char **args, npy_intp *dimensions,
>     >                             npy_intp* steps, void* data)
>     > Why would the numerical accuracy be any different?  The subtraction
>     > and square operations look identical and I thought np.sum just calls
>     > np.add.reduce, so the reduction step uses the same code and would
>     > therefore have the same accuracy.
>     >
> 
>     Sorry, did not read it carefully, I guess `c` is the mean, so you are
>     doing the two pass method.
> 
>     - Sebastian
> 
> 
>     > Thanks
>     >
>     > On Fri, Nov 4, 2016 at 1:56 PM, Sebastian Berg <sebastian at sipsolution
>     > s.net <http://s.net>> wrote:
>     > > On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote:
>     > > > I was reading this and got thinking about if a ufunc could
>     > > compute
>     > > > the sum of squared differences in a single pass without a
>     > > temporary
>     > > > array.  The python code below demonstrates a possible approach.
>     > > >
>     > > > import numpy as np
>     > > > x = np.arange(10)
>     > > > c = 1.0
>     > > > def add_square_diff(x1, x2):
>     > > >     return x1 + (x2-c)**2
>     > > > ufunc = np.frompyfunc(add_square_diff, 2, 1)
>     > > > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2)
>     > > > print(np.sum(np.square(x-c)))
>     > > >
>     > > > I have (at least) 4 questions:
>     > > > 1. Is it possible to pass run time constants to a ufunc written
>     > > in C
>     > > > for use in its inner loop, and if so how?
>     > >
>     > > I don't think its anticipated, since a ufunc could in most cases
>     > > use a
>     > > third argument, but a 3 arg ufunc can't be reduced. Not sure if
>     > > there
>     > > might be some trickery possible.
>     > >
>     > > > 2. Is it possible to pass an initial value to reduce to avoid the
>     > > > clean up required for the first element?
>     > >
>     > > This is the identity normally. But the identity can only be 0, 1 or
>     > > -1
>     > > right now I think. The identity is what the output array gets
>     > > initialized with (which effectively makes it the first value passed
>     > > into the inner loop).
>     > >
>     > > > 3. Does that ufunc work, or are there special cases which cause
>     > > it to
>     > > > fall apart?
>     > > > 4. Would a very specialized ufunc such as this be considered for
>     > > > incorporating in numpy since it would help reduce time and memory
>     > > of
>     > > > functions already in numpy?
>     > > >
>     > >
>     > > Might be mixing up things, however, IIRC the single pass approach
>     > > has a
>     > > bad numerical accuracy, so that I doubt that it is a good default
>     > > algorithm.
>     > >
>     > > - Sebastian
>     > >
>     > >
>     > > > Thank you,
>     > > > Matt
>     > > > _______________________________________________
>     > > > NumPy-Discussion mailing list
>     > > > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>     > > _______________________________________________
>     > > NumPy-Discussion mailing list
>     > > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>     > >
>     > _______________________________________________
>     > NumPy-Discussion mailing list
>     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 




More information about the NumPy-Discussion mailing list