[Numpy-discussion] ufunc for sum of squared difference

Matthew Harrigan harrigan.matthew at gmail.com
Fri Nov 11 16:06:15 EST 2016


My possibly poorly thought out API would be something like
np.add_sq_diff.set_c(42.0).reduce(x), basically add a setter method which
returned self.  Thanks for explaining what data was intended to do.  All of
this is really just a dance around the requirement that reduce only works
on binary functions.  Would there be any interest in extending reduce to
allow for arbitrary numbers of input and output arguments?

On Fri, Nov 11, 2016 at 1:38 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> here is an old unfinished PR adding max_abs which has a similar purpose
> as yours:
> https://github.com/numpy/numpy/pull/5509
>
> So far I know data is set during runtime construction and is part of the
> ufunc object, so it is not really very suitable to pass in constants on
> call.
> Its intended purpose is pass in functions to be called in generic loops
> like PyUFunc_d_d.
>
> Having an option to mark arguments of a ufunc as special in reductions
> could be useful, e.g. it would allow a potential
> (fused-)multiply-and-add ufunc to be used to implement a weighted sum.
>
> On 11.11.2016 17:25, Matthew Harrigan wrote:
> > I started a ufunc to compute the sum of square differences here
> > <https://gist.github.com/mattharrigan/6f678b3d6df5efd236fc23bfb59fd3bd>.
> It
> > is about 4x faster and uses half the memory compared to
> > np.sum(np.square(x-c)).  This could significantly speed up common
> > operations like std and var (where c=np.mean(x).  It faster because its
> > a single pass instead of 3, and also because the inner loop is
> > specialized for the common reduce case, which might be an optimization
> > considering more generally.
> >
> > I think I have answered my question #2&3 above.
> >
> > Can someone please point me to an example where "data" was used in a
> > ufunc inner loop?  How can that value be set at runtime?  Thanks
> >
> > On Fri, Nov 4, 2016 at 5:33 PM, Sebastian Berg
> > <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>> wrote:
> >
> >     On Fr, 2016-11-04 at 15:42 -0400, Matthew Harrigan wrote:
> >     > I didn't notice identity before.  Seems like frompyfunc always sets
> >     > it to None.  If it were zero maybe it would work as desired here.
> >     >
> >     > In the writing your own ufunc doc, I was wondering if the pointer
> to
> >     > data could be used to get a constant at runtime.  If not, what
> could
> >     > that be used for?
> >     > static void double_logit(char **args, npy_intp *dimensions,
> >     >                             npy_intp* steps, void* data)
> >     > Why would the numerical accuracy be any different?  The subtraction
> >     > and square operations look identical and I thought np.sum just
> calls
> >     > np.add.reduce, so the reduction step uses the same code and would
> >     > therefore have the same accuracy.
> >     >
> >
> >     Sorry, did not read it carefully, I guess `c` is the mean, so you are
> >     doing the two pass method.
> >
> >     - Sebastian
> >
> >
> >     > Thanks
> >     >
> >     > On Fri, Nov 4, 2016 at 1:56 PM, Sebastian Berg
> <sebastian at sipsolution
> >     > s.net <http://s.net>> wrote:
> >     > > On Fr, 2016-11-04 at 13:11 -0400, Matthew Harrigan wrote:
> >     > > > I was reading this and got thinking about if a ufunc could
> >     > > compute
> >     > > > the sum of squared differences in a single pass without a
> >     > > temporary
> >     > > > array.  The python code below demonstrates a possible approach.
> >     > > >
> >     > > > import numpy as np
> >     > > > x = np.arange(10)
> >     > > > c = 1.0
> >     > > > def add_square_diff(x1, x2):
> >     > > >     return x1 + (x2-c)**2
> >     > > > ufunc = np.frompyfunc(add_square_diff, 2, 1)
> >     > > > print(ufunc.reduce(x) - x[0] + (x[0]-c)**2)
> >     > > > print(np.sum(np.square(x-c)))
> >     > > >
> >     > > > I have (at least) 4 questions:
> >     > > > 1. Is it possible to pass run time constants to a ufunc written
> >     > > in C
> >     > > > for use in its inner loop, and if so how?
> >     > >
> >     > > I don't think its anticipated, since a ufunc could in most cases
> >     > > use a
> >     > > third argument, but a 3 arg ufunc can't be reduced. Not sure if
> >     > > there
> >     > > might be some trickery possible.
> >     > >
> >     > > > 2. Is it possible to pass an initial value to reduce to avoid
> the
> >     > > > clean up required for the first element?
> >     > >
> >     > > This is the identity normally. But the identity can only be 0, 1
> or
> >     > > -1
> >     > > right now I think. The identity is what the output array gets
> >     > > initialized with (which effectively makes it the first value
> passed
> >     > > into the inner loop).
> >     > >
> >     > > > 3. Does that ufunc work, or are there special cases which cause
> >     > > it to
> >     > > > fall apart?
> >     > > > 4. Would a very specialized ufunc such as this be considered
> for
> >     > > > incorporating in numpy since it would help reduce time and
> memory
> >     > > of
> >     > > > functions already in numpy?
> >     > > >
> >     > >
> >     > > Might be mixing up things, however, IIRC the single pass approach
> >     > > has a
> >     > > bad numerical accuracy, so that I doubt that it is a good default
> >     > > algorithm.
> >     > >
> >     > > - Sebastian
> >     > >
> >     > >
> >     > > > Thank you,
> >     > > > Matt
> >     > > > _______________________________________________
> >     > > > NumPy-Discussion mailing list
> >     > > > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> >     > > _______________________________________________
> >     > > NumPy-Discussion mailing list
> >     > > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> >     > >
> >     > _______________________________________________
> >     > NumPy-Discussion mailing list
> >     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> >
> >     _______________________________________________
> >     NumPy-Discussion mailing list
> >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> >
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161111/8d83ff1d/attachment.html>


More information about the NumPy-Discussion mailing list