[Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

Sebastian Berg sebastian at sipsolutions.net
Mon Mar 26 12:48:47 EDT 2018


On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> It'll need to be thought out for object arrays and subclasses. But
> for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect.

I do not want to block this, but I would like a clearer opinion about
this issue, `np.nansum` as Benjamin noted would require something like:

np.nansum([np.nan], default=np.nan)

because

np.sum([1], initializer=np.nan)
np.nansum([1], initializer=np.nan)

would both give NaN if the logic is the same as the current `np.sum`.
And yes, I guess for fmin/fmax NaN happens to work. And then there are
many nonsense reduces which could make sense with `initializer`.

Now nansum is not implemented in a way that could make use of the new
kwarg anyway, so maybe it does not matter in some sense. We can in
principle use `default` in nansum and at some point possibly add
`default` to the normal ufuncs. If we argue like that, the only
annoying thing is the `object` dtype which confuses the two use cases
currently.

This confusion IMO is not harmless, because I might want to use it
(e.g. sum with initializer=5), and I would expect things like dropping
in `decimal.Decimal` to work most of the time, while here it would give
silently bad results.

- Sebastian





> On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
> 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions
> are in a separate branch for another PR to be discussed later. You
> can
> see it on my fork, if you're interested. Except that as far as I
> understand I am not sure it will help much with it, since it is not a
> default, but an initializer. Initializing to NaN would just make all
> results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm,
> this is neat. I imagine it would finally give some people a choice on
> what np.nansum([np.nan]) should return? It caused a huge hullabeloo a
> few years ago when we changed it from returning NaN to returning
> zero.
> Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote: OK, the new documentation is
> actually clear: initializer : scalar, optional The value with which
> to
> start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> ufunc. If ``None`` is given, the first element of the reduction is
> used, and an error is thrown if the reduction is empty. If
> ``a.dtype``
> is ``object``, then the initializer is _only_ used if reduction is
> empty. I would actually like to say that I do not like the object
> special case much (and it is probably the reason why I was confused),
> nor am I quite sure this is what helps a lot? Logically, I would
> argue
> there are two things: 1. initializer/start (always used) 2. default
> (oly used for empty reductions) For example, I might like to give
> `np.nan` as the default for some empty reductions, this will not
> work.
> I understand that this is a minimal invasive PR and I am not sure I
> find the solution bad enough to really dislike it, but what do other
> think? My first expectation was the default behaviour (in all cases,
> not just object case) for some reason. To be honest, for now I just
> wonder a bit: How hard would it be to do both, or is that too
> annoying? It would at least get rid of that annoying thing with
> object
> ufuncs (which currently have a default, but not really an
> identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t
> that of `default` but that of > `initializer` or `start`. > > This
> was
> discussed further down in the PR but to reiterate: > `np.sum([10],
> initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
> becomes `0`, so it isn’t really > the default value, it’s the initial
> value among which the reduction > is performed. > > This was the
> reason to call it initializer in the first place. I like > `initial`
> and `initial_value` as well, and `start` also makes sense > but isn’t
> descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
> 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
> wrote: > > > > Initializer or this sounds fine to me. As an other
> data
> point which > > I > > think has been mentioned before, `sum` uses
> start and min/max use > > default. `start` does not work, unless we
> also change the code to > > always use the identity if given
> (currently that is not the case), > > in > > which case it might be
> nice. However, "start" seems a bit like > > solving > > a different
> issue in any case. > > > > Anyway, mostly noise. I really like adding
> this, the only thing > > worth > > discussing a bit is the name :). >
> - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer
> Abbasi wrote: > > > It calls it `initializer` - See
> https://docs.python.org/3.5/libra > > > ry/f > > >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
> On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > >
> > com> > > > > wrote: > > > > > > > > It turns out I mispoke -
> 
> functools.reduce calls the argument > > > > `initial` > > > > > > > >
> On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > > >
> wrote: > > > > > This looks like a very logical addition to the
> reduce
> > > > > > interface. > > > > > It has my support! > > > > > > > > > >
> 
> I would have preferred the more descriptive name > > > > >
> "initial_value", > > > > > but consistency with functools.reduce
> makes
> a compelling case > > > > > for > > > > > "initializer". > > > > > >
> >
> > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump >
> > > > > y at gm > > > > > ail.com> wrote: To reiterate my comments in
> > > > > the
> 
> issue - I'm in favor of > this. > > > > > > > > > > > > It seems seem
> especially valuable for identity-less > > > > > > functions > > > > >
> > (`min`, `max`, `lcm`), and the argument name is consistent > > > >
> > >
> > with > `functools.reduce`. too. > > > > > > > > > > > > The only
> 
> argument I can see against merging this would be > > > > > >
> `kwarg`-creep of `reduce`, and I think this has enough use > > > > >
> cases to justify that. > > > > > > > > > > > > I'd like to merge in a
> few days, if no one else has any > > > > > > opinions. > > > > > > >
> Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
> Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote: >
> > > > > > > Hello, everyone. I’ve submitted a PR to add a initializer
> > > > > > > > kwarg to ufunc.reduce. This is useful in a few cases, >
> > > > > > > 
> > > > > > > e.g., > > > > > > > it allows one to supply a “default”
> 
> value for identity- > > > > > > > less > > > > > > > ufunc
> reductions,
> and specify an initial value for > > > > > > > reductions such as sum
> (other than zero.) > > > > > > > > > > > > Please feel free to review
> or leave feedback, (although I > > > > > think Eric and Marten have
> picked it apart pretty well). > > > >
> https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > > >
> > > > > > > > > > Hameer > > > > Sent from Astro for Mac > > > > > >
> > > > > > > > > > >
> > > > > > > > 
> > > > > > > > _______________________________________________ > > > >
> > > > 
> > > > NumPy-Discussion mailing list > > > > > > >
> 
> NumPy-Discussion at python.org > > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> _______________________________________________ > > > > >
> NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > > > _______________________________________________ > > > > >
> 
> NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > > NumPy-
> Discussion
> mailing list > > > NumPy-Discussion at python.org > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > NumPy-Discussion
> mailing list > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > NumPy-Discussion
> mailing list > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/2e79573b/attachment.sig>


More information about the NumPy-Discussion mailing list