The difficulty in supporting object arrays is that func.reduce(arr, initial=func.identity)
and func.reduce(arr)
have different meanings - whereas with the current patch, they are equivalent.
On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> That may be complicated. Currently, the identity isn't used in object
> dtype reductions. We may need to change that, which could cause a
> whole lot of other backwards incompatible changes. For example, sum
> actually including zero in object reductions. Or we could pass in a
> flag saying an initializer was passed in to change that behaviour. If
> this is agreed upon and someone is kind enough to point me to the
> code, I'd be willing to make this change.
I realize the implication, I am not suggesting to change the default
behaviour (when no initial=... is passed), I would think about
deprecating it, but probably only if we also have the `default`
argument, since otherwise you cannot replicate the old behaviour.
What I think I would like to see is to change how it works if (and only
if) the initializer is passed in. Yes, this will require holding on to
some extra information since you will have to know/remember whether the
"identity" was passed in or defined otherwise.
I did not check the code, but I would hope that it is not awfully
tricky to do that.
- Sebastian
PS: A side note, but I see your emails as a single block of text with
no/broken new-lines.
> On 26/03/2018 at 18:54,
> Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
> wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
> need to be thought out for object arrays and subclasses. But for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect. I do not want to block this, but I would like a clearer
> opinion about this issue, `np.nansum` as Benjamin noted would require
> something like: np.nansum([np.nan], default=np.nan) because
> np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
> would both give NaN if the logic is the same as the current `np.sum`.
> And yes, I guess for fmin/fmax NaN happens to work. And then there
> are
> many nonsense reduces which could make sense with `initializer`. Now
> nansum is not implemented in a way that could make use of the new
> kwarg anyway, so maybe it does not matter in some sense. We can in
> principle use `default` in nansum and at some point possibly add
> `default` to the normal ufuncs. If we argue like that, the only
> annoying thing is the `object` dtype which confuses the two use cases
> currently. This confusion IMO is not harmless, because I might want
> to
> use it (e.g. sum with initializer=5), and I would expect things like
> dropping in `decimal.Decimal` to work most of the time, while here it
> would give silently bad results. In other words: I am very very much
> in favor if you get rid that object dtype special case. I frankly not
> see why not (except that it needs a bit more code change). If given
> explicitly, we might as well force the use and not do the funny stuff
> which is designed to be more type agnostic! If it happens to fail due
> to not being type agnostic, it will at least fail loudly. If you
> leave
> that object special case I am *very* hesitant about it. That I think
> I
> would like a `default` argument as well, is another issue and it can
> wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
> Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
> wrote: That is the idea, but NaN functions are in a separate branch
> for another PR to be discussed later. You can see it on my fork, if
> you're interested. Except that as far as I understand I am not sure
> it
> will help much with it, since it is not a default, but an
> initializer.
> Initializing to NaN would just make all results NaN. - Sebastian On
> 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
> would finally give some people a choice on what np.nansum([np.nan])
> should return? It caused a huge hullabeloo a few years ago when we
> changed it from returning NaN to returning zero. Ben Root On Mon, Mar
> 26, 2018 at 11:16 AM, Sebastian Berg <sebastian@sipsolutions.net>
> wrote: OK, the new documentation is actually clear: initializer :
> scalar, optional The value with which to start the reduction.
> Defaults
> to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given,
> the
> first element of the reduction is used, and an error is thrown if the
> reduction is empty. If ``a.dtype`` is ``object``, then the
> initializer
> is _only_ used if reduction is empty. I would actually like to say
> that I do not like the object special case much (and it is probably
> the reason why I was confused), nor am I quite sure this is what
> helps
> a lot? Logically, I would argue there are two things: 1.
> initializer/start (always used) 2. default (oly used for empty
> reductions) For example, I might like to give `np.nan` as the default
> for some empty reductions, this will not work. I understand that this
> is a minimal invasive PR and I am not sure I find the solution bad
> enough to really dislike it, but what do other think? My first
> expectation was the default behaviour (in all cases, not just object
> case) for some reason. To be honest, for now I just wonder a bit: How
> hard would it be to do both, or is that too annoying? It would at
> least get rid of that annoying thing with object ufuncs (which
> currently have a default, but not really an identity/initializer).
> Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
> wrote: > Actually, the behavior right now isn’t that of `default` but
> that of > `initializer` or `start`. > > This was discussed further
> down in the PR but to reiterate: > `np.sum([10], initializer=5)`
> becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so
> it isn’t really > the default value, it’s the initial value among
> which the reduction > is performed. > > This was the reason to call
> it
> initializer in the first place. I like > `initial` and
> `initial_value`
> as well, and `start` also makes sense > but isn’t descriptive enough.
> > > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at 12:06,
>
> Sebastian Berg <sebastian@sipsolutions.ne > > t> wrote: > > > >
> Initializer or this sounds fine to me. As an other data point which >
> > I > > think has been mentioned before, `sum` uses start and min/max
>
> use > > default. `start` does not work, unless we also change the
> code
> to > > always use the identity if given (currently that is not the
> case), > > in > > which case it might be nice. However, "start" seems
> a bit like > > solving > > a different issue in any case. > > > >
> Anyway, mostly noise. I really like adding this, the only thing > >
> worth > > discussing a bit is the name :). - Sebastian > > > > > > On
> Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It calls
> it
> `initializer` - See https://docs.python.org/3.5/libra > > > ry/f > >
> >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac On
> Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy@gmail. > > com>
> > > > > wrote: > > > > > > > > It turns out I mispoke -
>
> functools.reduce calls the argument > > > > `initial` > > > > > > >
> On
> Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer@gmail.com> > > >
> wrote: > > > > > This looks like a very logical addition to the
> reduce
> interface. > > > > > It has my support! > > > > > > > > > I would
> have
> preferred the more descriptive name > > > > > "initial_value", > > >
> >
> > but consistency with functools.reduce makes a compelling case > > >
> > > for > > > > > "initializer". > > > > > On Sun, Mar 25, 2018 at
>
> 1:15 PM Eric Wieser <wieser.eric+nump y@gm > > > > > ail.com> wrote:
> To reiterate my comments in the issue - I'm in favor of > this. > > >
> > > > > > > > > > It seems seem especially valuable for identity-less
> > > > > > >
> > > > > > > functions > > > > (`min`, `max`, `lcm`), and the argument
>
> name is consistent > > > with > `functools.reduce`. too. > > > > > >
> >
> > > > > > The only argument I can see against merging this would be >
> > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > >
> > > >
> > > > cases to justify that. > > > > > > > > > > > > I'd like to
> > > > merge
>
> in a few days, if no one else has any > > > > > > opinions. > > > > >
> > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
>
> Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote:
> Hello, everyone. I’ve submitted a PR to add a initializer kwarg to
> ufunc.reduce. This is useful in a few cases, e.g., > > > > > > > it
> allows one to supply a “default” value for identity- > > > > > > >
> less > > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > > reductions such as sum (other than zero.) > > > > > > >
> > > > > >
> > > > > > Please feel free to review or leave feedback, (although I >
> > > > >
> > > > > think Eric and Marten have picked it apart pretty well). > >
> > > > > >
> >
> > https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > >
>
> Hameer > > > > Sent from Astro for Mac > > > > >
> _______________________________________________ > > > NumPy-
> Discussion
> mailing list > > > > > > > NumPy-Discussion@python.org > > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >
> _______________________________________________ > > > > >
> NumPy-Discussion mailing list > > > > > > NumPy-Discussion@python.o
> rg
> https://mail.python.org/mailman/listinfo/numpy-discussi on
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > > NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > > NumPy-
> Discussion mailing list > > > NumPy-Discussion@python.org > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > NumPy- Discussion
> mailing list > > NumPy-Discussion@python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > NumPy-Discussion
> mailing list > NumPy-Discussion@python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion