[Numpy-discussion] Experimental `like=` attribute for array creation functions

Fri Aug 14 07:21:17 EDT 2020

Hi all,

This thread has IMO drifted very far from its original purpose, due to that
I decided to start a new thread specifically for the general NEP procedure
discussed, please check your mail for "NEP Procedure Discussion" subject.

On the topic of this thread, I'll try to rewrite NEP-35 to make it more
accessible and ping back here once I have a PR for that. Is there anything
else that's pressing here? If there is and I missed/forgot about it, please
let me know.

Best,
Peter

On Fri, Aug 14, 2020 at 5:00 AM Juan Nunez-Iglesias <jni at fastmail.com>
wrote:

> Hello everyone again!
>
> A few clarifications about my proposal of external peer review:
>
> - Yes, all this work is public and announced on the mailing list. However,
> I don’t think there’s a single person in this discussion or even this whole
> ecosystem that does not have a more immediately-pressing and also virtually
> infinite to-do list, so it’s unreasonable to expect that generally they
> would do more than glance at the stuff in the mailing list. In the peer
> review analogy, the mailing list is like the arXiv or Biorxiv stream — yep,
> anyone can see the stuff on there and comment, but most people just don’t
> have the time or attention to grab onto that. The only reason I stopped to
> comment here is Sebastian’s “Imma merge, YOLO!”, which had me raising my
> eyebrows real high. 😂 Especially for something that would expand the NumPy
> API!
>
> - So, my proposal is that there needs to be an *editor* of NEPs who takes
> responsibility, once they are themselves satisfied with the NEP, for
> seeking out external reviewers and pinging them individually and asking
> them if they would be ok to review.
>
> - A good friend who does screenwriting once told me, “don’t use all your
> proofreaders at once”. You want to get feedback, improve things, then
> feedback from a *totally independent* new person who can see the document
> with fresh eyes.
>
> Obviously, all of the above slows things down. But “alone we go fast,
> together we go far”. The point of a NEP is to document critical decisions
> for the long term health of the project. If the documentation is
> insufficient, it defeats the whole purpose. Might as well just implement
> stuff and skip the whole NEP process. (Side note: Stephan, I for one would
> definitely appreciate an update to existing NEPs if there’s obvious ways
> they can be improved!)
>
> I do think that NEP templates should be strict, and I don’t think that is
> incompatible with plain, jargon-free text. The NEP template and guidelines
> should specify that, and that the motivation should be understandable by a
> casual NumPy user — the kind described by Ilhan, for whom bare NumPy
> actually meets all their needs. Maybe they’ve also used PyTorch but they’ve
> never really had cause to mix them or write a program that worked with both
> kinds of arrays.
>
> Ditto for backwards compatibility — everyone should be clear when their
> existing code is going to be broken. Actually NEP18 broke so much of my
> code, but its Backward compatibility section basically says all good!
> https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility
>
>
> Anywho, as always, none of this is criticism to work done — I thank you
> all, and am eternally grateful for all the hard work everyone is doing to
> keep the ecosystem from fragmenting. I’m just hoping that this discussion
> can improve the process going forward!
>
> And, yes, apologies to Peter, I know from repeated personal experience how
> frustrating it can be to have last-minute drive-by objections after months
> of consensus building! But I think in the end every time that happened the
> end result was better — I hope the same is true here! And yes, I’ll
> reiterate Ralf’s point: my concerns are about the NEP process itself rather
> than this one. I’ll summarise my proposal:
>
> - strict NEP template. NEPs with missing sections will not be accepted.
> - sections Abstract, Motivation, and Backwards Compatibility should be
> understandable at a high level by casual users with ~zero background on the
> topic
> - enforce the above with at least two independent rounds of coordinated
> peer review.
>
> Thank you,
>
> Juan.
>
> On 14 Aug 2020, at 5:29 am, Stephan Hoyer <shoyer at gmail.com> wrote:
>
> On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>> Thanks for raising these concerns Ilhan and Juan, and for answering
>> Peter. Let me give my perspective as well.
>>
>> To start with, this is not specifically about Peter's NEP and PR. NEP 35
>> simply follows the pattern set by previous PRs, and given its tight scope
>> is less difficult to understand than other NEPs on such technical topics.
>> Peter has done a lot of things right, and is close to the finish line.
>>
>>
>> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev <
>> peter at entschev.com> wrote:
>>
>>>
>>> > I think, arriving to an agreement would be much faster if there is an
>>> executive summary of who this is intended for and what the regular usage
>>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>>> and a lot of technical details of which I am absolutely ignorant.
>>>
>>> This is what I intended to do in the Usage Guidance [2] section. Could
>>> you elaborate on what more information you'd want to see there? Or is
>>> it just a matter of reorganizing the NEP a bit to try and summarize
>>> such things right at the top?
>>>
>>
>> We adapted the NEP template [6] several times last year to try and
>> improve this. And specified in there as well that NEP content set to the
>> mailing list should only contain the sections: Abstract, Motivation and
>> Scope, Usage and Impact, and Backwards compatibility. This to ensure we
>> fully understand the "why" and "what" before the "how". Unfortunately that
>> template and procedure hasn't been exercised much yet, only in NEP 38 [7]
>> and partially in NEP 41 [8].
>>
>> If we have long-time maintainers of SciPy (Ilhan and myself),
>> scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't
>> understand the goals, relevance, target audience, or how they're supposed
>> to use a new feature, that indicates that the people doing the writing and
>> having the discussion are doing something wrong at a very fundamental level.
>>
>>
>> At this point I'm pretty disappointed in and tired of how we write and
>> discuss NEPs on technical topics like dispatching, dtypes and the like.
>> People literally refuse to write down concrete motivations, goals and
>> non-goals, code that's problematic now and will be better/working post-NEP
>> and usage examples before launching into extensive discussion of the gory
>> details of the internals. I'm not sure what to do about it. Completely
>> separate API and behavior proposals from implementation proposals? Make
>> separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo
>> on the API team which then needs to approve every API change in new NEPs?
>> Offer to co-write NEPs if someone is willing but doesn't understand how to
>> go about it? Keep the current structure/process but veto further approvals
>> until NEP authors get it right?
>>
>
> I think the NEP template is great, and we should try to be more diligent
> about following it!
>
> My own NEP 37 (__array_module__) is probably a good example of poor
> presentation due to not following the template structure. It goes pretty
> deep into low-level motivation and some implementation details before usage
> examples.
>
> Speaking just for myself, I would have appreciated a friendly nudge to use
> the template. Certainly I think it would be fine to require using the
> template for newly submitted NEPs. I did not remember about it when I
> started drafting NEP 37, and it definitely would have helped. I may still
> try to do a revision at some point to use the template structure.
>
>
>> I want to make an exception for merging the current NEP, for which the
>> plan is to merge it as experimental to try in downstream PRs and get more
>> experience. That does mean that master will be in an unreleasable state by
>> the way, which is unusual and it'd be nice to get Chuck's explicit OK for
>> that. But after that, I think we need a change here. I would like to hear
>> what everyone thinks is the shape that change should take - any of my above
>> suggestions, or something else?
>>
>>
>>
>>> > Finally as a minor point, I know we are mostly (ex-)academics but this
>>> necessity of formal language on NEPs is self-imposed (probably PEPs are to
>>> blame) and not quite helping. It can be a bit more descriptive in my
>>> external opinion.
>>>
>>> TBH, I don't really know how to solve that point, so if you have any
>>> specific suggestions, that's certainly welcome. I understand the
>>> frustration for a reader trying to understand all the details, with
>>> many being only described in NEP-18 [3], but we also strive to avoid
>>> rewriting things that are written elsewhere, which would also
>>> overburden those who are aware of what's being discussed.
>>>
>>>
>>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>>> difficult topics and readers should be expected to have *some* familiarity
>>> with the topics being discussed, but perhaps more effort should be put into
>>> the context/motivation/background of a NEP before accepting it. One way to
>>> ensure this might be to require a final proofreading step by someone who
>>> has not been involved at all in the discussions, like peer review does for
>>> papers.
>>>
>>
>> Some variant of this proposal would be my preference.
>>
>> Cheers,
>> Ralf
>>
>>
>>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572
>>> [2]
>>> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance
>>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html
>>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow
>>> [5]
>>> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html
>>
>>
>> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst
>> [7]
>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst
>> [8]
>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst
>>
>>
>>
>>>
>>>
>>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias <jni at fastmail.com>
>>> wrote:
>>> >
>>> > I’ve generally been on the “let the NumPy devs worry about it” side of
>>> things, but I do agree with Ilhan that `like=` is confusing and `typeof=`
>>> would be a much more appropriate name for that parameter.
>>> >
>>> > I do think library writers are NumPy users and so I wouldn’t really
>>> make that distinction, though. Users writing their own analysis code could
>>> very well be interested in writing code using numpy functions that will
>>> transparently work when the input is a CuPy array or whatever.
>>> >
>>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>>> difficult topics and readers should be expected to have *some* familiarity
>>> with the topics being discussed, but perhaps more effort should be put into
>>> the context/motivation/background of a NEP before accepting it. One way to
>>> ensure this might be to require a final proofreading step by someone who
>>> has not been involved at all in the discussions, like peer review does for
>>> papers.
>>> >
>>> > Food for thought.
>>> >
>>> > Juan.
>>> >
>>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat <ilhanpolat at gmail.com> wrote:
>>> >
>>> > For what is worth, as a potential consumer in SciPy, it really doesn't
>>> say anything (both in NEP and the PR) about how the regular users of NumPy
>>> will benefit from this. If only and only 3rd parties are going to benefit
>>> from it, I am not sure adding a new keyword to an already confusing
>>> function is the right thing to do.
>>> >
>>> > Let me clarify,
>>> >
>>> > - This is already a very (I mean extremely very) easy keyword name to
>>> confuse with ones_like, zeros_like and by its nature any other
>>> interpretation. It is not signalling anything about the functionality that
>>> is being discussed. I would seriously consider reserving such obvious names
>>> for really obvious tasks. Because you would also expect the shape and ndim
>>> would be mimicked by the "like"d argument but it turns out it is acting
>>> more like "typeof=" and not "like=" at all. Because if we follow the
>>> semantics it reads as "make your argument asarray like the other thing" but
>>> it is actually doing, "make your argument an array with the other thing's
>>> type" which might not be an array after all.
>>> >
>>> > - Again, if this is meant for downstream libraries (because that's
>>> what I got out of the PR discussion, cupy, dask, and JAX were the only
>>> examples I could read) then hiding it in another function and writing with
>>> capital letters "this is not meant for numpy users" would be a much more
>>> convenient way to separate the target audience and regular users.
>>> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may
>>> be would be quite clean and to the point with no ambiguous keywords.
>>> >
>>> > I think, arriving to an agreement would be much faster if there is an
>>> executive summary of who this is intended for and what the regular usage
>>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>>> and a lot of technical details of which I am absolutely ignorant.
>>> >
>>> > Finally as a minor point, I know we are mostly (ex-)academics but this
>>> necessity of formal language on NEPs is self-imposed (probably PEPs are to
>>> blame) and not quite helping. It can be a bit more descriptive in my
>>> external opinion.
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200814/025968ab/attachment-0001.html>