[Numpy-discussion] Experimental `like=` attribute for array creation functions

Ralf Gommers ralf.gommers at gmail.com
Sun Aug 16 07:41:08 EDT 2020


On Fri, Aug 14, 2020 at 12:23 PM Peter Andreas Entschev <peter at entschev.com>
wrote:

> Hi all,
>
> This thread has IMO drifted very far from its original purpose, due to
> that I decided to start a new thread specifically for the general NEP
> procedure discussed, please check your mail for "NEP Procedure Discussion"
> subject.
>

Thanks Peter. For future reference: better to just edit the thread subject,
but not start over completely - people want to reply to previous content. I
will copy over comments I'd like to reply to to the other thread by hand
now.


> On the topic of this thread, I'll try to rewrite NEP-35 to make it more
> accessible and ping back here once I have a PR for that.
>

Thanks!

Cheers,
Ralf

Is there anything else that's pressing here? If there is and I
> missed/forgot about it, please let me know.
>
> Best,
> Peter
>
> On Fri, Aug 14, 2020 at 5:00 AM Juan Nunez-Iglesias <jni at fastmail.com>
> wrote:
>
>> Hello everyone again!
>>
>> A few clarifications about my proposal of external peer review:
>>
>> - Yes, all this work is public and announced on the mailing list.
>> However, I don’t think there’s a single person in this discussion or even
>> this whole ecosystem that does not have a more immediately-pressing and
>> also virtually infinite to-do list, so it’s unreasonable to expect that
>> generally they would do more than glance at the stuff in the mailing list.
>> In the peer review analogy, the mailing list is like the arXiv or Biorxiv
>> stream — yep, anyone can see the stuff on there and comment, but most
>> people just don’t have the time or attention to grab onto that. The only
>> reason I stopped to comment here is Sebastian’s “Imma merge, YOLO!”, which
>> had me raising my eyebrows real high. 😂 Especially for something that
>> would expand the NumPy API!
>>
>> - So, my proposal is that there needs to be an *editor* of NEPs who takes
>> responsibility, once they are themselves satisfied with the NEP, for
>> seeking out external reviewers and pinging them individually and asking
>> them if they would be ok to review.
>>
>> - A good friend who does screenwriting once told me, “don’t use all your
>> proofreaders at once”. You want to get feedback, improve things, then
>> feedback from a *totally independent* new person who can see the document
>> with fresh eyes.
>>
>> Obviously, all of the above slows things down. But “alone we go fast,
>> together we go far”. The point of a NEP is to document critical decisions
>> for the long term health of the project. If the documentation is
>> insufficient, it defeats the whole purpose. Might as well just implement
>> stuff and skip the whole NEP process. (Side note: Stephan, I for one would
>> definitely appreciate an update to existing NEPs if there’s obvious ways
>> they can be improved!)
>>
>> I do think that NEP templates should be strict, and I don’t think that is
>> incompatible with plain, jargon-free text. The NEP template and guidelines
>> should specify that, and that the motivation should be understandable by a
>> casual NumPy user — the kind described by Ilhan, for whom bare NumPy
>> actually meets all their needs. Maybe they’ve also used PyTorch but they’ve
>> never really had cause to mix them or write a program that worked with both
>> kinds of arrays.
>>
>> Ditto for backwards compatibility — everyone should be clear when their
>> existing code is going to be broken. Actually NEP18 broke so much of my
>> code, but its Backward compatibility section basically says all good!
>> https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility
>>
>>
>> Anywho, as always, none of this is criticism to work done — I thank you
>> all, and am eternally grateful for all the hard work everyone is doing to
>> keep the ecosystem from fragmenting. I’m just hoping that this discussion
>> can improve the process going forward!
>>
>> And, yes, apologies to Peter, I know from repeated personal experience
>> how frustrating it can be to have last-minute drive-by objections after
>> months of consensus building! But I think in the end every time that
>> happened the end result was better — I hope the same is true here! And yes,
>> I’ll reiterate Ralf’s point: my concerns are about the NEP process itself
>> rather than this one. I’ll summarise my proposal:
>>
>> - strict NEP template. NEPs with missing sections will not be accepted.
>> - sections Abstract, Motivation, and Backwards Compatibility should be
>> understandable at a high level by casual users with ~zero background on the
>> topic
>> - enforce the above with at least two independent rounds of coordinated
>> peer review.
>>
>> Thank you,
>>
>> Juan.
>>
>> On 14 Aug 2020, at 5:29 am, Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>> On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>> Thanks for raising these concerns Ilhan and Juan, and for answering
>>> Peter. Let me give my perspective as well.
>>>
>>> To start with, this is not specifically about Peter's NEP and PR. NEP 35
>>> simply follows the pattern set by previous PRs, and given its tight scope
>>> is less difficult to understand than other NEPs on such technical topics.
>>> Peter has done a lot of things right, and is close to the finish line.
>>>
>>>
>>> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev <
>>> peter at entschev.com> wrote:
>>>
>>>>
>>>> > I think, arriving to an agreement would be much faster if there is an
>>>> executive summary of who this is intended for and what the regular usage
>>>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>>>> and a lot of technical details of which I am absolutely ignorant.
>>>>
>>>> This is what I intended to do in the Usage Guidance [2] section. Could
>>>> you elaborate on what more information you'd want to see there? Or is
>>>> it just a matter of reorganizing the NEP a bit to try and summarize
>>>> such things right at the top?
>>>>
>>>
>>> We adapted the NEP template [6] several times last year to try and
>>> improve this. And specified in there as well that NEP content set to the
>>> mailing list should only contain the sections: Abstract, Motivation and
>>> Scope, Usage and Impact, and Backwards compatibility. This to ensure we
>>> fully understand the "why" and "what" before the "how". Unfortunately that
>>> template and procedure hasn't been exercised much yet, only in NEP 38 [7]
>>> and partially in NEP 41 [8].
>>>
>>> If we have long-time maintainers of SciPy (Ilhan and myself),
>>> scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't
>>> understand the goals, relevance, target audience, or how they're supposed
>>> to use a new feature, that indicates that the people doing the writing and
>>> having the discussion are doing something wrong at a very fundamental level.
>>>
>>>
>>> At this point I'm pretty disappointed in and tired of how we write and
>>> discuss NEPs on technical topics like dispatching, dtypes and the like.
>>> People literally refuse to write down concrete motivations, goals and
>>> non-goals, code that's problematic now and will be better/working post-NEP
>>> and usage examples before launching into extensive discussion of the gory
>>> details of the internals. I'm not sure what to do about it. Completely
>>> separate API and behavior proposals from implementation proposals? Make
>>> separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo
>>> on the API team which then needs to approve every API change in new NEPs?
>>> Offer to co-write NEPs if someone is willing but doesn't understand how to
>>> go about it? Keep the current structure/process but veto further approvals
>>> until NEP authors get it right?
>>>
>>
>> I think the NEP template is great, and we should try to be more diligent
>> about following it!
>>
>> My own NEP 37 (__array_module__) is probably a good example of poor
>> presentation due to not following the template structure. It goes pretty
>> deep into low-level motivation and some implementation details before usage
>> examples.
>>
>> Speaking just for myself, I would have appreciated a friendly nudge to
>> use the template. Certainly I think it would be fine to require using the
>> template for newly submitted NEPs. I did not remember about it when I
>> started drafting NEP 37, and it definitely would have helped. I may still
>> try to do a revision at some point to use the template structure.
>>
>>
>>> I want to make an exception for merging the current NEP, for which the
>>> plan is to merge it as experimental to try in downstream PRs and get more
>>> experience. That does mean that master will be in an unreleasable state by
>>> the way, which is unusual and it'd be nice to get Chuck's explicit OK for
>>> that. But after that, I think we need a change here. I would like to hear
>>> what everyone thinks is the shape that change should take - any of my above
>>> suggestions, or something else?
>>>
>>>
>>>
>>>> > Finally as a minor point, I know we are mostly (ex-)academics but
>>>> this necessity of formal language on NEPs is self-imposed (probably PEPs
>>>> are to blame) and not quite helping. It can be a bit more descriptive in my
>>>> external opinion.
>>>>
>>>> TBH, I don't really know how to solve that point, so if you have any
>>>> specific suggestions, that's certainly welcome. I understand the
>>>> frustration for a reader trying to understand all the details, with
>>>> many being only described in NEP-18 [3], but we also strive to avoid
>>>> rewriting things that are written elsewhere, which would also
>>>> overburden those who are aware of what's being discussed.
>>>>
>>>>
>>>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>>>> difficult topics and readers should be expected to have *some* familiarity
>>>> with the topics being discussed, but perhaps more effort should be put into
>>>> the context/motivation/background of a NEP before accepting it. One way to
>>>> ensure this might be to require a final proofreading step by someone who
>>>> has not been involved at all in the discussions, like peer review does for
>>>> papers.
>>>>
>>>
>>> Some variant of this proposal would be my preference.
>>>
>>> Cheers,
>>> Ralf
>>>
>>>
>>>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572
>>>> [2]
>>>> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance
>>>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html
>>>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow
>>>> [5]
>>>> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html
>>>
>>>
>>> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst
>>> [7]
>>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst
>>> [8]
>>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst
>>>
>>>
>>>
>>>>
>>>>
>>>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias <jni at fastmail.com>
>>>> wrote:
>>>> >
>>>> > I’ve generally been on the “let the NumPy devs worry about it” side
>>>> of things, but I do agree with Ilhan that `like=` is confusing and
>>>> `typeof=` would be a much more appropriate name for that parameter.
>>>> >
>>>> > I do think library writers are NumPy users and so I wouldn’t really
>>>> make that distinction, though. Users writing their own analysis code could
>>>> very well be interested in writing code using numpy functions that will
>>>> transparently work when the input is a CuPy array or whatever.
>>>> >
>>>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>>>> difficult topics and readers should be expected to have *some* familiarity
>>>> with the topics being discussed, but perhaps more effort should be put into
>>>> the context/motivation/background of a NEP before accepting it. One way to
>>>> ensure this might be to require a final proofreading step by someone who
>>>> has not been involved at all in the discussions, like peer review does for
>>>> papers.
>>>> >
>>>> > Food for thought.
>>>> >
>>>> > Juan.
>>>> >
>>>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat <ilhanpolat at gmail.com> wrote:
>>>> >
>>>> > For what is worth, as a potential consumer in SciPy, it really
>>>> doesn't say anything (both in NEP and the PR) about how the regular users
>>>> of NumPy will benefit from this. If only and only 3rd parties are going to
>>>> benefit from it, I am not sure adding a new keyword to an already confusing
>>>> function is the right thing to do.
>>>> >
>>>> > Let me clarify,
>>>> >
>>>> > - This is already a very (I mean extremely very) easy keyword name to
>>>> confuse with ones_like, zeros_like and by its nature any other
>>>> interpretation. It is not signalling anything about the functionality that
>>>> is being discussed. I would seriously consider reserving such obvious names
>>>> for really obvious tasks. Because you would also expect the shape and ndim
>>>> would be mimicked by the "like"d argument but it turns out it is acting
>>>> more like "typeof=" and not "like=" at all. Because if we follow the
>>>> semantics it reads as "make your argument asarray like the other thing" but
>>>> it is actually doing, "make your argument an array with the other thing's
>>>> type" which might not be an array after all.
>>>> >
>>>> > - Again, if this is meant for downstream libraries (because that's
>>>> what I got out of the PR discussion, cupy, dask, and JAX were the only
>>>> examples I could read) then hiding it in another function and writing with
>>>> capital letters "this is not meant for numpy users" would be a much more
>>>> convenient way to separate the target audience and regular users.
>>>> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may
>>>> be would be quite clean and to the point with no ambiguous keywords.
>>>> >
>>>> > I think, arriving to an agreement would be much faster if there is an
>>>> executive summary of who this is intended for and what the regular usage
>>>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>>>> and a lot of technical details of which I am absolutely ignorant.
>>>> >
>>>> > Finally as a minor point, I know we are mostly (ex-)academics but
>>>> this necessity of formal language on NEPs is self-imposed (probably PEPs
>>>> are to blame) and not quite helping. It can be a bit more descriptive in my
>>>> external opinion.
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200816/c58ff6b6/attachment-0001.html>


More information about the NumPy-Discussion mailing list