From ralf.gommers at gmail.com Sat Aug 1 07:21:43 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 1 Aug 2020 12:21:43 +0100 Subject: [Numpy-discussion] participating in AI Code-In? Message-ID: Hi all, We got an invitation to participate in AI Code-In ( https://aicode-in.github.io/AICode-In/). It's a new initiative, seems a bit GSoC like, but created by and for middle/high schoolers. We'd have to create tasks to work on (more like tagging/creation actionable issues than a full project), and provide some mentoring bandwidth. It seems well-organized and because it's a new initiative it may be smaller and more "early adopter" than GSoC. Would anyone be interested to participate as a mentor and/or lead the NumPy organization participation? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Sat Aug 1 07:25:22 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Sat, 1 Aug 2020 16:55:22 +0530 Subject: [Numpy-discussion] participating in AI Code-In? In-Reply-To: References: Message-ID: Hi Ralf , I'd be glad and more than interested to take part in CodeIn as a Mentor if there is no issue . Thanks , Mrinal On Sat, 1 Aug, 2020, 4:52 pm Ralf Gommers, wrote: > Hi all, > > We got an invitation to participate in AI Code-In ( > https://aicode-in.github.io/AICode-In/). It's a new initiative, seems a > bit GSoC like, but created by and for middle/high schoolers. We'd have to > create tasks to work on (more like tagging/creation actionable issues than > a full project), and provide some mentoring bandwidth. > > It seems well-organized and because it's a new initiative it may be > smaller and more "early adopter" than GSoC. Would anyone be interested to > participate as a mentor and/or lead the NumPy organization participation? > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Sat Aug 1 07:40:32 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Sat, 1 Aug 2020 17:10:32 +0530 Subject: [Numpy-discussion] participating in AI Code-In? In-Reply-To: References: Message-ID: Hi Ralf , I have quite some experience in Computer Vision where I developed a model to detect different type of currency notes and use object detection to augment 3d objects on these currency notes , there I relied up opencv python libraries and NumPy arrays for detection . I'd like to apply for mentoring role . https://github.com/mrityagi/ARnote Here is the link to my repo Thanks , Mrinal On Sat, 1 Aug, 2020, 4:55 pm Saber Tooth, wrote: > Hi Ralf , > I'd be glad and more than interested to take part in CodeIn as a Mentor if > there is no issue . > > Thanks , > Mrinal > > On Sat, 1 Aug, 2020, 4:52 pm Ralf Gommers, wrote: > >> Hi all, >> >> We got an invitation to participate in AI Code-In ( >> https://aicode-in.github.io/AICode-In/). It's a new initiative, seems a >> bit GSoC like, but created by and for middle/high schoolers. We'd have to >> create tasks to work on (more like tagging/creation actionable issues than >> a full project), and provide some mentoring bandwidth. >> >> It seems well-organized and because it's a new initiative it may be >> smaller and more "early adopter" than GSoC. Would anyone be interested to >> participate as a mentor and/or lead the NumPy organization participation? >> >> Cheers, >> Ralf >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yashboss2000 at gmail.com Sat Aug 1 09:51:05 2020 From: yashboss2000 at gmail.com (yash varshney) Date: Sat, 1 Aug 2020 19:21:05 +0530 Subject: [Numpy-discussion] participating in AI Code-In? In-Reply-To: References: Message-ID: Hey Ralf, This is great, I would love to participate as a mentor in this wonderful opportunity. Brief about me: I am well experienced in Computer Vision as well as in NLP. I have done Tensorflow-in-Practice Specialization, How to win kaggle Competitions, NLP Specialization (ongoing) courses. I have participated in kaggle competitions and have studied 3 courses of Data Science in my college. Also, presently I'm working as a mentee in SPDX community ( under Linux Foundation) via CommunityBridge Mentorship program by Linux Foundation. I also a main contributor in DFFML (dataflow facilitator for Machine Learning) org under PSF. Thanks, would love to hear from you soon. Regards, Yash Varshney B18038 IIT Mandi, H.P., India On Sat, Aug 1, 2020, 4:52 PM Ralf Gommers wrote: > Hi all, > > We got an invitation to participate in AI Code-In ( > https://aicode-in.github.io/AICode-In/). It's a new initiative, seems a > bit GSoC like, but created by and for middle/high schoolers. We'd have to > create tasks to work on (more like tagging/creation actionable issues than > a full project), and provide some mentoring bandwidth. > > It seems well-organized and because it's a new initiative it may be > smaller and more "early adopter" than GSoC. Would anyone be interested to > participate as a mentor and/or lead the NumPy organization participation? > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Aug 1 14:52:11 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 1 Aug 2020 19:52:11 +0100 Subject: [Numpy-discussion] a summary function to get a quick glimpse on the contents of a numpy array In-Reply-To: References: Message-ID: On Fri, Jul 31, 2020 at 1:40 PM Peter Steinbach wrote: > Dear numpy devs and interested readers, > > as a day-to-day user, it occurred to me that having a quick look into the > contents and extents of arrays is well doable with > numpy. numpy offers a rich set of methods for this. However, very often I > oversee myself and others that one just wants to see > if the values of an array have a certain min/max or mean or how wide the > range of values are. > > I hence sat down to write a summary function that returns a string of > hand-packed summary statistics for a quick inspection. I > propose to include it into numpy and would love to have your feedback on > this idea before I submit a PR. Here is the core > functionality: > > Examples > -------- > >>> a = np.random.normal(size=20) > >>> print(summary(a)) > min 25perc mean stdev median > 75perc max > -2.289870 -2.265757 -0.083213 1.115033 -0.162885 > -2.217532 1.639802 > >>> a = np.reshape(a, newshape=(4,5)) > >>> print(summary(a,axis=1)) > min 25perc mean stdev median > 75perc max > 0 -0.976279 -0.974090 0.293003 1.009383 0.466814 > -0.969712 1.519695 > 1 -0.468854 -0.467739 0.184139 0.649378 -0.036762 > -0.465510 1.303144 > 2 -2.289870 -2.276455 -0.324450 1.230031 -0.289008 > -2.249625 1.111107 > 3 -1.782239 -1.777304 -0.485546 1.259598 -1.236190 > -1.767434 1.639802 > > So you see, it is merely a tiny helper function that can aid practitioners > and data scientists to get a quick insight on what an > array contains. > > first off, here is the code: > > https://github.com/psteinb/numpy/blob/summary-function/numpy/lib/utils.py#L1021 > > I put it there as I am not sure at this point, if the community would > appreciate such a function or not. Judging from the tests, > lib/utils.py appears to a be place for undocumented functions. So to > resolve this and prepare a proper PR, please let me know > where this summary function could reside! > This seems to be more the domain of scipy.stats and statsmodels. Statsmodels already does a good job with this; in SciPy there's stats.describe ( https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.describe.html) which is quite similar to what you're proposing. Could you think about whether scipy.stats.describe does what you want, and if there's room to improve it (perhaps add a `__repr__` and/or a `__html_repr__` for pretty-printing)? Cheers, Ralf > Second, please give me your thoughts on the summary function's output? > Should the number of digits be configurable? Should the > columns be configurable? Is is ok to honor the axis parameter which is > found in so many numpy functions? > > Last but not least, let me stress that this is my first time contribution > to numpy. I love the library and would like to > contribute something back. So bear with me, if my code violates best > practices in your community for now. I'll bite my teeth > into the formalities of a github PR once I get support from the community > and the core devs. > > I think that a summary function would be a valuable addition to numpy! > Best, > Peter > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From b.sipocz+numpylist at gmail.com Sun Aug 2 03:03:48 2020 From: b.sipocz+numpylist at gmail.com (Brigitta Sipocz) Date: Sun, 2 Aug 2020 00:03:48 -0700 Subject: [Numpy-discussion] participating in AI Code-In? In-Reply-To: References: Message-ID: Hi, At first sight, the competition element seems a bit weird approach for the open source setting. Do you see a way how it can work out well? (Google also has a code-in for HS students. Has numpy ever tried it? (all the mentors I talked to at the last gsoc summit said it takes more time to mentor than gsoc, but I guess maybe it's partly due to the fact that is different, if we count all the pre-coding period efforts put into gsoc by the wider community, it also adds up significantly)). Cheers, Brigitta On Sat, 1 Aug 2020, 04:22 Ralf Gommers, wrote: > Hi all, > > We got an invitation to participate in AI Code-In ( > https://aicode-in.github.io/AICode-In/). It's a new initiative, seems a > bit GSoC like, but created by and for middle/high schoolers. We'd have to > create tasks to work on (more like tagging/creation actionable issues than > a full project), and provide some mentoring bandwidth. > > It seems well-organized and because it's a new initiative it may be > smaller and more "early adopter" than GSoC. Would anyone be interested to > participate as a mentor and/or lead the NumPy organization participation? > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon Aug 3 14:09:22 2020 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 3 Aug 2020 14:09:22 -0400 Subject: [Numpy-discussion] New random.Generator method: permuted Message-ID: In one of the previous weekly zoom meetings, it was suggested to ping the mailing list about an updated PR that implements the `permuted` method for the Generator class in numpy.random. The relevant issue is https://github.com/numpy/numpy/issues/5173 and the PR is https://github.com/numpy/numpy/pull/15121 The new method (as it would be called from Python) is permuted(x, axis=None, out=None) The CircleCI rendering of the docstring from the pull request is https://14745-908607-gh.circle-artifacts.com/0/doc/build/html/reference/random/generated/numpy.random.Generator.permuted.html The new method is an alternative to the existing `shuffle` and `permutation` methods. It handles the `axis` parameter similar to how the sort methods do, i.e. when `axis` is given, the slices along the axis are shuffled independently. This new documentation (added as part of the pull request) explains the API of the various related methods: https://14745-908607-gh.circle-artifacts.com/0/doc/build/html/reference/random/generator.html#permutations Additional feedback on the implementation of `permuted` in the pull request is welcome. Further discussion of the API should be held in the issue gh-5173 (but please familiarize yourself with the discussion of the API in gh-5173--there has already been quite a long discussion of several different APIs). Thanks, Warren From cv1038 at wildcats.unh.edu Mon Aug 3 20:39:51 2020 From: cv1038 at wildcats.unh.edu (Chris Vavaliaris) Date: Mon, 3 Aug 2020 17:39:51 -0700 (MST) Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs Message-ID: <1596501591921-0.post@n7.nabble.com> PR #16999: https://github.com/numpy/numpy/pull/16999 Hello all, this PR adds the two 1D Chebyshev transform functions `chebyfft` and `ichebyfft` into the `numpy.fft` module, utilizing the real FFTs `rfft` and `irfft`, respectively. As far as I understand, `pockefft` does not support cosine transforms natively; for this reason, an even extension of the input vector is constructed, whose real FFT corresponds to a cosine transform. The motivation behind these two additions is the ability to quickly perform direct and inverse Chebyshev transforms with `numpy`, without the need to write scripts that do the necessary (although minor) modifications. Chebyshev transforms are used often e.g. in the spectral integration of PDE problems; thus, I believe having them implemented in `numpy` would be useful to many people in the community. I'm happy to get comments/feedback on this feature, and on whether it's something more people would be interested in. Also, I'm not entirely sure what part of this functionality is/isn't present in `scipy`, so that the two `fft` modules remain consistent with one another. Best, Chris -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From ralf.gommers at gmail.com Tue Aug 4 06:54:21 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 4 Aug 2020 11:54:21 +0100 Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: <1596501591921-0.post@n7.nabble.com> References: <1596501591921-0.post@n7.nabble.com> Message-ID: On Tue, Aug 4, 2020 at 1:49 AM Chris Vavaliaris wrote: > PR #16999: https://github.com/numpy/numpy/pull/16999 > > Hello all, > this PR adds the two 1D Chebyshev transform functions `chebyfft` and > `ichebyfft` into the `numpy.fft` module, utilizing the real FFTs `rfft` and > `irfft`, respectively. As far as I understand, `pockefft` does not support > cosine transforms natively; for this reason, an even extension of the input > vector is constructed, whose real FFT corresponds to a cosine transform. > > The motivation behind these two additions is the ability to quickly perform > direct and inverse Chebyshev transforms with `numpy`, without the need to > write scripts that do the necessary (although minor) modifications. > Chebyshev transforms are used often e.g. in the spectral integration of PDE > problems; thus, I believe having them implemented in `numpy` would be > useful > to many people in the community. > > I'm happy to get comments/feedback on this feature, and on whether it's > something more people would be interested in. Also, I'm not entirely sure > what part of this functionality is/isn't present in `scipy`, so that the > two > `fft` modules remain consistent with one another. > Hi Chris, that's a good question. scipy.fft is a superset of numpy.fft, and the functionality included in NumPy is really only the basics that are needed in many fields. The reason for the duplication stems from way back when we had no wheels and SciPy was very hard to install. So I don't think there's anything we'd add to numpy.fft at this point. As I commented on your PR, it would be useful to add some references and applications, and then make your proposal on the scipy-dev list. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Aug 4 15:51:54 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 04 Aug 2020 14:51:54 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <8dbf1486b3760142190eb7ca3fd1b34affda7f24.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday Agust 5th at 1pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Tue Aug 4 21:09:58 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Aug 2020 19:09:58 -0600 Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: References: <1596501591921-0.post@n7.nabble.com> Message-ID: On Tue, Aug 4, 2020 at 4:55 AM Ralf Gommers wrote: > > > On Tue, Aug 4, 2020 at 1:49 AM Chris Vavaliaris > wrote: > >> PR #16999: https://github.com/numpy/numpy/pull/16999 >> >> Hello all, >> this PR adds the two 1D Chebyshev transform functions `chebyfft` and >> `ichebyfft` into the `numpy.fft` module, utilizing the real FFTs `rfft` >> and >> `irfft`, respectively. As far as I understand, `pockefft` does not support >> cosine transforms natively; for this reason, an even extension of the >> input >> vector is constructed, whose real FFT corresponds to a cosine transform. >> >> The motivation behind these two additions is the ability to quickly >> perform >> direct and inverse Chebyshev transforms with `numpy`, without the need to >> write scripts that do the necessary (although minor) modifications. >> Chebyshev transforms are used often e.g. in the spectral integration of >> PDE >> problems; thus, I believe having them implemented in `numpy` would be >> useful >> to many people in the community. >> >> I'm happy to get comments/feedback on this feature, and on whether it's >> something more people would be interested in. Also, I'm not entirely sure >> what part of this functionality is/isn't present in `scipy`, so that the >> two >> `fft` modules remain consistent with one another. >> > > Hi Chris, that's a good question. scipy.fft is a superset of numpy.fft, > and the functionality included in NumPy is really only the basics that are > needed in many fields. The reason for the duplication stems from way back > when we had no wheels and SciPy was very hard to install. So I don't think > there's anything we'd add to numpy.fft at this point. > > As I commented on your PR, it would be useful to add some references and > applications, and then make your proposal on the scipy-dev list. > > Chebfun is based around this method, they use series with possibly thousands of terms. Trefethen is a big fan of Chebyshev polynomials. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Aug 4 22:15:02 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 4 Aug 2020 19:15:02 -0700 Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: References: <1596501591921-0.post@n7.nabble.com> Message-ID: On Tue, Aug 4, 2020 at 6:10 PM Charles R Harris wrote: > > > On Tue, Aug 4, 2020 at 4:55 AM Ralf Gommers > wrote: > >> >> >> On Tue, Aug 4, 2020 at 1:49 AM Chris Vavaliaris >> wrote: >> >>> PR #16999: https://github.com/numpy/numpy/pull/16999 >>> >>> Hello all, >>> this PR adds the two 1D Chebyshev transform functions `chebyfft` and >>> `ichebyfft` into the `numpy.fft` module, utilizing the real FFTs `rfft` >>> and >>> `irfft`, respectively. As far as I understand, `pockefft` does not >>> support >>> cosine transforms natively; for this reason, an even extension of the >>> input >>> vector is constructed, whose real FFT corresponds to a cosine transform. >>> >>> The motivation behind these two additions is the ability to quickly >>> perform >>> direct and inverse Chebyshev transforms with `numpy`, without the need to >>> write scripts that do the necessary (although minor) modifications. >>> Chebyshev transforms are used often e.g. in the spectral integration of >>> PDE >>> problems; thus, I believe having them implemented in `numpy` would be >>> useful >>> to many people in the community. >>> >>> I'm happy to get comments/feedback on this feature, and on whether it's >>> something more people would be interested in. Also, I'm not entirely sure >>> what part of this functionality is/isn't present in `scipy`, so that the >>> two >>> `fft` modules remain consistent with one another. >>> >> >> Hi Chris, that's a good question. scipy.fft is a superset of numpy.fft, >> and the functionality included in NumPy is really only the basics that are >> needed in many fields. The reason for the duplication stems from way back >> when we had no wheels and SciPy was very hard to install. So I don't think >> there's anything we'd add to numpy.fft at this point. >> >> As I commented on your PR, it would be useful to add some references and >> applications, and then make your proposal on the scipy-dev list. >> >> > Chebfun is based around this method, > they use series with possibly thousands of terms. Trefethen is a big fan of > Chebyshev polynomials. > I am quite sure that Chebyshev transforms are useful, but it does feel like something more directly suitable for SciPy than NumPy. The current division for submodules like numpy.fft/scipy.fft and numpy.linalg/scipy.linalg exists for outdated historical reasons, but at this point it is easiest for users to understand if has SciPy has a strict superset of NumPy's functionality here. Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Aug 5 12:13:56 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 05 Aug 2020 11:13:56 -0500 Subject: [Numpy-discussion] New random.Generator method: permuted In-Reply-To: References: Message-ID: <8a00e14e7e1c50b6dc82ff0f6c9fc62740912505.camel@sipsolutions.net> On Mon, 2020-08-03 at 14:09 -0400, Warren Weckesser wrote: > In one of the previous weekly zoom meetings, it was suggested > to ping the mailing list about an updated PR that implements > the `permuted` method for the Generator class in numpy.random. > The relevant issue is > > https://github.com/numpy/numpy/issues/5173 > > and the PR is > > https://github.com/numpy/numpy/pull/15121 > > The new method (as it would be called from Python) is > > permuted(x, axis=None, out=None) > I like the proposed API and name personally, and think we should go ahead with it. It is a useful complement to `shuffle` (and sorting). The followup questions of adding `shuffled`, and what to do about `permutation` are important, but I agree with viewing them as a second step. This API has been discussed a few times in various depths, so I assume that `permuted` as a name and API has largely settle down, and reached consensus (at last if there is not more activity here or on the PR). So, as a heads up, I am planning to review and push that forward in the next days, but more discussion is of course welcome. We still have time to decide differently. Cheers, Sebastian > The CircleCI rendering of the docstring from the pull request is > > > https://14745-908607-gh.circle-artifacts.com/0/doc/build/html/reference/random/generated/numpy.random.Generator.permuted.html > > The new method is an alternative to the existing `shuffle` and > `permutation` methods. It handles the `axis` parameter similar > to how the sort methods do, i.e. when `axis` is given, the slices > along the axis are shuffled independently. This new documentation > (added as part of the pull request) explains the API of the various > related methods: > > > https://14745-908607-gh.circle-artifacts.com/0/doc/build/html/reference/random/generator.html#permutations > > Additional feedback on the implementation of `permuted` in the > pull request is welcome. Further discussion of the API should > be held in the issue gh-5173 (but please familiarize yourself > with the discussion of the API in gh-5173--there has already > been quite a long discussion of several different APIs). > > Thanks, > > Warren > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ryan.c.cooper at uconn.edu Wed Aug 5 14:58:05 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Wed, 5 Aug 2020 11:58:05 -0700 (MST) Subject: [Numpy-discussion] Building Numpy Documentation Message-ID: <1596653885292-0.post@n7.nabble.com> I'm trying to build NumPy and its documentation from the current git repo, but I'm hitting a snag. I keep getting a RuntimeError: I'm trying to build NumPy inside the cloned repository from my fork. I'm running Arch (kernel 5.7.12) with gcc and gcc-libs installed. I'm using a fresh conda environment that has only installed Python 3.8 and Cython. Any way I can troubleshoot this issue? -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From ralf.gommers at gmail.com Wed Aug 5 15:02:10 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Aug 2020 20:02:10 +0100 Subject: [Numpy-discussion] Building Numpy Documentation In-Reply-To: <1596653885292-0.post@n7.nabble.com> References: <1596653885292-0.post@n7.nabble.com> Message-ID: On Wed, Aug 5, 2020 at 7:58 PM cooperrc wrote: > I'm trying to build NumPy and its documentation from the current git repo, > but I'm hitting a snag. I keep getting a RuntimeError: > > > I'm trying to build NumPy inside the cloned repository from my fork. I'm > running Arch (kernel 5.7.12) with gcc and gcc-libs installed. I'm using a > fresh conda environment that has only installed Python 3.8 and Cython. > > Any way I can troubleshoot this issue? > Opening an issue and including the command you're running and the full build/test log ending in that RuntimeError would be the way to get the input you need. Cheers, Ralf > > > > -- > Sent from: http://numpy-discussion.10968.n7.nabble.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From numpy_gsod at bigriver.xyz Wed Aug 5 15:15:58 2020 From: numpy_gsod at bigriver.xyz (Ben Nathanson) Date: Wed, 5 Aug 2020 15:15:58 -0400 Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: References: <1596501591921-0.post@n7.nabble.com> Message-ID: > scipy.fft is a superset of numpy.fft, and the functionality included in NumPy is really only the basics that are needed in many fields. Exactly this sentence might be useful on top of the FFT page. Is the right page reference/routines.fft.html? I can submit a PR. From ralf.gommers at gmail.com Wed Aug 5 16:01:41 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Aug 2020 21:01:41 +0100 Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: References: <1596501591921-0.post@n7.nabble.com> Message-ID: On Wed, Aug 5, 2020 at 8:16 PM Ben Nathanson wrote: > > scipy.fft is a superset of numpy.fft, and the functionality included in > NumPy is really only the basics that are needed in many fields. > > Exactly this sentence might be useful on top of the FFT page. > > Is the right page reference/routines.fft.html? I can submit a PR. > A PR would be great, thanks Ben. And yes, that's the right page. Cheers, Ralf _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.c.cooper at uconn.edu Wed Aug 5 16:35:16 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Wed, 5 Aug 2020 13:35:16 -0700 (MST) Subject: [Numpy-discussion] Building Numpy Documentation In-Reply-To: References: <1596653885292-0.post@n7.nabble.com> Message-ID: <1596659716279-0.post@n7.nabble.com> Thanks! I'll post the issue and full output. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From cv1038 at wildcats.unh.edu Wed Aug 5 20:16:01 2020 From: cv1038 at wildcats.unh.edu (Chris Val) Date: Wed, 5 Aug 2020 17:16:01 -0700 (MST) Subject: [Numpy-discussion] Add Chebyshev (cosine) transforms implemented via FFTs In-Reply-To: References: <1596501591921-0.post@n7.nabble.com> Message-ID: <1596672961102-0.post@n7.nabble.com> Stephan Hoyer-2 wrote > On Tue, Aug 4, 2020 at 6:10 PM Charles R Harris < > charlesr.harris@ > > > wrote: > >> >> >> On Tue, Aug 4, 2020 at 4:55 AM Ralf Gommers < > ralf.gommers@ > > >> wrote: >> >>> >>> >>> On Tue, Aug 4, 2020 at 1:49 AM Chris Vavaliaris < > cv1038 at .unh > > >>> wrote: >>> >>>> PR #16999: https://github.com/numpy/numpy/pull/16999 >>>> >>>> Hello all, >>>> this PR adds the two 1D Chebyshev transform functions `chebyfft` and >>>> `ichebyfft` into the `numpy.fft` module, utilizing the real FFTs `rfft` >>>> and >>>> `irfft`, respectively. As far as I understand, `pockefft` does not >>>> support >>>> cosine transforms natively; for this reason, an even extension of the >>>> input >>>> vector is constructed, whose real FFT corresponds to a cosine >>>> transform. >>>> >>>> The motivation behind these two additions is the ability to quickly >>>> perform >>>> direct and inverse Chebyshev transforms with `numpy`, without the need >>>> to >>>> write scripts that do the necessary (although minor) modifications. >>>> Chebyshev transforms are used often e.g. in the spectral integration of >>>> PDE >>>> problems; thus, I believe having them implemented in `numpy` would be >>>> useful >>>> to many people in the community. >>>> >>>> I'm happy to get comments/feedback on this feature, and on whether it's >>>> something more people would be interested in. Also, I'm not entirely >>>> sure >>>> what part of this functionality is/isn't present in `scipy`, so that >>>> the >>>> two >>>> `fft` modules remain consistent with one another. >>>> >>> >>> Hi Chris, that's a good question. scipy.fft is a superset of numpy.fft, >>> and the functionality included in NumPy is really only the basics that >>> are >>> needed in many fields. The reason for the duplication stems from way >>> back >>> when we had no wheels and SciPy was very hard to install. So I don't >>> think >>> there's anything we'd add to numpy.fft at this point. >>> >>> As I commented on your PR, it would be useful to add some references and >>> applications, and then make your proposal on the scipy-dev list. >>> >>> >> Chebfun <https://github.com/chebfun/chebfun> is based around this >> method, >> they use series with possibly thousands of terms. Trefethen is a big fan >> of >> Chebyshev polynomials. >> > > I am quite sure that Chebyshev transforms are useful, but it does feel > like > something more directly suitable for SciPy than NumPy. The current > division > for submodules like numpy.fft/scipy.fft and numpy.linalg/scipy.linalg > exists for outdated historical reasons, but at this point it is easiest > for > users to understand if has SciPy has a strict superset of NumPy's > functionality here. > > > Chuck >> _______________________________________________ >> NumPy-Discussion mailing list >> > NumPy-Discussion@ >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@ > https://mail.python.org/mailman/listinfo/numpy-discussion Thank you all for the replies and feedback! I now have a better understanding of the differences between the NumPy and SciPy FFT modules; it certainly looks like SciPy would be a more appropriate place for such a feature. > Chebfun is based around this method, they use series with possibly > thousands of terms. Trefethen is a big fan of Chebyshev polynomials. > > Chuck Thank you Chuck for your comment; yes I'm aware of Chebfun and of Trefethen's work in general, it's mostly the work of his and some of his past grad students that got me interested in Chebyshev methods in the first place! Chris -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From kevin.k.sheppard at gmail.com Fri Aug 7 09:00:12 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Fri, 7 Aug 2020 14:00:12 +0100 Subject: [Numpy-discussion] Replacement for Rackspace Message-ID: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> An HTML attachment was scrubbed... URL: From andy.terrel at gmail.com Fri Aug 7 09:18:39 2020 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Fri, 7 Aug 2020 08:18:39 -0500 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> Message-ID: If you are looking for servers, I can help with the NumFOCUS allocation from AWS. But anaconda.org will mean less work managing infrastructure. On Fri, Aug 7, 2020 at 8:01 AM Kevin Sheppard wrote: > The Rackspace hosted wheel endpoints at > > > > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > > > and > > > > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > > > seem to not be working. I know NumPy, SciPy, pandas and scikit-learn are > all using a common end point on anacona.org. Statsmodels is preparing > for release, and the wheel builder at > https://github.com/MacPython/statsmodels-wheels is failing at upload. Is > there any shared resource for uploading nightlies and release wheels? Or > should we just use a separate account on anaconda.org? > > > > Thanks, > > Kevin > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.c.cooper at uconn.edu Fri Aug 7 10:12:46 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Fri, 7 Aug 2020 07:12:46 -0700 (MST) Subject: [Numpy-discussion] Building Numpy Documentation In-Reply-To: <1596659716279-0.post@n7.nabble.com> References: <1596653885292-0.post@n7.nabble.com> <1596659716279-0.post@n7.nabble.com> Message-ID: <1596809566264-0.post@n7.nabble.com> For future reference, I opened and closed issue 17016 on github. The culprit was a `Broken Toolchain` due to a mismatch between Arch's newer ld and conda's older ld. Solution was to move the default ~/conda/envs/doc-build-38/compiler_compat/ld to ~/conda/envs/doc-build-38/compiler_compat/bak_ld Then, the build went smoothly. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From p.j.a.cock at googlemail.com Fri Aug 7 12:23:23 2020 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 7 Aug 2020 17:23:23 +0100 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> Message-ID: Ah - this is unwelcome news. See https://mail.python.org/pipermail/scipy-dev/2020-February/023990.html and https://github.com/matthew-brett/multibuild/issues/304 There are quite a few project's using the multibuild system now... Peter On Fri, Aug 7, 2020 at 2:01 PM Kevin Sheppard wrote: > The Rackspace hosted wheel endpoints at > > > > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > > > and > > > > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > > > seem to not be working. I know NumPy, SciPy, pandas and scikit-learn are > all using a common end point on anacona.org. Statsmodels is preparing > for release, and the wheel builder at > https://github.com/MacPython/statsmodels-wheels is failing at upload. Is > there any shared resource for uploading nightlies and release wheels? Or > should we just use a separate account on anaconda.org? > > > > Thanks, > > Kevin > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Aug 7 17:35:26 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 7 Aug 2020 22:35:26 +0100 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> Message-ID: On Fri, Aug 7, 2020 at 2:00 PM Kevin Sheppard wrote: > The Rackspace hosted wheel endpoints at > > > > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > > > and > > > > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > > > seem to not be working. I know NumPy, SciPy, pandas and scikit-learn are > all using a common end point on anacona.org. Statsmodels is preparing > for release, and the wheel builder at > https://github.com/MacPython/statsmodels-wheels is failing at upload. Is > there any shared resource for uploading nightlies and release wheels? Or > should we just use a separate account on anaconda.org? > Copying the numpy-wheels Azure/TravisCI code for this should work, it's pretty concise, e.g.: https://github.com/MacPython/numpy-wheels/blob/master/azure/posix.yml#L87 Not sure about the account credentials, Matti would know. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Sun Aug 9 18:15:06 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Mon, 10 Aug 2020 00:15:06 +0200 Subject: [Numpy-discussion] Type declaration to include all valid numerical NumPy types for Cython Message-ID: Hi all, As you might have seen my recent mails in Cython list, I'm trying to cook up an input validator for the linalg.solve() function. The machinery of SciPy linalg is as follows: Some input comes in passes through np.asarray() then depending on the resulting dtype of the numpy array we choose a LAPACK flavor (s,d,c,z) and off it goes through f2py to lalaland and comes back with some result. For the backslash polyalgorithm I need the arrays to be contiguous (C- or F- doesn't matter) and any of the four (possibly via making new copies) float, double, float complex, double complex after the intake because we are using wrapped fortran code (LAPACK) in SciPy. So my difficulty is how to type such function input, say, ctypedef fused numeric_numpy_t: bint cnp.npy_bool cnp.int_t cnp.intp_t cnp.int8_t cnp.int16_t cnp.int32_t cnp.int64_t cnp.uint8_t cnp.uint16_t cnp.uint32_t cnp.uint64_t cnp.float32_t cnp.float64_t cnp.complex64_t cnp.complex128_t Is this acceptable or something else needs to be used? Then there is the storyof np.complex256 and mysterious np.float16. Then there is the Linux vs Windows platform dependence issue and possibly some more that I can't comprehend. Then there are datetime, str, unicode etc. that need to be rejected. So this is quickly getting out of hand for my small brain. To be honest, I am a bit running out of steam working with this issue even though I managed to finish the actual difficult algorithmic part but got stuck here. I am quite surprised how fantastically complicated and confusing both NumPy and Cython docs about this stuff. Shouldn't we keep a generic fused type for such usage? Or maybe there already exists but I don't know and would be really grateful for pointers. Here I wrote a dummy typed Cython function just for type checking: cpdef inline bint ncc( numeric_numpy_t[:, :] a): print(a.is_f_contig()) print(a.is_c_contig()) return a.is_f_contig() or a.is_c_contig() And this is a dummy loop (with aliases) just to check whether fused type is working or not (on windows I couldn't make it work for float16). for x in (np.uint, np.uintc, np.uintp, np.uint0, np.uint8, np.uint16, np.uint32, np.uint64, np.int, np.intc, np.intp, np.int0, np.int8, np.int16, np.int32,np.int64, np.float, np.float32, np.float64, np.float_, np.complex, np.complex64, np.complex128, np.complex_): print(x) C = np.arange(25., dtype=x).reshape(5, 5) ncc(C) Thanks in advance, ilhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Sun Aug 9 20:49:37 2020 From: ewm at redtetrahedron.org (Eric Moore) Date: Sun, 9 Aug 2020 20:49:37 -0400 Subject: [Numpy-discussion] Type declaration to include all valid numerical NumPy types for Cython In-Reply-To: References: Message-ID: If that is really all you need, then the version in python is: def convert_one(a): """ Converts input with arbitrary layout and dtype to a blas/lapack compatible dtype with either C or F order. Acceptable objects are passed through without making copies. """ a_arr = np.asarray(a) dtype = np.result_type(a_arr, 1.0) # need to handle these separately if dtype == np.longdouble: dtype = np.dtype('d') elif dtype == np.clongdouble: dtype = np.dtype('D') elif dtype == np.float16: dtype = np.dtype('f') # explicitly force a copy if a_arr isn't one segment return np.array(a_arr, dtype, copy=not a_arr.flags.forc, order='K') In Cython, you could just run exactly this code and it's probably fine. The could also be rewritten using the C calls if you really wanted. You need to either provide your own or use a casting table and the copy / conversion routines from somewhere. Cython, to my knowledge, doesn't provide these things, but Numpy does. Eric On Sun, Aug 9, 2020 at 6:16 PM Ilhan Polat wrote: > Hi all, > > As you might have seen my recent mails in Cython list, I'm trying to cook > up an input validator for the linalg.solve() function. The machinery of > SciPy linalg is as follows: > > Some input comes in passes through np.asarray() then depending on the > resulting dtype of the numpy array we choose a LAPACK flavor (s,d,c,z) and > off it goes through f2py to lalaland and comes back with some result. > > For the backslash polyalgorithm I need the arrays to be contiguous (C- or > F- doesn't matter) and any of the four (possibly via making new copies) > float, double, float complex, double complex after the intake because we > are using wrapped fortran code (LAPACK) in SciPy. So my difficulty is how > to type such function input, say, > > ctypedef fused numeric_numpy_t: > bint > cnp.npy_bool > cnp.int_t > cnp.intp_t > cnp.int8_t > cnp.int16_t > cnp.int32_t > cnp.int64_t > cnp.uint8_t > cnp.uint16_t > cnp.uint32_t > cnp.uint64_t > cnp.float32_t > cnp.float64_t > cnp.complex64_t > cnp.complex128_t > > Is this acceptable or something else needs to be used? Then there is the > storyof np.complex256 and mysterious np.float16. Then there is the Linux vs > Windows platform dependence issue and possibly some more that I can't > comprehend. Then there are datetime, str, unicode etc. that need to be > rejected. So this is quickly getting out of hand for my small brain. > > To be honest, I am a bit running out of steam working with this issue even > though I managed to finish the actual difficult algorithmic part but got > stuck here. I am quite surprised how fantastically complicated and > confusing both NumPy and Cython docs about this stuff. Shouldn't we keep a > generic fused type for such usage? Or maybe there already exists but I > don't know and would be really grateful for pointers. > > Here I wrote a dummy typed Cython function just for type checking: > > cpdef inline bint ncc( numeric_numpy_t[:, :] a): > print(a.is_f_contig()) > print(a.is_c_contig()) > > return a.is_f_contig() or a.is_c_contig() > > And this is a dummy loop (with aliases) just to check whether fused type > is working or not (on windows I couldn't make it work for float16). > > for x in (np.uint, np.uintc, np.uintp, np.uint0, np.uint8, np.uint16, > np.uint32, > np.uint64, np.int, np.intc, np.intp, np.int0, np.int8, np.int16, > np.int32,np.int64, np.float, np.float32, np.float64, np.float_, > np.complex, np.complex64, np.complex128, np.complex_): > print(x) > C = np.arange(25., dtype=x).reshape(5, 5) > ncc(C) > > > Thanks in advance, > ilhan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Mon Aug 10 05:24:19 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Mon, 10 Aug 2020 11:24:19 +0200 Subject: [Numpy-discussion] Type declaration to include all valid numerical NumPy types for Cython In-Reply-To: References: Message-ID: Yes it seems like I don't have any other option anyways. There is a bit of a penalty but I guess this should do the trick. Thanks Eric (again! :D) On Mon, Aug 10, 2020 at 2:51 AM Eric Moore wrote: > If that is really all you need, then the version in python is: > > def convert_one(a): > """ > Converts input with arbitrary layout and dtype to a blas/lapack > compatible dtype with either C or F order. Acceptable objects are > passed > through without making copies. > """ > > a_arr = np.asarray(a) > dtype = np.result_type(a_arr, 1.0) > > # need to handle these separately > if dtype == np.longdouble: > dtype = np.dtype('d') > elif dtype == np.clongdouble: > dtype = np.dtype('D') > elif dtype == np.float16: > dtype = np.dtype('f') > > # explicitly force a copy if a_arr isn't one segment > return np.array(a_arr, dtype, copy=not a_arr.flags.forc, order='K') > > In Cython, you could just run exactly this code and it's probably fine. > The could also be rewritten using the C calls if you really wanted. > > You need to either provide your own or use a casting table and the copy / > conversion routines from somewhere. Cython, to my knowledge, doesn't > provide these things, but Numpy does. > > Eric > > On Sun, Aug 9, 2020 at 6:16 PM Ilhan Polat wrote: > >> Hi all, >> >> As you might have seen my recent mails in Cython list, I'm trying to cook >> up an input validator for the linalg.solve() function. The machinery of >> SciPy linalg is as follows: >> >> Some input comes in passes through np.asarray() then depending on the >> resulting dtype of the numpy array we choose a LAPACK flavor (s,d,c,z) and >> off it goes through f2py to lalaland and comes back with some result. >> >> For the backslash polyalgorithm I need the arrays to be contiguous (C- or >> F- doesn't matter) and any of the four (possibly via making new copies) >> float, double, float complex, double complex after the intake because we >> are using wrapped fortran code (LAPACK) in SciPy. So my difficulty is how >> to type such function input, say, >> >> ctypedef fused numeric_numpy_t: >> bint >> cnp.npy_bool >> cnp.int_t >> cnp.intp_t >> cnp.int8_t >> cnp.int16_t >> cnp.int32_t >> cnp.int64_t >> cnp.uint8_t >> cnp.uint16_t >> cnp.uint32_t >> cnp.uint64_t >> cnp.float32_t >> cnp.float64_t >> cnp.complex64_t >> cnp.complex128_t >> >> Is this acceptable or something else needs to be used? Then there is the >> storyof np.complex256 and mysterious np.float16. Then there is the Linux vs >> Windows platform dependence issue and possibly some more that I can't >> comprehend. Then there are datetime, str, unicode etc. that need to be >> rejected. So this is quickly getting out of hand for my small brain. >> >> To be honest, I am a bit running out of steam working with this issue >> even though I managed to finish the actual difficult algorithmic part but >> got stuck here. I am quite surprised how fantastically complicated and >> confusing both NumPy and Cython docs about this stuff. Shouldn't we keep a >> generic fused type for such usage? Or maybe there already exists but I >> don't know and would be really grateful for pointers. >> >> Here I wrote a dummy typed Cython function just for type checking: >> >> cpdef inline bint ncc( numeric_numpy_t[:, :] a): >> print(a.is_f_contig()) >> print(a.is_c_contig()) >> >> return a.is_f_contig() or a.is_c_contig() >> >> And this is a dummy loop (with aliases) just to check whether fused type >> is working or not (on windows I couldn't make it work for float16). >> >> for x in (np.uint, np.uintc, np.uintp, np.uint0, np.uint8, np.uint16, >> np.uint32, >> np.uint64, np.int, np.intc, np.intp, np.int0, np.int8, >> np.int16, >> np.int32,np.int64, np.float, np.float32, np.float64, np.float_, >> np.complex, np.complex64, np.complex128, np.complex_): >> print(x) >> C = np.arange(25., dtype=x).reshape(5, 5) >> ncc(C) >> >> >> Thanks in advance, >> ilhan >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Aug 10 11:30:23 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 10 Aug 2020 10:30:23 -0500 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions Message-ID: Hi all, as a heads up that Peter Entschev has a PR open to add `like=` to most array creation functions, my current plan is to merge it soon as a preliminary API and bring it up again before the actual release (in a few months). This allows overriding for array-likes, e.g. it will allow: arr = np.asarray([3], like=dask_array) type(arr) is dask.array.Array This was proposed in NEP 35: https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html Although that has not been accepted as of now, the PR is: https://github.com/numpy/numpy/pull/16935 This was discussed in a smaller group, and is an attempt to see how we can make the array-function protocol viable to allow packages such as sklearn to work with non-NumPy arrays. As of now, this would be experimental and can revisit it before the actual NumPy release. We should probably discuss accepting NEP 35 more. At this time, I hope that we can put in the functionality to facilitate this discussion, rather the other way around. If anyone feels nervous about this step, I would be happy to document that we will not include it in the next release unless the NEP is accepted first, or at least hide it behind an environment variable. Cheers, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From einstein.edison at gmail.com Mon Aug 10 11:35:14 2020 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Mon, 10 Aug 2020 17:35:14 +0200 Subject: [Numpy-discussion] Experimental =?utf-8?Q?=60like=3D=60_?=attribute for array creation functions In-Reply-To: References: Message-ID: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> Hi, We should have a higher-bandwidth meeting/communication for all stakeholders, and particularly some library authors, to see what would be good for them. We should definitely have language in the NEP that says it won?t be in a release unless the NEP is accepted. Best regards, Hameer Abbasi -- Sent from Canary (https://canarymail.io) > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg wrote: > Hi all, > > as a heads up that Peter Entschev has a PR open to add `like=` to > most array creation functions, my current plan is to merge it soon as a preliminary API and bring it up again before the actual release (in a few months). This allows overriding for array-likes, e.g. it will allow: > > > arr = np.asarray([3], like=dask_array) > type(arr) is dask.array.Array > > This was proposed in NEP 35: > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html > > Although that has not been accepted as of now, the PR is: > > https://github.com/numpy/numpy/pull/16935 > > > This was discussed in a smaller group, and is an attempt to see how we > can make the array-function protocol viable to allow packages such as > sklearn to work with non-NumPy arrays. > > As of now, this would be experimental and can revisit it before the > actual NumPy release. We should probably discuss accepting NEP 35 > more. At this time, I hope that we can put in the functionality to > facilitate this discussion, rather the other way around. > > If anyone feels nervous about this step, I would be happy to document > that we will not include it in the next release unless the NEP is > accepted first, or at least hide it behind an environment variable. > > Cheers, > > Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Aug 10 15:36:41 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 10 Aug 2020 14:36:41 -0500 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> Message-ID: <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote: > Hi, > > We should have a higher-bandwidth meeting/communication for all > stakeholders, and particularly some library authors, to see what > would be good for them. > > We should definitely have language in the NEP that says it won?t be > in a release unless the NEP is accepted. In that case, I think the important part is to have language right now in the implementation, although that can refer to the NEP itself of course. You can't expect everyone who may be tempted to use it to actually read the NEP draft, at least not without pointing it out. I will say that I think it is not very high risk, because I think annoying or not, the argument could be deprecated again with a transition short phase. Admittedly, that argument only works if we have a replacement solution. Cheers, Sebastian > > Best regards, > Hameer Abbasi > > -- > Sent from Canary (https://canarymail.io) > > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg < > > sebastian at sipsolutions.net (mailto:sebastian at sipsolutions.net)> > > wrote: > > Hi all, > > > > as a heads up that Peter Entschev has a PR open to add `like=` to > > most array creation functions, my current plan is to merge it soon > > as a preliminary API and bring it up again before the actual > > release (in a few months). This allows overriding for array-likes, > > e.g. it will allow: > > > > > > arr = np.asarray([3], like=dask_array) > > type(arr) is dask.array.Array > > > > This was proposed in NEP 35: > > > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html > > > > Although that has not been accepted as of now, the PR is: > > > > https://github.com/numpy/numpy/pull/16935 > > > > > > This was discussed in a smaller group, and is an attempt to see how > > we > > can make the array-function protocol viable to allow packages such > > as > > sklearn to work with non-NumPy arrays. > > > > As of now, this would be experimental and can revisit it before the > > actual NumPy release. We should probably discuss accepting NEP 35 > > more. At this time, I hope that we can put in the functionality to > > facilitate this discussion, rather the other way around. > > > > If anyone feels nervous about this step, I would be happy to > > document > > that we will not include it in the next release unless the NEP is > > accepted first, or at least hide it behind an environment variable. > > > > Cheers, > > > > Sebastian > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From pwang at anaconda.com Mon Aug 10 15:54:38 2020 From: pwang at anaconda.com (Peter Wang) Date: Mon, 10 Aug 2020 14:54:38 -0500 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> Message-ID: FWIW, we're happy to provide wheel hosting for statsmodels on anaconda.org. -Peter On Fri, Aug 7, 2020 at 8:01 AM Kevin Sheppard wrote: > The Rackspace hosted wheel endpoints at > > > > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > > > and > > > > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > > > seem to not be working. I know NumPy, SciPy, pandas and scikit-learn are > all using a common end point on anacona.org. Statsmodels is preparing > for release, and the wheel builder at > https://github.com/MacPython/statsmodels-wheels is failing at upload. Is > there any shared resource for uploading nightlies and release wheels? Or > should we just use a separate account on anaconda.org? > > > > Thanks, > > Kevin > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Aug 10 16:19:42 2020 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 10 Aug 2020 23:19:42 +0300 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> Message-ID: <304cbb1c-1871-0da4-097c-70d1c7c6d8e9@gmail.com> On 8/10/20 10:54 PM, Peter Wang wrote: > FWIW, we're happy to provide wheel hosting for statsmodels on > anaconda.org . > > -Peter > > On Fri, Aug 7, 2020 at 8:01 AM Kevin Sheppard > > wrote: > > The Rackspace hosted wheel endpoints at > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > and > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > seem to not be working.? I know NumPy, SciPy, pandas and > scikit-learn are all using a common end point on anacona.org > . Statsmodels is preparing for? release, and > the wheel builder at > https://github.com/MacPython/statsmodels-wheels is failing at > upload.? Is there any shared resource for uploading nightlies and > release wheels?? Or should we just use a separate account on > anaconda.org ? > > Thanks, > > Kevin > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > Thanks Peter, anaconda is generously hosting projects at https://anaconda.org/scipy-wheels-nightly/ (for weekly development releases that can be used to test downstream projects) and https://anaconda.org/multibuild-wheels-staging (for staging wheels to be tested for release on PyPI). The trick is that CI needs a token so it can upload to those organizations. Kevin, we can either add you to the groups you can create a token, or one of the current members could create tokens and transport them safely to Kevin. Please disucss it with me (or one of the other members https://anaconda.org/multibuild-wheels-staging/groups). Matti From p.j.a.cock at googlemail.com Mon Aug 10 17:39:17 2020 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 10 Aug 2020 22:39:17 +0100 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: <304cbb1c-1871-0da4-097c-70d1c7c6d8e9@gmail.com> References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> <304cbb1c-1871-0da4-097c-70d1c7c6d8e9@gmail.com> Message-ID: Hi Matti, Is this an open invitation to the wider Numpy ecosystem? I am interested on behalf of Biopython which was using the donated Rackspace for multibuild wheel staging prior to PyPy release (although having weekly test releases sounds interesting too). I would be happy to continue this discussion off list if you prefer, Thank you, Peter On Mon, Aug 10, 2020 at 9:20 PM Matti Picus wrote: > > On 8/10/20 10:54 PM, Peter Wang wrote: > > FWIW, we're happy to provide wheel hosting for statsmodels on > > anaconda.org . > > > > -Peter > > > > On Fri, Aug 7, 2020 at 8:01 AM Kevin Sheppard > > > wrote: > > > > The Rackspace hosted wheel endpoints at > > > > > https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com/ > > > > and > > > > > https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com/ > > > > seem to not be working. I know NumPy, SciPy, pandas and > > scikit-learn are all using a common end point on anacona.org > > . Statsmodels is preparing for release, and > > the wheel builder at > > https://github.com/MacPython/statsmodels-wheels is failing at > > upload. Is there any shared resource for uploading nightlies and > > release wheels? Or should we just use a separate account on > > anaconda.org ? > > > > Thanks, > > > > Kevin > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > Thanks Peter, anaconda is generously hosting projects at > https://anaconda.org/scipy-wheels-nightly/ (for weekly development > releases that can be used to test downstream projects) and > https://anaconda.org/multibuild-wheels-staging (for staging wheels to be > tested for release on PyPI). > > > The trick is that CI needs a token so it can upload to those > organizations. Kevin, we can either add you to the groups you can create > a token, or one of the current members could create tokens and transport > them safely to Kevin. Please disucss it with me (or one of the other > members https://anaconda.org/multibuild-wheels-staging/groups). > > Matti > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Aug 10 18:16:34 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 10 Aug 2020 23:16:34 +0100 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> Message-ID: On Mon, Aug 10, 2020 at 8:37 PM Sebastian Berg wrote: > On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote: > > Hi, > > > > We should have a higher-bandwidth meeting/communication for all > > stakeholders, and particularly some library authors, to see what > > would be good for them. > I'm not sure that helps. At this point there's little progress since the last meeting, I think the plan is unchanged: we need implementations of all the options on offer, and then try them out in PRs for scikit-learn, SciPy and perhaps another package who's maintainers are interested, to test like=, __array_module__ in realistic situations. > > > We should definitely have language in the NEP that says it won?t be > > in a release unless the NEP is accepted. > > In that case, I think the important part is to have language right now > in the implementation, although that can refer to the NEP itself of > course. > You can't expect everyone who may be tempted to use it to actually read > the NEP draft, at least not without pointing it out. > Agreed, I think the decision is on this list not in the NEP, and to make sure we won't forget we need an issue opened with the 1.20 milestone. Cheers, Ralf > I will say that I think it is not very high risk, because I think > annoying or not, the argument could be deprecated again with a > transition short phase. Admittedly, that argument only works if we have > a replacement solution. > > Cheers, > > Sebastian > > > > > > Best regards, > > Hameer Abbasi > > > > -- > > Sent from Canary (https://canarymail.io) > > > > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg < > > > sebastian at sipsolutions.net (mailto:sebastian at sipsolutions.net)> > > > wrote: > > > Hi all, > > > > > > as a heads up that Peter Entschev has a PR open to add `like=` to > > > most array creation functions, my current plan is to merge it soon > > > as a preliminary API and bring it up again before the actual > > > release (in a few months). This allows overriding for array-likes, > > > e.g. it will allow: > > > > > > > > > arr = np.asarray([3], like=dask_array) > > > type(arr) is dask.array.Array > > > > > > This was proposed in NEP 35: > > > > > > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html > > > > > > Although that has not been accepted as of now, the PR is: > > > > > > https://github.com/numpy/numpy/pull/16935 > > > > > > > > > This was discussed in a smaller group, and is an attempt to see how > > > we > > > can make the array-function protocol viable to allow packages such > > > as > > > sklearn to work with non-NumPy arrays. > > > > > > As of now, this would be experimental and can revisit it before the > > > actual NumPy release. We should probably discuss accepting NEP 35 > > > more. At this time, I hope that we can put in the functionality to > > > facilitate this discussion, rather the other way around. > > > > > > If anyone feels nervous about this step, I would be happy to > > > document > > > that we will not include it in the next release unless the NEP is > > > accepted first, or at least hide it behind an environment variable. > > > > > > Cheers, > > > > > > Sebastian > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Aug 11 01:33:42 2020 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 11 Aug 2020 08:33:42 +0300 Subject: [Numpy-discussion] Replacement for Rackspace In-Reply-To: References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> <304cbb1c-1871-0da4-097c-70d1c7c6d8e9@gmail.com> Message-ID: An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Aug 11 17:15:41 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 11 Aug 2020 16:15:41 -0500 Subject: [Numpy-discussion] NumPy Development Meeting Today - Triage Focus Message-ID: <191985aa84e634e8c3f4aff72389f66f4c114b32.camel@sipsolutions.net> Hi all, Our bi-weekly triage-focused NumPy development meeting is tomorrow (Wednesday, August 12th) at 11 am Pacific Time (18:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized or simply discussed briefly. Just comment on it so we can label it, or add your PR/issue to this weeks topics for discussion. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ilhanpolat at gmail.com Wed Aug 12 19:24:57 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Thu, 13 Aug 2020 01:24:57 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> Message-ID: For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. Let me clarify, - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. best, ilhan On Tue, Aug 11, 2020 at 12:18 AM Ralf Gommers wrote: > > > On Mon, Aug 10, 2020 at 8:37 PM Sebastian Berg > wrote: > >> On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote: >> > Hi, >> > >> > We should have a higher-bandwidth meeting/communication for all >> > stakeholders, and particularly some library authors, to see what >> > would be good for them. >> > > I'm not sure that helps. At this point there's little progress since the > last meeting, I think the plan is unchanged: we need implementations of all > the options on offer, and then try them out in PRs for scikit-learn, SciPy > and perhaps another package who's maintainers are interested, to test > like=, __array_module__ in realistic situations. > > > > >> > We should definitely have language in the NEP that says it won?t be >> > in a release unless the NEP is accepted. >> >> In that case, I think the important part is to have language right now >> in the implementation, although that can refer to the NEP itself of >> course. >> You can't expect everyone who may be tempted to use it to actually read >> the NEP draft, at least not without pointing it out. >> > > Agreed, I think the decision is on this list not in the NEP, and to make > sure we won't forget we need an issue opened with the 1.20 milestone. > > Cheers, > Ralf > > >> I will say that I think it is not very high risk, because I think >> annoying or not, the argument could be deprecated again with a >> transition short phase. Admittedly, that argument only works if we have >> a replacement solution. >> >> Cheers, >> >> Sebastian >> >> >> > >> > Best regards, >> > Hameer Abbasi >> > >> > -- >> > Sent from Canary (https://canarymail.io) >> > >> > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg < >> > > sebastian at sipsolutions.net (mailto:sebastian at sipsolutions.net)> >> > > wrote: >> > > Hi all, >> > > >> > > as a heads up that Peter Entschev has a PR open to add `like=` to >> > > most array creation functions, my current plan is to merge it soon >> > > as a preliminary API and bring it up again before the actual >> > > release (in a few months). This allows overriding for array-likes, >> > > e.g. it will allow: >> > > >> > > >> > > arr = np.asarray([3], like=dask_array) >> > > type(arr) is dask.array.Array >> > > >> > > This was proposed in NEP 35: >> > > >> > > >> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html >> > > >> > > Although that has not been accepted as of now, the PR is: >> > > >> > > https://github.com/numpy/numpy/pull/16935 >> > > >> > > >> > > This was discussed in a smaller group, and is an attempt to see how >> > > we >> > > can make the array-function protocol viable to allow packages such >> > > as >> > > sklearn to work with non-NumPy arrays. >> > > >> > > As of now, this would be experimental and can revisit it before the >> > > actual NumPy release. We should probably discuss accepting NEP 35 >> > > more. At this time, I hope that we can put in the functionality to >> > > facilitate this discussion, rather the other way around. >> > > >> > > If anyone feels nervous about this step, I would be happy to >> > > document >> > > that we will not include it in the next release unless the NEP is >> > > accepted first, or at least hide it behind an environment variable. >> > > >> > > Cheers, >> > > >> > > Sebastian >> > > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at python.org >> > > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni at fastmail.com Wed Aug 12 21:44:33 2020 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Thu, 13 Aug 2020 11:44:33 +1000 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> Message-ID: <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. Food for thought. Juan. > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: > > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. > > Let me clarify, > > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. > > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. > > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. > > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. > > best, > ilhan > > > > > > > > On Tue, Aug 11, 2020 at 12:18 AM Ralf Gommers > wrote: > > > On Mon, Aug 10, 2020 at 8:37 PM Sebastian Berg > wrote: > On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote: > > Hi, > > > > We should have a higher-bandwidth meeting/communication for all > > stakeholders, and particularly some library authors, to see what > > would be good for them. > > I'm not sure that helps. At this point there's little progress since the last meeting, I think the plan is unchanged: we need implementations of all the options on offer, and then try them out in PRs for scikit-learn, SciPy and perhaps another package who's maintainers are interested, to test like=, __array_module__ in realistic situations. > > > > > > We should definitely have language in the NEP that says it won?t be > > in a release unless the NEP is accepted. > > In that case, I think the important part is to have language right now > in the implementation, although that can refer to the NEP itself of > course. > You can't expect everyone who may be tempted to use it to actually read > the NEP draft, at least not without pointing it out. > > Agreed, I think the decision is on this list not in the NEP, and to make sure we won't forget we need an issue opened with the 1.20 milestone. > > Cheers, > Ralf > > > I will say that I think it is not very high risk, because I think > annoying or not, the argument could be deprecated again with a > transition short phase. Admittedly, that argument only works if we have > a replacement solution. > > Cheers, > > Sebastian > > > > > > Best regards, > > Hameer Abbasi > > > > -- > > Sent from Canary (https://canarymail.io ) > > > > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg < > > > sebastian at sipsolutions.net (mailto:sebastian at sipsolutions.net )> > > > wrote: > > > Hi all, > > > > > > as a heads up that Peter Entschev has a PR open to add `like=` to > > > most array creation functions, my current plan is to merge it soon > > > as a preliminary API and bring it up again before the actual > > > release (in a few months). This allows overriding for array-likes, > > > e.g. it will allow: > > > > > > > > > arr = np.asarray([3], like=dask_array) > > > type(arr) is dask.array.Array > > > > > > This was proposed in NEP 35: > > > > > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html > > > > > > Although that has not been accepted as of now, the PR is: > > > > > > https://github.com/numpy/numpy/pull/16935 > > > > > > > > > This was discussed in a smaller group, and is an attempt to see how > > > we > > > can make the array-function protocol viable to allow packages such > > > as > > > sklearn to work with non-NumPy arrays. > > > > > > As of now, this would be experimental and can revisit it before the > > > actual NumPy release. We should probably discuss accepting NEP 35 > > > more. At this time, I hope that we can put in the functionality to > > > facilitate this discussion, rather the other way around. > > > > > > If anyone feels nervous about this step, I would be happy to > > > document > > > that we will not include it in the next release unless the NEP is > > > accepted first, or at least hide it behind an environment variable. > > > > > > Cheers, > > > > > > Sebastian > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at entschev.com Thu Aug 13 06:56:26 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Thu, 13 Aug 2020 12:56:26 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: > I am not sure adding a new keyword to an already confusing function is the right thing to do. Could you clarify what is the confusing function in question? > This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. To be fair, the usage is the same. Therefore empty_like(downstream_array, ...) and empty(downstream_array, ..., like=downstream_array) should have the exact same behavior, which is arguably redundant now. > It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. I understand this can be confusing, and naming was one of the hardest discussions as there's no clear unambiguous name to use for this keyword, "like=" was simply the name that got closer to converging during discussions. At the same time I think "typeof=" is perhaps a better name than "like=", it could be very much confusing with "dtype=", and that would possibly just shift the confusion. > Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. The problem with this approach is that the __array_function__ protocol relies on downstream libraries implementing functions with the same signature (for example, Dask and CuPy both implement an "array" function that matches NumPy). The purpose of __array_function__ and NEP-35 is to introduce only minimal changes to both NumPy's API and downstream libraries. Of course adding new functions for such cases would work, but IMO it would defeat the purpose of __array_function__ in general as it would require a considerable amount of work in downstream libraries, and we discussed this previously deciding that an argument is better than many new functions [1]. > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. This is what I intended to do in the Usage Guidance [2] section. Could you elaborate on what more information you'd want to see there? Or is it just a matter of reorganizing the NEP a bit to try and summarize such things right at the top? > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. TBH, I don't really know how to solve that point, so if you have any specific suggestions, that's certainly welcome. I understand the frustration for a reader trying to understand all the details, with many being only described in NEP-18 [3], but we also strive to avoid rewriting things that are written elsewhere, which would also overburden those who are aware of what's being discussed. > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. To be clear, I have no strong opinion on renaming it, I'm fine either way but I think it's unrealistic to expect that we find somewhat short, unambiguous and properly descriptive names in a single name. If the preference now shifts towards the "typeof=" name, we can change it, but "like=" was really named after "empty_like" and similar functions. > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. I'm guessing this is somewhat of a loose definition of "library", to some extent if you really need "like=" it means that you're writing your own functions around the NumPy API (and that IMO is a library, even if you call it something else), rather than just writing your application on top of the existing NumPy API. I'm also happy to rephrase that in the NEP if people feel it should be done. > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. This is a good point, and we do always notify people over the mailing list of new NEPs as per NEP-0 [4], which was done for NEP-35 [5] (originally NEP-33, but renamed due to other open NEPs at that time), unfortunately not many concerns were raised about that back then. Best, Peter [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance [3] https://numpy.org/neps/nep-0018-array-function-protocol.html [4] https://numpy.org/neps/nep-0000.html#nep-workflow [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias wrote: > > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. > > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. > > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. > > Food for thought. > > Juan. > > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: > > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. > > Let me clarify, > > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. > > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. > > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. > > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. > > best, > ilhan > > > > > > > > On Tue, Aug 11, 2020 at 12:18 AM Ralf Gommers wrote: >> >> >> >> On Mon, Aug 10, 2020 at 8:37 PM Sebastian Berg wrote: >>> >>> On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote: >>> > Hi, >>> > >>> > We should have a higher-bandwidth meeting/communication for all >>> > stakeholders, and particularly some library authors, to see what >>> > would be good for them. >> >> >> I'm not sure that helps. At this point there's little progress since the last meeting, I think the plan is unchanged: we need implementations of all the options on offer, and then try them out in PRs for scikit-learn, SciPy and perhaps another package who's maintainers are interested, to test like=, __array_module__ in realistic situations. >> >> >>> > >>> > We should definitely have language in the NEP that says it won?t be >>> > in a release unless the NEP is accepted. >>> >>> In that case, I think the important part is to have language right now >>> in the implementation, although that can refer to the NEP itself of >>> course. >>> You can't expect everyone who may be tempted to use it to actually read >>> the NEP draft, at least not without pointing it out. >> >> >> Agreed, I think the decision is on this list not in the NEP, and to make sure we won't forget we need an issue opened with the 1.20 milestone. >> >> Cheers, >> Ralf >> >>> >>> I will say that I think it is not very high risk, because I think >>> annoying or not, the argument could be deprecated again with a >>> transition short phase. Admittedly, that argument only works if we have >>> a replacement solution. >>> >>> Cheers, >>> >>> Sebastian >>> >>> >>> > >>> > Best regards, >>> > Hameer Abbasi >>> > >>> > -- >>> > Sent from Canary (https://canarymail.io) >>> > >>> > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg < >>> > > sebastian at sipsolutions.net (mailto:sebastian at sipsolutions.net)> >>> > > wrote: >>> > > Hi all, >>> > > >>> > > as a heads up that Peter Entschev has a PR open to add `like=` to >>> > > most array creation functions, my current plan is to merge it soon >>> > > as a preliminary API and bring it up again before the actual >>> > > release (in a few months). This allows overriding for array-likes, >>> > > e.g. it will allow: >>> > > >>> > > >>> > > arr = np.asarray([3], like=dask_array) >>> > > type(arr) is dask.array.Array >>> > > >>> > > This was proposed in NEP 35: >>> > > >>> > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html >>> > > >>> > > Although that has not been accepted as of now, the PR is: >>> > > >>> > > https://github.com/numpy/numpy/pull/16935 >>> > > >>> > > >>> > > This was discussed in a smaller group, and is an attempt to see how >>> > > we >>> > > can make the array-function protocol viable to allow packages such >>> > > as >>> > > sklearn to work with non-NumPy arrays. >>> > > >>> > > As of now, this would be experimental and can revisit it before the >>> > > actual NumPy release. We should probably discuss accepting NEP 35 >>> > > more. At this time, I hope that we can put in the functionality to >>> > > facilitate this discussion, rather the other way around. >>> > > >>> > > If anyone feels nervous about this step, I would be happy to >>> > > document >>> > > that we will not include it in the next release unless the NEP is >>> > > accepted first, or at least hide it behind an environment variable. >>> > > >>> > > Cheers, >>> > > >>> > > Sebastian >>> > > >>> > > _______________________________________________ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion at python.org >>> > > https://mail.python.org/mailman/listinfo/numpy-discussion >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at python.org >>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Thu Aug 13 08:21:56 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Aug 2020 13:21:56 +0100 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev wrote: > > > I think, arriving to an agreement would be much faster if there is an > executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > > This is what I intended to do in the Usage Guidance [2] section. Could > you elaborate on what more information you'd want to see there? Or is > it just a matter of reorganizing the NEP a bit to try and summarize > such things right at the top? > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? > > Finally as a minor point, I know we are mostly (ex-)academics but this > necessity of formal language on NEPs is self-imposed (probably PEPs are to > blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > > TBH, I don't really know how to solve that point, so if you have any > specific suggestions, that's certainly welcome. I understand the > frustration for a reader trying to understand all the details, with > many being only described in NEP-18 [3], but we also strive to avoid > rewriting things that are written elsewhere, which would also > overburden those who are aware of what's being discussed. > > > > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > Some variant of this proposal would be my preference. Cheers, Ralf > [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 > [2] > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance > [3] https://numpy.org/neps/nep-0018-array-function-protocol.html > [4] https://numpy.org/neps/nep-0000.html#nep-workflow > [5] > https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias > wrote: > > > > I?ve generally been on the ?let the NumPy devs worry about it? side of > things, but I do agree with Ilhan that `like=` is confusing and `typeof=` > would be a much more appropriate name for that parameter. > > > > I do think library writers are NumPy users and so I wouldn?t really make > that distinction, though. Users writing their own analysis code could very > well be interested in writing code using numpy functions that will > transparently work when the input is a CuPy array or whatever. > > > > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > > > > Food for thought. > > > > Juan. > > > > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: > > > > For what is worth, as a potential consumer in SciPy, it really doesn't > say anything (both in NEP and the PR) about how the regular users of NumPy > will benefit from this. If only and only 3rd parties are going to benefit > from it, I am not sure adding a new keyword to an already confusing > function is the right thing to do. > > > > Let me clarify, > > > > - This is already a very (I mean extremely very) easy keyword name to > confuse with ones_like, zeros_like and by its nature any other > interpretation. It is not signalling anything about the functionality that > is being discussed. I would seriously consider reserving such obvious names > for really obvious tasks. Because you would also expect the shape and ndim > would be mimicked by the "like"d argument but it turns out it is acting > more like "typeof=" and not "like=" at all. Because if we follow the > semantics it reads as "make your argument asarray like the other thing" but > it is actually doing, "make your argument an array with the other thing's > type" which might not be an array after all. > > > > - Again, if this is meant for downstream libraries (because that's what > I got out of the PR discussion, cupy, dask, and JAX were the only examples > I could read) then hiding it in another function and writing with capital > letters "this is not meant for numpy users" would be a much more convenient > way to separate the target audience and regular users. > numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may > be would be quite clean and to the point with no ambiguous keywords. > > > > I think, arriving to an agreement would be much faster if there is an > executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > > > > Finally as a minor point, I know we are mostly (ex-)academics but this > necessity of formal language on NEPs is self-imposed (probably PEPs are to > blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Thu Aug 13 09:43:56 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Thu, 13 Aug 2020 15:43:56 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: To maybe lighten up the discussion a bit and to make my outsider confusion more tangible, let me start by apologizing for diving head first without weighing the past luggage :-) I always forget how much effort goes into these things and for outsiders like me, it's a matter of dipping the finger and tasting it just before starting to complain how much salt is missing etc. What I was mentioning about NEPs wasn't only related specifically to this one by the way. It's the generic feeling that I have. First let me start what I mean by NumPy users and downstreamers distinction. This is very much related to how data-science and huge-array users are magnetizing every tool out there in the Python world which is fine though the majority of number-crunchers have nothing to do with any of GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think of people who use NumPy as its own right with no duck-typing and nothing related to subclassing. Just straightforward array creation and lots of ops on these arrays. For those people (I'm one of them), this option brings in a keyword that we would never use. And it gets into many major functions (linspace and others mentioned somewhere). So it has a very appealing name but has nothing to do with me in an already very crowded namespace and keyword catalogue. That's basically a UX issue to be addressed (under the assumption that users like me are the majority). Either making its name as esoteric as possible so I naturally stay away from it or I don't see it. This has absolutely nothing to do with looking down on the downstream libraries. They are flat-out amazing and the more we can support them the merrier. Using yet another metaphor, I was hoping that NumPy would have a loading dock for heavy duty deliveries for downstream projects or specialized array creations and won't disturb the regular customer entrance. Because if I look at this page https://numpy.org/doc/stable/referenc/routines.array-creation.html, there are a lot of functions and I think most of them are candidates to gain this keyword. I wish I can comment on a viable alternative but I really cannot understand the _array_xxxx_ discussions since they fly way over my head no matter how many times I tried. So that's why I naively mentioned the "np.astypedarray" or "np.asarray_but_not_numpy_array" or whatever. Now I see that it is even more complicated and I generated extra noise. So you can just ignore my previous suggestions. Except that I want to draw attention to the UX problem and I'd like to leave it at that. The other point is about the NEP stuff. I think I need to elaborate. If the NEPs are meant for internal NumPy discussions, then by all means, crank up the pointer*-meter to 11 and dive into it, totally fine with me. But if you also want to get feedback from outside, then probably a few lines of code examples for mere mortals would go a long way. Also it would make the discussion much more streamlined in my humble opinion. What I was trying to get at was that almost all NEPs read like a legal document that I want to agree as soon as possible. Because they often come without any or minimal amount of code in it. In NEP35 for example, there are nice code blocks in function dispatching but I guess it's not meant for me. Because it is only decorating asarray with some black magic happening there somehow (I guess). So I can't even comprehend what the proposition would mean for the regular, friendly, anti-duck users. But I am pretty sure it is about dispatching something because the word is repeated ~20 times :-) Thus the feedback would be limited. That was also what I meant there. But again I totally understand the complexity of these issues. So I'm not expecting to understand all details of NumPy machinery in a single NEP. But anyways, hope this clarifies a few things that I failed to convey in my previous mail. ilhan On Thu, Aug 13, 2020 at 2:23 PM Ralf Gommers wrote: > Thanks for raising these concerns Ilhan and Juan, and for answering Peter. > Let me give my perspective as well. > > To start with, this is not specifically about Peter's NEP and PR. NEP 35 > simply follows the pattern set by previous PRs, and given its tight scope > is less difficult to understand than other NEPs on such technical topics. > Peter has done a lot of things right, and is close to the finish line. > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < > peter at entschev.com> wrote: > >> >> > I think, arriving to an agreement would be much faster if there is an >> executive summary of who this is intended for and what the regular usage >> is. Because with no offense, all I see is "dispatch", "_array_function_" >> and a lot of technical details of which I am absolutely ignorant. >> >> This is what I intended to do in the Usage Guidance [2] section. Could >> you elaborate on what more information you'd want to see there? Or is >> it just a matter of reorganizing the NEP a bit to try and summarize >> such things right at the top? >> > > We adapted the NEP template [6] several times last year to try and improve > this. And specified in there as well that NEP content set to the mailing > list should only contain the sections: Abstract, Motivation and Scope, > Usage and Impact, and Backwards compatibility. This to ensure we fully > understand the "why" and "what" before the "how". Unfortunately that > template and procedure hasn't been exercised much yet, only in NEP 38 [7] > and partially in NEP 41 [8]. > > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image > (Juan) and CuPy (Leo, on the PR review) all saying they don't understand > the goals, relevance, target audience, or how they're supposed to use a new > feature, that indicates that the people doing the writing and having the > discussion are doing something wrong at a very fundamental level. > > At this point I'm pretty disappointed in and tired of how we write and > discuss NEPs on technical topics like dispatching, dtypes and the like. > People literally refuse to write down concrete motivations, goals and > non-goals, code that's problematic now and will be better/working post-NEP > and usage examples before launching into extensive discussion of the gory > details of the internals. I'm not sure what to do about it. Completely > separate API and behavior proposals from implementation proposals? Make > separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo > on the API team which then needs to approve every API change in new NEPs? > Offer to co-write NEPs if someone is willing but doesn't understand how to > go about it? Keep the current structure/process but veto further approvals > until NEP authors get it right? > > I want to make an exception for merging the current NEP, for which the > plan is to merge it as experimental to try in downstream PRs and get more > experience. That does mean that master will be in an unreleasable state by > the way, which is unusual and it'd be nice to get Chuck's explicit OK for > that. But after that, I think we need a change here. I would like to hear > what everyone thinks is the shape that change should take - any of my above > suggestions, or something else? > > > >> > Finally as a minor point, I know we are mostly (ex-)academics but this >> necessity of formal language on NEPs is self-imposed (probably PEPs are to >> blame) and not quite helping. It can be a bit more descriptive in my >> external opinion. >> >> TBH, I don't really know how to solve that point, so if you have any >> specific suggestions, that's certainly welcome. I understand the >> frustration for a reader trying to understand all the details, with >> many being only described in NEP-18 [3], but we also strive to avoid >> rewriting things that are written elsewhere, which would also >> overburden those who are aware of what's being discussed. >> >> >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >> discussion) that NEPs are getting pretty inaccessible. In a sense these are >> difficult topics and readers should be expected to have *some* familiarity >> with the topics being discussed, but perhaps more effort should be put into >> the context/motivation/background of a NEP before accepting it. One way to >> ensure this might be to require a final proofreading step by someone who >> has not been involved at all in the discussions, like peer review does for >> papers. >> > > Some variant of this proposal would be my preference. > > Cheers, > Ralf > > >> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >> [2] >> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >> [5] >> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > [7] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > [8] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > >> >> >> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias >> wrote: >> > >> > I?ve generally been on the ?let the NumPy devs worry about it? side of >> things, but I do agree with Ilhan that `like=` is confusing and `typeof=` >> would be a much more appropriate name for that parameter. >> > >> > I do think library writers are NumPy users and so I wouldn?t really >> make that distinction, though. Users writing their own analysis code could >> very well be interested in writing code using numpy functions that will >> transparently work when the input is a CuPy array or whatever. >> > >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >> discussion) that NEPs are getting pretty inaccessible. In a sense these are >> difficult topics and readers should be expected to have *some* familiarity >> with the topics being discussed, but perhaps more effort should be put into >> the context/motivation/background of a NEP before accepting it. One way to >> ensure this might be to require a final proofreading step by someone who >> has not been involved at all in the discussions, like peer review does for >> papers. >> > >> > Food for thought. >> > >> > Juan. >> > >> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >> > >> > For what is worth, as a potential consumer in SciPy, it really doesn't >> say anything (both in NEP and the PR) about how the regular users of NumPy >> will benefit from this. If only and only 3rd parties are going to benefit >> from it, I am not sure adding a new keyword to an already confusing >> function is the right thing to do. >> > >> > Let me clarify, >> > >> > - This is already a very (I mean extremely very) easy keyword name to >> confuse with ones_like, zeros_like and by its nature any other >> interpretation. It is not signalling anything about the functionality that >> is being discussed. I would seriously consider reserving such obvious names >> for really obvious tasks. Because you would also expect the shape and ndim >> would be mimicked by the "like"d argument but it turns out it is acting >> more like "typeof=" and not "like=" at all. Because if we follow the >> semantics it reads as "make your argument asarray like the other thing" but >> it is actually doing, "make your argument an array with the other thing's >> type" which might not be an array after all. >> > >> > - Again, if this is meant for downstream libraries (because that's what >> I got out of the PR discussion, cupy, dask, and JAX were the only examples >> I could read) then hiding it in another function and writing with capital >> letters "this is not meant for numpy users" would be a much more convenient >> way to separate the target audience and regular users. >> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may >> be would be quite clean and to the point with no ambiguous keywords. >> > >> > I think, arriving to an agreement would be much faster if there is an >> executive summary of who this is intended for and what the regular usage >> is. Because with no offense, all I see is "dispatch", "_array_function_" >> and a lot of technical details of which I am absolutely ignorant. >> > >> > Finally as a minor point, I know we are mostly (ex-)academics but this >> necessity of formal language on NEPs is self-imposed (probably PEPs are to >> blame) and not quite helping. It can be a bit more descriptive in my >> external opinion. >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at entschev.com Thu Aug 13 09:47:02 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Thu, 13 Aug 2020 15:47:02 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. > > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. I'm more than happy to edit the NEP and try to clarify all the concerns. However, it gets pretty difficult to do so when I as an author don't understand where the difficulty is. Ilhan, Juan and Ralf now pointed out things that are missing/unclear, but no comment was made in that regard when I sent the NEP, my point being: I couldn't fix what I didn't know was a problem to others. > At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Honestly, I don't really understand this. From my perspective, there are two ways to deal with such things: 1. Templates are to be taken mainly as _guidelines_ rather than _hardlines_, and the current text of NEP-35 definitely falls in the first category; 2. Templates are _hardlines_ and to be guided/enforced by maintainers at some point (maybe before merging the PR?). If 2 is the desired case for NumPy, which sounds a lot like what is wanted from NEP-35 and other NEPs generally, maintainers should let the authors know as early as possible that something isn't following the template's hardlines and it should be corrected. I don't mean any of this to remove myself of any responsibility, but would like to express my frustration that a 10 month-old NEP is only now getting so much pushback for being unclear after its implementation is nearing completion. > I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. I don't quite understand this either, why would that leave master in an unreleasable state? Best, Peter On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers wrote: > > Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. > > To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev wrote: >> >> >> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >> >> This is what I intended to do in the Usage Guidance [2] section. Could >> you elaborate on what more information you'd want to see there? Or is >> it just a matter of reorganizing the NEP a bit to try and summarize >> such things right at the top? > > > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. > > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. > > At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? > > I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? > > >> >> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >> >> TBH, I don't really know how to solve that point, so if you have any >> specific suggestions, that's certainly welcome. I understand the >> frustration for a reader trying to understand all the details, with >> many being only described in NEP-18 [3], but we also strive to avoid >> rewriting things that are written elsewhere, which would also >> overburden those who are aware of what's being discussed. >> >> >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. > > > Some variant of this proposal would be my preference. > > Cheers, > Ralf > >> >> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >> [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >> [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > >> >> >> >> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias wrote: >> > >> > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. >> > >> > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. >> > >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >> > >> > Food for thought. >> > >> > Juan. >> > >> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >> > >> > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. >> > >> > Let me clarify, >> > >> > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. >> > >> > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. >> > >> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >> > >> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Thu Aug 13 10:13:07 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Aug 2020 15:13:07 +0100 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: On Thu, Aug 13, 2020 at 2:47 PM Peter Andreas Entschev wrote: > > We adapted the NEP template [6] several times last year to try and > improve this. And specified in there as well that NEP content set to the > mailing list should only contain the sections: Abstract, Motivation and > Scope, Usage and Impact, and Backwards compatibility. This to ensure we > fully understand the "why" and "what" before the "how". Unfortunately that > template and procedure hasn't been exercised much yet, only in NEP 38 [7] > and partially in NEP 41 [8]. > > > > If we have long-time maintainers of SciPy (Ilhan and myself), > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't > understand the goals, relevance, target audience, or how they're supposed > to use a new feature, that indicates that the people doing the writing and > having the discussion are doing something wrong at a very fundamental level. > > I'm more than happy to edit the NEP and try to clarify all the > concerns. Thanks Peter. Let me reiterate, you did a lot of things right, have been happy to adapt when given feedback, and your willingness to go back and fix things up now is much appreciated (and I'm happy to help). No criticism of your work or attitude intended, on the contract. > However, it gets pretty difficult to do so when I as an > author don't understand where the difficulty is. Ilhan, Juan and Ralf > now pointed out things that are missing/unclear, but no comment was > made in that regard when I sent the NEP, my point being: I couldn't > fix what I didn't know was a problem to others. > Yes of course, I totally understand that. > > At this point I'm pretty disappointed in and tired of how we write and > discuss NEPs on technical topics like dispatching, dtypes and the like. > People literally refuse to write down concrete motivations, goals and > non-goals, code that's problematic now and will be better/working post-NEP > and usage examples before launching into extensive discussion of the gory > details of the internals. I'm not sure what to do about it. > > Honestly, I don't really understand this. From my perspective, there > are two ways to deal with such things: > > 1. Templates are to be taken mainly as _guidelines_ rather than > _hardlines_, and the current text of NEP-35 definitely falls in the > first category; > 2. Templates are _hardlines_ and to be guided/enforced by maintainers > at some point (maybe before merging the PR?). > > If 2 is the desired case for NumPy, which sounds a lot like what is > wanted from NEP-35 and other NEPs generally, maintainers should let > the authors know as early as possible that something isn't following > the template's hardlines and it should be corrected. Yes agreed, maintainers should do this. It was always meant as something in between, "please follow but deviate if needed". If essential elements are missing, I think that should be flagged earlier going forward. As a concrete example: Stephan (the main author of __array_function__) was still fuzzy on the functions covered and whether it solves array coercion, in the last 24 hours*. You answered by pointing to concrete code in Dask and Xarray. That code, why it doesn't work well now but will work with like=, should be at the top of the NEP as concrete problem statement / code examples. It's quite unfortunate that no maintainer explicitly requested this many months ago. * https://github.com/numpy/numpy/pull/16935#issuecomment-673379038 I don't mean any of this to remove myself of any responsibility, but would > like to > express my frustration that a 10 month-old NEP is only now getting so > much pushback for being unclear after its implementation is nearing > completion. > Totally understandable. I think part of the problem is that people only weigh in when they see concrete "this part is for you, and here's how you use it to solve problem X". As for me personally, if I'm saying things now that I didn't manage to respond to earlier (specific to your NEP), I apologize. 10 months ago I was in the middle of an intercontinental move and a new-ish job getting busier fast. Again, apologies and no criticism of your work. > > > I want to make an exception for merging the current NEP, for which the > plan is to merge it as experimental to try in downstream PRs and get more > experience. That does mean that master will be in an unreleasable state by > the way, which is unusual and it'd be nice to get Chuck's explicit OK for > that. > > I don't quite understand this either, why would that leave master in > an unreleasable state? > That's what Sebastian proposed yesterday: let's merge right now, open issues for all the things being brought up right now, and deal with them pre-1.20-release. I'm saying I'm fine with that, but then we actually need to go back and finalize the discussions before the next release. Cheers, Ralf > Best, > Peter > > On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers > wrote: > > > > Thanks for raising these concerns Ilhan and Juan, and for answering > Peter. Let me give my perspective as well. > > > > To start with, this is not specifically about Peter's NEP and PR. NEP 35 > simply follows the pattern set by previous PRs, and given its tight scope > is less difficult to understand than other NEPs on such technical topics. > Peter has done a lot of things right, and is close to the finish line. > > > > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < > peter at entschev.com> wrote: > >> > >> > >> > I think, arriving to an agreement would be much faster if there is an > executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > >> > >> This is what I intended to do in the Usage Guidance [2] section. Could > >> you elaborate on what more information you'd want to see there? Or is > >> it just a matter of reorganizing the NEP a bit to try and summarize > >> such things right at the top? > > > > > > We adapted the NEP template [6] several times last year to try and > improve this. And specified in there as well that NEP content set to the > mailing list should only contain the sections: Abstract, Motivation and > Scope, Usage and Impact, and Backwards compatibility. This to ensure we > fully understand the "why" and "what" before the "how". Unfortunately that > template and procedure hasn't been exercised much yet, only in NEP 38 [7] > and partially in NEP 41 [8]. > > > > If we have long-time maintainers of SciPy (Ilhan and myself), > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't > understand the goals, relevance, target audience, or how they're supposed > to use a new feature, that indicates that the people doing the writing and > having the discussion are doing something wrong at a very fundamental level. > > > > At this point I'm pretty disappointed in and tired of how we write and > discuss NEPs on technical topics like dispatching, dtypes and the like. > People literally refuse to write down concrete motivations, goals and > non-goals, code that's problematic now and will be better/working post-NEP > and usage examples before launching into extensive discussion of the gory > details of the internals. I'm not sure what to do about it. Completely > separate API and behavior proposals from implementation proposals? Make > separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo > on the API team which then needs to approve every API change in new NEPs? > Offer to co-write NEPs if someone is willing but doesn't understand how to > go about it? Keep the current structure/process but veto further approvals > until NEP authors get it right? > > > > I want to make an exception for merging the current NEP, for which the > plan is to merge it as experimental to try in downstream PRs and get more > experience. That does mean that master will be in an unreleasable state by > the way, which is unusual and it'd be nice to get Chuck's explicit OK for > that. But after that, I think we need a change here. I would like to hear > what everyone thinks is the shape that change should take - any of my above > suggestions, or something else? > > > > > >> > >> > Finally as a minor point, I know we are mostly (ex-)academics but > this necessity of formal language on NEPs is self-imposed (probably PEPs > are to blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > >> > >> TBH, I don't really know how to solve that point, so if you have any > >> specific suggestions, that's certainly welcome. I understand the > >> frustration for a reader trying to understand all the details, with > >> many being only described in NEP-18 [3], but we also strive to avoid > >> rewriting things that are written elsewhere, which would also > >> overburden those who are aware of what's being discussed. > >> > >> > >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > > > > > > Some variant of this proposal would be my preference. > > > > Cheers, > > Ralf > > > >> > >> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 > >> [2] > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance > >> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html > >> [4] https://numpy.org/neps/nep-0000.html#nep-workflow > >> [5] > https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > > > > > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > > [7] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > > [8] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > > > >> > >> > >> > >> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias > wrote: > >> > > >> > I?ve generally been on the ?let the NumPy devs worry about it? side > of things, but I do agree with Ilhan that `like=` is confusing and > `typeof=` would be a much more appropriate name for that parameter. > >> > > >> > I do think library writers are NumPy users and so I wouldn?t really > make that distinction, though. Users writing their own analysis code could > very well be interested in writing code using numpy functions that will > transparently work when the input is a CuPy array or whatever. > >> > > >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > >> > > >> > Food for thought. > >> > > >> > Juan. > >> > > >> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: > >> > > >> > For what is worth, as a potential consumer in SciPy, it really > doesn't say anything (both in NEP and the PR) about how the regular users > of NumPy will benefit from this. If only and only 3rd parties are going to > benefit from it, I am not sure adding a new keyword to an already confusing > function is the right thing to do. > >> > > >> > Let me clarify, > >> > > >> > - This is already a very (I mean extremely very) easy keyword name to > confuse with ones_like, zeros_like and by its nature any other > interpretation. It is not signalling anything about the functionality that > is being discussed. I would seriously consider reserving such obvious names > for really obvious tasks. Because you would also expect the shape and ndim > would be mimicked by the "like"d argument but it turns out it is acting > more like "typeof=" and not "like=" at all. Because if we follow the > semantics it reads as "make your argument asarray like the other thing" but > it is actually doing, "make your argument an array with the other thing's > type" which might not be an array after all. > >> > > >> > - Again, if this is meant for downstream libraries (because that's > what I got out of the PR discussion, cupy, dask, and JAX were the only > examples I could read) then hiding it in another function and writing with > capital letters "this is not meant for numpy users" would be a much more > convenient way to separate the target audience and regular users. > numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may > be would be quite clean and to the point with no ambiguous keywords. > >> > > >> > I think, arriving to an agreement would be much faster if there is an > executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > >> > > >> > Finally as a minor point, I know we are mostly (ex-)academics but > this necessity of formal language on NEPs is self-imposed (probably PEPs > are to blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at entschev.com Thu Aug 13 10:14:15 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Thu, 13 Aug 2020 16:14:15 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Ilhan, Thanks, that does clarify things. I think the main point -- and correct me here if I'm still wrong -- is that we want the NEP to have some very clear example of when/why/how to use it, preferably as early in the text as possible, maybe just below the Abstract, in a Motivation and Scope section, as the NEP Template [6] pointed out to by Ralf earlier suggests. That is a totally valid ask, and I'll try to address it as soon as possible (hopefully today or tomorrow). To the point of whether NEPs are to be read by users, I normally don't expect users to be required to read and understand those NEPs other than by pure curiosity. If we need them to do so, then there's definitely a big problem in the API. This may sound counterintuitive with what I said before about the "like=" name, but that's really the piece of the NumPy API that I with a somewhat reasonable understand of arrays don't quite get or like, for instance "asarray" and "like" sound exactly the same thing, but they're not in the NumPy context, and on the other hand it's quite difficult to find a reasonable name to clarify that. And once more, I do like the "typeof=" suggestion more than "like=" to be perfectly honest, I'm just afraid it could be mistaken by the "dtype=" keyword somehow and thus still not solve the clarity problem. Going back to users reading NEPs or not, I would really expect that the docstring from the function is sufficiently clear to keep users off of it, but still give them an understanding of why that exists, the current docstring is in [9], please do comment on it if you have ideas of how to make it more accessible to users. You also mentioned you'd like that the name is as esoteric as possible, do you have any suggestions for an esoteric name that is hopefully unambiguous too? Naming has definitely been very much on the table since the NEP was written, but the consensus was more that "like=" is reasonably similar enough in both application and the name itself to "empty_like" and derived functions, that's why we just stuck to it. Best, Peter [9] https://github.com/numpy/numpy/pull/16935/files#diff-e5969453e399f2d32519d305b2582da9R16-R22 On Thu, Aug 13, 2020 at 3:43 PM Ilhan Polat wrote: > > To maybe lighten up the discussion a bit and to make my outsider confusion more tangible, let me start by apologizing for diving head first without weighing the past luggage :-) I always forget how much effort goes into these things and for outsiders like me, it's a matter of dipping the finger and tasting it just before starting to complain how much salt is missing etc. What I was mentioning about NEPs wasn't only related specifically to this one by the way. It's the generic feeling that I have. > > First let me start what I mean by NumPy users and downstreamers distinction. This is very much related to how data-science and huge-array users are magnetizing every tool out there in the Python world which is fine though the majority of number-crunchers have nothing to do with any of GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think of people who use NumPy as its own right with no duck-typing and nothing related to subclassing. Just straightforward array creation and lots of ops on these arrays. For those people (I'm one of them), this option brings in a keyword that we would never use. And it gets into many major functions (linspace and others mentioned somewhere). So it has a very appealing name but has nothing to do with me in an already very crowded namespace and keyword catalogue. That's basically a UX issue to be addressed (under the assumption that users like me are the majority). Either making its name as esoteric as possible so I naturally stay away from it or I don't see it. This has absolutely nothing to do with looking down on the downstream libraries. They are flat-out amazing and the more we can support them the merrier. > > Using yet another metaphor, I was hoping that NumPy would have a loading dock for heavy duty deliveries for downstream projects or specialized array creations and won't disturb the regular customer entrance. Because if I look at this page https://numpy.org/doc/stable/referenc/routines.array-creation.html, there are a lot of functions and I think most of them are candidates to gain this keyword. I wish I can comment on a viable alternative but I really cannot understand the _array_xxxx_ discussions since they fly way over my head no matter how many times I tried. So that's why I naively mentioned the "np.astypedarray" or "np.asarray_but_not_numpy_array" or whatever. Now I see that it is even more complicated and I generated extra noise. So you can just ignore my previous suggestions. Except that I want to draw attention to the UX problem and I'd like to leave it at that. > > The other point is about the NEP stuff. I think I need to elaborate. If the NEPs are meant for internal NumPy discussions, then by all means, crank up the pointer*-meter to 11 and dive into it, totally fine with me. But if you also want to get feedback from outside, then probably a few lines of code examples for mere mortals would go a long way. Also it would make the discussion much more streamlined in my humble opinion. What I was trying to get at was that almost all NEPs read like a legal document that I want to agree as soon as possible. Because they often come without any or minimal amount of code in it. In NEP35 for example, there are nice code blocks in function dispatching but I guess it's not meant for me. Because it is only decorating asarray with some black magic happening there somehow (I guess). So I can't even comprehend what the proposition would mean for the regular, friendly, anti-duck users. But I am pretty sure it is about dispatching something because the word is repeated ~20 times :-) Thus the feedback would be limited. That was also what I meant there. But again I totally understand the complexity of these issues. So I'm not expecting to understand all details of NumPy machinery in a single NEP. > > But anyways, hope this clarifies a few things that I failed to convey in my previous mail. > ilhan > > > > On Thu, Aug 13, 2020 at 2:23 PM Ralf Gommers wrote: >> >> Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. >> >> To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. >> >> >> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev wrote: >>> >>> >>> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >>> >>> This is what I intended to do in the Usage Guidance [2] section. Could >>> you elaborate on what more information you'd want to see there? Or is >>> it just a matter of reorganizing the NEP a bit to try and summarize >>> such things right at the top? >> >> >> We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. >> >> If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. >> >> At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? >> >> I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? >> >> >>> >>> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >>> >>> TBH, I don't really know how to solve that point, so if you have any >>> specific suggestions, that's certainly welcome. I understand the >>> frustration for a reader trying to understand all the details, with >>> many being only described in NEP-18 [3], but we also strive to avoid >>> rewriting things that are written elsewhere, which would also >>> overburden those who are aware of what's being discussed. >>> >>> >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >> >> >> Some variant of this proposal would be my preference. >> >> Cheers, >> Ralf >> >>> >>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >>> [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >>> [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html >> >> >> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >> [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst >> [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst >> >> >>> >>> >>> >>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias wrote: >>> > >>> > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. >>> > >>> > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. >>> > >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >>> > >>> > Food for thought. >>> > >>> > Juan. >>> > >>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >>> > >>> > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. >>> > >>> > Let me clarify, >>> > >>> > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. >>> > >>> > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. >>> > >>> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >>> > >>> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From peter at entschev.com Thu Aug 13 10:33:52 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Thu, 13 Aug 2020 16:33:52 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Ralf, I know none of it is a criticism of my work or directly of anybody else's work. I was just making a couple of general points (or questions really): 1. What is accepted as a reasonably clear NEP? It seems to point that a NEP _must_ follow the Template 2. Should the NEP Template be followed as a hardline? Personally, I think that would be fine in general, and diverging seems to be only an option of when additional information is necessary, but less should not be acceptable. And to be perfectly clear, none of what I said is a criticism to anybody in particular, but it's a frustration about the process seemingly not clear in itself for either authors or maintainers, thus my two points above. I apologize if any of what I said so far has been taken as a personal criticism to someone, it was definitely not meant that way. Finally, I like Juan's previous suggestion that someone not involved in the discussion proof-reading would be a great idea, I'm not sure if that's achievable in practice though. However, I think that discussion is a bit out of context, so I'll try to address the unclear parts of this NEP in a PR and we could continue the general discussion of the NEP process in a different thread if people wish to do so. Best, Peter On Thu, Aug 13, 2020 at 4:13 PM Ralf Gommers wrote: > > > > On Thu, Aug 13, 2020 at 2:47 PM Peter Andreas Entschev wrote: >> >> > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. >> > >> > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. >> >> I'm more than happy to edit the NEP and try to clarify all the >> concerns. > > > Thanks Peter. Let me reiterate, you did a lot of things right, have been happy to adapt when given feedback, and your willingness to go back and fix things up now is much appreciated (and I'm happy to help). No criticism of your work or attitude intended, on the contract. > >> >> However, it gets pretty difficult to do so when I as an >> author don't understand where the difficulty is. Ilhan, Juan and Ralf >> now pointed out things that are missing/unclear, but no comment was >> made in that regard when I sent the NEP, my point being: I couldn't >> fix what I didn't know was a problem to others. > > > Yes of course, I totally understand that. > >> >> > At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. >> >> Honestly, I don't really understand this. From my perspective, there >> are two ways to deal with such things: >> >> 1. Templates are to be taken mainly as _guidelines_ rather than >> _hardlines_, and the current text of NEP-35 definitely falls in the >> first category; >> 2. Templates are _hardlines_ and to be guided/enforced by maintainers >> at some point (maybe before merging the PR?). >> >> If 2 is the desired case for NumPy, which sounds a lot like what is >> wanted from NEP-35 and other NEPs generally, maintainers should let >> the authors know as early as possible that something isn't following >> the template's hardlines and it should be corrected. > > > Yes agreed, maintainers should do this. It was always meant as something in between, "please follow but deviate if needed". If essential elements are missing, I think that should be flagged earlier going forward. > > As a concrete example: Stephan (the main author of __array_function__) was still fuzzy on the functions covered and whether it solves array coercion, in the last 24 hours*. You answered by pointing to concrete code in Dask and Xarray. That code, why it doesn't work well now but will work with like=, should be at the top of the NEP as concrete problem statement / code examples. It's quite unfortunate that no maintainer explicitly requested this many months ago. > > * https://github.com/numpy/numpy/pull/16935#issuecomment-673379038 > >> I don't mean any of this to remove myself of any responsibility, but would like to >> express my frustration that a 10 month-old NEP is only now getting so >> much pushback for being unclear after its implementation is nearing >> completion. > > > Totally understandable. I think part of the problem is that people only weigh in when they see concrete "this part is for you, and here's how you use it to solve problem X". > > As for me personally, if I'm saying things now that I didn't manage to respond to earlier (specific to your NEP), I apologize. 10 months ago I was in the middle of an intercontinental move and a new-ish job getting busier fast. Again, apologies and no criticism of your work. > >> >> >> > I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. >> >> I don't quite understand this either, why would that leave master in >> an unreleasable state? > > > That's what Sebastian proposed yesterday: let's merge right now, open issues for all the things being brought up right now, and deal with them pre-1.20-release. I'm saying I'm fine with that, but then we actually need to go back and finalize the discussions before the next release. > > Cheers, > Ralf > > > > >> >> Best, >> Peter >> >> On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers wrote: >> > >> > Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. >> > >> > To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. >> > >> > >> > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev wrote: >> >> >> >> >> >> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >> >> >> >> This is what I intended to do in the Usage Guidance [2] section. Could >> >> you elaborate on what more information you'd want to see there? Or is >> >> it just a matter of reorganizing the NEP a bit to try and summarize >> >> such things right at the top? >> > >> > >> > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. >> > >> > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. >> > >> > At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? >> > >> > I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? >> > >> > >> >> >> >> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >> >> >> >> TBH, I don't really know how to solve that point, so if you have any >> >> specific suggestions, that's certainly welcome. I understand the >> >> frustration for a reader trying to understand all the details, with >> >> many being only described in NEP-18 [3], but we also strive to avoid >> >> rewriting things that are written elsewhere, which would also >> >> overburden those who are aware of what's being discussed. >> >> >> >> >> >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >> > >> > >> > Some variant of this proposal would be my preference. >> > >> > Cheers, >> > Ralf >> > >> >> >> >> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >> >> [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >> >> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >> >> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >> >> [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html >> > >> > >> > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >> > [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst >> > [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst >> > >> > >> >> >> >> >> >> >> >> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias wrote: >> >> > >> >> > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. >> >> > >> >> > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. >> >> > >> >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >> >> > >> >> > Food for thought. >> >> > >> >> > Juan. >> >> > >> >> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >> >> > >> >> > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. >> >> > >> >> > Let me clarify, >> >> > >> >> > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. >> >> > >> >> > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. >> >> > >> >> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >> >> > >> >> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Thu Aug 13 10:47:43 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 13 Aug 2020 09:47:43 -0500 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: <859079d7b3ab9b4b08e32b46bf78b529e5f69955.camel@sipsolutions.net> On Thu, 2020-08-13 at 15:47 +0200, Peter Andreas Entschev wrote: > > We adapted the NEP template [6] several times last year to try and > > improve this. And specified in there as well that NEP content set > > to the mailing list should only contain the sections: Abstract, > > Motivation and Scope, Usage and Impact, and Backwards > > compatibility. This to ensure we fully understand the "why" and > > "what" before the "how". Unfortunately that template and procedure > > hasn't been exercised much yet, only in NEP 38 [7] and partially in > > NEP 41 [8]. > > > > If we have long-time maintainers of SciPy (Ilhan and myself), > > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying > > they don't understand the goals, relevance, target audience, or how > > they're supposed to use a new feature, that indicates that the > > people doing the writing and having the discussion are doing > > something wrong at a very fundamental level. > > I'm more than happy to edit the NEP and try to clarify all the > concerns. However, it gets pretty difficult to do so when I as an > author don't understand where the difficulty is. Ilhan, Juan and Ralf > now pointed out things that are missing/unclear, but no comment was > made in that regard when I sent the NEP, my point being: I couldn't > fix what I didn't know was a problem to others. > > > At this point I'm pretty disappointed in and tired of how we write > > and discuss NEPs on technical topics like dispatching, dtypes and > > the like. People literally refuse to write down concrete > > motivations, goals and non-goals, code that's problematic now and > > will be better/working post-NEP and usage examples before launching > > into extensive discussion of the gory details of the internals. I'm > > not sure what to do about it. > > Honestly, I don't really understand this. From my perspective, there > are two ways to deal with such things: > > 1. Templates are to be taken mainly as _guidelines_ rather than > _hardlines_, and the current text of NEP-35 definitely falls in the > first category; > 2. Templates are _hardlines_ and to be guided/enforced by maintainers > at some point (maybe before merging the PR?). > > If 2 is the desired case for NumPy, which sounds a lot like what is > wanted from NEP-35 and other NEPs generally, maintainers should let > the authors know as early as possible that something isn't following > the template's hardlines and it should be corrected. I don't mean any > of this to remove myself of any responsibility, but would like to > express my frustration that a 10 month-old NEP is only now getting so > much pushback for being unclear after its implementation is nearing > completion. > > > I want to make an exception for merging the current NEP, for which > > the plan is to merge it as experimental to try in downstream PRs > > and get more experience. That does mean that master will be in an > > unreleasable state by the way, which is unusual and it'd be nice to > > get Chuck's explicit OK for that. > > I don't quite understand this either, why would that leave master in > an unreleasable state? > Well, a few points are not discussed to the end yet. The name is one that did not get much attention yet. Maybe because nobody had much concerns about it yet, or maybe it was just lower on the priority list. To be clear: I am fully prepared to pull this out of master before release or probably rather disable it in release versions. An alternative could be an environment variable (an env variable will not stop actual adoption, but we may be fine with that). And unless NEP 35 is accepted, that probably has to be the default, fortunately there is still some time until the next release. - Sebastian > Best, > Peter > > On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers > wrote: > > Thanks for raising these concerns Ilhan and Juan, and for answering > > Peter. Let me give my perspective as well. > > > > To start with, this is not specifically about Peter's NEP and PR. > > NEP 35 simply follows the pattern set by previous PRs, and given > > its tight scope is less difficult to understand than other NEPs on > > such technical topics. Peter has done a lot of things right, and is > > close to the finish line. > > > > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < > > peter at entschev.com> wrote: > > > > > > > I think, arriving to an agreement would be much faster if there > > > > is an executive summary of who this is intended for and what > > > > the regular usage is. Because with no offense, all I see is > > > > "dispatch", "_array_function_" and a lot of technical details > > > > of which I am absolutely ignorant. > > > > > > This is what I intended to do in the Usage Guidance [2] section. > > > Could > > > you elaborate on what more information you'd want to see there? > > > Or is > > > it just a matter of reorganizing the NEP a bit to try and > > > summarize > > > such things right at the top? > > > > We adapted the NEP template [6] several times last year to try and > > improve this. And specified in there as well that NEP content set > > to the mailing list should only contain the sections: Abstract, > > Motivation and Scope, Usage and Impact, and Backwards > > compatibility. This to ensure we fully understand the "why" and > > "what" before the "how". Unfortunately that template and procedure > > hasn't been exercised much yet, only in NEP 38 [7] and partially in > > NEP 41 [8]. > > > > If we have long-time maintainers of SciPy (Ilhan and myself), > > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying > > they don't understand the goals, relevance, target audience, or how > > they're supposed to use a new feature, that indicates that the > > people doing the writing and having the discussion are doing > > something wrong at a very fundamental level. > > > > At this point I'm pretty disappointed in and tired of how we write > > and discuss NEPs on technical topics like dispatching, dtypes and > > the like. People literally refuse to write down concrete > > motivations, goals and non-goals, code that's problematic now and > > will be better/working post-NEP and usage examples before launching > > into extensive discussion of the gory details of the internals. I'm > > not sure what to do about it. Completely separate API and behavior > > proposals from implementation proposals? Make separate "API" and > > "internals" teams with the likes of Juan, Ilhan and Leo on the API > > team which then needs to approve every API change in new NEPs? > > Offer to co-write NEPs if someone is willing but doesn't understand > > how to go about it? Keep the current structure/process but veto > > further approvals until NEP authors get it right? > > > > I want to make an exception for merging the current NEP, for which > > the plan is to merge it as experimental to try in downstream PRs > > and get more experience. That does mean that master will be in an > > unreleasable state by the way, which is unusual and it'd be nice to > > get Chuck's explicit OK for that. But after that, I think we need a > > change here. I would like to hear what everyone thinks is the shape > > that change should take - any of my above suggestions, or something > > else? > > > > > > > > Finally as a minor point, I know we are mostly (ex-)academics > > > > but this necessity of formal language on NEPs is self-imposed > > > > (probably PEPs are to blame) and not quite helping. It can be a > > > > bit more descriptive in my external opinion. > > > > > > TBH, I don't really know how to solve that point, so if you have > > > any > > > specific suggestions, that's certainly welcome. I understand the > > > frustration for a reader trying to understand all the details, > > > with > > > many being only described in NEP-18 [3], but we also strive to > > > avoid > > > rewriting things that are written elsewhere, which would also > > > overburden those who are aware of what's being discussed. > > > > > > > > > > I also share Ilhan?s concern (and I mentioned this in a > > > > previous NEP discussion) that NEPs are getting pretty > > > > inaccessible. In a sense these are difficult topics and readers > > > > should be expected to have *some* familiarity with the topics > > > > being discussed, but perhaps more effort should be put into the > > > > context/motivation/background of a NEP before accepting it. One > > > > way to ensure this might be to require a final proofreading > > > > step by someone who has not been involved at all in the > > > > discussions, like peer review does for papers. > > > > Some variant of this proposal would be my preference. > > > > Cheers, > > Ralf > > > > > [1] > > > https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 > > > [2] > > > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance > > > [3] https://numpy.org/neps/nep-0018-array-function-protocol.html > > > [4] https://numpy.org/neps/nep-0000.html#nep-workflow > > > [5] > > > https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > > > [6] > > https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > > [7] > > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > > [8] > > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > > > > > > > > > > > On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias < > > > jni at fastmail.com> wrote: > > > > I?ve generally been on the ?let the NumPy devs worry about it? > > > > side of things, but I do agree with Ilhan that `like=` is > > > > confusing and `typeof=` would be a much more appropriate name > > > > for that parameter. > > > > > > > > I do think library writers are NumPy users and so I wouldn?t > > > > really make that distinction, though. Users writing their own > > > > analysis code could very well be interested in writing code > > > > using numpy functions that will transparently work when the > > > > input is a CuPy array or whatever. > > > > > > > > I also share Ilhan?s concern (and I mentioned this in a > > > > previous NEP discussion) that NEPs are getting pretty > > > > inaccessible. In a sense these are difficult topics and readers > > > > should be expected to have *some* familiarity with the topics > > > > being discussed, but perhaps more effort should be put into the > > > > context/motivation/background of a NEP before accepting it. One > > > > way to ensure this might be to require a final proofreading > > > > step by someone who has not been involved at all in the > > > > discussions, like peer review does for papers. > > > > > > > > Food for thought. > > > > > > > > Juan. > > > > > > > > On 13 Aug 2020, at 9:24 am, Ilhan Polat > > > > wrote: > > > > > > > > For what is worth, as a potential consumer in SciPy, it really > > > > doesn't say anything (both in NEP and the PR) about how the > > > > regular users of NumPy will benefit from this. If only and only > > > > 3rd parties are going to benefit from it, I am not sure adding > > > > a new keyword to an already confusing function is the right > > > > thing to do. > > > > > > > > Let me clarify, > > > > > > > > - This is already a very (I mean extremely very) easy keyword > > > > name to confuse with ones_like, zeros_like and by its nature > > > > any other interpretation. It is not signalling anything about > > > > the functionality that is being discussed. I would seriously > > > > consider reserving such obvious names for really obvious tasks. > > > > Because you would also expect the shape and ndim would be > > > > mimicked by the "like"d argument but it turns out it is acting > > > > more like "typeof=" and not "like=" at all. Because if we > > > > follow the semantics it reads as "make your argument asarray > > > > like the other thing" but it is actually doing, "make your > > > > argument an array with the other thing's type" which might not > > > > be an array after all. > > > > > > > > - Again, if this is meant for downstream libraries (because > > > > that's what I got out of the PR discussion, cupy, dask, and JAX > > > > were the only examples I could read) then hiding it in another > > > > function and writing with capital letters "this is not meant > > > > for numpy users" would be a much more convenient way to > > > > separate the target audience and regular users. > > > > numpy.astypedarray([[some data], [...]], type_of=x) or whatever > > > > else it may be would be quite clean and to the point with no > > > > ambiguous keywords. > > > > > > > > I think, arriving to an agreement would be much faster if there > > > > is an executive summary of who this is intended for and what > > > > the regular usage is. Because with no offense, all I see is > > > > "dispatch", "_array_function_" and a lot of technical details > > > > of which I am absolutely ignorant. > > > > > > > > Finally as a minor point, I know we are mostly (ex-)academics > > > > but this necessity of formal language on NEPs is self-imposed > > > > (probably PEPs are to blame) and not quite helping. It can be a > > > > bit more descriptive in my external opinion. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ilhanpolat at gmail.com Thu Aug 13 12:46:30 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Thu, 13 Aug 2020 18:46:30 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Yes, the underlying gory details should be spelled out of course but if it is also modifying/adding to API then it is best to sound the horn and invite zombies to take a stab at it. Often people arrive with interesting use-cases that you wouldn't have thought about. And I am very familiar with the pushback feeling you are having right now, probably internally shouting "where have you been all this time you slackers?". As you might have seen me asking questions here and Cython lists, when I am done with some new feature over SciPy, it is also going to be a very very long and tiring process. I am really not looking forward to it :-) but I guess it is part of the deal. Maybe I can give some comfort that if more people start to flock over that means it has morphed into a finished product so people can shoot. But, I honestly thought this was a new NEP, that's a mistake on my part. For the like, typeof and other candidates, by esoteric I mean foreign enough to most users. We already have a nice candidate I think; ehm... "dispatch" or "dispatch_like" or something like that, nobody sober enough would confuse this with any other. And since this won't be typed in daily usage, or so I understood, I guess it is ok to make it verbose. But still take it as an initial guess and feel free to dismiss. I still would be in a platonic love with "numpy.DIY" or "numpy.hermes" namespace with a nice "bring your own _array_function_" service. On Thu, Aug 13, 2020 at 4:16 PM Peter Andreas Entschev wrote: > Ilhan, > > Thanks, that does clarify things. > > I think the main point -- and correct me here if I'm still wrong -- is > that we want the NEP to have some very clear example of when/why/how > to use it, preferably as early in the text as possible, maybe just > below the Abstract, in a Motivation and Scope section, as the NEP > Template [6] pointed out to by Ralf earlier suggests. That is a > totally valid ask, and I'll try to address it as soon as possible > (hopefully today or tomorrow). > > To the point of whether NEPs are to be read by users, I normally don't > expect users to be required to read and understand those NEPs other > than by pure curiosity. If we need them to do so, then there's > definitely a big problem in the API. This may sound counterintuitive > with what I said before about the "like=" name, but that's really the > piece of the NumPy API that I with a somewhat reasonable understand of > arrays don't quite get or like, for instance "asarray" and "like" > sound exactly the same thing, but they're not in the NumPy context, > and on the other hand it's quite difficult to find a reasonable name > to clarify that. And once more, I do like the "typeof=" suggestion > more than "like=" to be perfectly honest, I'm just afraid it could be > mistaken by the "dtype=" keyword somehow and thus still not solve the > clarity problem. Going back to users reading NEPs or not, I would > really expect that the docstring from the function is sufficiently > clear to keep users off of it, but still give them an understanding of > why that exists, the current docstring is in [9], please do comment on > it if you have ideas of how to make it more accessible to users. > > You also mentioned you'd like that the name is as esoteric as > possible, do you have any suggestions for an esoteric name that is > hopefully unambiguous too? Naming has definitely been very much on the > table since the NEP was written, but the consensus was more that > "like=" is reasonably similar enough in both application and the name > itself to "empty_like" and derived functions, that's why we just stuck > to it. > > Best, > Peter > > [9] > https://github.com/numpy/numpy/pull/16935/files#diff-e5969453e399f2d32519d305b2582da9R16-R22 > > On Thu, Aug 13, 2020 at 3:43 PM Ilhan Polat wrote: > > > > To maybe lighten up the discussion a bit and to make my outsider > confusion more tangible, let me start by apologizing for diving head first > without weighing the past luggage :-) I always forget how much effort goes > into these things and for outsiders like me, it's a matter of dipping the > finger and tasting it just before starting to complain how much salt is > missing etc. What I was mentioning about NEPs wasn't only related > specifically to this one by the way. It's the generic feeling that I have. > > > > First let me start what I mean by NumPy users and downstreamers > distinction. This is very much related to how data-science and huge-array > users are magnetizing every tool out there in the Python world which is > fine though the majority of number-crunchers have nothing to do with any of > GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think > of people who use NumPy as its own right with no duck-typing and nothing > related to subclassing. Just straightforward array creation and lots of ops > on these arrays. For those people (I'm one of them), this option brings in > a keyword that we would never use. And it gets into many major functions > (linspace and others mentioned somewhere). So it has a very appealing name > but has nothing to do with me in an already very crowded namespace and > keyword catalogue. That's basically a UX issue to be addressed (under the > assumption that users like me are the majority). Either making its name as > esoteric as possible so I naturally stay away from it or I don't see it. > This has absolutely nothing to do with looking down on the downstream > libraries. They are flat-out amazing and the more we can support them the > merrier. > > > > Using yet another metaphor, I was hoping that NumPy would have a loading > dock for heavy duty deliveries for downstream projects or specialized array > creations and won't disturb the regular customer entrance. Because if I > look at this page > https://numpy.org/doc/stable/referenc/routines.array-creation.html, there > are a lot of functions and I think most of them are candidates to gain this > keyword. I wish I can comment on a viable alternative but I really cannot > understand the _array_xxxx_ discussions since they fly way over my head no > matter how many times I tried. So that's why I naively mentioned the > "np.astypedarray" or "np.asarray_but_not_numpy_array" or whatever. Now I > see that it is even more complicated and I generated extra noise. So you > can just ignore my previous suggestions. Except that I want to draw > attention to the UX problem and I'd like to leave it at that. > > > > The other point is about the NEP stuff. I think I need to elaborate. If > the NEPs are meant for internal NumPy discussions, then by all means, crank > up the pointer*-meter to 11 and dive into it, totally fine with me. But if > you also want to get feedback from outside, then probably a few lines of > code examples for mere mortals would go a long way. Also it would make the > discussion much more streamlined in my humble opinion. What I was trying to > get at was that almost all NEPs read like a legal document that I want to > agree as soon as possible. Because they often come without any or minimal > amount of code in it. In NEP35 for example, there are nice code blocks in > function dispatching but I guess it's not meant for me. Because it is only > decorating asarray with some black magic happening there somehow (I guess). > So I can't even comprehend what the proposition would mean for the regular, > friendly, anti-duck users. But I am pretty sure it is about dispatching > something because the word is repeated ~20 times :-) Thus the feedback > would be limited. That was also what I meant there. But again I totally > understand the complexity of these issues. So I'm not expecting to > understand all details of NumPy machinery in a single NEP. > > > > But anyways, hope this clarifies a few things that I failed to convey in > my previous mail. > > ilhan > > > > > > > > On Thu, Aug 13, 2020 at 2:23 PM Ralf Gommers > wrote: > >> > >> Thanks for raising these concerns Ilhan and Juan, and for answering > Peter. Let me give my perspective as well. > >> > >> To start with, this is not specifically about Peter's NEP and PR. NEP > 35 simply follows the pattern set by previous PRs, and given its tight > scope is less difficult to understand than other NEPs on such technical > topics. Peter has done a lot of things right, and is close to the finish > line. > >> > >> > >> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < > peter at entschev.com> wrote: > >>> > >>> > >>> > I think, arriving to an agreement would be much faster if there is > an executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > >>> > >>> This is what I intended to do in the Usage Guidance [2] section. Could > >>> you elaborate on what more information you'd want to see there? Or is > >>> it just a matter of reorganizing the NEP a bit to try and summarize > >>> such things right at the top? > >> > >> > >> We adapted the NEP template [6] several times last year to try and > improve this. And specified in there as well that NEP content set to the > mailing list should only contain the sections: Abstract, Motivation and > Scope, Usage and Impact, and Backwards compatibility. This to ensure we > fully understand the "why" and "what" before the "how". Unfortunately that > template and procedure hasn't been exercised much yet, only in NEP 38 [7] > and partially in NEP 41 [8]. > >> > >> If we have long-time maintainers of SciPy (Ilhan and myself), > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't > understand the goals, relevance, target audience, or how they're supposed > to use a new feature, that indicates that the people doing the writing and > having the discussion are doing something wrong at a very fundamental level. > >> > >> At this point I'm pretty disappointed in and tired of how we write and > discuss NEPs on technical topics like dispatching, dtypes and the like. > People literally refuse to write down concrete motivations, goals and > non-goals, code that's problematic now and will be better/working post-NEP > and usage examples before launching into extensive discussion of the gory > details of the internals. I'm not sure what to do about it. Completely > separate API and behavior proposals from implementation proposals? Make > separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo > on the API team which then needs to approve every API change in new NEPs? > Offer to co-write NEPs if someone is willing but doesn't understand how to > go about it? Keep the current structure/process but veto further approvals > until NEP authors get it right? > >> > >> I want to make an exception for merging the current NEP, for which the > plan is to merge it as experimental to try in downstream PRs and get more > experience. That does mean that master will be in an unreleasable state by > the way, which is unusual and it'd be nice to get Chuck's explicit OK for > that. But after that, I think we need a change here. I would like to hear > what everyone thinks is the shape that change should take - any of my above > suggestions, or something else? > >> > >> > >>> > >>> > Finally as a minor point, I know we are mostly (ex-)academics but > this necessity of formal language on NEPs is self-imposed (probably PEPs > are to blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > >>> > >>> TBH, I don't really know how to solve that point, so if you have any > >>> specific suggestions, that's certainly welcome. I understand the > >>> frustration for a reader trying to understand all the details, with > >>> many being only described in NEP-18 [3], but we also strive to avoid > >>> rewriting things that are written elsewhere, which would also > >>> overburden those who are aware of what's being discussed. > >>> > >>> > >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > >> > >> > >> Some variant of this proposal would be my preference. > >> > >> Cheers, > >> Ralf > >> > >>> > >>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 > >>> [2] > https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance > >>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html > >>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow > >>> [5] > https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > >> > >> > >> [6] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > >> [7] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > >> [8] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > >> > >> > >>> > >>> > >>> > >>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias > wrote: > >>> > > >>> > I?ve generally been on the ?let the NumPy devs worry about it? side > of things, but I do agree with Ilhan that `like=` is confusing and > `typeof=` would be a much more appropriate name for that parameter. > >>> > > >>> > I do think library writers are NumPy users and so I wouldn?t really > make that distinction, though. Users writing their own analysis code could > very well be interested in writing code using numpy functions that will > transparently work when the input is a CuPy array or whatever. > >>> > > >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP > discussion) that NEPs are getting pretty inaccessible. In a sense these are > difficult topics and readers should be expected to have *some* familiarity > with the topics being discussed, but perhaps more effort should be put into > the context/motivation/background of a NEP before accepting it. One way to > ensure this might be to require a final proofreading step by someone who > has not been involved at all in the discussions, like peer review does for > papers. > >>> > > >>> > Food for thought. > >>> > > >>> > Juan. > >>> > > >>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat > wrote: > >>> > > >>> > For what is worth, as a potential consumer in SciPy, it really > doesn't say anything (both in NEP and the PR) about how the regular users > of NumPy will benefit from this. If only and only 3rd parties are going to > benefit from it, I am not sure adding a new keyword to an already confusing > function is the right thing to do. > >>> > > >>> > Let me clarify, > >>> > > >>> > - This is already a very (I mean extremely very) easy keyword name > to confuse with ones_like, zeros_like and by its nature any other > interpretation. It is not signalling anything about the functionality that > is being discussed. I would seriously consider reserving such obvious names > for really obvious tasks. Because you would also expect the shape and ndim > would be mimicked by the "like"d argument but it turns out it is acting > more like "typeof=" and not "like=" at all. Because if we follow the > semantics it reads as "make your argument asarray like the other thing" but > it is actually doing, "make your argument an array with the other thing's > type" which might not be an array after all. > >>> > > >>> > - Again, if this is meant for downstream libraries (because that's > what I got out of the PR discussion, cupy, dask, and JAX were the only > examples I could read) then hiding it in another function and writing with > capital letters "this is not meant for numpy users" would be a much more > convenient way to separate the target audience and regular users. > numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may > be would be quite clean and to the point with no ambiguous keywords. > >>> > > >>> > I think, arriving to an agreement would be much faster if there is > an executive summary of who this is intended for and what the regular usage > is. Because with no offense, all I see is "dispatch", "_array_function_" > and a lot of technical details of which I am absolutely ignorant. > >>> > > >>> > Finally as a minor point, I know we are mostly (ex-)academics but > this necessity of formal language on NEPs is self-imposed (probably PEPs > are to blame) and not quite helping. It can be a bit more descriptive in my > external opinion. > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Aug 13 15:29:27 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 13 Aug 2020 12:29:27 -0700 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers wrote: > Thanks for raising these concerns Ilhan and Juan, and for answering Peter. > Let me give my perspective as well. > > To start with, this is not specifically about Peter's NEP and PR. NEP 35 > simply follows the pattern set by previous PRs, and given its tight scope > is less difficult to understand than other NEPs on such technical topics. > Peter has done a lot of things right, and is close to the finish line. > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < > peter at entschev.com> wrote: > >> >> > I think, arriving to an agreement would be much faster if there is an >> executive summary of who this is intended for and what the regular usage >> is. Because with no offense, all I see is "dispatch", "_array_function_" >> and a lot of technical details of which I am absolutely ignorant. >> >> This is what I intended to do in the Usage Guidance [2] section. Could >> you elaborate on what more information you'd want to see there? Or is >> it just a matter of reorganizing the NEP a bit to try and summarize >> such things right at the top? >> > > We adapted the NEP template [6] several times last year to try and improve > this. And specified in there as well that NEP content set to the mailing > list should only contain the sections: Abstract, Motivation and Scope, > Usage and Impact, and Backwards compatibility. This to ensure we fully > understand the "why" and "what" before the "how". Unfortunately that > template and procedure hasn't been exercised much yet, only in NEP 38 [7] > and partially in NEP 41 [8]. > > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image > (Juan) and CuPy (Leo, on the PR review) all saying they don't understand > the goals, relevance, target audience, or how they're supposed to use a new > feature, that indicates that the people doing the writing and having the > discussion are doing something wrong at a very fundamental level. > > At this point I'm pretty disappointed in and tired of how we write and > discuss NEPs on technical topics like dispatching, dtypes and the like. > People literally refuse to write down concrete motivations, goals and > non-goals, code that's problematic now and will be better/working post-NEP > and usage examples before launching into extensive discussion of the gory > details of the internals. I'm not sure what to do about it. Completely > separate API and behavior proposals from implementation proposals? Make > separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo > on the API team which then needs to approve every API change in new NEPs? > Offer to co-write NEPs if someone is willing but doesn't understand how to > go about it? Keep the current structure/process but veto further approvals > until NEP authors get it right? > I think the NEP template is great, and we should try to be more diligent about following it! My own NEP 37 (__array_module__) is probably a good example of poor presentation due to not following the template structure. It goes pretty deep into low-level motivation and some implementation details before usage examples. Speaking just for myself, I would have appreciated a friendly nudge to use the template. Certainly I think it would be fine to require using the template for newly submitted NEPs. I did not remember about it when I started drafting NEP 37, and it definitely would have helped. I may still try to do a revision at some point to use the template structure. > I want to make an exception for merging the current NEP, for which the > plan is to merge it as experimental to try in downstream PRs and get more > experience. That does mean that master will be in an unreleasable state by > the way, which is unusual and it'd be nice to get Chuck's explicit OK for > that. But after that, I think we need a change here. I would like to hear > what everyone thinks is the shape that change should take - any of my above > suggestions, or something else? > > > >> > Finally as a minor point, I know we are mostly (ex-)academics but this >> necessity of formal language on NEPs is self-imposed (probably PEPs are to >> blame) and not quite helping. It can be a bit more descriptive in my >> external opinion. >> >> TBH, I don't really know how to solve that point, so if you have any >> specific suggestions, that's certainly welcome. I understand the >> frustration for a reader trying to understand all the details, with >> many being only described in NEP-18 [3], but we also strive to avoid >> rewriting things that are written elsewhere, which would also >> overburden those who are aware of what's being discussed. >> >> >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >> discussion) that NEPs are getting pretty inaccessible. In a sense these are >> difficult topics and readers should be expected to have *some* familiarity >> with the topics being discussed, but perhaps more effort should be put into >> the context/motivation/background of a NEP before accepting it. One way to >> ensure this might be to require a final proofreading step by someone who >> has not been involved at all in the discussions, like peer review does for >> papers. >> > > Some variant of this proposal would be my preference. > > Cheers, > Ralf > > >> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >> [2] >> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >> [5] >> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > [7] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > [8] > https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > >> >> >> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias >> wrote: >> > >> > I?ve generally been on the ?let the NumPy devs worry about it? side of >> things, but I do agree with Ilhan that `like=` is confusing and `typeof=` >> would be a much more appropriate name for that parameter. >> > >> > I do think library writers are NumPy users and so I wouldn?t really >> make that distinction, though. Users writing their own analysis code could >> very well be interested in writing code using numpy functions that will >> transparently work when the input is a CuPy array or whatever. >> > >> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >> discussion) that NEPs are getting pretty inaccessible. In a sense these are >> difficult topics and readers should be expected to have *some* familiarity >> with the topics being discussed, but perhaps more effort should be put into >> the context/motivation/background of a NEP before accepting it. One way to >> ensure this might be to require a final proofreading step by someone who >> has not been involved at all in the discussions, like peer review does for >> papers. >> > >> > Food for thought. >> > >> > Juan. >> > >> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >> > >> > For what is worth, as a potential consumer in SciPy, it really doesn't >> say anything (both in NEP and the PR) about how the regular users of NumPy >> will benefit from this. If only and only 3rd parties are going to benefit >> from it, I am not sure adding a new keyword to an already confusing >> function is the right thing to do. >> > >> > Let me clarify, >> > >> > - This is already a very (I mean extremely very) easy keyword name to >> confuse with ones_like, zeros_like and by its nature any other >> interpretation. It is not signalling anything about the functionality that >> is being discussed. I would seriously consider reserving such obvious names >> for really obvious tasks. Because you would also expect the shape and ndim >> would be mimicked by the "like"d argument but it turns out it is acting >> more like "typeof=" and not "like=" at all. Because if we follow the >> semantics it reads as "make your argument asarray like the other thing" but >> it is actually doing, "make your argument an array with the other thing's >> type" which might not be an array after all. >> > >> > - Again, if this is meant for downstream libraries (because that's what >> I got out of the PR discussion, cupy, dask, and JAX were the only examples >> I could read) then hiding it in another function and writing with capital >> letters "this is not meant for numpy users" would be a much more convenient >> way to separate the target audience and regular users. >> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may >> be would be quite clean and to the point with no ambiguous keywords. >> > >> > I think, arriving to an agreement would be much faster if there is an >> executive summary of who this is intended for and what the regular usage >> is. Because with no offense, all I see is "dispatch", "_array_function_" >> and a lot of technical details of which I am absolutely ignorant. >> > >> > Finally as a minor point, I know we are mostly (ex-)academics but this >> necessity of formal language on NEPs is self-imposed (probably PEPs are to >> blame) and not quite helping. It can be a bit more descriptive in my >> external opinion. >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Thu Aug 13 17:14:42 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Thu, 13 Aug 2020 15:14:42 -0600 Subject: [Numpy-discussion] Use of booleans in slices Message-ID: I noticed that np.bool_.__index__() gives a DeprecationWarning >>> np.bool_(True).__index__() __main__:1: DeprecationWarning: In future, it will be an error for 'np.bool_' scalars to be interpreted as an index 1 This is good, because booleans don't actually act like integers in indexing contexts. However, raw Python bools also allow __index__() >>> True.__index__() 1 A consequence of this is that NumPy slices allow booleans, as long as they are the Python type (if you use the NumPy bool_ type you get the deprecation warning). >>> a = np.arange(10) >>> a[True:] array([1, 2, 3, 4, 5, 6, 7, 8, 9]) Should this behavior also be considered deprecated? Presumably deprecating bool.__index__() in Python is a no-go, but it could be deprecated in NumPy contexts (in the pure Python collections, booleans don't have a special indexing meaning anyway). Interestingly, places that use a shape don't allow booleans (I guess they don't necessarily use __index__()?) >>> np.empty((True,)) Traceback (most recent call last): File "", line 1, in TypeError: an integer is required Aaron Meurer From jni at fastmail.com Thu Aug 13 23:00:50 2020 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Fri, 14 Aug 2020 13:00:50 +1000 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Hello everyone again! A few clarifications about my proposal of external peer review: - Yes, all this work is public and announced on the mailing list. However, I don?t think there?s a single person in this discussion or even this whole ecosystem that does not have a more immediately-pressing and also virtually infinite to-do list, so it?s unreasonable to expect that generally they would do more than glance at the stuff in the mailing list. In the peer review analogy, the mailing list is like the arXiv or Biorxiv stream ? yep, anyone can see the stuff on there and comment, but most people just don?t have the time or attention to grab onto that. The only reason I stopped to comment here is Sebastian?s ?Imma merge, YOLO!?, which had me raising my eyebrows real high. ? Especially for something that would expand the NumPy API! - So, my proposal is that there needs to be an *editor* of NEPs who takes responsibility, once they are themselves satisfied with the NEP, for seeking out external reviewers and pinging them individually and asking them if they would be ok to review. - A good friend who does screenwriting once told me, ?don?t use all your proofreaders at once?. You want to get feedback, improve things, then feedback from a *totally independent* new person who can see the document with fresh eyes. Obviously, all of the above slows things down. But ?alone we go fast, together we go far?. The point of a NEP is to document critical decisions for the long term health of the project. If the documentation is insufficient, it defeats the whole purpose. Might as well just implement stuff and skip the whole NEP process. (Side note: Stephan, I for one would definitely appreciate an update to existing NEPs if there?s obvious ways they can be improved!) I do think that NEP templates should be strict, and I don?t think that is incompatible with plain, jargon-free text. The NEP template and guidelines should specify that, and that the motivation should be understandable by a casual NumPy user ? the kind described by Ilhan, for whom bare NumPy actually meets all their needs. Maybe they?ve also used PyTorch but they?ve never really had cause to mix them or write a program that worked with both kinds of arrays. Ditto for backwards compatibility ? everyone should be clear when their existing code is going to be broken. Actually NEP18 broke so much of my code, but its Backward compatibility section basically says all good! https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility Anywho, as always, none of this is criticism to work done ? I thank you all, and am eternally grateful for all the hard work everyone is doing to keep the ecosystem from fragmenting. I?m just hoping that this discussion can improve the process going forward! And, yes, apologies to Peter, I know from repeated personal experience how frustrating it can be to have last-minute drive-by objections after months of consensus building! But I think in the end every time that happened the end result was better ? I hope the same is true here! And yes, I?ll reiterate Ralf?s point: my concerns are about the NEP process itself rather than this one. I?ll summarise my proposal: - strict NEP template. NEPs with missing sections will not be accepted. - sections Abstract, Motivation, and Backwards Compatibility should be understandable at a high level by casual users with ~zero background on the topic - enforce the above with at least two independent rounds of coordinated peer review. Thank you, Juan. > On 14 Aug 2020, at 5:29 am, Stephan Hoyer wrote: > > On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers > wrote: > Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. > > To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. > > > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev > wrote: > > > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. > > This is what I intended to do in the Usage Guidance [2] section. Could > you elaborate on what more information you'd want to see there? Or is > it just a matter of reorganizing the NEP a bit to try and summarize > such things right at the top? > > We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. > > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. > > At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? > > I think the NEP template is great, and we should try to be more diligent about following it! > > My own NEP 37 (__array_module__) is probably a good example of poor presentation due to not following the template structure. It goes pretty deep into low-level motivation and some implementation details before usage examples. > > Speaking just for myself, I would have appreciated a friendly nudge to use the template. Certainly I think it would be fine to require using the template for newly submitted NEPs. I did not remember about it when I started drafting NEP 37, and it definitely would have helped. I may still try to do a revision at some point to use the template structure. > > I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? > > > > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. > > TBH, I don't really know how to solve that point, so if you have any > specific suggestions, that's certainly welcome. I understand the > frustration for a reader trying to understand all the details, with > many being only described in NEP-18 [3], but we also strive to avoid > rewriting things that are written elsewhere, which would also > overburden those who are aware of what's being discussed. > > > > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. > > Some variant of this proposal would be my preference. > > Cheers, > Ralf > > > [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 > [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance > [3] https://numpy.org/neps/nep-0018-array-function-protocol.html > [4] https://numpy.org/neps/nep-0000.html#nep-workflow > [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html > > [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst > [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst > > > > > > On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias > wrote: > > > > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. > > > > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. > > > > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. > > > > Food for thought. > > > > Juan. > > > > On 13 Aug 2020, at 9:24 am, Ilhan Polat > wrote: > > > > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. > > > > Let me clarify, > > > > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. > > > > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. > > > > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. > > > > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at entschev.com Fri Aug 14 07:21:00 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Fri, 14 Aug 2020 13:21:00 +0200 Subject: [Numpy-discussion] NEP Procedure Discussion Message-ID: Hi all, During the discussion about NEP-35, there have been lots of discussions around the NEP process itself. In the interest of allowing people who are mostly interested in this discussion and to avoid drifting so much off-topic in that thread, I'm starting this new thread to discuss the NEP procedure. A few questions that have been raised so far: - Is the NEP Template [1] a guideline to be strictly followed or a suggestion for authors? - Who should decide when a NEP is sufficiently clear? - Should a NEP PR be merged at all until it's sufficiently clear or should it only be merged even in Draft state only after it's sufficiently clear? - What parts of the NEP are necessary to be clear for everyone? Just Abstract? Motivation and Scope? Everything, including the real technical details of implementation? - Would it be possible to have proof-readers -- preferably people who are not at all involved in the NEP's topic? Please feel free to comment on that and add any major points I might have missed. Best, Peter [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst From peter at entschev.com Fri Aug 14 07:21:17 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Fri, 14 Aug 2020 13:21:17 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: Hi all, This thread has IMO drifted very far from its original purpose, due to that I decided to start a new thread specifically for the general NEP procedure discussed, please check your mail for "NEP Procedure Discussion" subject. On the topic of this thread, I'll try to rewrite NEP-35 to make it more accessible and ping back here once I have a PR for that. Is there anything else that's pressing here? If there is and I missed/forgot about it, please let me know. Best, Peter On Fri, Aug 14, 2020 at 5:00 AM Juan Nunez-Iglesias wrote: > Hello everyone again! > > A few clarifications about my proposal of external peer review: > > - Yes, all this work is public and announced on the mailing list. However, > I don?t think there?s a single person in this discussion or even this whole > ecosystem that does not have a more immediately-pressing and also virtually > infinite to-do list, so it?s unreasonable to expect that generally they > would do more than glance at the stuff in the mailing list. In the peer > review analogy, the mailing list is like the arXiv or Biorxiv stream ? yep, > anyone can see the stuff on there and comment, but most people just don?t > have the time or attention to grab onto that. The only reason I stopped to > comment here is Sebastian?s ?Imma merge, YOLO!?, which had me raising my > eyebrows real high. ? Especially for something that would expand the NumPy > API! > > - So, my proposal is that there needs to be an *editor* of NEPs who takes > responsibility, once they are themselves satisfied with the NEP, for > seeking out external reviewers and pinging them individually and asking > them if they would be ok to review. > > - A good friend who does screenwriting once told me, ?don?t use all your > proofreaders at once?. You want to get feedback, improve things, then > feedback from a *totally independent* new person who can see the document > with fresh eyes. > > Obviously, all of the above slows things down. But ?alone we go fast, > together we go far?. The point of a NEP is to document critical decisions > for the long term health of the project. If the documentation is > insufficient, it defeats the whole purpose. Might as well just implement > stuff and skip the whole NEP process. (Side note: Stephan, I for one would > definitely appreciate an update to existing NEPs if there?s obvious ways > they can be improved!) > > I do think that NEP templates should be strict, and I don?t think that is > incompatible with plain, jargon-free text. The NEP template and guidelines > should specify that, and that the motivation should be understandable by a > casual NumPy user ? the kind described by Ilhan, for whom bare NumPy > actually meets all their needs. Maybe they?ve also used PyTorch but they?ve > never really had cause to mix them or write a program that worked with both > kinds of arrays. > > Ditto for backwards compatibility ? everyone should be clear when their > existing code is going to be broken. Actually NEP18 broke so much of my > code, but its Backward compatibility section basically says all good! > https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility > > > Anywho, as always, none of this is criticism to work done ? I thank you > all, and am eternally grateful for all the hard work everyone is doing to > keep the ecosystem from fragmenting. I?m just hoping that this discussion > can improve the process going forward! > > And, yes, apologies to Peter, I know from repeated personal experience how > frustrating it can be to have last-minute drive-by objections after months > of consensus building! But I think in the end every time that happened the > end result was better ? I hope the same is true here! And yes, I?ll > reiterate Ralf?s point: my concerns are about the NEP process itself rather > than this one. I?ll summarise my proposal: > > - strict NEP template. NEPs with missing sections will not be accepted. > - sections Abstract, Motivation, and Backwards Compatibility should be > understandable at a high level by casual users with ~zero background on the > topic > - enforce the above with at least two independent rounds of coordinated > peer review. > > Thank you, > > Juan. > > On 14 Aug 2020, at 5:29 am, Stephan Hoyer wrote: > > On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers > wrote: > >> Thanks for raising these concerns Ilhan and Juan, and for answering >> Peter. Let me give my perspective as well. >> >> To start with, this is not specifically about Peter's NEP and PR. NEP 35 >> simply follows the pattern set by previous PRs, and given its tight scope >> is less difficult to understand than other NEPs on such technical topics. >> Peter has done a lot of things right, and is close to the finish line. >> >> >> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < >> peter at entschev.com> wrote: >> >>> >>> > I think, arriving to an agreement would be much faster if there is an >>> executive summary of who this is intended for and what the regular usage >>> is. Because with no offense, all I see is "dispatch", "_array_function_" >>> and a lot of technical details of which I am absolutely ignorant. >>> >>> This is what I intended to do in the Usage Guidance [2] section. Could >>> you elaborate on what more information you'd want to see there? Or is >>> it just a matter of reorganizing the NEP a bit to try and summarize >>> such things right at the top? >>> >> >> We adapted the NEP template [6] several times last year to try and >> improve this. And specified in there as well that NEP content set to the >> mailing list should only contain the sections: Abstract, Motivation and >> Scope, Usage and Impact, and Backwards compatibility. This to ensure we >> fully understand the "why" and "what" before the "how". Unfortunately that >> template and procedure hasn't been exercised much yet, only in NEP 38 [7] >> and partially in NEP 41 [8]. >> >> If we have long-time maintainers of SciPy (Ilhan and myself), >> scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't >> understand the goals, relevance, target audience, or how they're supposed >> to use a new feature, that indicates that the people doing the writing and >> having the discussion are doing something wrong at a very fundamental level. >> >> >> At this point I'm pretty disappointed in and tired of how we write and >> discuss NEPs on technical topics like dispatching, dtypes and the like. >> People literally refuse to write down concrete motivations, goals and >> non-goals, code that's problematic now and will be better/working post-NEP >> and usage examples before launching into extensive discussion of the gory >> details of the internals. I'm not sure what to do about it. Completely >> separate API and behavior proposals from implementation proposals? Make >> separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo >> on the API team which then needs to approve every API change in new NEPs? >> Offer to co-write NEPs if someone is willing but doesn't understand how to >> go about it? Keep the current structure/process but veto further approvals >> until NEP authors get it right? >> > > I think the NEP template is great, and we should try to be more diligent > about following it! > > My own NEP 37 (__array_module__) is probably a good example of poor > presentation due to not following the template structure. It goes pretty > deep into low-level motivation and some implementation details before usage > examples. > > Speaking just for myself, I would have appreciated a friendly nudge to use > the template. Certainly I think it would be fine to require using the > template for newly submitted NEPs. I did not remember about it when I > started drafting NEP 37, and it definitely would have helped. I may still > try to do a revision at some point to use the template structure. > > >> I want to make an exception for merging the current NEP, for which the >> plan is to merge it as experimental to try in downstream PRs and get more >> experience. That does mean that master will be in an unreleasable state by >> the way, which is unusual and it'd be nice to get Chuck's explicit OK for >> that. But after that, I think we need a change here. I would like to hear >> what everyone thinks is the shape that change should take - any of my above >> suggestions, or something else? >> >> >> >>> > Finally as a minor point, I know we are mostly (ex-)academics but this >>> necessity of formal language on NEPs is self-imposed (probably PEPs are to >>> blame) and not quite helping. It can be a bit more descriptive in my >>> external opinion. >>> >>> TBH, I don't really know how to solve that point, so if you have any >>> specific suggestions, that's certainly welcome. I understand the >>> frustration for a reader trying to understand all the details, with >>> many being only described in NEP-18 [3], but we also strive to avoid >>> rewriting things that are written elsewhere, which would also >>> overburden those who are aware of what's being discussed. >>> >>> >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >>> discussion) that NEPs are getting pretty inaccessible. In a sense these are >>> difficult topics and readers should be expected to have *some* familiarity >>> with the topics being discussed, but perhaps more effort should be put into >>> the context/motivation/background of a NEP before accepting it. One way to >>> ensure this might be to require a final proofreading step by someone who >>> has not been involved at all in the discussions, like peer review does for >>> papers. >>> >> >> Some variant of this proposal would be my preference. >> >> Cheers, >> Ralf >> >> >>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >>> [2] >>> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >>> [5] >>> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html >> >> >> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >> [7] >> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst >> [8] >> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst >> >> >> >>> >>> >>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias >>> wrote: >>> > >>> > I?ve generally been on the ?let the NumPy devs worry about it? side of >>> things, but I do agree with Ilhan that `like=` is confusing and `typeof=` >>> would be a much more appropriate name for that parameter. >>> > >>> > I do think library writers are NumPy users and so I wouldn?t really >>> make that distinction, though. Users writing their own analysis code could >>> very well be interested in writing code using numpy functions that will >>> transparently work when the input is a CuPy array or whatever. >>> > >>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >>> discussion) that NEPs are getting pretty inaccessible. In a sense these are >>> difficult topics and readers should be expected to have *some* familiarity >>> with the topics being discussed, but perhaps more effort should be put into >>> the context/motivation/background of a NEP before accepting it. One way to >>> ensure this might be to require a final proofreading step by someone who >>> has not been involved at all in the discussions, like peer review does for >>> papers. >>> > >>> > Food for thought. >>> > >>> > Juan. >>> > >>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >>> > >>> > For what is worth, as a potential consumer in SciPy, it really doesn't >>> say anything (both in NEP and the PR) about how the regular users of NumPy >>> will benefit from this. If only and only 3rd parties are going to benefit >>> from it, I am not sure adding a new keyword to an already confusing >>> function is the right thing to do. >>> > >>> > Let me clarify, >>> > >>> > - This is already a very (I mean extremely very) easy keyword name to >>> confuse with ones_like, zeros_like and by its nature any other >>> interpretation. It is not signalling anything about the functionality that >>> is being discussed. I would seriously consider reserving such obvious names >>> for really obvious tasks. Because you would also expect the shape and ndim >>> would be mimicked by the "like"d argument but it turns out it is acting >>> more like "typeof=" and not "like=" at all. Because if we follow the >>> semantics it reads as "make your argument asarray like the other thing" but >>> it is actually doing, "make your argument an array with the other thing's >>> type" which might not be an array after all. >>> > >>> > - Again, if this is meant for downstream libraries (because that's >>> what I got out of the PR discussion, cupy, dask, and JAX were the only >>> examples I could read) then hiding it in another function and writing with >>> capital letters "this is not meant for numpy users" would be a much more >>> convenient way to separate the target audience and regular users. >>> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may >>> be would be quite clean and to the point with no ambiguous keywords. >>> > >>> > I think, arriving to an agreement would be much faster if there is an >>> executive summary of who this is intended for and what the regular usage >>> is. Because with no offense, all I see is "dispatch", "_array_function_" >>> and a lot of technical details of which I am absolutely ignorant. >>> > >>> > Finally as a minor point, I know we are mostly (ex-)academics but this >>> necessity of formal language on NEPs is self-imposed (probably PEPs are to >>> blame) and not quite helping. It can be a bit more descriptive in my >>> external opinion. >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Fri Aug 14 08:35:53 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Fri, 14 Aug 2020 14:35:53 +0200 Subject: [Numpy-discussion] NEP Procedure Discussion In-Reply-To: References: Message-ID: Also, not to be a complete slacker, I'd like to add to this list; - How can I help as an external lib maintainer? - Do you even want us to get involved before the final draft? Or wait until internal discussion finishes? On Fri, Aug 14, 2020 at 1:23 PM Peter Andreas Entschev wrote: > Hi all, > > During the discussion about NEP-35, there have been lots of > discussions around the NEP process itself. In the interest of allowing > people who are mostly interested in this discussion and to avoid > drifting so much off-topic in that thread, I'm starting this new > thread to discuss the NEP procedure. > > A few questions that have been raised so far: > > - Is the NEP Template [1] a guideline to be strictly followed or a > suggestion for authors? > - Who should decide when a NEP is sufficiently clear? > - Should a NEP PR be merged at all until it's sufficiently clear or > should it only be merged even in Draft state only after it's > sufficiently clear? > - What parts of the NEP are necessary to be clear for everyone? Just > Abstract? Motivation and Scope? Everything, including the real > technical details of implementation? > - Would it be possible to have proof-readers -- preferably people who > are not at all involved in the NEP's topic? > > Please feel free to comment on that and add any major points I might > have missed. > > Best, > Peter > > [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Fri Aug 14 10:04:36 2020 From: adrin.jalali at gmail.com (Adrin) Date: Fri, 14 Aug 2020 16:04:36 +0200 Subject: [Numpy-discussion] NEP Procedure Discussion In-Reply-To: References: Message-ID: Somewhat relevant, this is the discussion around the same topic we've been having in scikit-learn: https://github.com/scikit-learn/enhancement_proposals/pull/30 On Fri, Aug 14, 2020 at 2:36 PM Ilhan Polat wrote: > Also, not to be a complete slacker, I'd like to add to this list; > > - How can I help as an external lib maintainer? > - Do you even want us to get involved before the final draft? Or wait > until internal discussion finishes? > > > > > On Fri, Aug 14, 2020 at 1:23 PM Peter Andreas Entschev > wrote: > >> Hi all, >> >> During the discussion about NEP-35, there have been lots of >> discussions around the NEP process itself. In the interest of allowing >> people who are mostly interested in this discussion and to avoid >> drifting so much off-topic in that thread, I'm starting this new >> thread to discuss the NEP procedure. >> >> A few questions that have been raised so far: >> >> - Is the NEP Template [1] a guideline to be strictly followed or a >> suggestion for authors? >> - Who should decide when a NEP is sufficiently clear? >> - Should a NEP PR be merged at all until it's sufficiently clear or >> should it only be merged even in Draft state only after it's >> sufficiently clear? >> - What parts of the NEP are necessary to be clear for everyone? Just >> Abstract? Motivation and Scope? Everything, including the real >> technical details of implementation? >> - Would it be possible to have proof-readers -- preferably people who >> are not at all involved in the NEP's topic? >> >> Please feel free to comment on that and add any major points I might >> have missed. >> >> Best, >> Peter >> >> [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Aug 14 10:09:39 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 14 Aug 2020 09:09:39 -0500 Subject: [Numpy-discussion] Use of booleans in slices In-Reply-To: References: Message-ID: <733a94eb028de2cd1a259a4a6e32252c90719e60.camel@sipsolutions.net> This is because slicing with a boolean has just no confusing meaning I can think of [1]. NumPy even used to reject it, but there seems no reason to add maintenance/code complexity (i.e. duplicate code from Python already provides) to reject bools. There used to be a reason to using `__index__()` in slices, because Python did not. But now Python caught up. - Sebastian [1] I have never seen anyone index with a bool, and I have no conception of what `arr[masked:not_masked]` would mean. There is not a small step from `arr[True]` to `arr[True:]`. On Thu, 2020-08-13 at 15:14 -0600, Aaron Meurer wrote: > I noticed that np.bool_.__index__() gives a DeprecationWarning > > > > > np.bool_(True).__index__() > __main__:1: DeprecationWarning: In future, it will be an error for > 'np.bool_' scalars to be interpreted as an index > 1 > > This is good, because booleans don't actually act like integers in > indexing contexts. However, raw Python bools also allow __index__() > > > > > True.__index__() > 1 > > A consequence of this is that NumPy slices allow booleans, as long as > they are the Python type (if you use the NumPy bool_ type you get the > deprecation warning). > > > > > a = np.arange(10) > > > > a[True:] > array([1, 2, 3, 4, 5, 6, 7, 8, 9]) > > Should this behavior also be considered deprecated? Presumably > deprecating bool.__index__() in Python is a no-go, but it could be > deprecated in NumPy contexts (in the pure Python collections, > booleans > don't have a special indexing meaning anyway). > > Interestingly, places that use a shape don't allow booleans (I guess > they don't necessarily use __index__()?) > > > > > np.empty((True,)) > Traceback (most recent call last): > File "", line 1, in > TypeError: an integer is required > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From melissawm at gmail.com Fri Aug 14 17:20:04 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 14 Aug 2020 18:20:04 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday August 17 In-Reply-To: References: Message-ID: Hi all! This is a reminder that our next Documentation Team meeting will be on *Monday, August 17* at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200817T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Aug 16 07:41:08 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Aug 2020 12:41:08 +0100 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: On Fri, Aug 14, 2020 at 12:23 PM Peter Andreas Entschev wrote: > Hi all, > > This thread has IMO drifted very far from its original purpose, due to > that I decided to start a new thread specifically for the general NEP > procedure discussed, please check your mail for "NEP Procedure Discussion" > subject. > Thanks Peter. For future reference: better to just edit the thread subject, but not start over completely - people want to reply to previous content. I will copy over comments I'd like to reply to to the other thread by hand now. > On the topic of this thread, I'll try to rewrite NEP-35 to make it more > accessible and ping back here once I have a PR for that. > Thanks! Cheers, Ralf Is there anything else that's pressing here? If there is and I > missed/forgot about it, please let me know. > > Best, > Peter > > On Fri, Aug 14, 2020 at 5:00 AM Juan Nunez-Iglesias > wrote: > >> Hello everyone again! >> >> A few clarifications about my proposal of external peer review: >> >> - Yes, all this work is public and announced on the mailing list. >> However, I don?t think there?s a single person in this discussion or even >> this whole ecosystem that does not have a more immediately-pressing and >> also virtually infinite to-do list, so it?s unreasonable to expect that >> generally they would do more than glance at the stuff in the mailing list. >> In the peer review analogy, the mailing list is like the arXiv or Biorxiv >> stream ? yep, anyone can see the stuff on there and comment, but most >> people just don?t have the time or attention to grab onto that. The only >> reason I stopped to comment here is Sebastian?s ?Imma merge, YOLO!?, which >> had me raising my eyebrows real high. ? Especially for something that >> would expand the NumPy API! >> >> - So, my proposal is that there needs to be an *editor* of NEPs who takes >> responsibility, once they are themselves satisfied with the NEP, for >> seeking out external reviewers and pinging them individually and asking >> them if they would be ok to review. >> >> - A good friend who does screenwriting once told me, ?don?t use all your >> proofreaders at once?. You want to get feedback, improve things, then >> feedback from a *totally independent* new person who can see the document >> with fresh eyes. >> >> Obviously, all of the above slows things down. But ?alone we go fast, >> together we go far?. The point of a NEP is to document critical decisions >> for the long term health of the project. If the documentation is >> insufficient, it defeats the whole purpose. Might as well just implement >> stuff and skip the whole NEP process. (Side note: Stephan, I for one would >> definitely appreciate an update to existing NEPs if there?s obvious ways >> they can be improved!) >> >> I do think that NEP templates should be strict, and I don?t think that is >> incompatible with plain, jargon-free text. The NEP template and guidelines >> should specify that, and that the motivation should be understandable by a >> casual NumPy user ? the kind described by Ilhan, for whom bare NumPy >> actually meets all their needs. Maybe they?ve also used PyTorch but they?ve >> never really had cause to mix them or write a program that worked with both >> kinds of arrays. >> >> Ditto for backwards compatibility ? everyone should be clear when their >> existing code is going to be broken. Actually NEP18 broke so much of my >> code, but its Backward compatibility section basically says all good! >> https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility >> >> >> Anywho, as always, none of this is criticism to work done ? I thank you >> all, and am eternally grateful for all the hard work everyone is doing to >> keep the ecosystem from fragmenting. I?m just hoping that this discussion >> can improve the process going forward! >> >> And, yes, apologies to Peter, I know from repeated personal experience >> how frustrating it can be to have last-minute drive-by objections after >> months of consensus building! But I think in the end every time that >> happened the end result was better ? I hope the same is true here! And yes, >> I?ll reiterate Ralf?s point: my concerns are about the NEP process itself >> rather than this one. I?ll summarise my proposal: >> >> - strict NEP template. NEPs with missing sections will not be accepted. >> - sections Abstract, Motivation, and Backwards Compatibility should be >> understandable at a high level by casual users with ~zero background on the >> topic >> - enforce the above with at least two independent rounds of coordinated >> peer review. >> >> Thank you, >> >> Juan. >> >> On 14 Aug 2020, at 5:29 am, Stephan Hoyer wrote: >> >> On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers >> wrote: >> >>> Thanks for raising these concerns Ilhan and Juan, and for answering >>> Peter. Let me give my perspective as well. >>> >>> To start with, this is not specifically about Peter's NEP and PR. NEP 35 >>> simply follows the pattern set by previous PRs, and given its tight scope >>> is less difficult to understand than other NEPs on such technical topics. >>> Peter has done a lot of things right, and is close to the finish line. >>> >>> >>> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev < >>> peter at entschev.com> wrote: >>> >>>> >>>> > I think, arriving to an agreement would be much faster if there is an >>>> executive summary of who this is intended for and what the regular usage >>>> is. Because with no offense, all I see is "dispatch", "_array_function_" >>>> and a lot of technical details of which I am absolutely ignorant. >>>> >>>> This is what I intended to do in the Usage Guidance [2] section. Could >>>> you elaborate on what more information you'd want to see there? Or is >>>> it just a matter of reorganizing the NEP a bit to try and summarize >>>> such things right at the top? >>>> >>> >>> We adapted the NEP template [6] several times last year to try and >>> improve this. And specified in there as well that NEP content set to the >>> mailing list should only contain the sections: Abstract, Motivation and >>> Scope, Usage and Impact, and Backwards compatibility. This to ensure we >>> fully understand the "why" and "what" before the "how". Unfortunately that >>> template and procedure hasn't been exercised much yet, only in NEP 38 [7] >>> and partially in NEP 41 [8]. >>> >>> If we have long-time maintainers of SciPy (Ilhan and myself), >>> scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't >>> understand the goals, relevance, target audience, or how they're supposed >>> to use a new feature, that indicates that the people doing the writing and >>> having the discussion are doing something wrong at a very fundamental level. >>> >>> >>> At this point I'm pretty disappointed in and tired of how we write and >>> discuss NEPs on technical topics like dispatching, dtypes and the like. >>> People literally refuse to write down concrete motivations, goals and >>> non-goals, code that's problematic now and will be better/working post-NEP >>> and usage examples before launching into extensive discussion of the gory >>> details of the internals. I'm not sure what to do about it. Completely >>> separate API and behavior proposals from implementation proposals? Make >>> separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo >>> on the API team which then needs to approve every API change in new NEPs? >>> Offer to co-write NEPs if someone is willing but doesn't understand how to >>> go about it? Keep the current structure/process but veto further approvals >>> until NEP authors get it right? >>> >> >> I think the NEP template is great, and we should try to be more diligent >> about following it! >> >> My own NEP 37 (__array_module__) is probably a good example of poor >> presentation due to not following the template structure. It goes pretty >> deep into low-level motivation and some implementation details before usage >> examples. >> >> Speaking just for myself, I would have appreciated a friendly nudge to >> use the template. Certainly I think it would be fine to require using the >> template for newly submitted NEPs. I did not remember about it when I >> started drafting NEP 37, and it definitely would have helped. I may still >> try to do a revision at some point to use the template structure. >> >> >>> I want to make an exception for merging the current NEP, for which the >>> plan is to merge it as experimental to try in downstream PRs and get more >>> experience. That does mean that master will be in an unreleasable state by >>> the way, which is unusual and it'd be nice to get Chuck's explicit OK for >>> that. But after that, I think we need a change here. I would like to hear >>> what everyone thinks is the shape that change should take - any of my above >>> suggestions, or something else? >>> >>> >>> >>>> > Finally as a minor point, I know we are mostly (ex-)academics but >>>> this necessity of formal language on NEPs is self-imposed (probably PEPs >>>> are to blame) and not quite helping. It can be a bit more descriptive in my >>>> external opinion. >>>> >>>> TBH, I don't really know how to solve that point, so if you have any >>>> specific suggestions, that's certainly welcome. I understand the >>>> frustration for a reader trying to understand all the details, with >>>> many being only described in NEP-18 [3], but we also strive to avoid >>>> rewriting things that are written elsewhere, which would also >>>> overburden those who are aware of what's being discussed. >>>> >>>> >>>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are >>>> difficult topics and readers should be expected to have *some* familiarity >>>> with the topics being discussed, but perhaps more effort should be put into >>>> the context/motivation/background of a NEP before accepting it. One way to >>>> ensure this might be to require a final proofreading step by someone who >>>> has not been involved at all in the discussions, like peer review does for >>>> papers. >>>> >>> >>> Some variant of this proposal would be my preference. >>> >>> Cheers, >>> Ralf >>> >>> >>>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >>>> [2] >>>> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >>>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >>>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >>>> [5] >>>> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html >>> >>> >>> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >>> [7] >>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst >>> [8] >>> https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst >>> >>> >>> >>>> >>>> >>>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias >>>> wrote: >>>> > >>>> > I?ve generally been on the ?let the NumPy devs worry about it? side >>>> of things, but I do agree with Ilhan that `like=` is confusing and >>>> `typeof=` would be a much more appropriate name for that parameter. >>>> > >>>> > I do think library writers are NumPy users and so I wouldn?t really >>>> make that distinction, though. Users writing their own analysis code could >>>> very well be interested in writing code using numpy functions that will >>>> transparently work when the input is a CuPy array or whatever. >>>> > >>>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP >>>> discussion) that NEPs are getting pretty inaccessible. In a sense these are >>>> difficult topics and readers should be expected to have *some* familiarity >>>> with the topics being discussed, but perhaps more effort should be put into >>>> the context/motivation/background of a NEP before accepting it. One way to >>>> ensure this might be to require a final proofreading step by someone who >>>> has not been involved at all in the discussions, like peer review does for >>>> papers. >>>> > >>>> > Food for thought. >>>> > >>>> > Juan. >>>> > >>>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >>>> > >>>> > For what is worth, as a potential consumer in SciPy, it really >>>> doesn't say anything (both in NEP and the PR) about how the regular users >>>> of NumPy will benefit from this. If only and only 3rd parties are going to >>>> benefit from it, I am not sure adding a new keyword to an already confusing >>>> function is the right thing to do. >>>> > >>>> > Let me clarify, >>>> > >>>> > - This is already a very (I mean extremely very) easy keyword name to >>>> confuse with ones_like, zeros_like and by its nature any other >>>> interpretation. It is not signalling anything about the functionality that >>>> is being discussed. I would seriously consider reserving such obvious names >>>> for really obvious tasks. Because you would also expect the shape and ndim >>>> would be mimicked by the "like"d argument but it turns out it is acting >>>> more like "typeof=" and not "like=" at all. Because if we follow the >>>> semantics it reads as "make your argument asarray like the other thing" but >>>> it is actually doing, "make your argument an array with the other thing's >>>> type" which might not be an array after all. >>>> > >>>> > - Again, if this is meant for downstream libraries (because that's >>>> what I got out of the PR discussion, cupy, dask, and JAX were the only >>>> examples I could read) then hiding it in another function and writing with >>>> capital letters "this is not meant for numpy users" would be a much more >>>> convenient way to separate the target audience and regular users. >>>> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may >>>> be would be quite clean and to the point with no ambiguous keywords. >>>> > >>>> > I think, arriving to an agreement would be much faster if there is an >>>> executive summary of who this is intended for and what the regular usage >>>> is. Because with no offense, all I see is "dispatch", "_array_function_" >>>> and a lot of technical details of which I am absolutely ignorant. >>>> > >>>> > Finally as a minor point, I know we are mostly (ex-)academics but >>>> this necessity of formal language on NEPs is self-imposed (probably PEPs >>>> are to blame) and not quite helping. It can be a bit more descriptive in my >>>> external opinion. >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Aug 16 08:12:23 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Aug 2020 13:12:23 +0100 Subject: [Numpy-discussion] NEP Procedure Discussion In-Reply-To: References: Message-ID: On Fri, Aug 14, 2020 at 1:36 PM Ilhan Polat wrote: > Also, not to be a complete slacker, I'd like to add to this list; > > - How can I help as an external lib maintainer? > - Do you even want us to get involved before the final draft? Or wait > until internal discussion finishes? > Yes, before. The internal discussion (at least of the type that's now dominant) should come after, and in a way is less important. A NEP talks to different audiences, depending on the topic. It should start by talking to the people who are impacted by the end result of the particular proposal. Most of the time, these are end users and downstream library authors. Sometimes, for example the umath-multiarray merger, that's NumPy developers. But that's a small minority of cases. Part of the issue here is that we don't have explicit roles, like a company that develops software products would have. We do have them, they're hidden though. Typically one would have: - Customers - A product manager (technical background, marketing/sales role) - Engineering manager - Software architects (sometimes multiple layers, e.g. system architect and component architects) - Domain specialists People in these roles have conversations at different levels, and the feedback travelling up and down that chain improves the product so it's both fit for purpose and of high technical quality. The current NEP conversations are equivalent to domain specialists and software architects talking about implementation while assuming to fully understand customer needs. And then when a customer asks a question, telling them "we understand your problem, and see our code does multiple dispatch and is fast C code, so it will solve the problem - please wait 6 months and then you can buy the new version of our product". That's maybe exaggerated, but not by much. Especially for already established products (like NumPy) that are being improved, getting the customer's problem statement and constraints clear has to come first. There will be some iteration, e.g. once there's a prototype new constraints or additional benefits will be discovered, and that refines the outcomes. > On Fri, Aug 14, 2020 at 1:23 PM Peter Andreas Entschev > wrote: > >> Hi all, >> >> During the discussion about NEP-35, there have been lots of >> discussions around the NEP process itself. In the interest of allowing >> people who are mostly interested in this discussion and to avoid >> drifting so much off-topic in that thread, I'm starting this new >> thread to discuss the NEP procedure. >> >> A few questions that have been raised so far: >> >> - Is the NEP Template [1] a guideline to be strictly followed or a >> suggestion for authors? >> > I agree with Juan, who said "strict NEP template. NEPs with missing sections will not be accepted". - Who should decide when a NEP is sufficiently clear? >> > Juan said: "So, my proposal is that there needs to be an *editor* of NEPs who takes responsibility, once they are themselves satisfied with the NEP, for seeking out external reviewers and pinging them individually and asking them if they would be ok to review." I quite like that too. It would be great to have a pool of NEP editors, because relying on a single editor for everything would be too much for that person. This may be a place where interested downstream library authors like Ilhan and Juan can be really helpful. > - Should a NEP PR be merged at all until it's sufficiently clear or >> should it only be merged even in Draft state only after it's >> sufficiently clear? >> > I propose: merging as Draft once the sections up to Backwards Compatibility are clear enough, while implementation can still be rough but at least outlines the direction. > - What parts of the NEP are necessary to be clear for everyone? Just >> Abstract? Motivation and Scope? Everything, including the real >> technical details of implementation? >> - Would it be possible to have proof-readers -- preferably people who >> are not at all involved in the NEP's topic? >> > Juan said "enforce the above with at least two independent rounds of coordinated peer review". I'm not sure two rounds are necessary for every single NEP (e.g. the website redesign one was pretty straightforward), but for complex technical NEPs it does seem like a good idea. In many cases, doing one round of review with 1-2 people *before* submitting as a PR could be very beneficial. Cheers, Ralf >> [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >> >> _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at entschev.com Mon Aug 17 15:55:30 2020 From: peter at entschev.com (Peter Andreas Entschev) Date: Mon, 17 Aug 2020 21:55:30 +0200 Subject: [Numpy-discussion] Experimental `like=` attribute for array creation functions In-Reply-To: References: <7ca98625-53ea-47cd-a027-d9c902742fed@Canary> <9d9ad7a26241564ec3f14866accfe840b226e1dc.camel@sipsolutions.net> <96330BE4-1CA2-4451-8FE5-357CFA7E4EDC@fastmail.com> Message-ID: As per discussed, I've opened a PR https://github.com/numpy/numpy/pull/17093 attempting to clarify some of the writing and to follow the NEP Template. As suggested in the template, please find below the top part of NEP-35 (up to and including the Backward Compatibility section). Please feel free to comment and suggest improvements or point out what may still be unclear, personally I would prefer comments directly on the PR if possible. =========================================================== NEP 35 ? Array Creation Dispatching With __array_function__ =========================================================== :Author: Peter Andreas Entschev :Status: Draft :Type: Standards Track :Created: 2019-10-15 :Updated: 2020-08-17 :Resolution: Abstract -------- We propose the introduction of a new keyword argument ``like=`` to all array creation functions, this argument permits the creation of an array based on a non-NumPy reference array passed via that argument, resulting in an array defined by the downstream library implementing that type, which also implements the ``__array_function__`` protocol. With this we address one of that protocol's shortcomings, as described by NEP 18 [1]_. Motivation and Scope -------------------- Many are the libraries implementing the NumPy API, such as Dask for graph computing, CuPy for GPGPU computing, xarray for N-D labeled arrays, etc. All the libraries mentioned have yet another thing in common: they have also adopted the ``__array_function__`` protocol. The protocol defines a mechanism allowing a user to directly use the NumPy API as a dispatcher based on the input array type. In essence, dispatching means users are able to pass a downstream array, such as a Dask array, directly to one of NumPy's compute functions, and NumPy will be able to automatically recognize that and send the work back to Dask's implementation of that function, which will define the return value. For example: .. code:: python x = dask.array.arange(5) # Creates dask.array np.sum(a) # Returns dask.array Note above how we called Dask's implementation of ``sum`` via the NumPy namespace by calling ``np.sum``, and the same would apply if we had a CuPy array or any other array from a library that adopts ``__array_function__``. This allows writing code that is agnostic to the implementation library, thus users can write their code once and still be able to use different array implementations according to their needs. Unfortunately, ``__array_function__`` has limitations, one of them being array creation functions. In the example above, NumPy was able to call Dask's implementation because the input array was a Dask array. The same is not true for array creation functions, in the example the input of ``arange`` is simply the integer ``5``, not providing any information of the array type that should be the result, that's where a reference array passed by the ``like=`` argument proposed here can be of help, as it provides NumPy with the information required to create the expected type of array. The new ``like=`` keyword proposed is solely intended to identify the downstream library where to dispatch and the object is used only as reference, meaning that no modifications, copies or processing will be performed on that object. We expect that this functionality will be mostly useful to library developers, allowing them to create new arrays for internal usage based on arrays passed by the user, preventing unnecessary creation of NumPy arrays that will ultimately lead to an additional conversion into a downstream array type. Support for Python 2.7 has been dropped since NumPy 1.17, therefore we make use of the keyword-only argument standard described in PEP-3102 [2]_ to implement ``like=``, thus preventing it from being passed by position. .. _neps.like-kwarg.usage-and-impact: Usage and Impact ---------------- To understand the intended use for ``like=``, and before we move to more complex cases, consider the following illustrative example consisting only of NumPy and CuPy arrays: .. code:: python import numpy as np import cupy def my_pad(arr, padding): padding = np.array(padding, like=arr) return np.concatenate((padding, arr, padding)) my_pad(np.arange(5), [-1, -1]) # Returns np.ndarray my_pad(cupy.arange(5), [-1, -1]) # Returns cupy.core.core.ndarray Note in the ``my_pad`` function above how ``arr`` is used as a reference to dictate what array type padding should have, before concatenating the arrays to produce the result. On the other hand, if ``like=`` wasn't used, the NumPy case case would still work, but CuPy wouldn't allow this kind of automatic conversion, ultimately raising a ``TypeError: Only cupy arrays can be concatenated`` exception. Now we should look at how a library like Dask could benefit from ``like=``. Before we understand that, it's important to understand a bit about Dask basics and ensures correctness with ``__array_function__``. Note that Dask can compute different sorts of objects, like dataframes, bags and arrays, here we will focus strictly on arrays, which are the objects we can use ``__array_function__`` with. Dask uses a graph computing model, meaning it breaks down a large problem in many smaller problems and merge their results to reach the final result. To break the problem down into smaller ones, Dask also breaks arrays into smaller arrays, that it calls "chunks". A Dask array can thus consist of one or more chunks and they may be of different types. However, in the context of ``__array_function__``, Dask only allows chunks of the same type, for example, a Dask array can be formed of several NumPy arrays or several CuPy arrays, but not a mix of both. To avoid mismatched types during compute, Dask keeps an attribute ``_meta`` as part of its array throughout computation, this attribute is used to both predict the output type at graph creation time and to create any intermediary arrays that are necessary within some function's computation. Going back to our previous example, we can use ``_meta`` information to identify what kind of array we would use for padding, as seen below: .. code:: python import numpy as np import cupy import dask.array as da from dask.array.utils import meta_from_array def my_pad(arr, padding): padding = np.array(padding, like=meta_from_array(arr)) return np.concatenate((padding, arr, padding)) # Returns dask.array my_pad(da.arange(5), [-1, -1]) # Returns dask.array my_pad(da.from_array(cupy.arange(5)), [-1, -1]) Note how ``chunktype`` in the return value above changes from ``numpy.ndarray`` in the first ``my_pad`` call to ``cupy.ndarray`` in the second. To enable proper identification of the array type we use Dask's utility function ``meta_from_array``, which was introduced as part of the work to support ``__array_function__``, allowing Dask to handle ``_meta`` appropriately. That function is primarily targeted at the library's internal usage to ensure chunks are created with correct types. Without the ``like=`` argument, it would be impossible to ensure ``my_pad`` creates a padding array with a type matching that of the input array, which would cause cause a ``TypeError`` exception to be raised by CuPy, as discussed above would happen to the CuPy case alone. Backward Compatibility ---------------------- This proposal does not raise any backward compatibility issues within NumPy, given that it only introduces a new keyword argument to existing array creation functions with a default ``None`` value, thus not changing current behavior. On Sun, Aug 16, 2020 at 1:41 PM Ralf Gommers wrote: > > > > On Fri, Aug 14, 2020 at 12:23 PM Peter Andreas Entschev wrote: >> >> Hi all, >> >> This thread has IMO drifted very far from its original purpose, due to that I decided to start a new thread specifically for the general NEP procedure discussed, please check your mail for "NEP Procedure Discussion" subject. > > > Thanks Peter. For future reference: better to just edit the thread subject, but not start over completely - people want to reply to previous content. I will copy over comments I'd like to reply to to the other thread by hand now. > >> >> On the topic of this thread, I'll try to rewrite NEP-35 to make it more accessible and ping back here once I have a PR for that. > > > Thanks! > > Cheers, > Ralf > >> Is there anything else that's pressing here? If there is and I missed/forgot about it, please let me know. >> >> Best, >> Peter >> >> On Fri, Aug 14, 2020 at 5:00 AM Juan Nunez-Iglesias wrote: >>> >>> Hello everyone again! >>> >>> A few clarifications about my proposal of external peer review: >>> >>> - Yes, all this work is public and announced on the mailing list. However, I don?t think there?s a single person in this discussion or even this whole ecosystem that does not have a more immediately-pressing and also virtually infinite to-do list, so it?s unreasonable to expect that generally they would do more than glance at the stuff in the mailing list. In the peer review analogy, the mailing list is like the arXiv or Biorxiv stream ? yep, anyone can see the stuff on there and comment, but most people just don?t have the time or attention to grab onto that. The only reason I stopped to comment here is Sebastian?s ?Imma merge, YOLO!?, which had me raising my eyebrows real high. Especially for something that would expand the NumPy API! >>> >>> - So, my proposal is that there needs to be an *editor* of NEPs who takes responsibility, once they are themselves satisfied with the NEP, for seeking out external reviewers and pinging them individually and asking them if they would be ok to review. >>> >>> - A good friend who does screenwriting once told me, ?don?t use all your proofreaders at once?. You want to get feedback, improve things, then feedback from a *totally independent* new person who can see the document with fresh eyes. >>> >>> Obviously, all of the above slows things down. But ?alone we go fast, together we go far?. The point of a NEP is to document critical decisions for the long term health of the project. If the documentation is insufficient, it defeats the whole purpose. Might as well just implement stuff and skip the whole NEP process. (Side note: Stephan, I for one would definitely appreciate an update to existing NEPs if there?s obvious ways they can be improved!) >>> >>> I do think that NEP templates should be strict, and I don?t think that is incompatible with plain, jargon-free text. The NEP template and guidelines should specify that, and that the motivation should be understandable by a casual NumPy user ? the kind described by Ilhan, for whom bare NumPy actually meets all their needs. Maybe they?ve also used PyTorch but they?ve never really had cause to mix them or write a program that worked with both kinds of arrays. >>> >>> Ditto for backwards compatibility ? everyone should be clear when their existing code is going to be broken. Actually NEP18 broke so much of my code, but its Backward compatibility section basically says all good! https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility >>> >>> Anywho, as always, none of this is criticism to work done ? I thank you all, and am eternally grateful for all the hard work everyone is doing to keep the ecosystem from fragmenting. I?m just hoping that this discussion can improve the process going forward! >>> >>> And, yes, apologies to Peter, I know from repeated personal experience how frustrating it can be to have last-minute drive-by objections after months of consensus building! But I think in the end every time that happened the end result was better ? I hope the same is true here! And yes, I?ll reiterate Ralf?s point: my concerns are about the NEP process itself rather than this one. I?ll summarise my proposal: >>> >>> - strict NEP template. NEPs with missing sections will not be accepted. >>> - sections Abstract, Motivation, and Backwards Compatibility should be understandable at a high level by casual users with ~zero background on the topic >>> - enforce the above with at least two independent rounds of coordinated peer review. >>> >>> Thank you, >>> >>> Juan. >>> >>> On 14 Aug 2020, at 5:29 am, Stephan Hoyer wrote: >>> >>> On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers wrote: >>>> >>>> Thanks for raising these concerns Ilhan and Juan, and for answering Peter. Let me give my perspective as well. >>>> >>>> To start with, this is not specifically about Peter's NEP and PR. NEP 35 simply follows the pattern set by previous PRs, and given its tight scope is less difficult to understand than other NEPs on such technical topics. Peter has done a lot of things right, and is close to the finish line. >>>> >>>> >>>> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev wrote: >>>>> >>>>> >>>>> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >>>>> >>>>> This is what I intended to do in the Usage Guidance [2] section. Could >>>>> you elaborate on what more information you'd want to see there? Or is >>>>> it just a matter of reorganizing the NEP a bit to try and summarize >>>>> such things right at the top? >>>> >>>> >>>> We adapted the NEP template [6] several times last year to try and improve this. And specified in there as well that NEP content set to the mailing list should only contain the sections: Abstract, Motivation and Scope, Usage and Impact, and Backwards compatibility. This to ensure we fully understand the "why" and "what" before the "how". Unfortunately that template and procedure hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 [8]. >>>> >>>> If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the goals, relevance, target audience, or how they're supposed to use a new feature, that indicates that the people doing the writing and having the discussion are doing something wrong at a very fundamental level. >>>> >>>> At this point I'm pretty disappointed in and tired of how we write and discuss NEPs on technical topics like dispatching, dtypes and the like. People literally refuse to write down concrete motivations, goals and non-goals, code that's problematic now and will be better/working post-NEP and usage examples before launching into extensive discussion of the gory details of the internals. I'm not sure what to do about it. Completely separate API and behavior proposals from implementation proposals? Make separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo on the API team which then needs to approve every API change in new NEPs? Offer to co-write NEPs if someone is willing but doesn't understand how to go about it? Keep the current structure/process but veto further approvals until NEP authors get it right? >>> >>> >>> I think the NEP template is great, and we should try to be more diligent about following it! >>> >>> My own NEP 37 (__array_module__) is probably a good example of poor presentation due to not following the template structure. It goes pretty deep into low-level motivation and some implementation details before usage examples. >>> >>> Speaking just for myself, I would have appreciated a friendly nudge to use the template. Certainly I think it would be fine to require using the template for newly submitted NEPs. I did not remember about it when I started drafting NEP 37, and it definitely would have helped. I may still try to do a revision at some point to use the template structure. >>> >>>> >>>> I want to make an exception for merging the current NEP, for which the plan is to merge it as experimental to try in downstream PRs and get more experience. That does mean that master will be in an unreleasable state by the way, which is unusual and it'd be nice to get Chuck's explicit OK for that. But after that, I think we need a change here. I would like to hear what everyone thinks is the shape that change should take - any of my above suggestions, or something else? >>>> >>>> >>>>> >>>>> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >>>>> >>>>> TBH, I don't really know how to solve that point, so if you have any >>>>> specific suggestions, that's certainly welcome. I understand the >>>>> frustration for a reader trying to understand all the details, with >>>>> many being only described in NEP-18 [3], but we also strive to avoid >>>>> rewriting things that are written elsewhere, which would also >>>>> overburden those who are aware of what's being discussed. >>>>> >>>>> >>>>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >>>> >>>> >>>> Some variant of this proposal would be my preference. >>>> >>>> Cheers, >>>> Ralf >>>> >>>>> >>>>> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572 >>>>> [2] https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance >>>>> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html >>>>> [4] https://numpy.org/neps/nep-0000.html#nep-workflow >>>>> [5] https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html >>>> >>>> >>>> [6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst >>>> [7] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst >>>> [8] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0041-improved-dtype-support.rst >>>> >>>> >>>>> >>>>> >>>>> >>>>> On Thu, Aug 13, 2020 at 3:44 AM Juan Nunez-Iglesias wrote: >>>>> > >>>>> > I?ve generally been on the ?let the NumPy devs worry about it? side of things, but I do agree with Ilhan that `like=` is confusing and `typeof=` would be a much more appropriate name for that parameter. >>>>> > >>>>> > I do think library writers are NumPy users and so I wouldn?t really make that distinction, though. Users writing their own analysis code could very well be interested in writing code using numpy functions that will transparently work when the input is a CuPy array or whatever. >>>>> > >>>>> > I also share Ilhan?s concern (and I mentioned this in a previous NEP discussion) that NEPs are getting pretty inaccessible. In a sense these are difficult topics and readers should be expected to have *some* familiarity with the topics being discussed, but perhaps more effort should be put into the context/motivation/background of a NEP before accepting it. One way to ensure this might be to require a final proofreading step by someone who has not been involved at all in the discussions, like peer review does for papers. >>>>> > >>>>> > Food for thought. >>>>> > >>>>> > Juan. >>>>> > >>>>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat wrote: >>>>> > >>>>> > For what is worth, as a potential consumer in SciPy, it really doesn't say anything (both in NEP and the PR) about how the regular users of NumPy will benefit from this. If only and only 3rd parties are going to benefit from it, I am not sure adding a new keyword to an already confusing function is the right thing to do. >>>>> > >>>>> > Let me clarify, >>>>> > >>>>> > - This is already a very (I mean extremely very) easy keyword name to confuse with ones_like, zeros_like and by its nature any other interpretation. It is not signalling anything about the functionality that is being discussed. I would seriously consider reserving such obvious names for really obvious tasks. Because you would also expect the shape and ndim would be mimicked by the "like"d argument but it turns out it is acting more like "typeof=" and not "like=" at all. Because if we follow the semantics it reads as "make your argument asarray like the other thing" but it is actually doing, "make your argument an array with the other thing's type" which might not be an array after all. >>>>> > >>>>> > - Again, if this is meant for downstream libraries (because that's what I got out of the PR discussion, cupy, dask, and JAX were the only examples I could read) then hiding it in another function and writing with capital letters "this is not meant for numpy users" would be a much more convenient way to separate the target audience and regular users. numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may be would be quite clean and to the point with no ambiguous keywords. >>>>> > >>>>> > I think, arriving to an agreement would be much faster if there is an executive summary of who this is intended for and what the regular usage is. Because with no offense, all I see is "dispatch", "_array_function_" and a lot of technical details of which I am absolutely ignorant. >>>>> > >>>>> > Finally as a minor point, I know we are mostly (ex-)academics but this necessity of formal language on NEPs is self-imposed (probably PEPs are to blame) and not quite helping. It can be a bit more descriptive in my external opinion. >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From melissawm at gmail.com Mon Aug 17 15:59:40 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Mon, 17 Aug 2020 16:59:40 -0300 Subject: [Numpy-discussion] Announcing 2020 Google Season of Docs Technical Writers for NumPy Message-ID: Hello all, I'm pleased to announce that NumPy was awarded two slots in the Google Season of Docs program (you can see the full results here: https://developers.google.com/season-of-docs/docs/participants). The selected projects are - "NumPy Documentation for Community Education", by Ryan Cooper (Proposal: https://developers.google.com/season-of-docs/docs/participants/project-numpy-cooperrc ) - "High level restructuring and end user focus", by kubedoc (Proposal: https://developers.google.com/season-of-docs/docs/participants/project-numpy-kubedoc ) We appreciate all projects that were submitted and thank all participants for their efforts in putting together their proposals. Also, if you wish to contribute documentation to NumPy on a volunteer basis, you are welcome to do so! Cheers, - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Aug 17 16:34:53 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 17 Aug 2020 21:34:53 +0100 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative Message-ID: Hi all, I'd like to share this announcement blog post about the creation of a consortium for array and dataframe API standardization here: https://data-apis.org/blog/announcing_the_consortium/. It's still in the beginning stages, but starting to take shape. We have participation from one or more maintainers of most array and tensor libraries - NumPy, TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis Oliphant and myself have been providing input from a NumPy perspective. The effort is very much related to some of the interoperability work we've been doing in NumPy (e.g. it could provide an answer to what's described in https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api ). At this point we're looking for feedback from maintainers at a high level (see the blog post for details). Also important: the python-record-api tooling and data in its repo has very granular API usage data, of the kind we could really use when making decisions that impact backwards compatibility. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Aug 18 10:01:53 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 18 Aug 2020 09:01:53 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <3843fd878138cff21a73bfabd7dae3e44158c340.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday Agust 19th at 1pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From einstein.edison at gmail.com Tue Aug 18 10:14:38 2020 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 18 Aug 2020 16:14:38 +0200 Subject: [Numpy-discussion] [Release] PyData/Sparse 0.11.0 Message-ID: <6a3da074-45c5-428a-9c96-14ba4cca3704@Canary> Hello, I?m happy to announce the release of PyData/Sparse 0.11.0, available to download via pip and conda-forge. PyData/Sparse is a library that provides sparse N-dimensional arrays for the PyData ecosystem. The official website and documentation is available at: https://sparse.pydata.org The sources and bug tracker: https://github.com/pydata/sparse The changelog for this release can be viewed at: https://sparse.pydata.org/en/0.11.0/changelog.html Best Regards, Hameer Abbasi -- Sent from Canary (https://canarymail.io) -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Wed Aug 19 20:07:05 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 19 Aug 2020 18:07:05 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: Message-ID: > > 3. If you have multiple advanced indexing you get annoying broadcasting > > of all of these. That is *always* confusing for boolean indices. > > 0-D should not be too special there... OK, now that I am learning more about advanced indexing, this statement is confusing to me. It seems that scalar boolean indices do not broadcast. For example: >>> np.arange(2)[False, np.array([True, False])] array([], dtype=int64) >>> np.arange(2)[tuple(np.broadcast_arrays(False, np.array([True, False])))] Traceback (most recent call last): File "", line 1, in IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed And indeed, the docs even say, as you noted, "the nonzero equivalence for Boolean arrays does not hold for zero dimensional boolean arrays," which I guess also applies to the broadcasting. >From what I can tell, the logic is that all integer and boolean arrays (and scalar ints) are broadcast together, *except* for boolean scalars. Then the first boolean scalar is replaced with and(all boolean scalars) and the rest are removed from the index. Then that index adds a length 1 axis if it is True and 0 if it is False. So they don't broadcast, but rather "fake broadcast". I still contend that it would be much more useful, if True were a synonym for newaxis and False worked like newaxis but instead added a length 0 axis. Alternately, True and False scalars should behave exactly like all other boolean arrays with no exceptions (i.e., work like np.nonzero(), broadcast, etc.). This would be less useful, but more consistent. Aaron Meurer From sebastian at sipsolutions.net Wed Aug 19 20:55:03 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 19 Aug 2020 19:55:03 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: Message-ID: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > 3. If you have multiple advanced indexing you get annoying > > > broadcasting > > > of all of these. That is *always* confusing for boolean > > > indices. > > > 0-D should not be too special there... > > OK, now that I am learning more about advanced indexing, this > statement is confusing to me. It seems that scalar boolean indices do > not broadcast. For example: Well, broadcasting means you broadcast the *nonzero result* unless I am very confused... There is a reason I dismissed it. We could (and arguably should) just deprecate it. And I have doubts anyone would even notice. > > > > > np.arange(2)[False, np.array([True, False])] > array([], dtype=int64) > > > > np.arange(2)[tuple(np.broadcast_arrays(False, np.array([True, > > > > False])))] > Traceback (most recent call last): > File "", line 1, in > IndexError: too many indices for array: array is 1-dimensional, but 2 > were indexed > > And indeed, the docs even say, as you noted, "the nonzero equivalence > for Boolean arrays does not hold for zero dimensional boolean > arrays," > which I guess also applies to the broadcasting. I actually think that probably also holds. Nonzero just behave weird for 0D because arrays (because it returns a tuple). But since broadcasting the nonzero result is so weird, and since 0-D booleans require some additional logic and don't generalize 100% (code wise), I won't rule out there are differences. > From what I can tell, the logic is that all integer and boolean > arrays Did you try that? Because as I said above, IIRC broadcasting the boolean array without first calling `nonzero` isn't really whats going on. And I don't know how it could be whats going on, since adding dimensions to a boolean index would have much more implications? - Sebastian > (and scalar ints) are broadcast together, *except* for boolean > scalars. Then the first boolean scalar is replaced with and(all > boolean scalars) and the rest are removed from the index. Then that > index adds a length 1 axis if it is True and 0 if it is False. > > So they don't broadcast, but rather "fake broadcast". I still contend > that it would be much more useful, if True were a synonym for newaxis > and False worked like newaxis but instead added a length 0 axis. > Alternately, True and False scalars should behave exactly like all > other boolean arrays with no exceptions (i.e., work like > np.nonzero(), > broadcast, etc.). This would be less useful, but more consistent. > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Wed Aug 19 21:37:51 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 19 Aug 2020 19:37:51 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> Message-ID: These cases don't give any deprecation warnings in NumPy master: >>> np.arange(0)[np.array([0]), False] array([], dtype=int64) >>> np.arange(0).reshape((0, 0))[np.array([0]), np.array([], dtype=int)] array([], dtype=int64) Is that intentional? Aaron Meurer On Thu, Jul 23, 2020 at 12:18 PM Aaron Meurer wrote: > > > After writing this, I realized that I actually remember the *opposite* > > discussion occurring before. I think in some of the equality > > deprecations, we actually raise the new error due to an internal > > try/except clause. And there was a complaint that its confusing that a > > non-deprecation-warning is raised when the error will only happen with > > DeprecationWarnings being set to error. > > > > - Sebastian > > I noticed that warnings.catch_warnings does the right thing with > warnings that are raised alongside an exception (although it is a bit > clunky to use). > > Aaron Meurer From sebastian at sipsolutions.net Wed Aug 19 22:18:16 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 19 Aug 2020 21:18:16 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> Message-ID: On Wed, 2020-08-19 at 19:37 -0600, Aaron Meurer wrote: > These cases don't give any deprecation warnings in NumPy master: > > > > > np.arange(0)[np.array([0]), False] > array([], dtype=int64) > > > > np.arange(0).reshape((0, 0))[np.array([0]), np.array([], > > > > dtype=int)] > array([], dtype=int64) > > Is that intentional? I guess it follows from `np.array([[1]])[[], [10]]` also not failing currently. And that was intentional not to deprecate when out-of-bound indices broadcast away. But I am not sure I actually think that was the better choice. My initial choice was that this would be an error as well, and I still slightly prefer it, but don't feel it matters much. - Sebastian > > Aaron Meurer > > On Thu, Jul 23, 2020 at 12:18 PM Aaron Meurer > wrote: > > > After writing this, I realized that I actually remember the > > > *opposite* > > > discussion occurring before. I think in some of the equality > > > deprecations, we actually raise the new error due to an internal > > > try/except clause. And there was a complaint that its confusing > > > that a > > > non-deprecation-warning is raised when the error will only happen > > > with > > > DeprecationWarnings being set to error. > > > > > > - Sebastian > > > > I noticed that warnings.catch_warnings does the right thing with > > warnings that are raised alongside an exception (although it is a > > bit > > clunky to use). > > > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ryan.c.cooper at uconn.edu Thu Aug 20 13:10:58 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Thu, 20 Aug 2020 10:10:58 -0700 (MST) Subject: [Numpy-discussion] Feature requests/Enhancements for upper-level engineering students Message-ID: <1597943458736-0.post@n7.nabble.com> Greetings, As the Fall semester is fast approaching (10 days away for us at UConn), we are looking for senior design (also called capstone) projects for the 2020-2021 school year. The COVID situation has strengthened the need for remote work. The process here is that students are assigned to projects by late September. Then, they have 6 main deliverables over the course of 2 semesters: 1. Initial Fall Presentation (~Oct) 2. Final Fall Presentation (~Dec) 3. Mid-year report (~Jan) 4. Initial Spring Presentation (~Mar) 5. Final Spring Presntation (~Apr) 6. Final report (~May) My question to the NumPy community is: Are there any features or enhancements that would be nice to have, but might not have a team dedicated to the idea? I would be happy to advise any projects that people are interested in proposing. I would like to hear what people think would be worthwhile for students to build together. Some background, these students have all used Python and Matlab for mechanical engineering applications like linear regression, modal analyses, ode integration, and root solving. They learn quickly, but may not be interested in UX/UI design problems. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From asmeurer at gmail.com Thu Aug 20 14:21:50 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Thu, 20 Aug 2020 12:21:50 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> Message-ID: You're right. I was confusing the broadcasting logic for boolean arrays. However, I did find this example >>> np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], dtype=np.int64), False] Traceback (most recent call last): File "", line 1, in IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (1,5) (0,) That certainly seems to imply there is some broadcasting being done. Aaron Meurer On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg wrote: > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > 3. If you have multiple advanced indexing you get annoying > > > > broadcasting > > > > of all of these. That is *always* confusing for boolean > > > > indices. > > > > 0-D should not be too special there... > > > > OK, now that I am learning more about advanced indexing, this > > statement is confusing to me. It seems that scalar boolean indices do > > not broadcast. For example: > > Well, broadcasting means you broadcast the *nonzero result* unless I am > very confused... There is a reason I dismissed it. We could (and > arguably should) just deprecate it. And I have doubts anyone would > even notice. > > > > > > > > np.arange(2)[False, np.array([True, False])] > > array([], dtype=int64) > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, np.array([True, > > > > > False])))] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: too many indices for array: array is 1-dimensional, but 2 > > were indexed > > > > And indeed, the docs even say, as you noted, "the nonzero equivalence > > for Boolean arrays does not hold for zero dimensional boolean > > arrays," > > which I guess also applies to the broadcasting. > > I actually think that probably also holds. Nonzero just behave weird > for 0D because arrays (because it returns a tuple). > But since broadcasting the nonzero result is so weird, and since 0-D > booleans require some additional logic and don't generalize 100% (code > wise), I won't rule out there are differences. > > > From what I can tell, the logic is that all integer and boolean > > arrays > > Did you try that? Because as I said above, IIRC broadcasting the > boolean array without first calling `nonzero` isn't really whats going > on. And I don't know how it could be whats going on, since adding > dimensions to a boolean index would have much more implications? > > - Sebastian > > > > (and scalar ints) are broadcast together, *except* for boolean > > scalars. Then the first boolean scalar is replaced with and(all > > boolean scalars) and the rest are removed from the index. Then that > > index adds a length 1 axis if it is True and 0 if it is False. > > > > So they don't broadcast, but rather "fake broadcast". I still contend > > that it would be much more useful, if True were a synonym for newaxis > > and False worked like newaxis but instead added a length 0 axis. > > Alternately, True and False scalars should behave exactly like all > > other boolean arrays with no exceptions (i.e., work like > > np.nonzero(), > > broadcast, etc.). This would be less useful, but more consistent. > > > > Aaron Meurer > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From kmckenna at baselinesw.com Thu Aug 20 17:21:57 2020 From: kmckenna at baselinesw.com (KevinBaselinesw) Date: Thu, 20 Aug 2020 14:21:57 -0700 (MST) Subject: [Numpy-discussion] Feature requests/Enhancements for upper-level engineering students In-Reply-To: <1597943458736-0.post@n7.nabble.com> References: <1597943458736-0.post@n7.nabble.com> Message-ID: <1597958517453-0.post@n7.nabble.com> would your team be interested in contributing to my port of Numpy to .NET? https://github.com/Quansight-Labs/numpy.net I have the vast majority of the Numpy core working as a pure .NET library. All of the other libraries that rely on Numpy are not ported. I am sure we could find some good projects for your team to work on. These would be "green field" projects and would likely be great learning opportunities for them. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From sebastian at sipsolutions.net Thu Aug 20 17:50:17 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 20 Aug 2020 16:50:17 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> Message-ID: <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > You're right. I was confusing the broadcasting logic for boolean > arrays. > > However, I did find this example > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > dtype=np.int64), False] > Traceback (most recent call last): > File "", line 1, in > IndexError: shape mismatch: indexing arrays could not be broadcast > together with shapes (1,5) (0,) > > That certainly seems to imply there is some broadcasting being done. Yes, it broadcasts the array after converting it with `nonzero`, i.e. its much the same as: indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) indices = np.broadcast_arrays(*indices) will give the same result (see also `np.ix_` which converts booleans as well for this reason, to give you outer indexing). I was half way through a mock-up/pseudo code, but thought you likely wasn't sure it was ending up clear. It sounds like things are probably falling into place for you (if they are not, let me know what might help you): 1. Convert all boolean indices into a series of integer indices using `np.nonzero(index)` 2. For True/False scalars, that doesn't work, because `np.nonzero()`. `nonzero` gave us an index array (which is good, we obviously want one), but we need to index into `boolean_index.ndim == 0` dimensions! So that won't work, the approach using `nonzero` cannot generalize here, although boolean indices generalize perfectly. The solution to the dilemma is simple: If we have to index one dimension, but should be indexing zero, then we simply add that dimension to the original array (or at least pretend there was an additional dimension). 3. Do normal indexing with the result *including broadcasting*, we forget it was converted. The other way to solve it would be to always reshape the original array to combine all axes being indexed by a single boolean index into one axis and then index it using `np.flatnonzero`. (But that would get a different result if you try to broadcast!) In any case, I am not sure I would bother with making sense of this, except for sports! Its pretty much nonsense and I think the time understanding it is probably better spend deprecating it. The only reason I did not Deprecate itt before, is that I tried to do be minimal in the changes when I rewrote advanced indexing (and generalized boolean scalars correctly) long ago. That was likely the right start/choice at the time, since there were much bigger fish to catch, but I do not think anything is holding us back now. Cheers, Sebastian > > Aaron Meurer > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > wrote: > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > 3. If you have multiple advanced indexing you get annoying > > > > > broadcasting > > > > > of all of these. That is *always* confusing for boolean > > > > > indices. > > > > > 0-D should not be too special there... > > > > > > OK, now that I am learning more about advanced indexing, this > > > statement is confusing to me. It seems that scalar boolean > > > indices do > > > not broadcast. For example: > > > > Well, broadcasting means you broadcast the *nonzero result* unless > > I am > > very confused... There is a reason I dismissed it. We could (and > > arguably should) just deprecate it. And I have doubts anyone would > > even notice. > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > array([], dtype=int64) > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > np.array([True, > > > > > > False])))] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: too many indices for array: array is 1-dimensional, > > > but 2 > > > were indexed > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > equivalence > > > for Boolean arrays does not hold for zero dimensional boolean > > > arrays," > > > which I guess also applies to the broadcasting. > > > > I actually think that probably also holds. Nonzero just behave > > weird > > for 0D because arrays (because it returns a tuple). > > But since broadcasting the nonzero result is so weird, and since 0- > > D > > booleans require some additional logic and don't generalize 100% > > (code > > wise), I won't rule out there are differences. > > > > > From what I can tell, the logic is that all integer and boolean > > > arrays > > > > Did you try that? Because as I said above, IIRC broadcasting the > > boolean array without first calling `nonzero` isn't really whats > > going > > on. And I don't know how it could be whats going on, since adding > > dimensions to a boolean index would have much more implications? > > > > - Sebastian > > > > > > > (and scalar ints) are broadcast together, *except* for boolean > > > scalars. Then the first boolean scalar is replaced with and(all > > > boolean scalars) and the rest are removed from the index. Then > > > that > > > index adds a length 1 axis if it is True and 0 if it is False. > > > > > > So they don't broadcast, but rather "fake broadcast". I still > > > contend > > > that it would be much more useful, if True were a synonym for > > > newaxis > > > and False worked like newaxis but instead added a length 0 axis. > > > Alternately, True and False scalars should behave exactly like > > > all > > > other boolean arrays with no exceptions (i.e., work like > > > np.nonzero(), > > > broadcast, etc.). This would be less useful, but more consistent. > > > > > > Aaron Meurer > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Thu Aug 20 17:55:40 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 20 Aug 2020 16:55:40 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> Message-ID: <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > You're right. I was confusing the broadcasting logic for boolean > > arrays. > > > > However, I did find this example > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > > dtype=np.int64), False] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: shape mismatch: indexing arrays could not be broadcast > > together with shapes (1,5) (0,) > > > > That certainly seems to imply there is some broadcasting being > > done. > > Yes, it broadcasts the array after converting it with `nonzero`, i.e. > its much the same as: > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > indices = np.broadcast_arrays(*indices) > > will give the same result (see also `np.ix_` which converts booleans > as > well for this reason, to give you outer indexing). > I was half way through a mock-up/pseudo code, but thought you likely > wasn't sure it was ending up clear. It sounds like things are > probably > falling into place for you (if they are not, let me know what might > help you): Sorry editing error up there, in short I hope those steps sense to you, note that the broadcasting is basically part of a later "integer only" indexing step, and the `nonzero` part is pre-processing. > > 1. Convert all boolean indices into a series of integer indices using > `np.nonzero(index)` > > 2. For True/False scalars, that doesn't work, because `np.nonzero()`. > > `nonzero` gave us an index array (which is good, we obviously want > > one), but we need to index into `boolean_index.ndim == 0` > dimensions! > So that won't work, the approach using `nonzero` cannot generalize > > here, although boolean indices generalize perfectly. > > The solution to the dilemma is simple: If we have to index one > dimension, but should be indexing zero, then we simply add that > dimension to the original array (or at least pretend there was > an additional dimension). > > 3. Do normal indexing with the result *including broadcasting*, > we forget it was converted. > > The other way to solve it would be to always reshape the original > array > to combine all axes being indexed by a single boolean index into one > axis and then index it using `np.flatnonzero`. (But that would get a > different result if you try to broadcast!) > > > In any case, I am not sure I would bother with making sense of this, > except for sports! > Its pretty much nonsense and I think the time understanding it is > probably better spend deprecating it. The only reason I did not > Deprecate itt before, is that I tried to do be minimal in the changes > when I rewrote advanced indexing (and generalized boolean scalars > correctly) long ago. That was likely the right start/choice at the > time, since there were much bigger fish to catch, but I do not think > anything is holding us back now. > > Cheers, > > Sebastian > > > > Aaron Meurer > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > wrote: > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > 3. If you have multiple advanced indexing you get annoying > > > > > > broadcasting > > > > > > of all of these. That is *always* confusing for boolean > > > > > > indices. > > > > > > 0-D should not be too special there... > > > > > > > > OK, now that I am learning more about advanced indexing, this > > > > statement is confusing to me. It seems that scalar boolean > > > > indices do > > > > not broadcast. For example: > > > > > > Well, broadcasting means you broadcast the *nonzero result* > > > unless > > > I am > > > very confused... There is a reason I dismissed it. We could (and > > > arguably should) just deprecate it. And I have doubts anyone > > > would > > > even notice. > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > array([], dtype=int64) > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > np.array([True, > > > > > > > False])))] > > > > Traceback (most recent call last): > > > > File "", line 1, in > > > > IndexError: too many indices for array: array is 1-dimensional, > > > > but 2 > > > > were indexed > > > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > > equivalence > > > > for Boolean arrays does not hold for zero dimensional boolean > > > > arrays," > > > > which I guess also applies to the broadcasting. > > > > > > I actually think that probably also holds. Nonzero just behave > > > weird > > > for 0D because arrays (because it returns a tuple). > > > But since broadcasting the nonzero result is so weird, and since > > > 0- > > > D > > > booleans require some additional logic and don't generalize 100% > > > (code > > > wise), I won't rule out there are differences. > > > > > > > From what I can tell, the logic is that all integer and boolean > > > > arrays > > > > > > Did you try that? Because as I said above, IIRC broadcasting the > > > boolean array without first calling `nonzero` isn't really whats > > > going > > > on. And I don't know how it could be whats going on, since adding > > > dimensions to a boolean index would have much more implications? > > > > > > - Sebastian > > > > > > > > > > (and scalar ints) are broadcast together, *except* for boolean > > > > scalars. Then the first boolean scalar is replaced with and(all > > > > boolean scalars) and the rest are removed from the index. Then > > > > that > > > > index adds a length 1 axis if it is True and 0 if it is False. > > > > > > > > So they don't broadcast, but rather "fake broadcast". I still > > > > contend > > > > that it would be much more useful, if True were a synonym for > > > > newaxis > > > > and False worked like newaxis but instead added a length 0 > > > > axis. > > > > Alternately, True and False scalars should behave exactly like > > > > all > > > > other boolean arrays with no exceptions (i.e., work like > > > > np.nonzero(), > > > > broadcast, etc.). This would be less useful, but more > > > > consistent. > > > > > > > > Aaron Meurer > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Thu Aug 20 18:00:46 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Thu, 20 Aug 2020 16:00:46 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> Message-ID: Just to be clear, what exactly do you think should be deprecated? Boolean scalar indices in general, or just boolean scalars combined with other arrays, or something else? Aaron Meurer On Thu, Aug 20, 2020 at 3:56 PM Sebastian Berg wrote: > > On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > > You're right. I was confusing the broadcasting logic for boolean > > > arrays. > > > > > > However, I did find this example > > > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > > > dtype=np.int64), False] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: shape mismatch: indexing arrays could not be broadcast > > > together with shapes (1,5) (0,) > > > > > > That certainly seems to imply there is some broadcasting being > > > done. > > > > Yes, it broadcasts the array after converting it with `nonzero`, i.e. > > its much the same as: > > > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > > indices = np.broadcast_arrays(*indices) > > > > will give the same result (see also `np.ix_` which converts booleans > > as > > well for this reason, to give you outer indexing). > > I was half way through a mock-up/pseudo code, but thought you likely > > wasn't sure it was ending up clear. It sounds like things are > > probably > > falling into place for you (if they are not, let me know what might > > help you): > > Sorry editing error up there, in short I hope those steps sense to you, > note that the broadcasting is basically part of a later "integer only" > indexing step, and the `nonzero` part is pre-processing. > > > > > 1. Convert all boolean indices into a series of integer indices using > > `np.nonzero(index)` > > > > 2. For True/False scalars, that doesn't work, because `np.nonzero()`. > > > > `nonzero` gave us an index array (which is good, we obviously want > > > > one), but we need to index into `boolean_index.ndim == 0` > > dimensions! > > So that won't work, the approach using `nonzero` cannot generalize > > > > here, although boolean indices generalize perfectly. > > > > The solution to the dilemma is simple: If we have to index one > > dimension, but should be indexing zero, then we simply add that > > dimension to the original array (or at least pretend there was > > an additional dimension). > > > > 3. Do normal indexing with the result *including broadcasting*, > > we forget it was converted. > > > > The other way to solve it would be to always reshape the original > > array > > to combine all axes being indexed by a single boolean index into one > > axis and then index it using `np.flatnonzero`. (But that would get a > > different result if you try to broadcast!) > > > > > > In any case, I am not sure I would bother with making sense of this, > > except for sports! > > Its pretty much nonsense and I think the time understanding it is > > probably better spend deprecating it. The only reason I did not > > Deprecate itt before, is that I tried to do be minimal in the changes > > when I rewrote advanced indexing (and generalized boolean scalars > > correctly) long ago. That was likely the right start/choice at the > > time, since there were much bigger fish to catch, but I do not think > > anything is holding us back now. > > > > Cheers, > > > > Sebastian > > > > > > > Aaron Meurer > > > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > > wrote: > > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > > 3. If you have multiple advanced indexing you get annoying > > > > > > > broadcasting > > > > > > > of all of these. That is *always* confusing for boolean > > > > > > > indices. > > > > > > > 0-D should not be too special there... > > > > > > > > > > OK, now that I am learning more about advanced indexing, this > > > > > statement is confusing to me. It seems that scalar boolean > > > > > indices do > > > > > not broadcast. For example: > > > > > > > > Well, broadcasting means you broadcast the *nonzero result* > > > > unless > > > > I am > > > > very confused... There is a reason I dismissed it. We could (and > > > > arguably should) just deprecate it. And I have doubts anyone > > > > would > > > > even notice. > > > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > > array([], dtype=int64) > > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > > np.array([True, > > > > > > > > False])))] > > > > > Traceback (most recent call last): > > > > > File "", line 1, in > > > > > IndexError: too many indices for array: array is 1-dimensional, > > > > > but 2 > > > > > were indexed > > > > > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > > > equivalence > > > > > for Boolean arrays does not hold for zero dimensional boolean > > > > > arrays," > > > > > which I guess also applies to the broadcasting. > > > > > > > > I actually think that probably also holds. Nonzero just behave > > > > weird > > > > for 0D because arrays (because it returns a tuple). > > > > But since broadcasting the nonzero result is so weird, and since > > > > 0- > > > > D > > > > booleans require some additional logic and don't generalize 100% > > > > (code > > > > wise), I won't rule out there are differences. > > > > > > > > > From what I can tell, the logic is that all integer and boolean > > > > > arrays > > > > > > > > Did you try that? Because as I said above, IIRC broadcasting the > > > > boolean array without first calling `nonzero` isn't really whats > > > > going > > > > on. And I don't know how it could be whats going on, since adding > > > > dimensions to a boolean index would have much more implications? > > > > > > > > - Sebastian > > > > > > > > > > > > > (and scalar ints) are broadcast together, *except* for boolean > > > > > scalars. Then the first boolean scalar is replaced with and(all > > > > > boolean scalars) and the rest are removed from the index. Then > > > > > that > > > > > index adds a length 1 axis if it is True and 0 if it is False. > > > > > > > > > > So they don't broadcast, but rather "fake broadcast". I still > > > > > contend > > > > > that it would be much more useful, if True were a synonym for > > > > > newaxis > > > > > and False worked like newaxis but instead added a length 0 > > > > > axis. > > > > > Alternately, True and False scalars should behave exactly like > > > > > all > > > > > other boolean arrays with no exceptions (i.e., work like > > > > > np.nonzero(), > > > > > broadcast, etc.). This would be less useful, but more > > > > > consistent. > > > > > > > > > > Aaron Meurer > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Thu Aug 20 18:37:43 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 20 Aug 2020 17:37:43 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: (sfid-20200821_000224_575232_ABA06109) References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> (sfid-20200821_000224_575232_ABA06109) Message-ID: <5a76b9753c84a324d78cf30d78fc1a3459467c9d.camel@sipsolutions.net> On Thu, 2020-08-20 at 16:00 -0600, Aaron Meurer wrote: > Just to be clear, what exactly do you think should be deprecated? > Boolean scalar indices in general, or just boolean scalars combined > with other arrays, or something else? My angle is that we should allow only: * Any number of integer array indices (ideally only explicitly with `arr.vindex[]`, but we do not have that luxury right now.) * A single boolean index (array or scalar is identical) but no mix of the above (including multiple boolean indices). Because I think they are at least one level more confusing than multiple advanced indices. I admit, I forgot that the broadcasting logic is fine in this case: arr = np.zeros((2, 3)) arr[[True], np.array(3)] where the advanced index is also a scalar index. In that case the result is straight forward, since broadcasting does not affect `np.array(3)`. I am happy to be wrong about that assessment, but I think your opinion on it could likely push us towards just doing a Deprecation. The only use case for "multiple boolean indices" that I could think of was this: arr = np.diag([1, 2, 3, 4]) # 2-d square array indx = arr.diagonal() > 2 # mask for each row and column masked_diagonal = arr[indx, indx] print(repr(masked_diagonal)) # array([3, 4]) and my guess is that the reaction to that code is a: "Wait what?!" That code might seem reasonable, but it only works if you have the exact same number of `True` values in the two indices. And if you have the exact same number but two different arrays, then I fail to reason about the result without doing the `nonzero` step, which I think indicates that there just is no logical concept for it. So, I think we may be better of forcing the few power-user who may have found a use for this type of nugget to use `np.nonzero()` or find another solution. - Sebastian > > Aaron Meurer > > On Thu, Aug 20, 2020 at 3:56 PM Sebastian Berg > wrote: > > On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > > > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > > > You're right. I was confusing the broadcasting logic for > > > > boolean > > > > arrays. > > > > > > > > However, I did find this example > > > > > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > > > > dtype=np.int64), False] > > > > Traceback (most recent call last): > > > > File "", line 1, in > > > > IndexError: shape mismatch: indexing arrays could not be > > > > broadcast > > > > together with shapes (1,5) (0,) > > > > > > > > That certainly seems to imply there is some broadcasting being > > > > done. > > > > > > Yes, it broadcasts the array after converting it with `nonzero`, > > > i.e. > > > its much the same as: > > > > > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > > > indices = np.broadcast_arrays(*indices) > > > > > > will give the same result (see also `np.ix_` which converts > > > booleans > > > as > > > well for this reason, to give you outer indexing). > > > I was half way through a mock-up/pseudo code, but thought you > > > likely > > > wasn't sure it was ending up clear. It sounds like things are > > > probably > > > falling into place for you (if they are not, let me know what > > > might > > > help you): > > > > Sorry editing error up there, in short I hope those steps sense to > > you, > > note that the broadcasting is basically part of a later "integer > > only" > > indexing step, and the `nonzero` part is pre-processing. > > > > > 1. Convert all boolean indices into a series of integer indices > > > using > > > `np.nonzero(index)` > > > > > > 2. For True/False scalars, that doesn't work, because > > > `np.nonzero()`. > > > > > > `nonzero` gave us an index array (which is good, we obviously > > > want > > > > > > one), but we need to index into `boolean_index.ndim == 0` > > > dimensions! > > > So that won't work, the approach using `nonzero` cannot > > > generalize > > > > > > here, although boolean indices generalize perfectly. > > > > > > The solution to the dilemma is simple: If we have to index one > > > dimension, but should be indexing zero, then we simply add > > > that > > > dimension to the original array (or at least pretend there was > > > an additional dimension). > > > > > > 3. Do normal indexing with the result *including broadcasting*, > > > we forget it was converted. > > > > > > The other way to solve it would be to always reshape the original > > > array > > > to combine all axes being indexed by a single boolean index into > > > one > > > axis and then index it using `np.flatnonzero`. (But that would > > > get a > > > different result if you try to broadcast!) > > > > > > > > > In any case, I am not sure I would bother with making sense of > > > this, > > > except for sports! > > > Its pretty much nonsense and I think the time understanding it is > > > probably better spend deprecating it. The only reason I did not > > > Deprecate itt before, is that I tried to do be minimal in the > > > changes > > > when I rewrote advanced indexing (and generalized boolean scalars > > > correctly) long ago. That was likely the right start/choice at > > > the > > > time, since there were much bigger fish to catch, but I do not > > > think > > > anything is holding us back now. > > > > > > Cheers, > > > > > > Sebastian > > > > > > > > > > Aaron Meurer > > > > > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > > > wrote: > > > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > > > 3. If you have multiple advanced indexing you get > > > > > > > > annoying > > > > > > > > broadcasting > > > > > > > > of all of these. That is *always* confusing for > > > > > > > > boolean > > > > > > > > indices. > > > > > > > > 0-D should not be too special there... > > > > > > > > > > > > OK, now that I am learning more about advanced indexing, > > > > > > this > > > > > > statement is confusing to me. It seems that scalar boolean > > > > > > indices do > > > > > > not broadcast. For example: > > > > > > > > > > Well, broadcasting means you broadcast the *nonzero result* > > > > > unless > > > > > I am > > > > > very confused... There is a reason I dismissed it. We could > > > > > (and > > > > > arguably should) just deprecate it. And I have doubts anyone > > > > > would > > > > > even notice. > > > > > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > > > array([], dtype=int64) > > > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > > > np.array([True, > > > > > > > > > False])))] > > > > > > Traceback (most recent call last): > > > > > > File "", line 1, in > > > > > > IndexError: too many indices for array: array is 1- > > > > > > dimensional, > > > > > > but 2 > > > > > > were indexed > > > > > > > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > > > > equivalence > > > > > > for Boolean arrays does not hold for zero dimensional > > > > > > boolean > > > > > > arrays," > > > > > > which I guess also applies to the broadcasting. > > > > > > > > > > I actually think that probably also holds. Nonzero just > > > > > behave > > > > > weird > > > > > for 0D because arrays (because it returns a tuple). > > > > > But since broadcasting the nonzero result is so weird, and > > > > > since > > > > > 0- > > > > > D > > > > > booleans require some additional logic and don't generalize > > > > > 100% > > > > > (code > > > > > wise), I won't rule out there are differences. > > > > > > > > > > > From what I can tell, the logic is that all integer and > > > > > > boolean > > > > > > arrays > > > > > > > > > > Did you try that? Because as I said above, IIRC broadcasting > > > > > the > > > > > boolean array without first calling `nonzero` isn't really > > > > > whats > > > > > going > > > > > on. And I don't know how it could be whats going on, since > > > > > adding > > > > > dimensions to a boolean index would have much more > > > > > implications? > > > > > > > > > > - Sebastian > > > > > > > > > > > > > > > > (and scalar ints) are broadcast together, *except* for > > > > > > boolean > > > > > > scalars. Then the first boolean scalar is replaced with > > > > > > and(all > > > > > > boolean scalars) and the rest are removed from the index. > > > > > > Then > > > > > > that > > > > > > index adds a length 1 axis if it is True and 0 if it is > > > > > > False. > > > > > > > > > > > > So they don't broadcast, but rather "fake broadcast". I > > > > > > still > > > > > > contend > > > > > > that it would be much more useful, if True were a synonym > > > > > > for > > > > > > newaxis > > > > > > and False worked like newaxis but instead added a length 0 > > > > > > axis. > > > > > > Alternately, True and False scalars should behave exactly > > > > > > like > > > > > > all > > > > > > other boolean arrays with no exceptions (i.e., work like > > > > > > np.nonzero(), > > > > > > broadcast, etc.). This would be less useful, but more > > > > > > consistent. > > > > > > > > > > > > Aaron Meurer > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Thu Aug 20 19:08:39 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Thu, 20 Aug 2020 17:08:39 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: <5a76b9753c84a324d78cf30d78fc1a3459467c9d.camel@sipsolutions.net> References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> <5a76b9753c84a324d78cf30d78fc1a3459467c9d.camel@sipsolutions.net> Message-ID: On Thu, Aug 20, 2020 at 4:38 PM Sebastian Berg wrote: > > On Thu, 2020-08-20 at 16:00 -0600, Aaron Meurer wrote: > > Just to be clear, what exactly do you think should be deprecated? > > Boolean scalar indices in general, or just boolean scalars combined > > with other arrays, or something else? > > My angle is that we should allow only: > > * Any number of integer array indices (ideally only explicitly > with `arr.vindex[]`, but we do not have that luxury right now.) > > * A single boolean index (array or scalar is identical) > > but no mix of the above (including multiple boolean indices). > > Because I think they are at least one level more confusing than > multiple advanced indices. > > I admit, I forgot that the broadcasting logic is fine in this case: > > arr = np.zeros((2, 3)) > arr[[True], np.array(3)] > > where the advanced index is also a scalar index. In that case the > result is straight forward, since broadcasting does not affect > `np.array(3)`. > > > I am happy to be wrong about that assessment, but I think your opinion > on it could likely push us towards just doing a Deprecation. > The only use case for "multiple boolean indices" that I could think of > was this: > > arr = np.diag([1, 2, 3, 4]) # 2-d square array > indx = arr.diagonal() > 2 # mask for each row and column > masked_diagonal = arr[indx, indx] > print(repr(masked_diagonal)) > # array([3, 4]) > > and my guess is that the reaction to that code is a: "Wait what?!" > > That code might seem reasonable, but it only works if you have the > exact same number of `True` values in the two indices. > And if you have the exact same number but two different arrays, then I > fail to reason about the result without doing the `nonzero` step, which > I think indicates that there just is no logical concept for it. > > > So, I think we may be better of forcing the few power-user who may have > found a use for this type of nugget to use `np.nonzero()` or find > another solution. Well I'm cautious because despite implementing the logic for all this, I'm a bit divorced from most use-cases. So I don't have a great feeling for what is currently being used. For example, is it possible to have a situation where you build a mask out of an expression, like a[x > 0] or whatever, where the mask expression could be any number of dimensions depending on the input values? And if so, does the current logic for scalar booleans do the right thing when the number of dimensions happens to be 0. Mixing nonscalar boolean and integer arrays seems fine, as far as the logic is concerned. I'm not really sure if it makes sense semantically. I'll have to think about it more. The thing that has the most odd corner cases in the indexing logic is boolean scalars. It would be nice if you could treat them uniformly with the same logic as other boolean arrays, but they have special cases everywhere. This is in contrast with integer scalars which perfectly match the logic of integer arrays with the shape == (). Maybe I'm just not looking at it from the right angle. I don't know. In ndindex, I've left the "arrays separated by slices, ellipses, or newaxes" case unimplemented. Travis Oliphant told me he thinks it was a mistake and it would be better to not allow it. I've also left boolean scalars mixed with other arrays unimplemented because I don't want to waste more time trying to figure out what is going on in the example I posted earlier (though what you wrote helps). I have nonscalar boolean arrays mixed with integer arrays working just fine, and the logic isn't really any different than it would be if I only supported them separately. Aaron Meurer > > - Sebastian > > > > > > Aaron Meurer > > > > On Thu, Aug 20, 2020 at 3:56 PM Sebastian Berg > > wrote: > > > On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > > > > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > > > > You're right. I was confusing the broadcasting logic for > > > > > boolean > > > > > arrays. > > > > > > > > > > However, I did find this example > > > > > > > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > > > > > dtype=np.int64), False] > > > > > Traceback (most recent call last): > > > > > File "", line 1, in > > > > > IndexError: shape mismatch: indexing arrays could not be > > > > > broadcast > > > > > together with shapes (1,5) (0,) > > > > > > > > > > That certainly seems to imply there is some broadcasting being > > > > > done. > > > > > > > > Yes, it broadcasts the array after converting it with `nonzero`, > > > > i.e. > > > > its much the same as: > > > > > > > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > > > > indices = np.broadcast_arrays(*indices) > > > > > > > > will give the same result (see also `np.ix_` which converts > > > > booleans > > > > as > > > > well for this reason, to give you outer indexing). > > > > I was half way through a mock-up/pseudo code, but thought you > > > > likely > > > > wasn't sure it was ending up clear. It sounds like things are > > > > probably > > > > falling into place for you (if they are not, let me know what > > > > might > > > > help you): > > > > > > Sorry editing error up there, in short I hope those steps sense to > > > you, > > > note that the broadcasting is basically part of a later "integer > > > only" > > > indexing step, and the `nonzero` part is pre-processing. > > > > > > > 1. Convert all boolean indices into a series of integer indices > > > > using > > > > `np.nonzero(index)` > > > > > > > > 2. For True/False scalars, that doesn't work, because > > > > `np.nonzero()`. > > > > > > > > `nonzero` gave us an index array (which is good, we obviously > > > > want > > > > > > > > one), but we need to index into `boolean_index.ndim == 0` > > > > dimensions! > > > > So that won't work, the approach using `nonzero` cannot > > > > generalize > > > > > > > > here, although boolean indices generalize perfectly. > > > > > > > > The solution to the dilemma is simple: If we have to index one > > > > dimension, but should be indexing zero, then we simply add > > > > that > > > > dimension to the original array (or at least pretend there was > > > > an additional dimension). > > > > > > > > 3. Do normal indexing with the result *including broadcasting*, > > > > we forget it was converted. > > > > > > > > The other way to solve it would be to always reshape the original > > > > array > > > > to combine all axes being indexed by a single boolean index into > > > > one > > > > axis and then index it using `np.flatnonzero`. (But that would > > > > get a > > > > different result if you try to broadcast!) > > > > > > > > > > > > In any case, I am not sure I would bother with making sense of > > > > this, > > > > except for sports! > > > > Its pretty much nonsense and I think the time understanding it is > > > > probably better spend deprecating it. The only reason I did not > > > > Deprecate itt before, is that I tried to do be minimal in the > > > > changes > > > > when I rewrote advanced indexing (and generalized boolean scalars > > > > correctly) long ago. That was likely the right start/choice at > > > > the > > > > time, since there were much bigger fish to catch, but I do not > > > > think > > > > anything is holding us back now. > > > > > > > > Cheers, > > > > > > > > Sebastian > > > > > > > > > > > > > Aaron Meurer > > > > > > > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > > > > wrote: > > > > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > > > > 3. If you have multiple advanced indexing you get > > > > > > > > > annoying > > > > > > > > > broadcasting > > > > > > > > > of all of these. That is *always* confusing for > > > > > > > > > boolean > > > > > > > > > indices. > > > > > > > > > 0-D should not be too special there... > > > > > > > > > > > > > > OK, now that I am learning more about advanced indexing, > > > > > > > this > > > > > > > statement is confusing to me. It seems that scalar boolean > > > > > > > indices do > > > > > > > not broadcast. For example: > > > > > > > > > > > > Well, broadcasting means you broadcast the *nonzero result* > > > > > > unless > > > > > > I am > > > > > > very confused... There is a reason I dismissed it. We could > > > > > > (and > > > > > > arguably should) just deprecate it. And I have doubts anyone > > > > > > would > > > > > > even notice. > > > > > > > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > > > > array([], dtype=int64) > > > > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > > > > np.array([True, > > > > > > > > > > False])))] > > > > > > > Traceback (most recent call last): > > > > > > > File "", line 1, in > > > > > > > IndexError: too many indices for array: array is 1- > > > > > > > dimensional, > > > > > > > but 2 > > > > > > > were indexed > > > > > > > > > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > > > > > equivalence > > > > > > > for Boolean arrays does not hold for zero dimensional > > > > > > > boolean > > > > > > > arrays," > > > > > > > which I guess also applies to the broadcasting. > > > > > > > > > > > > I actually think that probably also holds. Nonzero just > > > > > > behave > > > > > > weird > > > > > > for 0D because arrays (because it returns a tuple). > > > > > > But since broadcasting the nonzero result is so weird, and > > > > > > since > > > > > > 0- > > > > > > D > > > > > > booleans require some additional logic and don't generalize > > > > > > 100% > > > > > > (code > > > > > > wise), I won't rule out there are differences. > > > > > > > > > > > > > From what I can tell, the logic is that all integer and > > > > > > > boolean > > > > > > > arrays > > > > > > > > > > > > Did you try that? Because as I said above, IIRC broadcasting > > > > > > the > > > > > > boolean array without first calling `nonzero` isn't really > > > > > > whats > > > > > > going > > > > > > on. And I don't know how it could be whats going on, since > > > > > > adding > > > > > > dimensions to a boolean index would have much more > > > > > > implications? > > > > > > > > > > > > - Sebastian > > > > > > > > > > > > > > > > > > > (and scalar ints) are broadcast together, *except* for > > > > > > > boolean > > > > > > > scalars. Then the first boolean scalar is replaced with > > > > > > > and(all > > > > > > > boolean scalars) and the rest are removed from the index. > > > > > > > Then > > > > > > > that > > > > > > > index adds a length 1 axis if it is True and 0 if it is > > > > > > > False. > > > > > > > > > > > > > > So they don't broadcast, but rather "fake broadcast". I > > > > > > > still > > > > > > > contend > > > > > > > that it would be much more useful, if True were a synonym > > > > > > > for > > > > > > > newaxis > > > > > > > and False worked like newaxis but instead added a length 0 > > > > > > > axis. > > > > > > > Alternately, True and False scalars should behave exactly > > > > > > > like > > > > > > > all > > > > > > > other boolean arrays with no exceptions (i.e., work like > > > > > > > np.nonzero(), > > > > > > > broadcast, etc.). This would be less useful, but more > > > > > > > consistent. > > > > > > > > > > > > > > Aaron Meurer > > > > > > > _______________________________________________ > > > > > > > NumPy-Discussion mailing list > > > > > > > NumPy-Discussion at python.org > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Thu Aug 20 20:17:32 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 20 Aug 2020 19:17:32 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: <0c3ec8485726fc06dc60ee0c5d803d75c99b7e90.camel@sipsolutions.net> <7fe7ee7bff9b3cc72518513b1005fd2048ea368a.camel@sipsolutions.net> <3e3b8ffedf5efbcf2d9f071a2bcedacfe38d7820.camel@sipsolutions.net> <5a76b9753c84a324d78cf30d78fc1a3459467c9d.camel@sipsolutions.net> Message-ID: On Thu, 2020-08-20 at 17:08 -0600, Aaron Meurer wrote: > On Thu, Aug 20, 2020 at 4:38 PM Sebastian Berg > wrote: > > On Thu, 2020-08-20 at 16:00 -0600, Aaron Meurer wrote: > > > Just to be clear, what exactly do you think should be deprecated? > > > Boolean scalar indices in general, or just boolean scalars > > > combined > > > with other arrays, or something else? > > > > My angle is that we should allow only: > > > > * Any number of integer array indices (ideally only explicitly > > with `arr.vindex[]`, but we do not have that luxury right now.) > > > > * A single boolean index (array or scalar is identical) > > > > but no mix of the above (including multiple boolean indices). > > > > Because I think they are at least one level more confusing than > > multiple advanced indices. > > > > I admit, I forgot that the broadcasting logic is fine in this case: > > > > arr = np.zeros((2, 3)) > > arr[[True], np.array(3)] > > > > where the advanced index is also a scalar index. In that case the > > result is straight forward, since broadcasting does not affect > > `np.array(3)`. > > > > > > I am happy to be wrong about that assessment, but I think your > > opinion > > on it could likely push us towards just doing a Deprecation. > > The only use case for "multiple boolean indices" that I could think > > of > > was this: > > > > arr = np.diag([1, 2, 3, 4]) # 2-d square array > > indx = arr.diagonal() > 2 # mask for each row and column > > masked_diagonal = arr[indx, indx] > > print(repr(masked_diagonal)) > > # array([3, 4]) > > > > and my guess is that the reaction to that code is a: "Wait what?!" > > > > That code might seem reasonable, but it only works if you have the > > exact same number of `True` values in the two indices. > > And if you have the exact same number but two different arrays, > > then I > > fail to reason about the result without doing the `nonzero` step, > > which > > I think indicates that there just is no logical concept for it. > > > > > > So, I think we may be better of forcing the few power-user who may > > have > > found a use for this type of nugget to use `np.nonzero()` or find > > another solution. > > Well I'm cautious because despite implementing the logic for all > this, > I'm a bit divorced from most use-cases. So I don't have a great > feeling for what is currently being used. For example, is it possible > to have a situation where you build a mask out of an expression, like > a[x > 0] or whatever, where the mask expression could be any number > of I am not sure anyone does it, but I certainly can think of ways to use this functionality: ``` def good_images(image_or_stack): """Filter dark images image_or_stack : ndarray (..., N, M, 3) Returns ------- good_images : ndarray (K, N, M, 3) Returns all good images as a one dimensional stack for further processing, where `K` is the number of good images. """ assert image_or_stack.ndim >= 3 assert image_or_stack.shape[-1] == 3 # 3 colors, fixed. average_brightness = image_or_stack.mean((-3, -2, -1)) return image_or_stack[average_brigthness, ...] ``` Note that the above uses a single True/False if you pass in a single image. > dimensions depending on the input values? And if so, does the current > logic for scalar booleans do the right thing when the number of > dimensions happens to be 0. > > Mixing nonscalar boolean and integer arrays seems fine, as far as the > logic is concerned. I'm not really sure if it makes sense > semantically. I'll have to think about it more. The thing that has > the > most odd corner cases in the indexing logic is boolean scalars. It I think they are perfectly fine semantically, but they definitely do require special handling. Although the reason for that special handling is that we have to implement boolean indices using integer array indices and that is not possible without additional logic. If you browse the NumPy code, you will see there is a `HAS_0D_BOOL` macro (basically enum), to distinguish: internal_indx = np.nonzero(False) and: internal_indx = np.nonzero([False]) because the first effectively inserts a new dimension and then indices it, while the former just indices an existing dimension. > would be nice if you could treat them uniformly with the same logic > as > other boolean arrays, but they have special cases everywhere. This is > in contrast with integer scalars which perfectly match the logic of > integer arrays with the shape == (). Maybe I'm just not looking at it > from the right angle. I don't know. I hope the example above helps you, I think you should always remember the two rules of boolean indexing mentioned somewhere in the docs: * A boolean array indexes into `arr.ndim` dimensions, and effectively removes them. * A boolean array index adds a single input array. I guess, I should have written that mock-up code (maybe you can help improve the NumPy docs, although I guess this might be too technical): ``` def preprocess_boolean_indices(arr, indices): """Take an array and indices and returns a new array and new indices without any boolean ones. NOTE: Code will not handle None or Ellipsis """ new_indices = [] for axis, index in enumerate(indices): if not is_boolean_index(index): new_indices.append(index) # Check whether dimensions match here! new_indices.extend(np.nonzero(indices)) if index.ndim == 0: # nonzero result added an index, but we # should index into 0-dimensions, so add one. # (Ellipsis or None would mean `axis` is incorrect) arr = np.expand_dims(arr, axis) return arr, indices prep_arr, prep_indices = preprocess_boolean_indices(arr, indices) arr[indices] == prep_arr[prep_indices] ``` That is ugly, but the issue is not in the semantics of 0-D booleans, but rather in the translating boolean indices to integer indices. > In ndindex, I've left the "arrays separated by slices, ellipses, or > newaxes" case unimplemented. Travis Oliphant told me he thinks it was > a mistake and it would be better to not allow it. I've also left Yeah, either always transpose or just refuse the "separated by" cases. It is an interesting angle to only support the cases where axis insertion can be done as "expected", I remember mainly the discussion to just always transpose. > boolean scalars mixed with other arrays unimplemented because I don't > want to waste more time trying to figure out what is going on in the > example I posted earlier (though what you wrote helps). I have Absolutely agree with that step (I don't know if you are careful with scalars and 0D arrays, it would be the only issue I can think of). > nonscalar boolean arrays mixed with integer arrays working just fine, > and the logic isn't really any different than it would be if I only > supported them separately. Right, the implementation is likely straight forward. But the semantics of it is pretty weird (or impossible), almost any trial will show that, I think: arr = np.arange(12).reshape(3, 4) arr # array([[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11]]) arr[[True, False, True], [True, False, False, False]] # array([0, 8]) OK, you can reason about that, but only because there is a single boolean True in the second array (and then gets broadcast. arr[[True, False, True], [True, False, True, False]] # array([ 0, 10]) Ok, we can reason about this, but at that point we have to align the True values from the first index with those from the second (effectively convert the two indices to integer ones in our heads). But what is the meaning of aligning true values? I am sure there is none, except in very special cases. To proof this, lets try: arr[[True, True, True], [True, False, True, False]] which gives a broadcasting error :). So yeah, I guess you can find "meaning" for it but it seems just too strange, and even if you do using two integer indices will make things much clearer and less error prone. - Sebastian > Aaron Meurer > > > - Sebastian > > > > > > > Aaron Meurer > > > > > > On Thu, Aug 20, 2020 at 3:56 PM Sebastian Berg > > > wrote: > > > > On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > > > > > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > > > > > You're right. I was confusing the broadcasting logic for > > > > > > boolean > > > > > > arrays. > > > > > > > > > > > > However, I did find this example > > > > > > > > > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, > > > > > > > > > 0]], > > > > > > > > > dtype=np.int64), False] > > > > > > Traceback (most recent call last): > > > > > > File "", line 1, in > > > > > > IndexError: shape mismatch: indexing arrays could not be > > > > > > broadcast > > > > > > together with shapes (1,5) (0,) > > > > > > > > > > > > That certainly seems to imply there is some broadcasting > > > > > > being > > > > > > done. > > > > > > > > > > Yes, it broadcasts the array after converting it with > > > > > `nonzero`, > > > > > i.e. > > > > > its much the same as: > > > > > > > > > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > > > > > indices = np.broadcast_arrays(*indices) > > > > > > > > > > will give the same result (see also `np.ix_` which converts > > > > > booleans > > > > > as > > > > > well for this reason, to give you outer indexing). > > > > > I was half way through a mock-up/pseudo code, but thought you > > > > > likely > > > > > wasn't sure it was ending up clear. It sounds like things are > > > > > probably > > > > > falling into place for you (if they are not, let me know what > > > > > might > > > > > help you): > > > > > > > > Sorry editing error up there, in short I hope those steps sense > > > > to > > > > you, > > > > note that the broadcasting is basically part of a later > > > > "integer > > > > only" > > > > indexing step, and the `nonzero` part is pre-processing. > > > > > > > > > 1. Convert all boolean indices into a series of integer > > > > > indices > > > > > using > > > > > `np.nonzero(index)` > > > > > > > > > > 2. For True/False scalars, that doesn't work, because > > > > > `np.nonzero()`. > > > > > > > > > > `nonzero` gave us an index array (which is good, we > > > > > obviously > > > > > want > > > > > > > > > > one), but we need to index into `boolean_index.ndim == 0` > > > > > dimensions! > > > > > So that won't work, the approach using `nonzero` cannot > > > > > generalize > > > > > > > > > > here, although boolean indices generalize perfectly. > > > > > > > > > > The solution to the dilemma is simple: If we have to index > > > > > one > > > > > dimension, but should be indexing zero, then we simply add > > > > > that > > > > > dimension to the original array (or at least pretend there > > > > > was > > > > > an additional dimension). > > > > > > > > > > 3. Do normal indexing with the result *including > > > > > broadcasting*, > > > > > we forget it was converted. > > > > > > > > > > The other way to solve it would be to always reshape the > > > > > original > > > > > array > > > > > to combine all axes being indexed by a single boolean index > > > > > into > > > > > one > > > > > axis and then index it using `np.flatnonzero`. (But that > > > > > would > > > > > get a > > > > > different result if you try to broadcast!) > > > > > > > > > > > > > > > In any case, I am not sure I would bother with making sense > > > > > of > > > > > this, > > > > > except for sports! > > > > > Its pretty much nonsense and I think the time understanding > > > > > it is > > > > > probably better spend deprecating it. The only reason I did > > > > > not > > > > > Deprecate itt before, is that I tried to do be minimal in the > > > > > changes > > > > > when I rewrote advanced indexing (and generalized boolean > > > > > scalars > > > > > correctly) long ago. That was likely the right start/choice > > > > > at > > > > > the > > > > > time, since there were much bigger fish to catch, but I do > > > > > not > > > > > think > > > > > anything is holding us back now. > > > > > > > > > > Cheers, > > > > > > > > > > Sebastian > > > > > > > > > > > > > > > > Aaron Meurer > > > > > > > > > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > > > > > wrote: > > > > > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > > > > > 3. If you have multiple advanced indexing you get > > > > > > > > > > annoying > > > > > > > > > > broadcasting > > > > > > > > > > of all of these. That is *always* confusing for > > > > > > > > > > boolean > > > > > > > > > > indices. > > > > > > > > > > 0-D should not be too special there... > > > > > > > > > > > > > > > > OK, now that I am learning more about advanced > > > > > > > > indexing, > > > > > > > > this > > > > > > > > statement is confusing to me. It seems that scalar > > > > > > > > boolean > > > > > > > > indices do > > > > > > > > not broadcast. For example: > > > > > > > > > > > > > > Well, broadcasting means you broadcast the *nonzero > > > > > > > result* > > > > > > > unless > > > > > > > I am > > > > > > > very confused... There is a reason I dismissed it. We > > > > > > > could > > > > > > > (and > > > > > > > arguably should) just deprecate it. And I have doubts > > > > > > > anyone > > > > > > > would > > > > > > > even notice. > > > > > > > > > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > > > > > array([], dtype=int64) > > > > > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > > > > > np.array([True, > > > > > > > > > > > False])))] > > > > > > > > Traceback (most recent call last): > > > > > > > > File "", line 1, in > > > > > > > > IndexError: too many indices for array: array is 1- > > > > > > > > dimensional, > > > > > > > > but 2 > > > > > > > > were indexed > > > > > > > > > > > > > > > > And indeed, the docs even say, as you noted, "the > > > > > > > > nonzero > > > > > > > > equivalence > > > > > > > > for Boolean arrays does not hold for zero dimensional > > > > > > > > boolean > > > > > > > > arrays," > > > > > > > > which I guess also applies to the broadcasting. > > > > > > > > > > > > > > I actually think that probably also holds. Nonzero just > > > > > > > behave > > > > > > > weird > > > > > > > for 0D because arrays (because it returns a tuple). > > > > > > > But since broadcasting the nonzero result is so weird, > > > > > > > and > > > > > > > since > > > > > > > 0- > > > > > > > D > > > > > > > booleans require some additional logic and don't > > > > > > > generalize > > > > > > > 100% > > > > > > > (code > > > > > > > wise), I won't rule out there are differences. > > > > > > > > > > > > > > > From what I can tell, the logic is that all integer and > > > > > > > > boolean > > > > > > > > arrays > > > > > > > > > > > > > > Did you try that? Because as I said above, IIRC > > > > > > > broadcasting > > > > > > > the > > > > > > > boolean array without first calling `nonzero` isn't > > > > > > > really > > > > > > > whats > > > > > > > going > > > > > > > on. And I don't know how it could be whats going on, > > > > > > > since > > > > > > > adding > > > > > > > dimensions to a boolean index would have much more > > > > > > > implications? > > > > > > > > > > > > > > - Sebastian > > > > > > > > > > > > > > > > > > > > > > (and scalar ints) are broadcast together, *except* for > > > > > > > > boolean > > > > > > > > scalars. Then the first boolean scalar is replaced with > > > > > > > > and(all > > > > > > > > boolean scalars) and the rest are removed from the > > > > > > > > index. > > > > > > > > Then > > > > > > > > that > > > > > > > > index adds a length 1 axis if it is True and 0 if it is > > > > > > > > False. > > > > > > > > > > > > > > > > So they don't broadcast, but rather "fake broadcast". I > > > > > > > > still > > > > > > > > contend > > > > > > > > that it would be much more useful, if True were a > > > > > > > > synonym > > > > > > > > for > > > > > > > > newaxis > > > > > > > > and False worked like newaxis but instead added a > > > > > > > > length 0 > > > > > > > > axis. > > > > > > > > Alternately, True and False scalars should behave > > > > > > > > exactly > > > > > > > > like > > > > > > > > all > > > > > > > > other boolean arrays with no exceptions (i.e., work > > > > > > > > like > > > > > > > > np.nonzero(), > > > > > > > > broadcast, etc.). This would be less useful, but more > > > > > > > > consistent. > > > > > > > > > > > > > > > > Aaron Meurer > > > > > > > > _______________________________________________ > > > > > > > > NumPy-Discussion mailing list > > > > > > > > NumPy-Discussion at python.org > > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > NumPy-Discussion mailing list > > > > > > > NumPy-Discussion at python.org > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ottohirr at gmail.com Fri Aug 21 17:32:03 2020 From: ottohirr at gmail.com (Otto Hirr) Date: Fri, 21 Aug 2020 14:32:03 -0700 Subject: [Numpy-discussion] Migrating code to eliminate references to numpy/core/include/numpy/npy_3kcompat.h (?) Message-ID: Greetings, tl;dr: Need to remove npy_3kcompat.h from any source code location since Python2 is no longer supported. Background: Ran into an error after compiling various apps regarding npy_PyErr_ChainExceptionsCause() being undefined at runtime. Poking around shows that this is related to npy_3kcompat.h, which seems to be for Python2 compatibility. (There is a differently named counterpart for Python3.) This seems to be used in a couple of numpy/core/src/multiarray files. Seems that since Python2 is no longer supported, cleaning up cruft of P2 will improve code and should eliminate the problem I ran into, (or else move an error report to something else...) Is this reasonable to eliminate? Regards, ..Otto (Still getting my feet wet in numpy, scipy, etc.) From charlesr.harris at gmail.com Fri Aug 21 22:49:06 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Aug 2020 20:49:06 -0600 Subject: [Numpy-discussion] Migrating code to eliminate references to numpy/core/include/numpy/npy_3kcompat.h (?) In-Reply-To: References: Message-ID: On Fri, Aug 21, 2020 at 3:32 PM Otto Hirr wrote: > Greetings, > > tl;dr: > Need to remove npy_3kcompat.h from any source code location since > Python2 is no longer supported. > > Background: > Ran into an error after compiling various apps regarding > npy_PyErr_ChainExceptionsCause() being undefined at runtime. > > Poking around shows that this is related to npy_3kcompat.h, which > seems to be for Python2 compatibility. > (There is a differently named counterpart for Python3.) > > This seems to be used in a couple of numpy/core/src/multiarray files. > > Seems that since Python2 is no longer supported, cleaning up cruft of > P2 will improve code and should eliminate the problem I ran into, (or > else move an error report to something else...) > > Is this reasonable to eliminate? > > The migration is already underway, but will take time. Even after NumPy is clean we will keep it because we also need to maintain compatibility across Python 3 versions and to avoid breaking downstream projects who might be using it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Aug 21 23:39:53 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Aug 2020 21:39:53 -0600 Subject: [Numpy-discussion] Feature requests/Enhancements for upper-level engineering students In-Reply-To: <1597943458736-0.post@n7.nabble.com> References: <1597943458736-0.post@n7.nabble.com> Message-ID: On Thu, Aug 20, 2020 at 11:11 AM cooperrc wrote: > Greetings, > As the Fall semester is fast approaching (10 days away for us at UConn), we > are looking for senior design (also called capstone) projects for the > 2020-2021 school year. The COVID situation has strengthened the need for > remote work. > The process here is that students are assigned to projects by late > September. Then, they have 6 main deliverables over the course of 2 > semesters: > 1. Initial Fall Presentation (~Oct) > 2. Final Fall Presentation (~Dec) > 3. Mid-year report (~Jan) > 4. Initial Spring Presentation (~Mar) > 5. Final Spring Presntation (~Apr) > 6. Final report (~May) > > My question to the NumPy community is: Are there any features or > enhancements that would be nice to have, but might not have a team > dedicated > to the idea? > > I would be happy to advise any projects that people are interested in > proposing. I would like to hear what people think would be worthwhile for > students to build together. Some background, these students have all used > Python and Matlab for mechanical engineering applications like linear > regression, modal analyses, ode integration, and root solving. They learn > quickly, but may not be interested in UX/UI design problems. > > > Thanks for the inquiry. We are always looking for new people who have the time and inclination to make a contribution to NumPy, but NumPy core probably isn't a good choice for class projects. Work on NumPy core requires C and CPython C-API expertise and experienced programmers generally take 3-6 months to come up to speed, the learning curve is just too steep for most students. NumPy also needs to be very careful about maintaining compatibility with existing downstream projects and in introducing new features. I suspect students would enjoy a faster moving project. There is a lot of work on the website and online documentation that is moving faster than NumPy core, but that sounds like it might be out of scope for your classes. If not, let us know. If you can think of new projects based on NumPy, that might work better. They could be written in Python and the students could release them on PyPI if so inclined. I suspect there are several ongoing projects that are more engineering oriented than NumPy and the current Python Science stack could use more engineering applications. Perhaps others more familiar with that area could make suggestions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Mon Aug 24 17:31:42 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Mon, 24 Aug 2020 15:31:42 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> Message-ID: On Wed, Aug 19, 2020 at 8:18 PM Sebastian Berg wrote: > > On Wed, 2020-08-19 at 19:37 -0600, Aaron Meurer wrote: > > These cases don't give any deprecation warnings in NumPy master: > > > > > > > np.arange(0)[np.array([0]), False] > > array([], dtype=int64) > > > > > np.arange(0).reshape((0, 0))[np.array([0]), np.array([], > > > > > dtype=int)] > > array([], dtype=int64) > > > > Is that intentional? > > I guess it follows from `np.array([[1]])[[], [10]]` also not failing > currently. Sure, I think that's the same thing (sorry if my example is "too trivial". I was copy-pasting a hypothesis shrunk example). > > And that was intentional not to deprecate when out-of-bound indices > broadcast away. But I am not sure I actually think that was the better > choice. My initial choice was that this would be an error as well, and > I still slightly prefer it, but don't feel it matters much. There's an inconsistency here, which is that out-of-bounds indices that are broadcast away are not bounds checked unless they are scalar indices, in which case they are. >>> a = np.empty((1, 1)) >>> a[np.array([], dtype=int), np.array([10])] array([], dtype=float64) >>> a[np.array([], dtype=int), 10] Traceback (most recent call last): File "", line 1, in IndexError: index 10 is out of bounds for axis 1 with size 1 >>> np.broadcast_arrays(np.array([], dtype=int), np.array([10])) [array([], dtype=int64), array([], dtype=int64)] >>> np.broadcast_arrays(np.array([], dtype=int), 10) [array([], dtype=int64), array([], dtype=int64)] This breaks the rule that scalar integer indices have the same semantics as integer arrays with shape (). Aaron Meurer > > - Sebastian > > > > > Aaron Meurer > > > > On Thu, Jul 23, 2020 at 12:18 PM Aaron Meurer > > wrote: > > > > After writing this, I realized that I actually remember the > > > > *opposite* > > > > discussion occurring before. I think in some of the equality > > > > deprecations, we actually raise the new error due to an internal > > > > try/except clause. And there was a complaint that its confusing > > > > that a > > > > non-deprecation-warning is raised when the error will only happen > > > > with > > > > DeprecationWarnings being set to error. > > > > > > > > - Sebastian > > > > > > I noticed that warnings.catch_warnings does the right thing with > > > warnings that are raised alongside an exception (although it is a > > > bit > > > clunky to use). > > > > > > Aaron Meurer > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Mon Aug 24 19:07:42 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 24 Aug 2020 18:07:42 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> Message-ID: <3adb973e04b621eb643b016b61be9ce42bdcf3b6.camel@sipsolutions.net> On Mon, 2020-08-24 at 15:31 -0600, Aaron Meurer wrote: > On Wed, Aug 19, 2020 at 8:18 PM Sebastian Berg > wrote: > > On Wed, 2020-08-19 at 19:37 -0600, Aaron Meurer wrote: > > > These cases don't give any deprecation warnings in NumPy master: > > > > > > > > > np.arange(0)[np.array([0]), False] > > > array([], dtype=int64) > > > > > > np.arange(0).reshape((0, 0))[np.array([0]), np.array([], > > > > > > dtype=int)] > > > array([], dtype=int64) > > > > > > Is that intentional? > > > > I guess it follows from `np.array([[1]])[[], [10]]` also not > > failing > > currently. > > Sure, I think that's the same thing (sorry if my example is "too > trivial". I was copy-pasting a hypothesis shrunk example). > > > And that was intentional not to deprecate when out-of-bound indices > > broadcast away. But I am not sure I actually think that was the > > better > > choice. My initial choice was that this would be an error as well, > > and > > I still slightly prefer it, but don't feel it matters much. > > There's an inconsistency here, which is that out-of-bounds indices > that are broadcast away are not bounds checked unless they are scalar > indices, in which case they are. > > > > > a = np.empty((1, 1)) > > > > a[np.array([], dtype=int), np.array([10])] > array([], dtype=float64) > > > > a[np.array([], dtype=int), 10] > Traceback (most recent call last): > File "", line 1, in > IndexError: index 10 is out of bounds for axis 1 with size 1 > > > > np.broadcast_arrays(np.array([], dtype=int), np.array([10])) > [array([], dtype=int64), array([], dtype=int64)] > > > > np.broadcast_arrays(np.array([], dtype=int), 10) > [array([], dtype=int64), array([], dtype=int64)] > > This breaks the rule that scalar integer indices have the same > semantics as integer arrays with shape (). > Good observation. I agree, that is a subtle inconsistency for 0-D objects! (To be precise, I expect 0-D arrays behave identically to integers, since they will be optimized out of the "advanced index" part of the indexing operation). I suppose this may be an argument for always checking indices even when they are broadcast away? I am not certain how straight forward, or even desirable, it is to fix it so that 0-D integer arrays/integers can be "broadcast away". - Sebastian > Aaron Meurer > > > - Sebastian > > > > > Aaron Meurer > > > > > > On Thu, Jul 23, 2020 at 12:18 PM Aaron Meurer > > > > > > wrote: > > > > > After writing this, I realized that I actually remember the > > > > > *opposite* > > > > > discussion occurring before. I think in some of the equality > > > > > deprecations, we actually raise the new error due to an > > > > > internal > > > > > try/except clause. And there was a complaint that its > > > > > confusing > > > > > that a > > > > > non-deprecation-warning is raised when the error will only > > > > > happen > > > > > with > > > > > DeprecationWarnings being set to error. > > > > > > > > > > - Sebastian > > > > > > > > I noticed that warnings.catch_warnings does the right thing > > > > with > > > > warnings that are raised alongside an exception (although it is > > > > a > > > > bit > > > > clunky to use). > > > > > > > > Aaron Meurer > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Tue Aug 25 12:31:48 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 25 Aug 2020 11:31:48 -0500 Subject: [Numpy-discussion] NumPy Development Meeting Today - Triage Focus Message-ID: <5877079dc9ab10c71bd1b6a7033e937a171a1a6c.camel@sipsolutions.net> Hi all, Our bi-weekly triage-focused NumPy development meeting is tomorrow (Wednesday, August 26th) at 11 am Pacific Time (18:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized or simply discussed briefly. Just comment on it so we can label it, or add your PR/issue to this weeks topics for discussion. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Thu Aug 27 17:50:17 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2020 15:50:17 -0600 Subject: [Numpy-discussion] Dropping manylinux1 wheels for NumPy 1.20. Message-ID: Hi All, The 32 bit manylinux1 wheels are proving problematic, see https://github.com/numpy/numpy/issues/17174. One proposed solution is to only release manylinux2010 linux wheels for the NumPy 1.20 release. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Aug 28 10:37:15 2020 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Aug 2020 15:37:15 +0100 Subject: [Numpy-discussion] Dropping manylinux1 wheels for NumPy 1.20. In-Reply-To: References: Message-ID: Hi, On Thu, Aug 27, 2020 at 10:51 PM Charles R Harris wrote: > > Hi All, > > The 32 bit manylinux1 wheels are proving problematic, see https://github.com/numpy/numpy/issues/17174. One proposed solution is to only release manylinux2010 linux wheels for the NumPy 1.20 release. Thoughts? I think it may still be too early to discontinue manylinux1, sadly. Systems requiring manylinux1 are those with: pip < 19.0 (Jan 2019) [1] Linux distribution older than around 2010 (glibc < 2.12) [2] I did a PyPI BigQuery [3] just now, editing to give results for 32, and 64 bit (by changing the manylinux wheel name matching regexp). Then I processed a bit with Pandas [4]. It looks like about 34% of PyPI manylinux*_i686 downloads are for systems that actually need manylinux1, and about 17% of manylinux*_x86_64. See the table in [4] for a listing of the top 10 entries. Cheers, Matthew [1] https://github.com/pypa/manylinux [2] https://www.python.org/dev/peps/pep-0571/ [3] https://gist.github.com/e3901b344b8d81f5633908347b1b333e [4] https://gist.github.com/0f624ddbc34bc3db8bcae23e3eeb7b54 From ralf.gommers at gmail.com Fri Aug 28 11:50:20 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 28 Aug 2020 16:50:20 +0100 Subject: [Numpy-discussion] Dropping manylinux1 wheels for NumPy 1.20. In-Reply-To: References: Message-ID: On Fri, Aug 28, 2020 at 3:38 PM Matthew Brett wrote: > Hi, > > On Thu, Aug 27, 2020 at 10:51 PM Charles R Harris > wrote: > > > > Hi All, > > > > The 32 bit manylinux1 wheels are proving problematic, see > https://github.com/numpy/numpy/issues/17174. One proposed solution is to > only release manylinux2010 linux wheels for the NumPy 1.20 release. > Thoughts? > > I think it may still be too early to discontinue manylinux1, sadly. > > Systems requiring manylinux1 are those with: > > pip < 19.0 (Jan 2019) [1] > Linux distribution older than around 2010 (glibc < 2.12) [2] > > I did a PyPI BigQuery [3] just now, editing to give results for 32, > and 64 bit (by changing the manylinux wheel name matching regexp). > > Then I processed a bit with Pandas [4]. > > It looks like about 34% of PyPI manylinux*_i686 downloads are for > systems that actually need manylinux1, Note that a large fraction of that will be CI systems that get default Ubuntu pip (18.1 mostly), and could be very easily updated just like we do in our own CI (a simple `pip install -U pip`). If you really want to get to a small percentage of pip <19.1, we can wait for another 5 years. Which seems undesirable. While I agree that we can keep manylinux1 around for a little longer (maybe another year or so?), gating it on Linux distro pip version defaults would be odd. Pip is still very immature; people with a several years old Pip would be well-served by having to upgrade it. Cheers, Ralf and about 17% of > manylinux*_x86_64. See the table in [4] for a listing of the top 10 > entries. > > Cheers, > > Matthew > > [1] https://github.com/pypa/manylinux > [2] https://www.python.org/dev/peps/pep-0571/ > [3] https://gist.github.com/e3901b344b8d81f5633908347b1b333e > [4] https://gist.github.com/0f624ddbc34bc3db8bcae23e3eeb7b54 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Aug 28 12:00:01 2020 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Aug 2020 17:00:01 +0100 Subject: [Numpy-discussion] Dropping manylinux1 wheels for NumPy 1.20. In-Reply-To: References: Message-ID: Hi, Updated for Numpy wheels only - BigQuery [1], Notebook [2]. 41% of 32-bit wheels need manylinux1, 30% of 64-bit wheels. Ralf - agreed we shouldn't wait too long for old pip - but maybe we need to think of some way of reminding people with old pip to upgrade? Cheers, Matthew [1] https://gist.github.com/dc410698ca9e422aec08e4554eac6678 [2] https://gist.github.com/77879cb58b28b3d05c3c14b8a45687e8 On Fri, Aug 28, 2020 at 3:37 PM Matthew Brett wrote: > > Hi, > > On Thu, Aug 27, 2020 at 10:51 PM Charles R Harris > wrote: > > > > Hi All, > > > > The 32 bit manylinux1 wheels are proving problematic, see https://github.com/numpy/numpy/issues/17174. One proposed solution is to only release manylinux2010 linux wheels for the NumPy 1.20 release. Thoughts? > > I think it may still be too early to discontinue manylinux1, sadly. > > Systems requiring manylinux1 are those with: > > pip < 19.0 (Jan 2019) [1] > Linux distribution older than around 2010 (glibc < 2.12) [2] > > I did a PyPI BigQuery [3] just now, editing to give results for 32, > and 64 bit (by changing the manylinux wheel name matching regexp). > > Then I processed a bit with Pandas [4]. > > It looks like about 34% of PyPI manylinux*_i686 downloads are for > systems that actually need manylinux1, and about 17% of > manylinux*_x86_64. See the table in [4] for a listing of the top 10 > entries. > > Cheers, > > Matthew > > [1] https://github.com/pypa/manylinux > [2] https://www.python.org/dev/peps/pep-0571/ > [3] https://gist.github.com/e3901b344b8d81f5633908347b1b333e > [4] https://gist.github.com/0f624ddbc34bc3db8bcae23e3eeb7b54 From melissawm at gmail.com Fri Aug 28 14:49:11 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 28 Aug 2020 15:49:11 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday August 31 In-Reply-To: References: Message-ID: Hi all! This is a reminder that our next Documentation Team meeting will be on *Monday, August 31* at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200831T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Aug 28 16:06:09 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 28 Aug 2020 21:06:09 +0100 Subject: [Numpy-discussion] Dropping manylinux1 wheels for NumPy 1.20. In-Reply-To: References: Message-ID: On Fri, Aug 28, 2020 at 5:01 PM Matthew Brett wrote: > Hi, > > Updated for Numpy wheels only - BigQuery [1], Notebook [2]. > > 41% of 32-bit wheels need manylinux1, 30% of 64-bit wheels. > Thanks for doing the analysis, very useful data. > Ralf - agreed we shouldn't wait too long for old pip - but maybe we > need to think of some way of reminding people with old pip to upgrade? > I don't think there's much we can do unfortunately. That's up to Pip itself, and it may have good reasons not to nag people to upgrade. Cheers, Ralf > Cheers, > > Matthew > > [1] https://gist.github.com/dc410698ca9e422aec08e4554eac6678 > [2] https://gist.github.com/77879cb58b28b3d05c3c14b8a45687e8 > > On Fri, Aug 28, 2020 at 3:37 PM Matthew Brett > wrote: > > > > Hi, > > > > On Thu, Aug 27, 2020 at 10:51 PM Charles R Harris > > wrote: > > > > > > Hi All, > > > > > > The 32 bit manylinux1 wheels are proving problematic, see > https://github.com/numpy/numpy/issues/17174. One proposed solution is to > only release manylinux2010 linux wheels for the NumPy 1.20 release. > Thoughts? > > > > I think it may still be too early to discontinue manylinux1, sadly. > > > > Systems requiring manylinux1 are those with: > > > > pip < 19.0 (Jan 2019) [1] > > Linux distribution older than around 2010 (glibc < 2.12) [2] > > > > I did a PyPI BigQuery [3] just now, editing to give results for 32, > > and 64 bit (by changing the manylinux wheel name matching regexp). > > > > Then I processed a bit with Pandas [4]. > > > > It looks like about 34% of PyPI manylinux*_i686 downloads are for > > systems that actually need manylinux1, and about 17% of > > manylinux*_x86_64. See the table in [4] for a listing of the top 10 > > entries. > > > > Cheers, > > > > Matthew > > > > [1] https://github.com/pypa/manylinux > > [2] https://www.python.org/dev/peps/pep-0571/ > > [3] https://gist.github.com/e3901b344b8d81f5633908347b1b333e > > [4] https://gist.github.com/0f624ddbc34bc3db8bcae23e3eeb7b54 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.j.a.cock at googlemail.com Sat Aug 29 14:30:44 2020 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 29 Aug 2020 19:30:44 +0100 Subject: [Numpy-discussion] Staging Biopython wheels on Anaconda.org, was: Replacement for Rackspace In-Reply-To: References: <2B5B9B49-80D8-46A8-B12F-84C438A0ED4D@hxcore.ol> <304cbb1c-1871-0da4-097c-70d1c7c6d8e9@gmail.com> Message-ID: Hi Matti, If your offer still stands, I'd like to transition Biopython from the donated Rackspace storage to Anaconda as discussed a few weeks back. I presume this means updating the WHEELHOUSE credentials in our multi-wheel repository, both for TravisCI and AppVeyor? https://github.com/biopython/biopython-wheels Do you need any other information from me, or the Biopython team? Thank you, Peter, (On behalf of Biopython) On Tue, Aug 11, 2020 at 6:34 AM Matti Picus wrote: > > On 8/11/20 12:39 AM, Peter Cock wrote: > > Hi Matti, > > Is this an open invitation to the wider Numpy ecosystem? I am > interested on behalf of Biopython which was using the donated > Rackspace for multibuild wheel staging prior to PyPy release > (although having weekly test releases sounds interesting too). > > I would be happy to continue this discussion off list if you prefer, > > Thank you, > > Peter > > On Mon, Aug 10, 2020 at 9:20 PM Matti Picus wrote: > >> anaconda is generously hosting projects at >> https://anaconda.org/scipy-wheels-nightly/ (for weekly development >> releases that can be used to test downstream projects) and >> https://anaconda.org/multibuild-wheels-staging (for staging wheels to be >> tested for release on PyPI). >> >> >> The trick is that CI needs a token so it can upload to those >> organizations. Kevin, we can either add you to the groups you can create >> a token, or one of the current members could create tokens and transport >> them safely to Kevin. Please disucss it with me (or one of the other >> members https://anaconda.org/multibuild-wheels-staging/groups). >> >> Matti > > > Yes, it is a general invitation. My mail is in the message. > > I guess we should set up some kind of janitor task to remove older > packages from the hosting space as usage goes up. > > Matti > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.c.cooper at uconn.edu Sat Aug 29 15:34:16 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Sat, 29 Aug 2020 12:34:16 -0700 (MST) Subject: [Numpy-discussion] Feature requests/Enhancements for upper-level engineering students In-Reply-To: References: <1597943458736-0.post@n7.nabble.com> Message-ID: <1598729656988-0.post@n7.nabble.com> Charles R Harris wrote > On Thu, Aug 20, 2020 at 11:11 AM cooperrc < > ryan.c.cooper@ > > wrote: > >> Greetings, >> As the Fall semester is fast approaching (10 days away for us at UConn), >> we >> are looking for senior design (also called capstone) projects for the >> 2020-2021 school year. The COVID situation has strengthened the need for >> remote work. >> The process here is that students are assigned to projects by late >> September. Then, they have 6 main deliverables over the course of 2 >> semesters: >> 1. Initial Fall Presentation (~Oct) >> 2. Final Fall Presentation (~Dec) >> 3. Mid-year report (~Jan) >> 4. Initial Spring Presentation (~Mar) >> 5. Final Spring Presntation (~Apr) >> 6. Final report (~May) >> >> My question to the NumPy community is: Are there any features or >> enhancements that would be nice to have, but might not have a team >> dedicated >> to the idea? >> >> I would be happy to advise any projects that people are interested in >> proposing. I would like to hear what people think would be worthwhile for >> students to build together. Some background, these students have all used >> Python and Matlab for mechanical engineering applications like linear >> regression, modal analyses, ode integration, and root solving. They learn >> quickly, but may not be interested in UX/UI design problems. >> >> >> > Thanks for the inquiry. We are always looking for new people who have the > time and inclination to make a contribution to NumPy, but NumPy core > probably isn't a good choice for class projects. Work on NumPy core > requires C and CPython C-API expertise and experienced programmers > generally take 3-6 months to come up to speed, the learning curve is just > too steep for most students. NumPy also needs to be very careful about > maintaining compatibility with existing downstream projects and in > introducing new features. I suspect students would enjoy a > faster moving project. > > There is a lot of work on the website and online documentation that is > moving faster than NumPy core, but that sounds like it might be out of > scope for your classes. If not, let us know. > > If you can think of new projects based on NumPy, that might work better. > They could be written in Python and the students could release them on > PyPI > if so inclined. I suspect there are several ongoing projects that are more > engineering oriented than NumPy and the current Python Science stack could > use more engineering applications. Perhaps others more familiar with that > area could make suggestions. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@ > https://mail.python.org/mailman/listinfo/numpy-discussion Thanks for the feedback Chuck. I'll poke around and brainstorm. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From ryan.c.cooper at uconn.edu Sat Aug 29 15:33:18 2020 From: ryan.c.cooper at uconn.edu (cooperrc) Date: Sat, 29 Aug 2020 12:33:18 -0700 (MST) Subject: [Numpy-discussion] Feature requests/Enhancements for upper-level engineering students In-Reply-To: <1597958517453-0.post@n7.nabble.com> References: <1597943458736-0.post@n7.nabble.com> <1597958517453-0.post@n7.nabble.com> Message-ID: <1598729598952-0.post@n7.nabble.com> KevinBaselinesw wrote > would your team be interested in contributing to my port of Numpy to .NET? > > https://github.com/Quansight-Labs/numpy.net > > I have the vast majority of the Numpy core working as a pure .NET library. > > All of the other libraries that rely on Numpy are not ported. I am sure we > could find some good projects for your team to work on. These would be > "green field" projects and would likely be great learning opportunities > for > them. > > > > -- > Sent from: http://numpy-discussion.10968.n7.nabble.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@ > https://mail.python.org/mailman/listinfo/numpy-discussion I don't have any experience in .NET, so I don't know how much help I could lend/advise projects. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From einstein.edison at gmail.com Mon Aug 31 08:09:25 2020 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Mon, 31 Aug 2020 14:09:25 +0200 Subject: [Numpy-discussion] [ANN] PyData/Sparse 0.11.1 Message-ID: <4cd080dc-8e39-4f24-9b7d-1391c21091da@Canary> Hello, I?m happy to announce the release of PyData/Sparse 0.11.1, available to download via pip and conda-forge. PyData/Sparse is a library that provides sparse N-dimensional arrays for the PyData ecosystem. This is a bugfix release, with a fix for the regression in dot for very small values. The official website and documentation is available at: https://sparse.pydata.org (https://sparse.pydata.org/) The sources and bug tracker: https://github.com/pydata/sparse The changelog for this release can be viewed at: https://sparse.pydata.org/en/0.11.1/changelog.html Best Regards, Hameer Abbasi -- Sent from Canary (https://canarymail.io/) -------------- next part -------------- An HTML attachment was scrubbed... URL: