From josh.craig.wilson at gmail.com Sun Nov 1 09:37:39 2020 From: josh.craig.wilson at gmail.com (Joshua Wilson) Date: Sun, 1 Nov 2020 06:37:39 -0800 Subject: [Numpy-discussion] Ndarray static typing: Order of generic types In-Reply-To: References: Message-ID: > Just to speak for myself, I don't think the precise choice matters very much. There are arguments for consistency both ways. I agree with this. In the absence of strong theoretical considerations I'd fall back to a practical one-we can make ndarray generic over dtype _right now_, while for shape we will need to wait 1+ years for the variadic type variable PEP to settle etc. To me that suggests: - Do ndarray[DType] now - When the shape stuff is ready, do ndarray[DType, ShapeStuff] (or however ShapeStuff ends up being spelled) - Write a mypy plugin that rewrites ndarray[DType] to ndarray[DType, AnyShape] (or whatever) for backwards compatibility On Thu, Oct 29, 2020 at 1:37 PM Stephan Hoyer wrote: > > On Wed, Oct 28, 2020 at 2:44 PM bas van beek wrote: >> >> Hey all, >> >> >> >> With the recent merging of numpy/numpy#16759 we?re at the point where `ndarray` can be made generic w.r.t. its dtype and shape. >> >> An open question which yet remains is to order in which these two parameters should appear (numpy/numpy#16547): >> >> ? `ndarray[Dtype, Shape]` >> >> ? `ndarray[Shape, Dtype]` > > > Hi Bas, > > Thanks for driving this forward! > > Just to speak for myself, I don't think the precise choice matters very much. There are arguments for consistency both ways. In the end Dtype and Shape are different enough that I doubt it will be a point of confusion. > > Also, I would guess many users will define their own type aliases, so can write something more succinct like Float64[shape] rather than ndarray[float64, shape]. We might even consider including some of these in numpy.typing. > > Cheers, > Stephan > > >> >> >> >> There has been a some discussion about this question in issue 16547, but a consensus has not yet to be reached. >> >> Most people seem to slightly preferring one option over the other. >> >> >> >> Are there any further thoughts on this subject? >> >> >> >> Regards, >> >> Bas van Beek >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From mark.harfouche at gmail.com Sun Nov 1 18:27:30 2020 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Sun, 1 Nov 2020 18:27:30 -0500 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: I know it seems silly, but would an amendment to NEP29 be reasonable? Many downstream packages look to numpy to understand what versions should be supported and NEP29 gave some good guidance. That said, if it is worth ignoring, or revisiting, some clarity on how to apply NEP29 given recent development would be appreciated. Best, Mark On Sat, Oct 31, 2020 at 8:24 AM Ralf Gommers wrote: > > > On Thu, Oct 29, 2020 at 2:25 PM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Time to start planning for the 1.20.x branch. These are my thoughts at >> the moment: >> >> - Keep support for Python 3.6. Python 3.7 came out in June 2018, >> which seems too recent to be our oldest supported version. >> - Drop Python 3.6 for 1.21.x, that will make the oldest supported >> version about three years old. >> - Drop manylinux1 for 1.21.x. It would be nice to drop earlier, but >> manylinux2010 is pretty recent. >> >> There were 33 wheels in the 1.19.3 release, I think we can live with that >> for 1.20.x. I'm more worried about our tools aging out. After Python has >> settled into its yearly release cycle, I think we will end up supporting >> the latest 4 versions. >> >> Thoughts? >> > > Seems reasonable to me. > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From currurant at gmail.com Sun Nov 1 18:54:46 2020 From: currurant at gmail.com (Currurant) Date: Sun, 1 Nov 2020 16:54:46 -0700 (MST) Subject: [Numpy-discussion] Efficient way to draw multinomial distribution random samples Message-ID: <1604274886591-0.post@n7.nabble.com> I realized that neither numpy.random.multinomial nor rng.multinomial has the ability to draw from different multinomial distributions at the same time like what MATLAB mnrnd() does here: https://www.mathworks.com/help/stats/mnrnd.html Also, I have asked this question on StackOverFlow: https://stackoverflow.com/questions/64529620/is-there-an-efficient-way-to-generate-multinomial-random-variables-in-parallel?noredirect=1#comment114131565_64529620 It seems like this is something good to add to numpy.random, since it would be much more faster when you have many multinomial distributions to draw from---using loops. -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From charlesr.harris at gmail.com Sun Nov 1 19:50:59 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Nov 2020 17:50:59 -0700 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: On Sun, Nov 1, 2020 at 4:28 PM Mark Harfouche wrote: > I know it seems silly, but would an amendment to NEP29 be reasonable? > > Many downstream packages look to numpy to understand what versions should > be supported and NEP29 gave some good guidance. > That said, if it is worth ignoring, or revisiting, some clarity on how to > apply NEP29 given recent development would be appreciated. > > Best, > > Mark > > Do you think the proposal is not in compliance? There is no requirement that we drop anything more than 42 months old, it is just recommended. The change in the Python release cycle has created some difficulty. With the yearly cycle, 4 python yearly releases will cover 3-4 years, which seems reasonable and we can probably drop to 3 releases towards the end, but with 3.7 coming 18 months after 3.6, four releases is on the long side, and three releases on the short side, so keeping 3.6 is the conservative choice. Once the yearly cycle sets in I think we will be fine. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Sun Nov 1 19:58:27 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Mon, 2 Nov 2020 00:58:27 +0000 Subject: [Numpy-discussion] Efficient way to draw multinomial distribution random samples In-Reply-To: <1604274886591-0.post@n7.nabble.com> References: <1604274886591-0.post@n7.nabble.com> Message-ID: This is in the pending PR. Hopefully out in 1.20. Kevin On Sun, Nov 1, 2020, 23:55 Currurant wrote: > I realized that neither numpy.random.multinomial nor rng.multinomial has > the > ability to draw from different multinomial distributions at the same time > like what MATLAB mnrnd() does here: > > https://www.mathworks.com/help/stats/mnrnd.html > > Also, I have asked this question on StackOverFlow: > > > https://stackoverflow.com/questions/64529620/is-there-an-efficient-way-to-generate-multinomial-random-variables-in-parallel?noredirect=1#comment114131565_64529620 > > It seems like this is something good to add to numpy.random, since it would > be much more faster when you have many multinomial distributions to draw > from---using loops. > > > > -- > Sent from: http://numpy-discussion.10968.n7.nabble.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Sun Nov 1 20:47:25 2020 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Sun, 1 Nov 2020 20:47:25 -0500 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: > > > Do you think the proposal is not in compliance? There is no requirement > that we drop anything more than 42 months old, it is just recommended. The > change in the Python release cycle has created some difficulty. With the > yearly cycle, 4 python yearly releases will cover 3-4 years, which seems > reasonable and we can probably drop to 3 releases towards the end, but with > 3.7 coming 18 months after 3.6, four releases is on the long side, and > three releases on the short side, so keeping 3.6 is the conservative > choice. Once the yearly cycle sets in I think we will be fine. > > Chuck > I believe that it really helps to "lead by example". I don't mean to reference threads that you have all participated in, but the discussion in: https://mail.python.org/pipermail/scipy-dev/2020-August/024336.html Makes it clear to me at least, that downstream will follow the example that numpy sets. At the time of writing, it was anticipated that Python 3.7, 3.8, and maybe 3.9 would exist in Nov 1st. The support table https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table suggests that any release July 23 should only support 3.7. Barring COVID delays, it seems natural that in Nov 2020, support for Python 3.6 be dropped or that the NEP be revised. These decisions are hard, and take up alot of mental capacity, if the support window needs revisiting, that is fine, it just really helps to be able to point to a document (which is what NEP29 seemed to do). -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Nov 1 21:03:38 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Nov 2020 19:03:38 -0700 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: On Sun, Nov 1, 2020 at 6:48 PM Mark Harfouche wrote: > >> Do you think the proposal is not in compliance? There is no requirement >> that we drop anything more than 42 months old, it is just recommended. The >> change in the Python release cycle has created some difficulty. With the >> yearly cycle, 4 python yearly releases will cover 3-4 years, which seems >> reasonable and we can probably drop to 3 releases towards the end, but with >> 3.7 coming 18 months after 3.6, four releases is on the long side, and >> three releases on the short side, so keeping 3.6 is the conservative >> choice. Once the yearly cycle sets in I think we will be fine. >> >> Chuck >> > > I believe that it really helps to "lead by example". > > I don't mean to reference threads that you have all participated in, but > the discussion in: > https://mail.python.org/pipermail/scipy-dev/2020-August/024336.html > > Makes it clear to me at least, that downstream will follow the example > that numpy sets. > > At the time of writing, it was anticipated that Python 3.7, 3.8, and maybe > 3.9 would exist in Nov 1st. > The support table > https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table > suggests that any release July 23 should only support 3.7. > > Barring COVID delays, it seems natural that in Nov 2020, support for > Python 3.6 be dropped or that the NEP be revised. > > These decisions are hard, and take up alot of mental capacity, if the > support window needs revisiting, that is fine, it just really helps to be > able to point to a document (which is what NEP29 seemed to do). > > The problem is that if we drop 3.6 the oldest version of Python will only be 30 months old, not 36. Dropping 3.6 for 1.20.x will make it 36 months, which is the recommended minimum coverage. I made sure that the language did not preclude longer support periods in any case. It would be helpful here if more people would comment, I would be happy to go with the shorter period if a majority of downstream projects want to go that way. It's not that I love 3.6, but there is no compelling reason to drop it, as there was for 3.5, at least that I am aware of. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sun Nov 1 21:07:52 2020 From: jeffreback at gmail.com (Jeff Reback) Date: Sun, 1 Nov 2020 21:07:52 -0500 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: pandas has already dropped 3.6 support in our coming 1.2 release (nov 2020); 1.1.x supports 3.6 > On Nov 1, 2020, at 9:04 PM, Charles R Harris wrote: > > ? > > > On Sun, Nov 1, 2020 at 6:48 PM Mark Harfouche wrote: >>> >>> Do you think the proposal is not in compliance? There is no requirement that we drop anything more than 42 months old, it is just recommended. The change in the Python release cycle has created some difficulty. With the yearly cycle, 4 python yearly releases will cover 3-4 years, which seems reasonable and we can probably drop to 3 releases towards the end, but with 3.7 coming 18 months after 3.6, four releases is on the long side, and three releases on the short side, so keeping 3.6 is the conservative choice. Once the yearly cycle sets in I think we will be fine. >>> >>> Chuck >> >> I believe that it really helps to "lead by example". >> >> I don't mean to reference threads that you have all participated in, but the discussion in: >> https://mail.python.org/pipermail/scipy-dev/2020-August/024336.html >> >> Makes it clear to me at least, that downstream will follow the example that numpy sets. >> >> At the time of writing, it was anticipated that Python 3.7, 3.8, and maybe 3.9 would exist in Nov 1st. >> The support table https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table >> suggests that any release July 23 should only support 3.7. >> >> Barring COVID delays, it seems natural that in Nov 2020, support for Python 3.6 be dropped or that the NEP be revised. >> >> These decisions are hard, and take up alot of mental capacity, if the support window needs revisiting, that is fine, it just really helps to be able to point to a document (which is what NEP29 seemed to do). >> > > The problem is that if we drop 3.6 the oldest version of Python will only be 30 months old, not 36. Dropping 3.6 for 1.20.x will make it 36 months, which is the recommended minimum coverage. I made sure that the language did not preclude longer support periods in any case. > > It would be helpful here if more people would comment, I would be happy to go with the shorter period if a majority of downstream projects want to go that way. It's not that I love 3.6, but there is no compelling reason to drop it, as there was for 3.5, at least that I am aware of. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Sun Nov 1 21:44:03 2020 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 1 Nov 2020 18:44:03 -0800 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: NetworkX is currently planning to support 3.6 for our coming 2.6 release (dec 2020) and 3.0 release (early 2021). We had originally thought about following NEP 29. But I assumed it had been abandoned, since neither NumPy nor SciPy dropped Python 3.6 on Jun 23, 2020. NetworkX is likely to continue supporting whatever versions of Python both NumPy and SciPy support regardless of what NEP 29 says. I wouldn't be surprised if other projects do the same thing. Jarrod From millman at berkeley.edu Sun Nov 1 21:54:53 2020 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 1 Nov 2020 18:54:53 -0800 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: I also misunderstood the purpose of the NEP. I assumed it was intended to encourage projects to drop old versions of Python. Other people have viewed the NEP similarly: https://github.com/networkx/networkx/issues/4027 If the intention of the NEP is to specify that projects not drop old version of Python too early, I don't think it is obvious from the NEP. It would be helpful if you added a simple motivation statement near the top of the document. Something like: ## Motivation and Scope The purpose of the NEP is to ensure projects in the scientific Python ecosystem don't drop support for old version of Python and NumPy too soon. On Sun, Nov 1, 2020 at 6:44 PM Jarrod Millman wrote: > > NetworkX is currently planning to support 3.6 for our coming 2.6 > release (dec 2020) and 3.0 release (early 2021). We had originally > thought about following NEP 29. But I assumed it had been abandoned, > since neither NumPy nor SciPy dropped Python 3.6 on Jun 23, 2020. > > NetworkX is likely to continue supporting whatever versions of Python > both NumPy and SciPy support regardless of what NEP 29 says. I > wouldn't be surprised if other projects do the same thing. > > Jarrod From stefanv at berkeley.edu Sun Nov 1 22:47:04 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Sun, 01 Nov 2020 19:47:04 -0800 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: > I also misunderstood the purpose of the NEP. I assumed it was > intended to encourage projects to drop old versions of Python. Other > people have viewed the NEP similarly: > https://github.com/networkx/networkx/issues/4027 Of all the packages, it makes sense for NumPy to behave most conservatively with depreciations. The NEP suggests allowable support periods, but as far as I recall does not enforce minimal support. Stephan Hoyer had a good recommendation on how we can clarify the NEP to be easier to intuit. Stephan, shall we make an ammendment to the NEP with your idea? Best regards, St?fan From kevin.k.sheppard at gmail.com Mon Nov 2 02:12:34 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Mon, 2 Nov 2020 07:12:34 +0000 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Nov 2 02:47:18 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 1 Nov 2020 23:47:18 -0800 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt wrote: > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: > > I also misunderstood the purpose of the NEP. I assumed it was > > intended to encourage projects to drop old versions of Python. Other > > people have viewed the NEP similarly: > > https://github.com/networkx/networkx/issues/4027 > > Of all the packages, it makes sense for NumPy to behave most > conservatively with depreciations. The NEP suggests allowable support > periods, but as far as I recall does not enforce minimal support. > > Stephan Hoyer had a good recommendation on how we can clarify the NEP to > be easier to intuit. Stephan, shall we make an ammendment to the NEP with > your idea? > For reference, here was my proposed revision: https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 Specifically, rather than saying "the latest release of NumPy supports all versions of Python released in the 42 months before NumPy's release", it says "NumPy will only require versions of Python that were released more than 24 months ago". In practice, this works out to the same thing (at least given Python's old 18 month release cycle). This changes the definition of the support window (in a way that I think is clearer and that works better for infrequent releases), but there is still the question of how large that window should be for NumPy. My personal opinion is that somewhere in the range of 24-36 months would be appropriate. > Best regards, > St?fan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Nov 2 03:01:38 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 2 Nov 2020 08:01:38 +0000 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer wrote: > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt > wrote: > >> On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: >> > I also misunderstood the purpose of the NEP. I assumed it was >> > intended to encourage projects to drop old versions of Python. > > It was. It is. I think the NEP is very clear on that. Honestly we should just follow the NEP and drop 3.6 now for both NumPy and SciPy, I just am tired of arguing for it - which the NEP should have prevented being necessary, and I don't want to do again right now, so this will probably be my last email on this thread. Other >> > people have viewed the NEP similarly: >> > https://github.com/networkx/networkx/issues/4027 >> >> Of all the packages, it makes sense for NumPy to behave most >> conservatively with depreciations. The NEP suggests allowable support >> periods, but as far as I recall does not enforce minimal support. >> > It doesn't *enforce* it, but the recommendation is very clear. It would be good to follow it. >> Stephan Hoyer had a good recommendation on how we can clarify the NEP to >> be easier to intuit. Stephan, shall we make an ammendment to the NEP with >> your idea? >> > > For reference, here was my proposed revision: > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 > > Specifically, rather than saying "the latest release of NumPy supports all > versions of Python released in the 42 months before NumPy's release", it > says "NumPy will only require versions of Python that were released more > than 24 months ago". In practice, this works out to the same thing (at > least given Python's old 18 month release cycle). > > This changes the definition of the support window (in a way that I think > is clearer and that works better for infrequent releases), but there is > still the question of how large that window should be for NumPy. > I'm not sure it's clearer, the current NEP has a nice graphic and literally says "a project with a major or minor version release in November 2020 should support Python 3.7 and newer."). However happy to adopt it if it makes others happy - in the end it comes down to the same thing: it's recommended to drop Python 3.6 now. My personal opinion is that somewhere in the range of 24-36 months would be > appropriate. > +1 Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From deak.andris at gmail.com Mon Nov 2 07:22:06 2020 From: deak.andris at gmail.com (Andras Deak) Date: Mon, 2 Nov 2020 13:22:06 +0100 Subject: [Numpy-discussion] Do not understand what f2py is reporting In-Reply-To: References: Message-ID: On Sun, Nov 1, 2020 at 2:33 AM Samuel Dupree wrote: > > I'm attempting to build wrappers around two Fortran routines. One is a > Fortran 77 subroutine (see file gravity_derivs.f) that calls a Fortran > 90 package that performs automatic differentiation (see file > auto_deriv.f90). > > I'm running he Anaconda distribution for Python 3.7.6 on a Mac Pro > (2019) under Mac OS X Catalina (ver. 10.15.6). The version of NumPy I'm > running is 1.18.3. The commands I used to attempt the build are > contained in the file auto_deriv_build. The messages output by f2py are > captured in the file auto_derivs_build_report.txt. > > I don't understand the cause behind the error messages I got, so any > advice would be welcomed. > > Sam Dupree. Hi Sam, I've got a partial solution. I haven't used f2py yet but at least the error from your first `f2py` call seems straightforward. Near the top: Line #119 in gravity_derivs.f:" integer * 4 degree" updatevars: no name pattern found for entity='*4degree'. Skipping. This shows that the fortran code gets parsed as `(integer) (*4degree)`. That can't be right. There might be a way to tell f2py to do this right, but anyway I could make your code compile by replacing every such declaration with `integer * 4 :: degree` etc (i.e. adding double colons everywhere). Once that's fixed your first f2py call raises another error: Fatal Error: Cannot open module file ?deriv_class.mod? for reading at (1): No such file or directory I could generate these mod files by manually running `gfortran -c auto_deriv.f90`. After that the .mod files appear and your first `f2py` call will succed. You can now `import gravity_derivs`, but of course this will lead to an error because `auto_deriv` is not available in python. Unfortunately your _second_` f2py` call also dies on `auto_deriv.f90`, with such offending lines: In: :auto_deriv:auto_deriv.f90:ad_auxiliary get_parameters: got "invalid syntax (, line 1)" on '(/((i, i=j,n), j=1,n)/)' I'm guessing that again f2py can't parse that syntax. My hunch is that if you can get f2py to work with `auto_deriv.f90` you should first run that. This should hopefully generate the .mod files after which the second call to `f2py` with `gravity_derivs.f` should work. If `f2py` doesn't generate the .mod files you could at worst run your fortran compiler yourself between the two calls to `f2py`. Cheers, Andr?s > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From jni at fastmail.com Mon Nov 2 07:49:45 2020 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Mon, 02 Nov 2020 06:49:45 -0600 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: Message-ID: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> I like Ralf's email, and most of all I agree that the existing wording is clearer. My view on the NEP is that it does not mandate dropping support, but encourage it. In my projects I would drop it if I had use for Python 3.7+ features. It so happens that we want to use PEP-593 so we were grateful for NEP-29 giving us "permission" to drop 3.6. I would suggest that 3.6 be dropped immediately if there are any open PRs that would benefit from it, or code cleanups that it would enable. The point of the NEP is to short-circuit discussion about whether it's "worth" dropping 3.6. If it's valuable at all, do it. Thanks all, Juan. On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: > > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer wrote: >> On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt wrote: >>> On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: >>> > I also misunderstood the purpose of the NEP. I assumed it was >>> > intended to encourage projects to drop old versions of Python. > > It was. It is. I think the NEP is very clear on that. Honestly we should just follow the NEP and drop 3.6 now for both NumPy and SciPy, I just am tired of arguing for it - which the NEP should have prevented being necessary, and I don't want to do again right now, so this will probably be my last email on this thread. > > >>> Other >>> > people have viewed the NEP similarly: >>> > https://github.com/networkx/networkx/issues/4027 >>> >>> Of all the packages, it makes sense for NumPy to behave most conservatively with depreciations. The NEP suggests allowable support periods, but as far as I recall does not enforce minimal support. > > It doesn't *enforce* it, but the recommendation is very clear. It would be good to follow it. > >>> >>> Stephan Hoyer had a good recommendation on how we can clarify the NEP to be easier to intuit. Stephan, shall we make an ammendment to the NEP with your idea? >> >> For reference, here was my proposed revision: >> https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 >> Specifically, rather than saying "the latest release of NumPy supports all versions of Python released in the 42 months before NumPy's release", it says "NumPy will only require versions of Python that were released more than 24 months ago". In practice, this works out to the same thing (at least given Python's old 18 month release cycle). >> >> This changes the definition of the support window (in a way that I think is clearer and that works better for infrequent releases), but there is still the question of how large that window should be for NumPy. > > I'm not sure it's clearer, the current NEP has a nice graphic and literally says "a project with a major or minor version release in November 2020 should support Python 3.7 and newer."). However happy to adopt it if it makes others happy - in the end it comes down to the same thing: it's recommended to drop Python 3.6 now. > >> My personal opinion is that somewhere in the range of 24-36 months would be appropriate. > > +1 > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Nov 2 11:37:28 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 2 Nov 2020 09:37:28 -0700 Subject: [Numpy-discussion] NumPy 1.19.4 release Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce the release of NumPy 1.19.4. NumPy 1.19.4 is a quick release to revert the OpenBLAS library version. It was hoped that the 0.3.12 OpenBLAS version used in 1.19.3 would work around the Microsoft fmod bug, but problems in some docker environments turned up. Instead, 1.19.4 will use the older library and run a sanity check on import, raising an error if the problem is detected. Microsoft is aware of the problem and has promised a fix, users should upgrade when it becomes available. This release supports Python 3.6-3.9. NumPy Wheels for this release can be downloaded from PyPI , source archives, release notes, and wheel hashes are available on Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Contributors* A total of 1 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Charles Harris *Pull requests merged* A total of 2 pull requests were merged for this release. - #17679: MAINT: Add check for Windows 10 version 2004 bug. - #17680: REV: Revert OpenBLAS to 1.19.2 version for 1.19.4 Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Nov 2 13:48:46 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 02 Nov 2020 12:48:46 -0600 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> References: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> Message-ID: On Mon, 2020-11-02 at 06:49 -0600, Juan Nunez-Iglesias wrote: > I like Ralf's email, and most of all I agree that the existing > wording is clearer. > > My view on the NEP is that it does not mandate dropping support, but > encourage it. In my projects I would drop it if I had use for Python > 3.7+ features. It so happens that we want to use PEP-593 so we were > grateful for NEP-29 giving us "permission" to drop 3.6. > > I would suggest that 3.6 be dropped immediately if there are any open > PRs that would benefit from it, or code cleanups that it would > enable. The point of the NEP is to short-circuit discussion about > whether it's "worth" dropping 3.6. If it's valuable at all, do it. > Probably the only thing that requires 3.7 in NumPy at this time is the module level `__getattr__`, which is used only for deprecations (and to make the financial removal slightly more gentle). I am not sure if PyPy already has stable support for 3.7 yet? Although PyPy is maybe not a big priority. We don't have to support 3.6 and I don't care if we do. Until this discussion my assumption was we would probably drop it. But, current master is tested against 3.6, so the main work seems release related. If Chuck thinks that is no hassle I don't mind if NumPy is a bit more conservative than NEP 29. Or is there a danger of setting a precedent where projects are wrongly expected to keep support just because NumPy still has it, so that NumPy not being conservative actually helps everyone? - Sebastian > Thanks all, > > Juan. > > On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: > > > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer > > wrote: > > > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt < > > > stefanv at berkeley.edu> wrote: > > > > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: > > > > > I also misunderstood the purpose of the NEP. I assumed it > > > > > was > > > > > intended to encourage projects to drop old versions of > > > > > Python. > > > > It was. It is. I think the NEP is very clear on that. Honestly we > > should just follow the NEP and drop 3.6 now for both NumPy and > > SciPy, I just am tired of arguing for it - which the NEP should > > have prevented being necessary, and I don't want to do again right > > now, so this will probably be my last email on this thread. > > > > > > > > Other > > > > > people have viewed the NEP similarly: > > > > > https://github.com/networkx/networkx/issues/4027 > > > > > > > > Of all the packages, it makes sense for NumPy to behave most > > > > conservatively with depreciations. The NEP suggests allowable > > > > support periods, but as far as I recall does not enforce > > > > minimal support. > > > > It doesn't *enforce* it, but the recommendation is very clear. It > > would be good to follow it. > > > > > > Stephan Hoyer had a good recommendation on how we can clarify > > > > the NEP to be easier to intuit. Stephan, shall we make an > > > > ammendment to the NEP with your idea? > > > > > > For reference, here was my proposed revision: > > > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 > > > Specifically, rather than saying "the latest release of NumPy > > > supports all versions of Python released in the 42 months before > > > NumPy's release", it says "NumPy will only require versions of > > > Python that were released more than 24 months ago". In practice, > > > this works out to the same thing (at least given Python's old 18 > > > month release cycle). > > > > > > This changes the definition of the support window (in a way that > > > I think is clearer and that works better for infrequent > > > releases), but there is still the question of how large that > > > window should be for NumPy. > > > > I'm not sure it's clearer, the current NEP has a nice graphic and > > literally says "a project with a major or minor version release in > > November 2020 should support Python 3.7 and newer."). However happy > > to adopt it if it makes others happy - in the end it comes down to > > the same thing: it's recommended to drop Python 3.6 now. > > > > > My personal opinion is that somewhere in the range of 24-36 > > > months would be appropriate. > > > > +1 > > > > Cheers, > > Ralf > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sdupree at speakeasy.net Mon Nov 2 23:26:00 2020 From: sdupree at speakeasy.net (Samuel Dupree) Date: Mon, 2 Nov 2020 23:26:00 -0500 Subject: [Numpy-discussion] Do not understand what f2py is reporting In-Reply-To: References: Message-ID: <7426aa38-b667-36ec-46cd-66c6935eb928@speakeasy.net> Andras, Thank you for respond to my post. I sincerely appreciate it. Following your advice, I replaced "integer * 4" with "integer" and I was able to generate the signature files for gravity_derivs.f. The problem now is generating the signature file for auto_deriv.f90. I agree that f2py has a problem with In: :auto_deriv:auto_deriv.f90:ad_auxiliary get_parameters: got "invalid syntax (, line 1)" on '(/((i,i=j,n), j=1,n)/)' I'm not sure I understand why f2py has a problem with this syntax. Is there documentation that talks to what Fortran77, Fortran 90/95 syntax f2py will and will not accept? Sam Dupree. On November/02/2020 07:22:06, Andras Deak wrote: > On Sun, Nov 1, 2020 at 2:33 AM Samuel Dupree wrote: >> I'm attempting to build wrappers around two Fortran routines. One is a >> Fortran 77 subroutine (see file gravity_derivs.f) that calls a Fortran >> 90 package that performs automatic differentiation (see file >> auto_deriv.f90). >> >> I'm running he Anaconda distribution for Python 3.7.6 on a Mac Pro >> (2019) under Mac OS X Catalina (ver. 10.15.6). The version of NumPy I'm >> running is 1.18.3. The commands I used to attempt the build are >> contained in the file auto_deriv_build. The messages output by f2py are >> captured in the file auto_derivs_build_report.txt. >> >> I don't understand the cause behind the error messages I got, so any >> advice would be welcomed. >> >> Sam Dupree. > Hi Sam, > > I've got a partial solution. > I haven't used f2py yet but at least the error from your first `f2py` > call seems straightforward. Near the top: > > Line #119 in gravity_derivs.f:" integer * 4 degree" > updatevars: no name pattern found for entity='*4degree'. Skipping. > > This shows that the fortran code gets parsed as `(integer) > (*4degree)`. That can't be right. There might be a way to tell f2py to > do this right, but anyway I could make your code compile by replacing > every such declaration with `integer * 4 :: degree` etc (i.e. adding > double colons everywhere). > Once that's fixed your first f2py call raises another error: > > Fatal Error: Cannot open module file ?deriv_class.mod? for reading > at (1): No such file or directory > > I could generate these mod files by manually running `gfortran -c > auto_deriv.f90`. After that the .mod files appear and your first > `f2py` call will succed. > You can now `import gravity_derivs`, but of course this will lead to > an error because `auto_deriv` is not available in python. > Unfortunately your _second_` f2py` call also dies on `auto_deriv.f90`, > with such offending lines: > > In: :auto_deriv:auto_deriv.f90:ad_auxiliary > get_parameters: got "invalid syntax (, line 1)" on '(/((i, > i=j,n), j=1,n)/)' > > I'm guessing that again f2py can't parse that syntax. > My hunch is that if you can get f2py to work with `auto_deriv.f90` you > should first run that. This should hopefully generate the .mod files > after which the second call to `f2py` with `gravity_derivs.f` should > work. If `f2py` doesn't generate the .mod files you could at worst run > your fortran compiler yourself between the two calls to `f2py`. > Cheers, > > Andr?s > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From mark.harfouche at gmail.com Tue Nov 3 09:17:54 2020 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Tue, 3 Nov 2020 09:17:54 -0500 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> Message-ID: Juan made a pretty good argument for keeping 3.6 support in the next scikit-image release, let me try to paraphrase: - Since nobody has made the PR to explicitly drop python 3.6 from the scikit-image build matrix, we will continue to support it, but if somebody were to make the PR, I (Juan) would support it. As for supporting PyPy: it already exists in the build matrix AFAICT. Breaking PyPy would be a deliberate action, as opposed to an accidental byproduct of dropping CPython 3.6. On Mon, Nov 2, 2020, 13:50 Sebastian Berg wrote: > On Mon, 2020-11-02 at 06:49 -0600, Juan Nunez-Iglesias wrote: > > I like Ralf's email, and most of all I agree that the existing > > wording is clearer. > > > > My view on the NEP is that it does not mandate dropping support, but > > encourage it. In my projects I would drop it if I had use for Python > > 3.7+ features. It so happens that we want to use PEP-593 so we were > > grateful for NEP-29 giving us "permission" to drop 3.6. > > > > I would suggest that 3.6 be dropped immediately if there are any open > > PRs that would benefit from it, or code cleanups that it would > > enable. The point of the NEP is to short-circuit discussion about > > whether it's "worth" dropping 3.6. If it's valuable at all, do it. > > > > Probably the only thing that requires 3.7 in NumPy at this time is the > module level `__getattr__`, which is used only for deprecations (and to > make the financial removal slightly more gentle). > I am not sure if PyPy already has stable support for 3.7 yet? Although > PyPy is maybe not a big priority. > > We don't have to support 3.6 and I don't care if we do. Until this > discussion my assumption was we would probably drop it. > > But, current master is tested against 3.6, so the main work seems > release related. If Chuck thinks that is no hassle I don't mind if > NumPy is a bit more conservative than NEP 29. > > Or is there a danger of setting a precedent where projects are wrongly > expected to keep support just because NumPy still has it, so that NumPy > not being conservative actually helps everyone? > > - Sebastian > > > > Thanks all, > > > > Juan. > > > > On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: > > > > > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer > > > wrote: > > > > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt < > > > > stefanv at berkeley.edu> wrote: > > > > > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: > > > > > > I also misunderstood the purpose of the NEP. I assumed it > > > > > > was > > > > > > intended to encourage projects to drop old versions of > > > > > > Python. > > > > > > It was. It is. I think the NEP is very clear on that. Honestly we > > > should just follow the NEP and drop 3.6 now for both NumPy and > > > SciPy, I just am tired of arguing for it - which the NEP should > > > have prevented being necessary, and I don't want to do again right > > > now, so this will probably be my last email on this thread. > > > > > > > > > > > Other > > > > > > people have viewed the NEP similarly: > > > > > > https://github.com/networkx/networkx/issues/4027 > > > > > > > > > > Of all the packages, it makes sense for NumPy to behave most > > > > > conservatively with depreciations. The NEP suggests allowable > > > > > support periods, but as far as I recall does not enforce > > > > > minimal support. > > > > > > It doesn't *enforce* it, but the recommendation is very clear. It > > > would be good to follow it. > > > > > > > > Stephan Hoyer had a good recommendation on how we can clarify > > > > > the NEP to be easier to intuit. Stephan, shall we make an > > > > > ammendment to the NEP with your idea? > > > > > > > > For reference, here was my proposed revision: > > > > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 > > > > Specifically, rather than saying "the latest release of NumPy > > > > supports all versions of Python released in the 42 months before > > > > NumPy's release", it says "NumPy will only require versions of > > > > Python that were released more than 24 months ago". In practice, > > > > this works out to the same thing (at least given Python's old 18 > > > > month release cycle). > > > > > > > > This changes the definition of the support window (in a way that > > > > I think is clearer and that works better for infrequent > > > > releases), but there is still the question of how large that > > > > window should be for NumPy. > > > > > > I'm not sure it's clearer, the current NEP has a nice graphic and > > > literally says "a project with a major or minor version release in > > > November 2020 should support Python 3.7 and newer."). However happy > > > to adopt it if it makes others happy - in the end it comes down to > > > the same thing: it's recommended to drop Python 3.6 now. > > > > > > > My personal opinion is that somewhere in the range of 24-36 > > > > months would be appropriate. > > > > > > +1 > > > > > > Cheers, > > > Ralf > > > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Nov 3 10:54:24 2020 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 3 Nov 2020 17:54:24 +0200 Subject: [Numpy-discussion] New package to speed up ufunc inner loops Message-ID: Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce "pnumpy", a package to speed up NumPy. https://quansight.github.io/numpy-threading-extensions/stable/index.html What is in it? - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops - When the inner loop is called with a large enough array, chunk the data and perform the iteration via a thread pool - Add a different memory allocator for "ndarray" data (will require an appropriate API from NumPy) - Allow using optimized loops above and beyond what NumPy provides - Allow logging inner loop calls and parameters to learn about the current process and perhaps tune the performance accordingly The first release contains the hooking mechanism and the thread pool, the rest has been prototyped but is not ready for release. The idea behind the package is that a third-party package can try things out and iterate much faster than NumPy. If some of the ideas bear fruit, and do not add an undue maintenance burden to NumPy, the code can be ported to NumPy. I am not sure NumPy wishes to take upon itself the burden of managing threads, but a third-party package may be able to. I am writing to the mailing list both to announce the pre-release under the wrong name, and, in accordance with the fair play rules[1], to request use of the "numpy" name in the package. We had considered many options, in the end would like to propose "pnumpy" (the p is either "parallel" or "performant" or "preliminary", whatever you desire). Matti [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules From tcaswell at gmail.com Tue Nov 3 13:49:39 2020 From: tcaswell at gmail.com (Thomas Caswell) Date: Tue, 3 Nov 2020 13:49:39 -0500 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> Message-ID: I am in favor of dropping py36 for np1.20, I think it would be good to lead by example. Similar to pandas, the next Matplotlib release (3.4 targeted for Dec/Jan) will not support py36. Tom On Tue, Nov 3, 2020 at 9:18 AM Mark Harfouche wrote: > Juan made a pretty good argument for keeping 3.6 support in the next > scikit-image release, let me try to paraphrase: > > - Since nobody has made the PR to explicitly drop python 3.6 from the > scikit-image build matrix, we will continue to support it, but if somebody > were to make the PR, I (Juan) would support it. > > As for supporting PyPy: it already exists in the build matrix AFAICT. > Breaking PyPy would be a deliberate action, as opposed to an accidental > byproduct of dropping CPython 3.6. > > On Mon, Nov 2, 2020, 13:50 Sebastian Berg > wrote: > >> On Mon, 2020-11-02 at 06:49 -0600, Juan Nunez-Iglesias wrote: >> > I like Ralf's email, and most of all I agree that the existing >> > wording is clearer. >> > >> > My view on the NEP is that it does not mandate dropping support, but >> > encourage it. In my projects I would drop it if I had use for Python >> > 3.7+ features. It so happens that we want to use PEP-593 so we were >> > grateful for NEP-29 giving us "permission" to drop 3.6. >> > >> > I would suggest that 3.6 be dropped immediately if there are any open >> > PRs that would benefit from it, or code cleanups that it would >> > enable. The point of the NEP is to short-circuit discussion about >> > whether it's "worth" dropping 3.6. If it's valuable at all, do it. >> > >> >> Probably the only thing that requires 3.7 in NumPy at this time is the >> module level `__getattr__`, which is used only for deprecations (and to >> make the financial removal slightly more gentle). >> I am not sure if PyPy already has stable support for 3.7 yet? Although >> PyPy is maybe not a big priority. >> >> We don't have to support 3.6 and I don't care if we do. Until this >> discussion my assumption was we would probably drop it. >> >> But, current master is tested against 3.6, so the main work seems >> release related. If Chuck thinks that is no hassle I don't mind if >> NumPy is a bit more conservative than NEP 29. >> >> Or is there a danger of setting a precedent where projects are wrongly >> expected to keep support just because NumPy still has it, so that NumPy >> not being conservative actually helps everyone? >> >> - Sebastian >> >> >> > Thanks all, >> > >> > Juan. >> > >> > On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: >> > > >> > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer >> > > wrote: >> > > > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt < >> > > > stefanv at berkeley.edu> wrote: >> > > > > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: >> > > > > > I also misunderstood the purpose of the NEP. I assumed it >> > > > > > was >> > > > > > intended to encourage projects to drop old versions of >> > > > > > Python. >> > > >> > > It was. It is. I think the NEP is very clear on that. Honestly we >> > > should just follow the NEP and drop 3.6 now for both NumPy and >> > > SciPy, I just am tired of arguing for it - which the NEP should >> > > have prevented being necessary, and I don't want to do again right >> > > now, so this will probably be my last email on this thread. >> > > >> > > >> > > > > Other >> > > > > > people have viewed the NEP similarly: >> > > > > > https://github.com/networkx/networkx/issues/4027 >> > > > > >> > > > > Of all the packages, it makes sense for NumPy to behave most >> > > > > conservatively with depreciations. The NEP suggests allowable >> > > > > support periods, but as far as I recall does not enforce >> > > > > minimal support. >> > > >> > > It doesn't *enforce* it, but the recommendation is very clear. It >> > > would be good to follow it. >> > > >> > > > > Stephan Hoyer had a good recommendation on how we can clarify >> > > > > the NEP to be easier to intuit. Stephan, shall we make an >> > > > > ammendment to the NEP with your idea? >> > > > >> > > > For reference, here was my proposed revision: >> > > > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 >> > > > Specifically, rather than saying "the latest release of NumPy >> > > > supports all versions of Python released in the 42 months before >> > > > NumPy's release", it says "NumPy will only require versions of >> > > > Python that were released more than 24 months ago". In practice, >> > > > this works out to the same thing (at least given Python's old 18 >> > > > month release cycle). >> > > > >> > > > This changes the definition of the support window (in a way that >> > > > I think is clearer and that works better for infrequent >> > > > releases), but there is still the question of how large that >> > > > window should be for NumPy. >> > > >> > > I'm not sure it's clearer, the current NEP has a nice graphic and >> > > literally says "a project with a major or minor version release in >> > > November 2020 should support Python 3.7 and newer."). However happy >> > > to adopt it if it makes others happy - in the end it comes down to >> > > the same thing: it's recommended to drop Python 3.6 now. >> > > >> > > > My personal opinion is that somewhere in the range of 24-36 >> > > > months would be appropriate. >> > > >> > > +1 >> > > >> > > Cheers, >> > > Ralf >> > > >> > > >> > > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at python.org >> > > https://mail.python.org/mailman/listinfo/numpy-discussion >> > > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Thomas Caswell tcaswell at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From b.sipocz+numpylist at gmail.com Tue Nov 3 14:58:11 2020 From: b.sipocz+numpylist at gmail.com (Brigitta Sipocz) Date: Tue, 3 Nov 2020 11:58:11 -0800 Subject: [Numpy-discussion] NumPy 1.20.x branch in two weeks In-Reply-To: References: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> Message-ID: Hi, For what it's worth, python 3.6 is also dropped for astropy 4.2 (RC1 to be released in the next few days). We haven't yet formally adopted NEP29, but are very close to it peding some word smithing, and no one from the dev team was fighting for keeping support for 3.6. or numpy 1.16. Cheers, Brigitta On Tue, 3 Nov 2020 at 10:53, Thomas Caswell wrote: > I am in favor of dropping py36 for np1.20, I think it would be good to > lead by example. > > Similar to pandas, the next Matplotlib release (3.4 targeted for Dec/Jan) > will not support py36. > > Tom > > > > On Tue, Nov 3, 2020 at 9:18 AM Mark Harfouche > wrote: > >> Juan made a pretty good argument for keeping 3.6 support in the next >> scikit-image release, let me try to paraphrase: >> >> - Since nobody has made the PR to explicitly drop python 3.6 from the >> scikit-image build matrix, we will continue to support it, but if somebody >> were to make the PR, I (Juan) would support it. >> >> As for supporting PyPy: it already exists in the build matrix AFAICT. >> Breaking PyPy would be a deliberate action, as opposed to an accidental >> byproduct of dropping CPython 3.6. >> >> On Mon, Nov 2, 2020, 13:50 Sebastian Berg >> wrote: >> >>> On Mon, 2020-11-02 at 06:49 -0600, Juan Nunez-Iglesias wrote: >>> > I like Ralf's email, and most of all I agree that the existing >>> > wording is clearer. >>> > >>> > My view on the NEP is that it does not mandate dropping support, but >>> > encourage it. In my projects I would drop it if I had use for Python >>> > 3.7+ features. It so happens that we want to use PEP-593 so we were >>> > grateful for NEP-29 giving us "permission" to drop 3.6. >>> > >>> > I would suggest that 3.6 be dropped immediately if there are any open >>> > PRs that would benefit from it, or code cleanups that it would >>> > enable. The point of the NEP is to short-circuit discussion about >>> > whether it's "worth" dropping 3.6. If it's valuable at all, do it. >>> > >>> >>> Probably the only thing that requires 3.7 in NumPy at this time is the >>> module level `__getattr__`, which is used only for deprecations (and to >>> make the financial removal slightly more gentle). >>> I am not sure if PyPy already has stable support for 3.7 yet? Although >>> PyPy is maybe not a big priority. >>> >>> We don't have to support 3.6 and I don't care if we do. Until this >>> discussion my assumption was we would probably drop it. >>> >>> But, current master is tested against 3.6, so the main work seems >>> release related. If Chuck thinks that is no hassle I don't mind if >>> NumPy is a bit more conservative than NEP 29. >>> >>> Or is there a danger of setting a precedent where projects are wrongly >>> expected to keep support just because NumPy still has it, so that NumPy >>> not being conservative actually helps everyone? >>> >>> - Sebastian >>> >>> >>> > Thanks all, >>> > >>> > Juan. >>> > >>> > On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: >>> > > >>> > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer >>> > > wrote: >>> > > > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt < >>> > > > stefanv at berkeley.edu> wrote: >>> > > > > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: >>> > > > > > I also misunderstood the purpose of the NEP. I assumed it >>> > > > > > was >>> > > > > > intended to encourage projects to drop old versions of >>> > > > > > Python. >>> > > >>> > > It was. It is. I think the NEP is very clear on that. Honestly we >>> > > should just follow the NEP and drop 3.6 now for both NumPy and >>> > > SciPy, I just am tired of arguing for it - which the NEP should >>> > > have prevented being necessary, and I don't want to do again right >>> > > now, so this will probably be my last email on this thread. >>> > > >>> > > >>> > > > > Other >>> > > > > > people have viewed the NEP similarly: >>> > > > > > https://github.com/networkx/networkx/issues/4027 >>> > > > > >>> > > > > Of all the packages, it makes sense for NumPy to behave most >>> > > > > conservatively with depreciations. The NEP suggests allowable >>> > > > > support periods, but as far as I recall does not enforce >>> > > > > minimal support. >>> > > >>> > > It doesn't *enforce* it, but the recommendation is very clear. It >>> > > would be good to follow it. >>> > > >>> > > > > Stephan Hoyer had a good recommendation on how we can clarify >>> > > > > the NEP to be easier to intuit. Stephan, shall we make an >>> > > > > ammendment to the NEP with your idea? >>> > > > >>> > > > For reference, here was my proposed revision: >>> > > > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 >>> > > > Specifically, rather than saying "the latest release of NumPy >>> > > > supports all versions of Python released in the 42 months before >>> > > > NumPy's release", it says "NumPy will only require versions of >>> > > > Python that were released more than 24 months ago". In practice, >>> > > > this works out to the same thing (at least given Python's old 18 >>> > > > month release cycle). >>> > > > >>> > > > This changes the definition of the support window (in a way that >>> > > > I think is clearer and that works better for infrequent >>> > > > releases), but there is still the question of how large that >>> > > > window should be for NumPy. >>> > > >>> > > I'm not sure it's clearer, the current NEP has a nice graphic and >>> > > literally says "a project with a major or minor version release in >>> > > November 2020 should support Python 3.7 and newer."). However happy >>> > > to adopt it if it makes others happy - in the end it comes down to >>> > > the same thing: it's recommended to drop Python 3.6 now. >>> > > >>> > > > My personal opinion is that somewhere in the range of 24-36 >>> > > > months would be appropriate. >>> > > >>> > > +1 >>> > > >>> > > Cheers, >>> > > Ralf >>> > > >>> > > >>> > > >>> > > _______________________________________________ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion at python.org >>> > > https://mail.python.org/mailman/listinfo/numpy-discussion >>> > > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at python.org >>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > -- > Thomas Caswell > tcaswell at gmail.com > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Nov 4 01:49:45 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 04 Nov 2020 00:49:45 -0600 Subject: [Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus (US switched times last week) Message-ID: Hi all, Our bi-weekly triage-focused NumPy development meeting is today (Wednesday, November 4th) at 11 am Pacific Time (18:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized, discussed, or reviewed. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Wed Nov 4 16:43:16 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 4 Nov 2020 21:43:16 +0000 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Tue, Nov 3, 2020 at 3:54 PM Matti Picus wrote: > Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce > "pnumpy", a package to speed up NumPy. > > https://quansight.github.io/numpy-threading-extensions/stable/index.html > > > What is in it? > > - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops > > - When the inner loop is called with a large enough array, chunk the > data and perform the iteration via a thread pool > > - Add a different memory allocator for "ndarray" data (will require an > appropriate API from NumPy) > > - Allow using optimized loops above and beyond what NumPy provides > > - Allow logging inner loop calls and parameters to learn about the > current process and perhaps tune the performance accordingly > > > The first release contains the hooking mechanism and the thread pool, > the rest has been prototyped but is not ready for release. The idea > behind the package is that a third-party package can try things out and > iterate much faster than NumPy. If some of the ideas bear fruit, and do > not add an undue maintenance burden to NumPy, the code can be ported to > NumPy. I am not sure NumPy wishes to take upon itself the burden of > managing threads, but a third-party package may be able to. > > > I am writing to the mailing list both to announce the pre-release under > the wrong name, and, in accordance with the fair play rules[1], to > request use of the "numpy" name in the package. We had considered many > options, in the end would like to propose "pnumpy" (the p is either > "parallel" or "performant" or "preliminary", whatever you desire). > Thanks Matti! Obviously as another Quansight employee I have a conflict of interest here, so let me just say I wasn't involved with choosing the `pnumpy` name but did already comment internally on using "numpy" as part of the package name would probably be fine, given that Matti is the main author and the intent is to migrate the useful parts into NumPy itself. Hopefully someone else can comment, maybe St?fan as the "fair play" NEP author? Cheers, Ralf > > Matti > > > [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Wed Nov 4 16:47:40 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 4 Nov 2020 14:47:40 -0700 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: I hope this isn't too off topic, but this "fair play" NEP reads like it is a set of additional restrictions on the NumPy license, which if it is, would make NumPy no longer open source by the OSI definition. I think the NEP should be much clearer that these are requests but not requirements. Aaron Meurer On Wed, Nov 4, 2020 at 2:44 PM Ralf Gommers wrote: > > > > On Tue, Nov 3, 2020 at 3:54 PM Matti Picus wrote: >> >> Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce >> "pnumpy", a package to speed up NumPy. >> >> https://quansight.github.io/numpy-threading-extensions/stable/index.html >> >> >> What is in it? >> >> - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops >> >> - When the inner loop is called with a large enough array, chunk the >> data and perform the iteration via a thread pool >> >> - Add a different memory allocator for "ndarray" data (will require an >> appropriate API from NumPy) >> >> - Allow using optimized loops above and beyond what NumPy provides >> >> - Allow logging inner loop calls and parameters to learn about the >> current process and perhaps tune the performance accordingly >> >> >> The first release contains the hooking mechanism and the thread pool, >> the rest has been prototyped but is not ready for release. The idea >> behind the package is that a third-party package can try things out and >> iterate much faster than NumPy. If some of the ideas bear fruit, and do >> not add an undue maintenance burden to NumPy, the code can be ported to >> NumPy. I am not sure NumPy wishes to take upon itself the burden of >> managing threads, but a third-party package may be able to. >> >> >> I am writing to the mailing list both to announce the pre-release under >> the wrong name, and, in accordance with the fair play rules[1], to >> request use of the "numpy" name in the package. We had considered many >> options, in the end would like to propose "pnumpy" (the p is either >> "parallel" or "performant" or "preliminary", whatever you desire). > > > Thanks Matti! > > Obviously as another Quansight employee I have a conflict of interest here, so let me just say I wasn't involved with choosing the `pnumpy` name but did already comment internally on using "numpy" as part of the package name would probably be fine, given that Matti is the main author and the intent is to migrate the useful parts into NumPy itself. > > Hopefully someone else can comment, maybe St?fan as the "fair play" NEP author? > > Cheers, > Ralf > > >> >> >> Matti >> >> >> [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Wed Nov 4 17:01:29 2020 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 4 Nov 2020 17:01:29 -0500 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Wed, Nov 4, 2020 at 4:49 PM Aaron Meurer wrote: > I hope this isn't too off topic, but this "fair play" NEP reads like > it is a set of additional restrictions on the NumPy license, which if > it is, would make NumPy no longer open source by the OSI definition. I > think the NEP should be much clearer that these are requests but not > requirements. > FWIW, I don't read the NEP like that. Aside from the trademark on the name "NumPy", which _are_ enforceable requirements but are orthogonal to the copyright license, I see enough "request-like" language on everything else. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Nov 4 17:15:37 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 04 Nov 2020 16:15:37 -0600 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: <615e34a86cd0fd6bedb4c4d053f7c9006e1a28c5.camel@sipsolutions.net> On Tue, 2020-11-03 at 17:54 +0200, Matti Picus wrote: > Hi. On behalf of Quansight and RTOSHoldings, I would like to > introduce > "pnumpy", a package to speed up NumPy. > > https://quansight.github.io/numpy-threading-extensions/stable/index.html > Nice to see these efforts especially with intention of possible upstreaming. I hope we can improve the NumPy infrastructure to make these tries much easier and powerful in the future! (And as I mentioned, I had such things in mind with NEP 43, albeit as a possible later extension, not an explicit goal.) I am a bit curious about the actual performance improvements even without allowing more flexibility on the NumPy side, my gut feeling would be fairly large variations with sometimes big improvements due to parallelization bug often only added overheads due to NumPy not giving you deep enough control? As to the name, I don't have an issue with using `pnumpy`, although I was never hugely concerned about it. Initially I thought a longer name might be nicer, but the old(?) accelerated-numpy or fast_numpy_loops doesn't seem that much clearer to me. I guess in the end, I think its just important to be clear that this type of project patches/modifies NumPy and is not associated with it directly. It seams `pnumpy` is currently taken on PyPI with a small amount of downloads: https://pypistats.org/packages/pnumpy (Although I wonder how many are actual users.), though. Cheers, Sebastian > > What is in it? > > - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner > loops > > - When the inner loop is called with a large enough array, chunk the > data and perform the iteration via a thread pool > > - Add a different memory allocator for "ndarray" data (will require > an > appropriate API from NumPy) > > - Allow using optimized loops above and beyond what NumPy provides > > - Allow logging inner loop calls and parameters to learn about the > current process and perhaps tune the performance accordingly > > > The first release contains the hooking mechanism and the thread > pool, > the rest has been prototyped but is not ready for release. The idea > behind the package is that a third-party package can try things out > and > iterate much faster than NumPy. If some of the ideas bear fruit, and > do > not add an undue maintenance burden to NumPy, the code can be ported > to > NumPy. I am not sure NumPy wishes to take upon itself the burden of > managing threads, but a third-party package may be able to. > > > I am writing to the mailing list both to announce the pre-release > under > the wrong name, and, in accordance with the fair play rules[1], to > request use of the "numpy" name in the package. We had considered > many > options, in the end would like to propose "pnumpy" (the p is either > "parallel" or "performant" or "preliminary", whatever you desire). > > > Matti > > > [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From stefanv at berkeley.edu Wed Nov 4 17:20:22 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Wed, 04 Nov 2020 14:20:22 -0800 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Wed, Nov 4, 2020, at 13:47, Aaron Meurer wrote: > I hope this isn't too off topic, but this "fair play" NEP reads like > it is a set of additional restrictions on the NumPy license, which if > it is, would make NumPy no longer open source by the OSI definition. I > think the NEP should be much clearer that these are requests but not > requirements. Specifically, the NEP is worded as follows: """ This document aims to define a minimal set of rules that, when followed, will be considered good-faith efforts in line with the expectations of the NumPy developers. ... When in doubt, please talk to us first. We may suggest an alternative; at minimum, we?ll be prepared. """ There is no language of forced restriction. The heading in question is "Do not reuse the NumPy name for projects not developed by the NumPy community". Matti is a member of our community, and while the project may be sponsored by others, he is doing exactly what the NEP recommends: discussing the issue with the community. Community members should weigh in if they see an issue with the naming. I don't think this is a particularly good name for a package (not easy to pronounce, does not indicate functionality of the package), but I don't personally have an issue with it either. Best regards, St?fan From asmeurer at gmail.com Wed Nov 4 17:54:18 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 4 Nov 2020 15:54:18 -0700 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Wed, Nov 4, 2020 at 3:02 PM Robert Kern wrote: > > On Wed, Nov 4, 2020 at 4:49 PM Aaron Meurer wrote: >> >> I hope this isn't too off topic, but this "fair play" NEP reads like >> it is a set of additional restrictions on the NumPy license, which if >> it is, would make NumPy no longer open source by the OSI definition. I >> think the NEP should be much clearer that these are requests but not >> requirements. > > > FWIW, I don't read the NEP like that. Aside from the trademark on the name "NumPy", which _are_ enforceable requirements but are orthogonal to the copyright license, I see enough "request-like" language on everything else. To be clear, I don't read it like that either. But I also implicitly understand that this is the intention of the document, because I know that NumPy wouldn't actually place restrictions like these on its license. My point is just that the document ought to be clearer about this, as I can easily see someone misinterpreting it, especially if they aren't close enough to the community that they would implicitly understand that it is only a set of guidelines. > There is no language of forced restriction. The language you quoted reads ambiguously to me. It isn't forced, but it also isn't obviously nonforced. "Please talk to us first" is the sort of language I would expect to see for software that is commercially licensed and can only be used with permission. All the bullet points say "do not", which sounds forced to me. And the trademark thing makes it even more confusing because even if you read the rest as "only guidelines", it isn't clear if this is somehow an exception. Again, *I* understand the purpose of this document, but I think the way it is currently written it could easily be misinterpreted by someone else. Aaron Meurer > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From stefanv at berkeley.edu Wed Nov 4 18:27:09 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Wed, 04 Nov 2020 15:27:09 -0800 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: <31e9ec2c-2c6a-4757-910f-a68f2bf328f3@www.fastmail.com> On Wed, Nov 4, 2020, at 14:54, Aaron Meurer wrote: > Again, *I* understand the purpose of this document, but I think the > way it is currently written it could easily be misinterpreted by > someone else. Misinterpreted in what way? That they would think we have an ability to enforce the guidelines? We *are* trying to encourage certain behavior here. If they read it and, our of abundant caution reach out to us, that's a fine outcome. What negative outcomes do you foresee? St?fan From robert.kern at gmail.com Wed Nov 4 18:29:31 2020 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 4 Nov 2020 18:29:31 -0500 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Wed, Nov 4, 2020 at 5:55 PM Aaron Meurer wrote: > On Wed, Nov 4, 2020 at 3:02 PM Robert Kern wrote: > > > > On Wed, Nov 4, 2020 at 4:49 PM Aaron Meurer wrote: > >> > >> I hope this isn't too off topic, but this "fair play" NEP reads like > >> it is a set of additional restrictions on the NumPy license, which if > >> it is, would make NumPy no longer open source by the OSI definition. I > >> think the NEP should be much clearer that these are requests but not > >> requirements. > > > > > > FWIW, I don't read the NEP like that. Aside from the trademark on the > name "NumPy", which _are_ enforceable requirements but are orthogonal to > the copyright license, I see enough "request-like" language on everything > else. > > To be clear, I don't read it like that either. But I also implicitly > understand that this is the intention of the document, because I know > that NumPy wouldn't actually place restrictions like these on its > license. My point is just that the document ought to be clearer about > this, as I can easily see someone misinterpreting it, especially if > they aren't close enough to the community that they would implicitly > understand that it is only a set of guidelines. > > > There is no language of forced restriction. > > The language you quoted reads ambiguously to me. It isn't forced, but > it also isn't obviously nonforced. "Please talk to us first" is the > sort of language I would expect to see for software that is > commercially licensed and can only be used with permission. All the > bullet points say "do not", which sounds forced to me. And the > trademark thing makes it even more confusing because even if you read > the rest as "only guidelines", it isn't clear if this is somehow an > exception. > If you pick out an individual sentence and consider it in isolation, sure. But there's a significant amount of context in the Abstract, Motivation, and Scope sections that preface the rules. And the discussion of many of the rules explicitly discusses ways to "break" the rules if you have to. We use "rule" language in many contexts besides legally-enforceable contracts and licenses. Again, *I* understand the purpose of this document, but I think the > way it is currently written it could easily be misinterpreted by > someone else. > I'm willing to wait for someone to actually misinterpret it. That's not to say that there isn't clearer language that could be drafted. The NEP is still in Draft stage. But if you think it could be clearer, please propose specific edits to the draft. Like with unclear documentation, it's the person who finds the current docs insufficient/confusing/unclear that is in the best position to recommend the language that would have helped them. Collaboration helps. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Wed Nov 4 19:21:20 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 4 Nov 2020 17:21:20 -0700 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: > Misinterpreted in what way? That they would think we have an ability to enforce the guidelines? We *are* trying to encourage certain behavior here. If they read it and, our of abundant caution reach out to us, that's a fine outcome. > What negative outcomes do you foresee? That it is a legal requirement, as part of the license to use NumPy. The negative outcome is that someone reads the document and believes NumPy to not actually be open source software. > That's not to say that there isn't clearer language that could be drafted. The NEP is still in Draft stage. But if you think it could be clearer, please propose specific edits to the draft. Like with unclear documentation, it's the person who finds the current docs insufficient/confusing/unclear that is in the best position to recommend the language that would have helped them. Collaboration helps. I disagree. The best person to write documentation is the person who actually understands the package. I already noted that I don't actually understand the actual situation with the trademark, for instance. I don't really understand why there is pushback for making NEP clearer. Also "like with unclear documentation", if someone says that documentation is unclear, you should take their word for it that it actually is, and improve it, rather than somehow trying to argue that they actually aren't confused. But as I noted, this is already off topic for the original discussion here, and since there's apparently no interest in improving the NEP wording, I'll drop it. Aaron Meurer From stefanv at berkeley.edu Wed Nov 4 19:29:41 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Wed, 04 Nov 2020 16:29:41 -0800 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: <7932850c-f115-4b57-b7e2-251e10e5a9dc@www.fastmail.com> On Wed, Nov 4, 2020, at 16:21, Aaron Meurer wrote: > But as I noted, this is already off topic for the original discussion > here, and since there's apparently no interest in improving the NEP > wording, I'll drop it. I was trying to understand where, specifically, the language falls short, and what to do about improving it. Perhaps a sentence making it clear that this is not a licensing issue will assuage your concerns? If not, please help me understand where statements are overly strong, unclear, or insufficient in coverage. Best regards, St?fan From robert.kern at gmail.com Wed Nov 4 19:42:17 2020 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 4 Nov 2020 19:42:17 -0500 Subject: [Numpy-discussion] New package to speed up ufunc inner loops In-Reply-To: References: Message-ID: On Wed, Nov 4, 2020 at 7:22 PM Aaron Meurer wrote: > > > That's not to say that there isn't clearer language that could be > drafted. The NEP is still in Draft stage. But if you think it could be > clearer, please propose specific edits to the draft. Like with unclear > documentation, it's the person who finds the current docs > insufficient/confusing/unclear that is in the best position to recommend > the language that would have helped them. Collaboration helps. > > I disagree. The best person to write documentation is the person who > actually understands the package. I already noted that I don't > actually understand the actual situation with the trademark, for > instance. > Rather, I meant that the best person to fix confusing language is the person who was confused, after consultation with the authors/experts come to a consensus about what was intended. > I don't really understand why there is pushback for making NEP > clearer. Also "like with unclear documentation", if someone says that > documentation is unclear, you should take their word for it that it > actually is, and improve it, rather than somehow trying to argue that > they actually aren't confused. > I'm not. I'm saying that I don't know how to make it more clear to those people because I'm not experiencing it like they are. The things I could think to add are the same kinds of things that were already stated explicitly in the Abstract, Motivation, and Scope. It seems like Stefan is in the same boat. Authors need editors, but the editor can't just say "rewrite!" I don't know what kind of assumptions and context this hypothetical reader is bringing to this reading that are leading to confusion. Sometimes it's clear, but not for me, here (and more relevantly, Stefan). Do you think this needs a complete revamp? Or just an additional sentence to explicitly state that this does not add additional legal restrictions to the copyright license? -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Wed Nov 4 22:25:24 2020 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Wed, 4 Nov 2020 20:25:24 -0700 Subject: [Numpy-discussion] ANN: SciPy 1.5.4 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.5.4, which is a bug fix release that includes Python 3.9 wheels and a more complete fix for build issues on XCode 12. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.5.4 One of a few ways to install this release with pip: pip install scipy==1.5.4 ===================== SciPy 1.5.4 Release Notes ===================== SciPy 1.5.4 is a bug-fix release with no new features compared to 1.5.3. Importantly, wheels are now available for Python 3.9 and a more complete fix has been applied for issues building with XCode 12. Authors ====== * Peter Bell * CJ Carey * Andrew McCluskey + * Andrew Nelson * Tyler Reddy * Eli Rykoff + * Ian Thomas + A total of 7 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.5.4 ------------------------------- * `#12763 `__: ndimage.fourier_ellipsoid segmentation fault * `#12789 `__: TestConvolve2d.test_large_array failing on Windows ILP64 CI job * `#12857 `__: sparse A[0,:] = ndarray is ok, A[:,0] = ndarray ValueError from... * `#12860 `__: BUG: Build failure with Xcode 12 * `#12935 `__: Failure to build with Python 3.9.0 on macOS * `#12966 `__: MAINT: lint_diff.py on some backport PRs * `#12988 `__: BUG: Highly multi-dimensional \`gaussian_kde\` giving \`-inf\`... Pull requests for 1.5.4 ------------------------------ * `#12790 `__: TST: Skip TestConvolve2d.test_large_array if not enough memory * `#12851 `__: BUG: sparse: fix inner indexed assignment of a 1d array * `#12875 `__: BUG: segfault in ndimage.fourier_ellipsoid with length-1 dims * `#12937 `__: CI: macOS3.9 testing * `#12957 `__: MAINT: fixes XCode 12/ python 3.9.0 build for 1.5.x maint branch * `#12959 `__: CI: add Windows Python 3.9 to CI * `#12974 `__: MAINT: Run lint_diff.py against the merge target and only for... * `#12978 `__: DOC: next_fast_len output doesn't match docstring * `#12979 `__: BUG: fft.next_fast_len should accept keyword arguments * `#12989 `__: BUG: improved the stability of kde for highly (1000s) multi-dimension... * `#13017 `__: BUG: Add explicit cast to _tmp sum. * `#13022 `__: TST: xfail test_maxiter_worsening() Checksums ========= MD5 ~~~ 09a446e10033c3132f1f257e3f4d9735 scipy-1.5.4-cp36-cp36m-macosx_10_9_x86_64.whl 25e58fde2fd4eb6c7717719db85e368b scipy-1.5.4-cp36-cp36m-manylinux1_i686.whl 2c9705cd57788ad79ea0c1015208f41f scipy-1.5.4-cp36-cp36m-manylinux1_x86_64.whl d0fb84f3ff45e4149698fbc662ac4d47 scipy-1.5.4-cp36-cp36m-manylinux2014_aarch64.whl f94f0e274cd2960ecb2d8751632e098c scipy-1.5.4-cp36-cp36m-win32.whl f56f4d5b67fccc49fb64331c28bdf7d1 scipy-1.5.4-cp36-cp36m-win_amd64.whl 33e0843f8619b78547866579134a733b scipy-1.5.4-cp37-cp37m-macosx_10_9_x86_64.whl 6720a406d82bd08c4370b665d5eddeb9 scipy-1.5.4-cp37-cp37m-manylinux1_i686.whl eafc3bc8a12d41cb348c73b54ad25ad5 scipy-1.5.4-cp37-cp37m-manylinux1_x86_64.whl 1174418ae0614d621acdb49faeaadcb8 scipy-1.5.4-cp37-cp37m-manylinux2014_aarch64.whl 5ca53c5cd6828498c0a41c3ae747a34b scipy-1.5.4-cp37-cp37m-win32.whl cdb91a7db9cf79b7446680f8d106aabc scipy-1.5.4-cp37-cp37m-win_amd64.whl 02a29a4eec9c61c30aef7439138fe1b3 scipy-1.5.4-cp38-cp38-macosx_10_9_x86_64.whl ce8e02167763493374c4bea807139a1b scipy-1.5.4-cp38-cp38-manylinux1_i686.whl 65ec027bfa6bed805dac62744b45c693 scipy-1.5.4-cp38-cp38-manylinux1_x86_64.whl c358b4b332cc9dbcd1eadc229d8b019e scipy-1.5.4-cp38-cp38-manylinux2014_aarch64.whl 492ec3bfe082229076a83d74cfa51d7e scipy-1.5.4-cp38-cp38-win32.whl d5d12211502429f3bc3074b12ca1f541 scipy-1.5.4-cp38-cp38-win_amd64.whl da25e7ac777e8b1b6cd7f117f163e6d2 scipy-1.5.4-cp39-cp39-macosx_10_9_x86_64.whl 12275e3578eb17065081d83d329d18db scipy-1.5.4-cp39-cp39-manylinux1_i686.whl 6778d670f75f536921c3d38e44517280 scipy-1.5.4-cp39-cp39-manylinux1_x86_64.whl efda61c74b29ffe714b6b842ec369a19 scipy-1.5.4-cp39-cp39-manylinux2014_aarch64.whl 107204c14328df879c5fc941e7829389 scipy-1.5.4-cp39-cp39-win32.whl ed6970f7538d38dd91a42950bd6843b7 scipy-1.5.4-cp39-cp39-win_amd64.whl 293401ee7ac354a2f2313373b497f40e scipy-1.5.4.tar.gz d446ec7a6b0bc44484389ab7589eccf5 scipy-1.5.4.tar.xz 47d0dabdc684475bc2aac7e8db9eea6f scipy-1.5.4.zip SHA256 ~~~~~~ 4f12d13ffbc16e988fa40809cbbd7a8b45bc05ff6ea0ba8e3e41f6f4db3a9e47 scipy-1.5.4-cp36-cp36m-macosx_10_9_x86_64.whl a254b98dbcc744c723a838c03b74a8a34c0558c9ac5c86d5561703362231107d scipy-1.5.4-cp36-cp36m-manylinux1_i686.whl 368c0f69f93186309e1b4beb8e26d51dd6f5010b79264c0f1e9ca00cd92ea8c9 scipy-1.5.4-cp36-cp36m-manylinux1_x86_64.whl 4598cf03136067000855d6b44d7a1f4f46994164bcd450fb2c3d481afc25dd06 scipy-1.5.4-cp36-cp36m-manylinux2014_aarch64.whl e98d49a5717369d8241d6cf33ecb0ca72deee392414118198a8e5b4c35c56340 scipy-1.5.4-cp36-cp36m-win32.whl 65923bc3809524e46fb7eb4d6346552cbb6a1ffc41be748535aa502a2e3d3389 scipy-1.5.4-cp36-cp36m-win_amd64.whl 9ad4fcddcbf5dc67619379782e6aeef41218a79e17979aaed01ed099876c0e62 scipy-1.5.4-cp37-cp37m-macosx_10_9_x86_64.whl f87b39f4d69cf7d7529d7b1098cb712033b17ea7714aed831b95628f483fd012 scipy-1.5.4-cp37-cp37m-manylinux1_i686.whl 25b241034215247481f53355e05f9e25462682b13bd9191359075682adcd9554 scipy-1.5.4-cp37-cp37m-manylinux1_x86_64.whl fa789583fc94a7689b45834453fec095245c7e69c58561dc159b5d5277057e4c scipy-1.5.4-cp37-cp37m-manylinux2014_aarch64.whl d6d25c41a009e3c6b7e757338948d0076ee1dd1770d1c09ec131f11946883c54 scipy-1.5.4-cp37-cp37m-win32.whl 2c872de0c69ed20fb1a9b9cf6f77298b04a26f0b8720a5457be08be254366c6e scipy-1.5.4-cp37-cp37m-win_amd64.whl e360cb2299028d0b0d0f65a5c5e51fc16a335f1603aa2357c25766c8dab56938 scipy-1.5.4-cp38-cp38-macosx_10_9_x86_64.whl 3397c129b479846d7eaa18f999369a24322d008fac0782e7828fa567358c36ce scipy-1.5.4-cp38-cp38-manylinux1_i686.whl 168c45c0c32e23f613db7c9e4e780bc61982d71dcd406ead746c7c7c2f2004ce scipy-1.5.4-cp38-cp38-manylinux1_x86_64.whl 213bc59191da2f479984ad4ec39406bf949a99aba70e9237b916ce7547b6ef42 scipy-1.5.4-cp38-cp38-manylinux2014_aarch64.whl 634568a3018bc16a83cda28d4f7aed0d803dd5618facb36e977e53b2df868443 scipy-1.5.4-cp38-cp38-win32.whl b03c4338d6d3d299e8ca494194c0ae4f611548da59e3c038813f1a43976cb437 scipy-1.5.4-cp38-cp38-win_amd64.whl 3d5db5d815370c28d938cf9b0809dade4acf7aba57eaf7ef733bfedc9b2474c4 scipy-1.5.4-cp39-cp39-macosx_10_9_x86_64.whl 6b0ceb23560f46dd236a8ad4378fc40bad1783e997604ba845e131d6c680963e scipy-1.5.4-cp39-cp39-manylinux1_i686.whl ed572470af2438b526ea574ff8f05e7f39b44ac37f712105e57fc4d53a6fb660 scipy-1.5.4-cp39-cp39-manylinux1_x86_64.whl 8c8d6ca19c8497344b810b0b0344f8375af5f6bb9c98bd42e33f747417ab3f57 scipy-1.5.4-cp39-cp39-manylinux2014_aarch64.whl d84cadd7d7998433334c99fa55bcba0d8b4aeff0edb123b2a1dfcface538e474 scipy-1.5.4-cp39-cp39-win32.whl cc1f78ebc982cd0602c9a7615d878396bec94908db67d4ecddca864d049112f2 scipy-1.5.4-cp39-cp39-win_amd64.whl 4a453d5e5689de62e5d38edf40af3f17560bfd63c9c5bd228c18c1f99afa155b scipy-1.5.4.tar.gz 5c87347bfe2db6e23d391aa226584f6b280248c0ca71e08f26f1faf9d7a76bc9 scipy-1.5.4.tar.xz e0bcc10c133a151937550bb42301c56439d34098b1b8f9dd18c5919d604edd37 scipy-1.5.4.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Nov 5 10:21:44 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 05 Nov 2020 09:21:44 -0600 Subject: [Numpy-discussion] Officially drop Python 3.6 from NumPy 1.20 (was: NumPy 1.20.x branch in two weeks) In-Reply-To: References: <65b3fe9e-6f6a-4c5e-943e-e5747076eb0d@www.fastmail.com> Message-ID: <1e974b9d8d746dac2a79ecb507ea180788e04404.camel@sipsolutions.net> Hi all, just to note: We discussed this yesterday briefly and decided to drop official support for 3.6 in the 1.20 release. We never had ambition to support 1.20 and there seems advantage in dropping it, if mainly for clarity and consistency with many other projects. If you disagree with this decision, please just bring it up so we can reconsider. Cheers, Sebastian PS: We may keep testing on 3.6 for the moment, at least for PyPy for technical reasons. On Tue, 2020-11-03 at 11:58 -0800, Brigitta Sipocz wrote: > Hi, > > For what it's worth, python 3.6 is also dropped for astropy 4.2 (RC1 > to be > released in the next few days). We haven't yet formally adopted > NEP29, but > are very close to it peding some word smithing, and no one from the > dev > team was fighting for keeping support for 3.6. or numpy 1.16. > > Cheers, > Brigitta > > On Tue, 3 Nov 2020 at 10:53, Thomas Caswell > wrote: > > > I am in favor of dropping py36 for np1.20, I think it would be good > > to > > lead by example. > > > > Similar to pandas, the next Matplotlib release (3.4 targeted for > > Dec/Jan) > > will not support py36. > > > > Tom > > > > > > > > On Tue, Nov 3, 2020 at 9:18 AM Mark Harfouche < > > mark.harfouche at gmail.com> > > wrote: > > > > > Juan made a pretty good argument for keeping 3.6 support in the > > > next > > > scikit-image release, let me try to paraphrase: > > > > > > - Since nobody has made the PR to explicitly drop python 3.6 from > > > the > > > scikit-image build matrix, we will continue to support it, but if > > > somebody > > > were to make the PR, I (Juan) would support it. > > > > > > As for supporting PyPy: it already exists in the build matrix > > > AFAICT. > > > Breaking PyPy would be a deliberate action, as opposed to an > > > accidental > > > byproduct of dropping CPython 3.6. > > > > > > On Mon, Nov 2, 2020, 13:50 Sebastian Berg < > > > sebastian at sipsolutions.net> > > > wrote: > > > > > > > On Mon, 2020-11-02 at 06:49 -0600, Juan Nunez-Iglesias wrote: > > > > > I like Ralf's email, and most of all I agree that the > > > > > existing > > > > > wording is clearer. > > > > > > > > > > My view on the NEP is that it does not mandate dropping > > > > > support, but > > > > > encourage it. In my projects I would drop it if I had use for > > > > > Python > > > > > 3.7+ features. It so happens that we want to use PEP-593 so > > > > > we were > > > > > grateful for NEP-29 giving us "permission" to drop 3.6. > > > > > > > > > > I would suggest that 3.6 be dropped immediately if there are > > > > > any open > > > > > PRs that would benefit from it, or code cleanups that it > > > > > would > > > > > enable. The point of the NEP is to short-circuit discussion > > > > > about > > > > > whether it's "worth" dropping 3.6. If it's valuable at all, > > > > > do it. > > > > > > > > > > > > > Probably the only thing that requires 3.7 in NumPy at this time > > > > is the > > > > module level `__getattr__`, which is used only for deprecations > > > > (and to > > > > make the financial removal slightly more gentle). > > > > I am not sure if PyPy already has stable support for 3.7 yet? > > > > Although > > > > PyPy is maybe not a big priority. > > > > > > > > We don't have to support 3.6 and I don't care if we do. Until > > > > this > > > > discussion my assumption was we would probably drop it. > > > > > > > > But, current master is tested against 3.6, so the main work > > > > seems > > > > release related. If Chuck thinks that is no hassle I don't mind > > > > if > > > > NumPy is a bit more conservative than NEP 29. > > > > > > > > Or is there a danger of setting a precedent where projects are > > > > wrongly > > > > expected to keep support just because NumPy still has it, so > > > > that NumPy > > > > not being conservative actually helps everyone? > > > > > > > > - Sebastian > > > > > > > > > > > > > Thanks all, > > > > > > > > > > Juan. > > > > > > > > > > On Mon, 2 Nov 2020, at 2:01 AM, Ralf Gommers wrote: > > > > > > On Mon, Nov 2, 2020 at 7:47 AM Stephan Hoyer < > > > > > > shoyer at gmail.com> > > > > > > wrote: > > > > > > > On Sun, Nov 1, 2020 at 7:47 PM Stefan van der Walt < > > > > > > > stefanv at berkeley.edu> wrote: > > > > > > > > On Sun, Nov 1, 2020, at 18:54, Jarrod Millman wrote: > > > > > > > > > I also misunderstood the purpose of the NEP. I > > > > > > > > > assumed it > > > > > > > > > was > > > > > > > > > intended to encourage projects to drop old versions > > > > > > > > > of > > > > > > > > > Python. > > > > > > > > > > > > It was. It is. I think the NEP is very clear on that. > > > > > > Honestly we > > > > > > should just follow the NEP and drop 3.6 now for both NumPy > > > > > > and > > > > > > SciPy, I just am tired of arguing for it - which the NEP > > > > > > should > > > > > > have prevented being necessary, and I don't want to do > > > > > > again right > > > > > > now, so this will probably be my last email on this thread. > > > > > > > > > > > > > > > > > > > > Other > > > > > > > > > people have viewed the NEP similarly: > > > > > > > > > https://github.com/networkx/networkx/issues/4027 > > > > > > > > > > > > > > > > Of all the packages, it makes sense for NumPy to behave > > > > > > > > most > > > > > > > > conservatively with depreciations. The NEP suggests > > > > > > > > allowable > > > > > > > > support periods, but as far as I recall does not > > > > > > > > enforce > > > > > > > > minimal support. > > > > > > > > > > > > It doesn't *enforce* it, but the recommendation is very > > > > > > clear. It > > > > > > would be good to follow it. > > > > > > > > > > > > > > Stephan Hoyer had a good recommendation on how we can > > > > > > > > clarify > > > > > > > > the NEP to be easier to intuit. Stephan, shall we make > > > > > > > > an > > > > > > > > ammendment to the NEP with your idea? > > > > > > > > > > > > > > For reference, here was my proposed revision: > > > > > > > https://github.com/numpy/numpy/pull/14086#issuecomment-649287648 > > > > > > > Specifically, rather than saying "the latest release of > > > > > > > NumPy > > > > > > > supports all versions of Python released in the 42 months > > > > > > > before > > > > > > > NumPy's release", it says "NumPy will only require > > > > > > > versions of > > > > > > > Python that were released more than 24 months ago". In > > > > > > > practice, > > > > > > > this works out to the same thing (at least given Python's > > > > > > > old 18 > > > > > > > month release cycle). > > > > > > > > > > > > > > This changes the definition of the support window (in a > > > > > > > way that > > > > > > > I think is clearer and that works better for infrequent > > > > > > > releases), but there is still the question of how large > > > > > > > that > > > > > > > window should be for NumPy. > > > > > > > > > > > > I'm not sure it's clearer, the current NEP has a nice > > > > > > graphic and > > > > > > literally says "a project with a major or minor version > > > > > > release in > > > > > > November 2020 should support Python 3.7 and newer."). > > > > > > However happy > > > > > > to adopt it if it makes others happy - in the end it comes > > > > > > down to > > > > > > the same thing: it's recommended to drop Python 3.6 now. > > > > > > > > > > > > > My personal opinion is that somewhere in the range of 24- > > > > > > > 36 > > > > > > > months would be appropriate. > > > > > > > > > > > > +1 > > > > > > > > > > > > Cheers, > > > > > > Ralf > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > -- > > Thomas Caswell > > tcaswell at gmail.com > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Thu Nov 5 11:55:08 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 05 Nov 2020 10:55:08 -0600 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: Message-ID: Hi all, just a brief note that I merged this proposal: https://github.com/numpy/numpy/pull/17394 adding `np.sliding_window_view` into the 1.20 release of NumPy. There was only one public API change, and that is that the `shape` argument is now called `window_shape`. This is still a good time for feedback in case you have a better idea e.g. for the function or parameter names. Cheers, Sebastian On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > Hello, > > I would like to draw the attention of this list to PR #17394 [1] that > adds the implementation of a sliding window view to numpy. > > Having a sliding window view in numpy is a longstanding open issue > (cf > #7753 [2] from 2016). A brief summary of the discussions surrounding > it > can be found in the description of the PR. > > This PR implements a sliding window view based on stride tricks. > Following the discussion in issue #7753, a first implementation was > provided by Fanjin Zeng in PR #10771. After some discussion, that PR > stalled and I picked up the issue in the present PR #17394. It is > based > on the first implementation, but follows the changed API as suggested > by > Eric Wieser. > > Code reviews have been provided by Bas van Beek, Stephen Hoyer, and > Eric > Wieser. Sebastian Berg added the "62 - Python API" label. > > > Do you think this is suitable for inclusion in numpy? > > Do you consider the PR ready? > > Do you have suggestions or requests? > > > Thanks for your time and consideration! > Klaus > > > [1] https://github.com/numpy/numpy/pull/17394 > [2] https://github.com/numpy/numpy/issues/7753 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Thu Nov 5 14:15:17 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 5 Nov 2020 19:15:17 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: Message-ID: On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg wrote: > Hi all, > > just a brief note that I merged this proposal: > > https://github.com/numpy/numpy/pull/17394 > > adding `np.sliding_window_view` into the 1.20 release of NumPy. > > There was only one public API change, and that is that the `shape` > argument is now called `window_shape`. > > This is still a good time for feedback in case you have a better idea > e.g. for the function or parameter names. > The old PR had this in the lib.stride_tricks namespace. Seeing it in the main namespace is unexpected and likely will lead to issues/questions, given that such an overlapping view is going to do behave in ways the average user will be surprised by. It may also lead to requests for other array/tensor libraries to implement this. I don't see any discussion on this in PR 17394, it looks like a decision by the PR author that no one commented on - reconsider that? Cheers, Ralf > > Cheers, > > Sebastian > > > > On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > > Hello, > > > > I would like to draw the attention of this list to PR #17394 [1] that > > adds the implementation of a sliding window view to numpy. > > > > Having a sliding window view in numpy is a longstanding open issue > > (cf > > #7753 [2] from 2016). A brief summary of the discussions surrounding > > it > > can be found in the description of the PR. > > > > This PR implements a sliding window view based on stride tricks. > > Following the discussion in issue #7753, a first implementation was > > provided by Fanjin Zeng in PR #10771. After some discussion, that PR > > stalled and I picked up the issue in the present PR #17394. It is > > based > > on the first implementation, but follows the changed API as suggested > > by > > Eric Wieser. > > > > Code reviews have been provided by Bas van Beek, Stephen Hoyer, and > > Eric > > Wieser. Sebastian Berg added the "62 - Python API" label. > > > > > > Do you think this is suitable for inclusion in numpy? > > > > Do you consider the PR ready? > > > > Do you have suggestions or requests? > > > > > > Thanks for your time and consideration! > > Klaus > > > > > > [1] https://github.com/numpy/numpy/pull/17394 > > [2] https://github.com/numpy/numpy/issues/7753 > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noamraph at gmail.com Thu Nov 5 15:12:24 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Thu, 5 Nov 2020 22:12:24 +0200 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone Message-ID: Hi, I suggest removing the deprecation warning when constructing a datetime64 with a timezone. For example, this is the current behavior: >>> np.datetime64('2020-11-05 16:00+0200') :1: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future numpy.datetime64('2020-11-05T14:00') I suggest removing the deprecation warning because I find this to be a useful behavior, and because it is a correct behavior. The manual says: "The datetime object represents a single moment in time... Datetimes are always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." So 2020-11-05T16:00+0200 is indeed the moment in time represented by np.datetime64('2020-11-05T14:00'). I just used this to restrict my data set to records created after a certain moment. It was easier for me to write the moment in my local time and add "+0200" than to figure out the moment representation in UTC. So this is my simple suggestion: remove the deprecation warning. Beyond that, I have 3 ideas for changing the repr of datetime64 that I would like to discuss. 1. Add "Z" at the end, for example, numpy.datetime64('2020-11-05T14:00Z'). This will make it clear to which moment it refers. I think this is significant - I had to dig quite a bit to realize that datetime64('2020-11-05T14:00') means 14:00 UTC. 2. Replace the 'T' with a space. I just find it much easier to read '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of characters makes it hard for my brain to parse. 3. This will require discussion, but will be very convenient: have the repr display the time using the environment time zone, including a time offset. So, in my specific time zone (+0200), I will have: repr(np.datetime64('2020-11-05 14:00Z')) == "numpy.datetime64('2020-11-05T16:00+0200')" I'm sure the pros and cons of having an environment-dependent repr should be discussed. But I will list some pros: 1. It's very convenient - it's immediately obvious to me to which moment 2020-11-05 16:00+0200 refers. 2. It's well defined - I may collect timestamps from machines with different time zones, and I will be able to know to which exact moment each timestamp refers. 3. It's very simple - I could compare any two timestamps, I don't have to worry about time zones. I would be happy to hear your thoughts. Thanks, Noam -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Nov 5 15:51:36 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 5 Nov 2020 12:51:36 -0800 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: Message-ID: On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers wrote: > > > On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg > wrote: > >> Hi all, >> >> just a brief note that I merged this proposal: >> >> https://github.com/numpy/numpy/pull/17394 >> >> adding `np.sliding_window_view` into the 1.20 release of NumPy. >> >> There was only one public API change, and that is that the `shape` >> argument is now called `window_shape`. >> >> This is still a good time for feedback in case you have a better idea >> e.g. for the function or parameter names. >> > > The old PR had this in the lib.stride_tricks namespace. Seeing it in the > main namespace is unexpected and likely will lead to issues/questions, > given that such an overlapping view is going to do behave in ways the > average user will be surprised by. It may also lead to requests for other > array/tensor libraries to implement this. I don't see any discussion on > this in PR 17394, it looks like a decision by the PR author that no one > commented on - reconsider that? > > Cheers, > Ralf > +1 let's keep this in the lib.stride_tricks namespace. > > > > >> >> Cheers, >> >> Sebastian >> >> >> >> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: >> > Hello, >> > >> > I would like to draw the attention of this list to PR #17394 [1] that >> > adds the implementation of a sliding window view to numpy. >> > >> > Having a sliding window view in numpy is a longstanding open issue >> > (cf >> > #7753 [2] from 2016). A brief summary of the discussions surrounding >> > it >> > can be found in the description of the PR. >> > >> > This PR implements a sliding window view based on stride tricks. >> > Following the discussion in issue #7753, a first implementation was >> > provided by Fanjin Zeng in PR #10771. After some discussion, that PR >> > stalled and I picked up the issue in the present PR #17394. It is >> > based >> > on the first implementation, but follows the changed API as suggested >> > by >> > Eric Wieser. >> > >> > Code reviews have been provided by Bas van Beek, Stephen Hoyer, and >> > Eric >> > Wieser. Sebastian Berg added the "62 - Python API" label. >> > >> > >> > Do you think this is suitable for inclusion in numpy? >> > >> > Do you consider the PR ready? >> > >> > Do you have suggestions or requests? >> > >> > >> > Thanks for your time and consideration! >> > Klaus >> > >> > >> > [1] https://github.com/numpy/numpy/pull/17394 >> > [2] https://github.com/numpy/numpy/issues/7753 >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Thu Nov 5 16:04:21 2020 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Thu, 5 Nov 2020 21:04:21 +0000 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone In-Reply-To: References: Message-ID: Without weighing in yet on how I feel about the deprecation, you can see some discussion about why this was originally deprecated in the PR that introduced the warning: https://github.com/numpy/numpy/pull/6453 Eric On Thu, Nov 5, 2020, 20:13 Noam Yorav-Raphael wrote: > Hi, > > I suggest removing the deprecation warning when constructing a datetime64 > with a timezone. For example, this is the current behavior: > > >>> np.datetime64('2020-11-05 16:00+0200') > :1: DeprecationWarning: parsing timezone aware datetimes is > deprecated; this will raise an error in the future > numpy.datetime64('2020-11-05T14:00') > > I suggest removing the deprecation warning because I find this to be a > useful behavior, and because it is a correct behavior. The manual says: > "The datetime object represents a single moment in time... Datetimes are > always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." > So 2020-11-05T16:00+0200 is indeed the moment in time represented by > np.datetime64('2020-11-05T14:00'). > > I just used this to restrict my data set to records created after a > certain moment. It was easier for me to write the moment in my local time > and add "+0200" than to figure out the moment representation in UTC. > > So this is my simple suggestion: remove the deprecation warning. > > > Beyond that, I have 3 ideas for changing the repr of datetime64 that I > would like to discuss. > > 1. Add "Z" at the end, for example, numpy.datetime64('2020-11-05T14:00Z'). > This will make it clear to which moment it refers. I think this is > significant - I had to dig quite a bit to realize that > datetime64('2020-11-05T14:00') means 14:00 UTC. > > 2. Replace the 'T' with a space. I just find it much easier to read > '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of > characters makes it hard for my brain to parse. > > 3. This will require discussion, but will be very convenient: have the > repr display the time using the environment time zone, including a time > offset. So, in my specific time zone (+0200), I will have: > > repr(np.datetime64('2020-11-05 14:00Z')) == > "numpy.datetime64('2020-11-05T16:00+0200')" > > I'm sure the pros and cons of having an environment-dependent repr should > be discussed. But I will list some pros: > 1. It's very convenient - it's immediately obvious to me to which moment > 2020-11-05 16:00+0200 refers. > 2. It's well defined - I may collect timestamps from machines with > different time zones, and I will be able to know to which exact moment each > timestamp refers. > 3. It's very simple - I could compare any two timestamps, I don't have to > worry about time zones. > > I would be happy to hear your thoughts. > > Thanks, > Noam > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Nov 5 18:35:41 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 05 Nov 2020 17:35:41 -0600 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: Message-ID: On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers > wrote: > > > > > On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > > sebastian at sipsolutions.net> > > wrote: > > > > > Hi all, > > > > > > just a brief note that I merged this proposal: > > > > > > https://github.com/numpy/numpy/pull/17394 > > > > > > adding `np.sliding_window_view` into the 1.20 release of NumPy. > > > > > > There was only one public API change, and that is that the > > > `shape` > > > argument is now called `window_shape`. > > > > > > This is still a good time for feedback in case you have a better > > > idea > > > e.g. for the function or parameter names. > > > > > > > The old PR had this in the lib.stride_tricks namespace. Seeing it > > in the > > main namespace is unexpected and likely will lead to > > issues/questions, > > given that such an overlapping view is going to do behave in ways > > the > > average user will be surprised by. It may also lead to requests for > > other > > array/tensor libraries to implement this. I don't see any > > discussion on > > this in PR 17394, it looks like a decision by the PR author that no > > one > > commented on - reconsider that? > > > > Cheers, > > Ralf > > > > +1 let's keep this in the lib.stride_tricks namespace. > I have no reservations against having it in the main namespace and am happy either way (it can still be exposed later in any case). It is the conservative choice and maybe it is an uncommon enough function that it deserves being a bit hidden... But I am curious, it sounds like you have both very strong reservations, and I would like to understand them better. The behaviour can be surprising, but that is why the default is a read- only view. I do not think it is worse than `np.broadcast_to` in this regard. (It is nowhere near as dangerous as `as_strided`.) It is true that it is specific to NumPy (memory model). So that is maybe a good enough reason right now. But I am not sure that stuffing things into a pretty hidden `np.lib.*` namespaces is a great long term solution either. There is very little useful functionality hidden away in `np.lib.*` currently. Cheers, Sebastian > > > > > > > > > > Cheers, > > > > > > Sebastian > > > > > > > > > > > > On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > > > > Hello, > > > > > > > > I would like to draw the attention of this list to PR #17394 > > > > [1] that > > > > adds the implementation of a sliding window view to numpy. > > > > > > > > Having a sliding window view in numpy is a longstanding open > > > > issue > > > > (cf > > > > #7753 [2] from 2016). A brief summary of the discussions > > > > surrounding > > > > it > > > > can be found in the description of the PR. > > > > > > > > This PR implements a sliding window view based on stride > > > > tricks. > > > > Following the discussion in issue #7753, a first implementation > > > > was > > > > provided by Fanjin Zeng in PR #10771. After some discussion, > > > > that PR > > > > stalled and I picked up the issue in the present PR #17394. It > > > > is > > > > based > > > > on the first implementation, but follows the changed API as > > > > suggested > > > > by > > > > Eric Wieser. > > > > > > > > Code reviews have been provided by Bas van Beek, Stephen Hoyer, > > > > and > > > > Eric > > > > Wieser. Sebastian Berg added the "62 - Python API" label. > > > > > > > > > > > > Do you think this is suitable for inclusion in numpy? > > > > > > > > Do you consider the PR ready? > > > > > > > > Do you have suggestions or requests? > > > > > > > > > > > > Thanks for your time and consideration! > > > > Klaus > > > > > > > > > > > > [1] https://github.com/numpy/numpy/pull/17394 > > > > [2] https://github.com/numpy/numpy/issues/7753 > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From shoyer at gmail.com Thu Nov 5 18:44:48 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 5 Nov 2020 15:44:48 -0800 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone In-Reply-To: References: Message-ID: I can try to dig up the old discussions, but datetime64 used to implement both (1) and (3), and this was updated in a very intentional way. Datetime64 now works like Python's own time-zone naive datetime.datetime objects. The documentation referencing "Z" should be updated -- datetime64 can be in any timezone you like. Timezone aware datetime objects are certainly useful, but NumPy's datetime64 was restricted to UTC. The consensus was that it was worse to have UTC-only rather than timezone-naive-only. NumPy's datetime64 is often used for data analysis purposes, for which automatic conversion to the local timezone of the computer running the analysis is often counter-productive. If you care about timezone conversions, I would highly recommend looking into pandas's Timestamp class for this purpose. In the future, this would be a good use-case for a new custom NumPy dtype. (The existing np.datetime64 code cannot easily handle multiple timezones.) On Thu, Nov 5, 2020 at 1:04 PM Eric Wieser wrote: > Without weighing in yet on how I feel about the deprecation, you can see > some discussion about why this was originally deprecated in the PR that > introduced the warning: > > https://github.com/numpy/numpy/pull/6453 > > Eric > > On Thu, Nov 5, 2020, 20:13 Noam Yorav-Raphael wrote: > >> Hi, >> >> I suggest removing the deprecation warning when constructing a datetime64 >> with a timezone. For example, this is the current behavior: >> >> >>> np.datetime64('2020-11-05 16:00+0200') >> :1: DeprecationWarning: parsing timezone aware datetimes is >> deprecated; this will raise an error in the future >> numpy.datetime64('2020-11-05T14:00') >> >> I suggest removing the deprecation warning because I find this to be a >> useful behavior, and because it is a correct behavior. The manual says: >> "The datetime object represents a single moment in time... Datetimes are >> always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." >> So 2020-11-05T16:00+0200 is indeed the moment in time represented by >> np.datetime64('2020-11-05T14:00'). >> >> I just used this to restrict my data set to records created after a >> certain moment. It was easier for me to write the moment in my local time >> and add "+0200" than to figure out the moment representation in UTC. >> >> So this is my simple suggestion: remove the deprecation warning. >> >> >> Beyond that, I have 3 ideas for changing the repr of datetime64 that I >> would like to discuss. >> >> 1. Add "Z" at the end, for example, >> numpy.datetime64('2020-11-05T14:00Z'). This will make it clear to which >> moment it refers. I think this is significant - I had to dig quite a bit to >> realize that datetime64('2020-11-05T14:00') means 14:00 UTC. >> >> 2. Replace the 'T' with a space. I just find it much easier to read >> '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of >> characters makes it hard for my brain to parse. >> >> 3. This will require discussion, but will be very convenient: have the >> repr display the time using the environment time zone, including a time >> offset. So, in my specific time zone (+0200), I will have: >> >> repr(np.datetime64('2020-11-05 14:00Z')) == >> "numpy.datetime64('2020-11-05T16:00+0200')" >> >> I'm sure the pros and cons of having an environment-dependent repr should >> be discussed. But I will list some pros: >> 1. It's very convenient - it's immediately obvious to me to which moment >> 2020-11-05 16:00+0200 refers. >> 2. It's well defined - I may collect timestamps from machines with >> different time zones, and I will be able to know to which exact moment each >> timestamp refers. >> 3. It's very simple - I could compare any two timestamps, I don't have to >> worry about time zones. >> >> I would be happy to hear your thoughts. >> >> Thanks, >> Noam >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Nov 5 19:39:06 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 05 Nov 2020 18:39:06 -0600 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: Message-ID: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > > On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > > ralf.gommers at gmail.com> > > wrote: > > > > > On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > > > sebastian at sipsolutions.net> > > > wrote: > > > > > > > Hi all, > > > > > > > > just a brief note that I merged this proposal: > > > > > > > > https://github.com/numpy/numpy/pull/17394 > > > > > > > > adding `np.sliding_window_view` into the 1.20 release of NumPy. > > > > > > > > There was only one public API change, and that is that the > > > > `shape` > > > > argument is now called `window_shape`. > > > > > > > > This is still a good time for feedback in case you have a > > > > better > > > > idea > > > > e.g. for the function or parameter names. > > > > > > > > > > The old PR had this in the lib.stride_tricks namespace. Seeing it > > > in the > > > main namespace is unexpected and likely will lead to > > > issues/questions, > > > given that such an overlapping view is going to do behave in ways > > > the > > > average user will be surprised by. It may also lead to requests > > > for > > > other > > > array/tensor libraries to implement this. I don't see any > > > discussion on > > > this in PR 17394, it looks like a decision by the PR author that > > > no > > > one > > > commented on - reconsider that? > > > > > > Cheers, > > > Ralf > > > > > > > +1 let's keep this in the lib.stride_tricks namespace. > > > > I have no reservations against having it in the main namespace and am > happy either way (it can still be exposed later in any case). It is > the > conservative choice and maybe it is an uncommon enough function that > it > deserves being a bit hidden... In any case, its the safe bet for NumPy 1.20 at least so I opened a PR: https://github.com/numpy/numpy/pull/17720 Name changes, etc. are also possible of course. I still think it might be nice to find a better place for this type of function that `np.lib.stride_tricks` though, but dunno... - Sebastian > > But I am curious, it sounds like you have both very strong > reservations, and I would like to understand them better. > > The behaviour can be surprising, but that is why the default is a > read- > only view. I do not think it is worse than `np.broadcast_to` in this > regard. (It is nowhere near as dangerous as `as_strided`.) > > It is true that it is specific to NumPy (memory model). So that is > maybe a good enough reason right now. But I am not sure that > stuffing > things into a pretty hidden `np.lib.*` namespaces is a great long > term > solution either. There is very little useful functionality hidden > away > in `np.lib.*` currently. > > Cheers, > > Sebastian > > > > > > > > > > > Cheers, > > > > > > > > Sebastian > > > > > > > > > > > > > > > > On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > > > > > Hello, > > > > > > > > > > I would like to draw the attention of this list to PR #17394 > > > > > [1] that > > > > > adds the implementation of a sliding window view to numpy. > > > > > > > > > > Having a sliding window view in numpy is a longstanding open > > > > > issue > > > > > (cf > > > > > #7753 [2] from 2016). A brief summary of the discussions > > > > > surrounding > > > > > it > > > > > can be found in the description of the PR. > > > > > > > > > > This PR implements a sliding window view based on stride > > > > > tricks. > > > > > Following the discussion in issue #7753, a first > > > > > implementation > > > > > was > > > > > provided by Fanjin Zeng in PR #10771. After some discussion, > > > > > that PR > > > > > stalled and I picked up the issue in the present PR #17394. > > > > > It > > > > > is > > > > > based > > > > > on the first implementation, but follows the changed API as > > > > > suggested > > > > > by > > > > > Eric Wieser. > > > > > > > > > > Code reviews have been provided by Bas van Beek, Stephen > > > > > Hoyer, > > > > > and > > > > > Eric > > > > > Wieser. Sebastian Berg added the "62 - Python API" label. > > > > > > > > > > > > > > > Do you think this is suitable for inclusion in numpy? > > > > > > > > > > Do you consider the PR ready? > > > > > > > > > > Do you have suggestions or requests? > > > > > > > > > > > > > > > Thanks for your time and consideration! > > > > > Klaus > > > > > > > > > > > > > > > [1] https://github.com/numpy/numpy/pull/17394 > > > > > [2] https://github.com/numpy/numpy/issues/7753 > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From klaus.zimmermann at smhi.se Fri Nov 6 04:45:42 2020 From: klaus.zimmermann at smhi.se (Zimmermann Klaus) Date: Fri, 6 Nov 2020 09:45:42 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> References: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> Message-ID: <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> Hi all, I have absolutely no problem keeping this out of the main namespace. In fact I'd like to point out that it was not my idea. Rather, it was proposed by Bas van Beek in the comments [1,2] and received a little more scrutiny from Eric Wieser in [3]. The reason that it didn't receive the scrutiny it probably deserves is that it got a bit mangled up with the array dispatch discussion; sorry for that. On the subject matter, I am also curious about the potential for confusion. What other behavior could one expect from a sliding window view with this shape? As I said, I am completely fine with keeping this out of the main namespace, but I agree with Sebastian's comment, that `np.lib.stride_tricks` is perhaps not the best namespace. The reason from my point of view is that stride tricks is really a technical (and slightly ominous) name that might throw of more application oriented programmers from finding and using this function. Thinking of my scientist colleagues, I think those are exactly the kind of users that could benefit from such a prototyping tool. Cheers Klaus [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468 [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340 On 06/11/2020 01:39, Sebastian Berg wrote: > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < >>> ralf.gommers at gmail.com> >>> wrote: >>> >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < >>>> sebastian at sipsolutions.net> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> just a brief note that I merged this proposal: >>>>> >>>>> https://github.com/numpy/numpy/pull/17394 >>>>> >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy. >>>>> >>>>> There was only one public API change, and that is that the >>>>> `shape` >>>>> argument is now called `window_shape`. >>>>> >>>>> This is still a good time for feedback in case you have a >>>>> better >>>>> idea >>>>> e.g. for the function or parameter names. >>>>> >>>> >>>> The old PR had this in the lib.stride_tricks namespace. Seeing it >>>> in the >>>> main namespace is unexpected and likely will lead to >>>> issues/questions, >>>> given that such an overlapping view is going to do behave in ways >>>> the >>>> average user will be surprised by. It may also lead to requests >>>> for >>>> other >>>> array/tensor libraries to implement this. I don't see any >>>> discussion on >>>> this in PR 17394, it looks like a decision by the PR author that >>>> no >>>> one >>>> commented on - reconsider that? >>>> >>>> Cheers, >>>> Ralf >>>> >>> >>> +1 let's keep this in the lib.stride_tricks namespace. >>> >> >> I have no reservations against having it in the main namespace and am >> happy either way (it can still be exposed later in any case). It is >> the >> conservative choice and maybe it is an uncommon enough function that >> it >> deserves being a bit hidden... > > > In any case, its the safe bet for NumPy 1.20 at least so I opened a PR: > > https://github.com/numpy/numpy/pull/17720 > > Name changes, etc. are also possible of course. > > I still think it might be nice to find a better place for this type of > function that `np.lib.stride_tricks` though, but dunno... > > - Sebastian > > > >> >> But I am curious, it sounds like you have both very strong >> reservations, and I would like to understand them better. >> >> The behaviour can be surprising, but that is why the default is a >> read- >> only view. I do not think it is worse than `np.broadcast_to` in this >> regard. (It is nowhere near as dangerous as `as_strided`.) >> >> It is true that it is specific to NumPy (memory model). So that is >> maybe a good enough reason right now. But I am not sure that >> stuffing >> things into a pretty hidden `np.lib.*` namespaces is a great long >> term >> solution either. There is very little useful functionality hidden >> away >> in `np.lib.*` currently. >> >> Cheers, >> >> Sebastian >> >>>> >>>> >>>>> Cheers, >>>>> >>>>> Sebastian >>>>> >>>>> >>>>> >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: >>>>>> Hello, >>>>>> >>>>>> I would like to draw the attention of this list to PR #17394 >>>>>> [1] that >>>>>> adds the implementation of a sliding window view to numpy. >>>>>> >>>>>> Having a sliding window view in numpy is a longstanding open >>>>>> issue >>>>>> (cf >>>>>> #7753 [2] from 2016). A brief summary of the discussions >>>>>> surrounding >>>>>> it >>>>>> can be found in the description of the PR. >>>>>> >>>>>> This PR implements a sliding window view based on stride >>>>>> tricks. >>>>>> Following the discussion in issue #7753, a first >>>>>> implementation >>>>>> was >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion, >>>>>> that PR >>>>>> stalled and I picked up the issue in the present PR #17394. >>>>>> It >>>>>> is >>>>>> based >>>>>> on the first implementation, but follows the changed API as >>>>>> suggested >>>>>> by >>>>>> Eric Wieser. >>>>>> >>>>>> Code reviews have been provided by Bas van Beek, Stephen >>>>>> Hoyer, >>>>>> and >>>>>> Eric >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. >>>>>> >>>>>> >>>>>> Do you think this is suitable for inclusion in numpy? >>>>>> >>>>>> Do you consider the PR ready? >>>>>> >>>>>> Do you have suggestions or requests? >>>>>> >>>>>> >>>>>> Thanks for your time and consideration! >>>>>> Klaus >>>>>> >>>>>> >>>>>> [1] https://github.com/numpy/numpy/pull/17394 >>>>>> [2] https://github.com/numpy/numpy/issues/7753 >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From noamraph at gmail.com Fri Nov 6 05:47:46 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Fri, 6 Nov 2020 12:47:46 +0200 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone In-Reply-To: References: Message-ID: Hi, I actually arrived at this by first trying to use pandas.Timestamp and getting very frustrated about it. With pandas, I get: >>> pd.Timestamp.now() Timestamp('2020-11-06 09:45:24.249851') I find the whole notion of a "timezone naive timestamp" to be nearly meaningless. A timestamp should mean a moment in time (as the current numpy documentation defines very well). A "naive timestamp" doesn't mean anything. It's exactly like a "unit naive length". I can have a Length type which just takes a number, and be very happy that it works both if my "unit zone" is inches or centimeters. So "Length(3)" will mean 3 cm in most of the world and 3 inches in the US. But then, if I get "Length(3)" from someone, I can't be sure what length it refers to. So currently, this happens with pandas timestamps: >>> os.environ['TZ'] = 'UTC'; time.tzset() ... t0 = pd.Timestamp.now() ... time.sleep(1) ... os.environ['TZ'] = 'EST-5'; time.tzset() ... t1 = pd.Timestamp.now() ... t1 - t0 Timedelta('0 days 05:00:01.001583') This is not just theoretical - I actually need to work with data from several devices, each in its own time zone. And I need to know that I won't get such meaningless results. And you can even get something like this: >>> t0 = pd.Timestamp.now() ... time.sleep(10) ... t1 = pd.Timestamp.now() ... t1 - t0 Timedelta('0 days 01:00:10.001583') if the first measurement happened to be in winter time and the second measurement happened to be in daylight saving time. The solution is simple, and is what datetime64 used to do before the change - have a type that just represents a moment in time. It's not "in UTC" - it just stores the number of seconds that passed since an agreed moment in time (which is usually 1970-01-01 02:00+0200, which is more commonly referred to as 1970-01-01 00:00Z - it's the exact same moment). I think it would make things clearer if I'll mention that there are operations that are not dealing with timestamps. For example, it's meaningless to ask what is the year of a timestamp - it may depend on the time zone. These are always *human* related questions, that depend on certain human conventions. We can call them "calendar questions". For these types of questions, a type that includes both a timestamp and a timezone offset (in minutes from UTC) can be useful. Some questions even require full timezone information, meaning a function that defines what's the timezone offset for each moment. However, I don't think numpy should deal with those calendar issues. As a very simple example, even for "timestamp+offset" types, it's not clear how to compare them - should values with the same timestamp and different offsets be considered equal or not? And in virtually all of my data analysis, this calendar aspect has nothing to do with the questions I'm trying to answer. I have a suggestion. Instead of changing datetime64 (which I consider to be ill-defined, but never mind), add a new type called "timestamp64". It will have the exact same behavior as datetime64 had before the change, except that its only allowed units will be seconds, milliseconds, microseconds and nanoseconds. Removing the longer units will make it clear that it doesn't deal with calendar and dates. Also, all the business day functionality will not be applicable to timestamp64. In order to get calendar information (such as the year) from timestamp64, you will have to manually convert it to python's datetime (or to np.datetime64) with an explicit timezone (utc, local, an offset, or a timezone object). What do you think? Thanks, Noam On Fri, Nov 6, 2020 at 1:45 AM Stephan Hoyer wrote: > I can try to dig up the old discussions, but datetime64 used to implement > both (1) and (3), and this was updated in a very intentional way. > Datetime64 now works like Python's own time-zone naive datetime.datetime > objects. The documentation referencing "Z" should be updated -- datetime64 > can be in any timezone you like. > > Timezone aware datetime objects are certainly useful, but NumPy's > datetime64 was restricted to UTC. The consensus was that it was worse to > have UTC-only rather than timezone-naive-only. NumPy's datetime64 is often > used for data analysis purposes, for which automatic conversion to the > local timezone of the computer running the analysis is often > counter-productive. > > If you care about timezone conversions, I would highly recommend looking > into pandas's Timestamp class for this purpose. In the future, this would > be a good use-case for a new custom NumPy dtype. (The existing > np.datetime64 code cannot easily handle multiple timezones.) > > On Thu, Nov 5, 2020 at 1:04 PM Eric Wieser > wrote: > >> Without weighing in yet on how I feel about the deprecation, you can see >> some discussion about why this was originally deprecated in the PR that >> introduced the warning: >> >> https://github.com/numpy/numpy/pull/6453 >> >> Eric >> >> On Thu, Nov 5, 2020, 20:13 Noam Yorav-Raphael wrote: >> >>> Hi, >>> >>> I suggest removing the deprecation warning when constructing a >>> datetime64 with a timezone. For example, this is the current behavior: >>> >>> >>> np.datetime64('2020-11-05 16:00+0200') >>> :1: DeprecationWarning: parsing timezone aware datetimes is >>> deprecated; this will raise an error in the future >>> numpy.datetime64('2020-11-05T14:00') >>> >>> I suggest removing the deprecation warning because I find this to be a >>> useful behavior, and because it is a correct behavior. The manual says: >>> "The datetime object represents a single moment in time... Datetimes are >>> always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." >>> So 2020-11-05T16:00+0200 is indeed the moment in time represented by >>> np.datetime64('2020-11-05T14:00'). >>> >>> I just used this to restrict my data set to records created after a >>> certain moment. It was easier for me to write the moment in my local time >>> and add "+0200" than to figure out the moment representation in UTC. >>> >>> So this is my simple suggestion: remove the deprecation warning. >>> >>> >>> Beyond that, I have 3 ideas for changing the repr of datetime64 that I >>> would like to discuss. >>> >>> 1. Add "Z" at the end, for example, >>> numpy.datetime64('2020-11-05T14:00Z'). This will make it clear to which >>> moment it refers. I think this is significant - I had to dig quite a bit to >>> realize that datetime64('2020-11-05T14:00') means 14:00 UTC. >>> >>> 2. Replace the 'T' with a space. I just find it much easier to read >>> '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of >>> characters makes it hard for my brain to parse. >>> >>> 3. This will require discussion, but will be very convenient: have the >>> repr display the time using the environment time zone, including a time >>> offset. So, in my specific time zone (+0200), I will have: >>> >>> repr(np.datetime64('2020-11-05 14:00Z')) == >>> "numpy.datetime64('2020-11-05T16:00+0200')" >>> >>> I'm sure the pros and cons of having an environment-dependent repr >>> should be discussed. But I will list some pros: >>> 1. It's very convenient - it's immediately obvious to me to which moment >>> 2020-11-05 16:00+0200 refers. >>> 2. It's well defined - I may collect timestamps from machines with >>> different time zones, and I will be able to know to which exact moment each >>> timestamp refers. >>> 3. It's very simple - I could compare any two timestamps, I don't have >>> to worry about time zones. >>> >>> I would be happy to hear your thoughts. >>> >>> Thanks, >>> Noam >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Nov 6 09:58:23 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 6 Nov 2020 14:58:23 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> References: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> Message-ID: On Fri, Nov 6, 2020 at 9:51 AM Zimmermann Klaus wrote: > Hi all, > > > I have absolutely no problem keeping this out of the main namespace. > > In fact I'd like to point out that it was not my idea. Rather, it was > proposed by Bas van Beek in the comments [1,2] and received a little > more scrutiny from Eric Wieser in [3]. > Thanks, between two PRs with that many comments, I couldn't figure that out - just saw the commit that make the change. > The reason that it didn't receive the scrutiny it probably deserves is > that it got a bit mangled up with the array dispatch discussion; sorry > for that. > No worries at all. This is why we announce new features on the mailing list. > On the subject matter, I am also curious about the potential for > confusion. What other behavior could one expect from a sliding window > view with this shape? > > As I said, I am completely fine with keeping this out of the main > namespace, but I agree with Sebastian's comment, that > `np.lib.stride_tricks` is perhaps not the best namespace. I agree that that's not a great namespace. There's multiple issues with namespaces, we basically have three good ones (fft, linalg, random) and a bunch of other ones that range from questionable to terrible. See https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 for details. This would be a good thing to work on - making the `numpy.lib` namespace not bleed into `numpy` via `import *` is one thing to do there, and there's many others. But given backwards compat constraints it's not easy. > The reason > from my point of view is that stride tricks is really a technical (and > slightly ominous) name that might throw of more application oriented > programmers from finding and using this function. Thinking of my > scientist colleagues, I think those are exactly the kind of users that > could benefit from such a prototyping tool. > That phrasing is one of a number of concerns. NumPy is normally not in the business of providing things that are okay as a prototyping tool, but are potentially extremely slow (as pointed out in the Notes section of the docstring). A function like that would basically not be the right tool for almost anything in, e.g., SciPy - it requires an iterative algorithm. In NumPy we don't prefer performance at all costs, but in general it's pretty decent rather than "Numba or Cython may gain you 100x here". Other issues include: 2) It is very specific to NumPy's memory model (as pointed out by you and Sebastian) - just like the rest of stride_tricks 3) It has "view" in the name, which doesn't quite make sense for the main namespace (also connected to point 2 above). 4) The cost of putting something in the main namespace for other array/tensor libraries is large. Maybe other libraries, e.g. CuPy, Dask, TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the main NumPy namespace as well as possible. This would trigger discussions and likely many person-weeks of work for others. 5) It's a useful function, but it's very much on the margins of NumPy's scope. It could easily have gone into, for example, scipy.signal. At this point the bar for functions going into the main namespace should be (and is) high. All this taken together means it's not even a toss-up for me. If it were just one or two of these points, maybe. But given all the above, I'm pretty confident saying "it does not belong in the main namespace". Cheers, Ralf > > Cheers > Klaus > > > > [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 > [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468 > [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340 > > On 06/11/2020 01:39, Sebastian Berg wrote: > > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > >>> ralf.gommers at gmail.com> > >>> wrote: > >>> > >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > >>>> sebastian at sipsolutions.net> > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> just a brief note that I merged this proposal: > >>>>> > >>>>> https://github.com/numpy/numpy/pull/17394 > >>>>> > >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy. > >>>>> > >>>>> There was only one public API change, and that is that the > >>>>> `shape` > >>>>> argument is now called `window_shape`. > >>>>> > >>>>> This is still a good time for feedback in case you have a > >>>>> better > >>>>> idea > >>>>> e.g. for the function or parameter names. > >>>>> > >>>> > >>>> The old PR had this in the lib.stride_tricks namespace. Seeing it > >>>> in the > >>>> main namespace is unexpected and likely will lead to > >>>> issues/questions, > >>>> given that such an overlapping view is going to do behave in ways > >>>> the > >>>> average user will be surprised by. It may also lead to requests > >>>> for > >>>> other > >>>> array/tensor libraries to implement this. I don't see any > >>>> discussion on > >>>> this in PR 17394, it looks like a decision by the PR author that > >>>> no > >>>> one > >>>> commented on - reconsider that? > >>>> > >>>> Cheers, > >>>> Ralf > >>>> > >>> > >>> +1 let's keep this in the lib.stride_tricks namespace. > >>> > >> > >> I have no reservations against having it in the main namespace and am > >> happy either way (it can still be exposed later in any case). It is > >> the > >> conservative choice and maybe it is an uncommon enough function that > >> it > >> deserves being a bit hidden... > > > > > > In any case, its the safe bet for NumPy 1.20 at least so I opened a PR: > > > > https://github.com/numpy/numpy/pull/17720 > > > > Name changes, etc. are also possible of course. > > > > I still think it might be nice to find a better place for this type of > > function that `np.lib.stride_tricks` though, but dunno... > > > > - Sebastian > > > > > > > >> > >> But I am curious, it sounds like you have both very strong > >> reservations, and I would like to understand them better. > >> > >> The behaviour can be surprising, but that is why the default is a > >> read- > >> only view. I do not think it is worse than `np.broadcast_to` in this > >> regard. (It is nowhere near as dangerous as `as_strided`.) > >> > >> It is true that it is specific to NumPy (memory model). So that is > >> maybe a good enough reason right now. But I am not sure that > >> stuffing > >> things into a pretty hidden `np.lib.*` namespaces is a great long > >> term > >> solution either. There is very little useful functionality hidden > >> away > >> in `np.lib.*` currently. > >> > >> Cheers, > >> > >> Sebastian > >> > >>>> > >>>> > >>>>> Cheers, > >>>>> > >>>>> Sebastian > >>>>> > >>>>> > >>>>> > >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > >>>>>> Hello, > >>>>>> > >>>>>> I would like to draw the attention of this list to PR #17394 > >>>>>> [1] that > >>>>>> adds the implementation of a sliding window view to numpy. > >>>>>> > >>>>>> Having a sliding window view in numpy is a longstanding open > >>>>>> issue > >>>>>> (cf > >>>>>> #7753 [2] from 2016). A brief summary of the discussions > >>>>>> surrounding > >>>>>> it > >>>>>> can be found in the description of the PR. > >>>>>> > >>>>>> This PR implements a sliding window view based on stride > >>>>>> tricks. > >>>>>> Following the discussion in issue #7753, a first > >>>>>> implementation > >>>>>> was > >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion, > >>>>>> that PR > >>>>>> stalled and I picked up the issue in the present PR #17394. > >>>>>> It > >>>>>> is > >>>>>> based > >>>>>> on the first implementation, but follows the changed API as > >>>>>> suggested > >>>>>> by > >>>>>> Eric Wieser. > >>>>>> > >>>>>> Code reviews have been provided by Bas van Beek, Stephen > >>>>>> Hoyer, > >>>>>> and > >>>>>> Eric > >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. > >>>>>> > >>>>>> > >>>>>> Do you think this is suitable for inclusion in numpy? > >>>>>> > >>>>>> Do you consider the PR ready? > >>>>>> > >>>>>> Do you have suggestions or requests? > >>>>>> > >>>>>> > >>>>>> Thanks for your time and consideration! > >>>>>> Klaus > >>>>>> > >>>>>> > >>>>>> [1] https://github.com/numpy/numpy/pull/17394 > >>>>>> [2] https://github.com/numpy/numpy/issues/7753 > >>>>>> _______________________________________________ > >>>>>> NumPy-Discussion mailing list > >>>>>> NumPy-Discussion at python.org > >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> NumPy-Discussion mailing list > >>>>> NumPy-Discussion at python.org > >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at python.org > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>> > >>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at python.org > >>> https://mail.python.org/mailman/listinfo/numpy-discussion > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbrockmendel at gmail.com Fri Nov 6 10:57:41 2020 From: jbrockmendel at gmail.com (Brock Mendel) Date: Fri, 6 Nov 2020 07:57:41 -0800 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone In-Reply-To: References: Message-ID: > I find the whole notion of a "timezone naive timestamp" to be nearly meaningless >From the perspective of, say, the dateutil parser, what would you do with "2020-11-06 07:48"? If you assume it's UTC you'll be wrong in this case. If you assume it is in your local timezone, you'll be wrong in Europe. Timezone-naive datetimes are an abstraction for exactly this case. >>> t0 = pd.Timestamp.now() You can use `pd.Timestamp.now("UTC")`. See also https://mail.python.org/archives/list/datetime-sig at python.org/thread/PT4JWJLYBE5R2QASVBPZLHH37ULJQR43/ , https://github.com/pandas-dev/pandas/issues/22451 On Fri, Nov 6, 2020 at 2:48 AM Noam Yorav-Raphael wrote: > Hi, > > I actually arrived at this by first trying to use pandas.Timestamp and > getting very frustrated about it. With pandas, I get: > > >>> pd.Timestamp.now() > Timestamp('2020-11-06 09:45:24.249851') > > I find the whole notion of a "timezone naive timestamp" to be nearly > meaningless. A timestamp should mean a moment in time (as the current numpy > documentation defines very well). A "naive timestamp" doesn't mean > anything. It's exactly like a "unit naive length". I can have a Length type > which just takes a number, and be very happy that it works both if my "unit > zone" is inches or centimeters. So "Length(3)" will mean 3 cm in most of > the world and 3 inches in the US. But then, if I get "Length(3)" from > someone, I can't be sure what length it refers to. > > So currently, this happens with pandas timestamps: > > >>> os.environ['TZ'] = 'UTC'; time.tzset() > ... t0 = pd.Timestamp.now() > ... time.sleep(1) > ... os.environ['TZ'] = 'EST-5'; time.tzset() > ... t1 = pd.Timestamp.now() > ... t1 - t0 > Timedelta('0 days 05:00:01.001583') > > This is not just theoretical - I actually need to work with data from > several devices, each in its own time zone. And I need to know that I won't > get such meaningless results. > > And you can even get something like this: > > >>> t0 = pd.Timestamp.now() > ... time.sleep(10) > ... t1 = pd.Timestamp.now() > ... t1 - t0 > Timedelta('0 days 01:00:10.001583') > > if the first measurement happened to be in winter time and the second > measurement happened to be in daylight saving time. > > The solution is simple, and is what datetime64 used to do before the > change - have a type that just represents a moment in time. It's not "in > UTC" - it just stores the number of seconds that passed since an agreed > moment in time (which is usually 1970-01-01 02:00+0200, which is more > commonly referred to as 1970-01-01 00:00Z - it's the exact same moment). > > I think it would make things clearer if I'll mention that there are > operations that are not dealing with timestamps. For example, it's > meaningless to ask what is the year of a timestamp - it may depend on the > time zone. These are always *human* related questions, that depend on > certain human conventions. We can call them "calendar questions". For these > types of questions, a type that includes both a timestamp and a timezone > offset (in minutes from UTC) can be useful. Some questions even require > full timezone information, meaning a function that defines what's the > timezone offset for each moment. However, I don't think numpy should deal > with those calendar issues. As a very simple example, even for > "timestamp+offset" types, it's not clear how to compare them - should > values with the same timestamp and different offsets be considered equal or > not? And in virtually all of my data analysis, this calendar aspect has > nothing to do with the questions I'm trying to answer. > > I have a suggestion. Instead of changing datetime64 (which I consider to > be ill-defined, but never mind), add a new type called "timestamp64". It > will have the exact same behavior as datetime64 had before the change, > except that its only allowed units will be seconds, milliseconds, > microseconds and nanoseconds. Removing the longer units will make it clear > that it doesn't deal with calendar and dates. Also, all the business day > functionality will not be applicable to timestamp64. In order to get > calendar information (such as the year) from timestamp64, you will have to > manually convert it to python's datetime (or to np.datetime64) with an > explicit timezone (utc, local, an offset, or a timezone object). > > What do you think? > > Thanks, > Noam > > > > > > On Fri, Nov 6, 2020 at 1:45 AM Stephan Hoyer wrote: > >> I can try to dig up the old discussions, but datetime64 used to implement >> both (1) and (3), and this was updated in a very intentional way. >> Datetime64 now works like Python's own time-zone naive datetime.datetime >> objects. The documentation referencing "Z" should be updated -- datetime64 >> can be in any timezone you like. >> >> Timezone aware datetime objects are certainly useful, but NumPy's >> datetime64 was restricted to UTC. The consensus was that it was worse to >> have UTC-only rather than timezone-naive-only. NumPy's datetime64 is often >> used for data analysis purposes, for which automatic conversion to the >> local timezone of the computer running the analysis is often >> counter-productive. >> >> If you care about timezone conversions, I would highly recommend looking >> into pandas's Timestamp class for this purpose. In the future, this would >> be a good use-case for a new custom NumPy dtype. (The existing >> np.datetime64 code cannot easily handle multiple timezones.) >> >> On Thu, Nov 5, 2020 at 1:04 PM Eric Wieser >> wrote: >> >>> Without weighing in yet on how I feel about the deprecation, you can see >>> some discussion about why this was originally deprecated in the PR that >>> introduced the warning: >>> >>> https://github.com/numpy/numpy/pull/6453 >>> >>> Eric >>> >>> On Thu, Nov 5, 2020, 20:13 Noam Yorav-Raphael >>> wrote: >>> >>>> Hi, >>>> >>>> I suggest removing the deprecation warning when constructing a >>>> datetime64 with a timezone. For example, this is the current behavior: >>>> >>>> >>> np.datetime64('2020-11-05 16:00+0200') >>>> :1: DeprecationWarning: parsing timezone aware datetimes is >>>> deprecated; this will raise an error in the future >>>> numpy.datetime64('2020-11-05T14:00') >>>> >>>> I suggest removing the deprecation warning because I find this to be a >>>> useful behavior, and because it is a correct behavior. The manual says: >>>> "The datetime object represents a single moment in time... Datetimes are >>>> always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." >>>> So 2020-11-05T16:00+0200 is indeed the moment in time represented by >>>> np.datetime64('2020-11-05T14:00'). >>>> >>>> I just used this to restrict my data set to records created after a >>>> certain moment. It was easier for me to write the moment in my local time >>>> and add "+0200" than to figure out the moment representation in UTC. >>>> >>>> So this is my simple suggestion: remove the deprecation warning. >>>> >>>> >>>> Beyond that, I have 3 ideas for changing the repr of datetime64 that I >>>> would like to discuss. >>>> >>>> 1. Add "Z" at the end, for example, >>>> numpy.datetime64('2020-11-05T14:00Z'). This will make it clear to which >>>> moment it refers. I think this is significant - I had to dig quite a bit to >>>> realize that datetime64('2020-11-05T14:00') means 14:00 UTC. >>>> >>>> 2. Replace the 'T' with a space. I just find it much easier to read >>>> '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of >>>> characters makes it hard for my brain to parse. >>>> >>>> 3. This will require discussion, but will be very convenient: have the >>>> repr display the time using the environment time zone, including a time >>>> offset. So, in my specific time zone (+0200), I will have: >>>> >>>> repr(np.datetime64('2020-11-05 14:00Z')) == >>>> "numpy.datetime64('2020-11-05T16:00+0200')" >>>> >>>> I'm sure the pros and cons of having an environment-dependent repr >>>> should be discussed. But I will list some pros: >>>> 1. It's very convenient - it's immediately obvious to me to which >>>> moment 2020-11-05 16:00+0200 refers. >>>> 2. It's well defined - I may collect timestamps from machines with >>>> different time zones, and I will be able to know to which exact moment each >>>> timestamp refers. >>>> 3. It's very simple - I could compare any two timestamps, I don't have >>>> to worry about time zones. >>>> >>>> I would be happy to hear your thoughts. >>>> >>>> Thanks, >>>> Noam >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From klaus.zimmermann at smhi.se Fri Nov 6 11:03:00 2020 From: klaus.zimmermann at smhi.se (Zimmermann Klaus) Date: Fri, 6 Nov 2020 16:03:00 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> Message-ID: Hi, On 06/11/2020 15:58, Ralf Gommers wrote: > On Fri, Nov 6, 2020 at 9:51 AM Zimmermann Klaus > > wrote: > I have absolutely no problem keeping this out of the main namespace. > > In fact I'd like to point out that it was not my idea. Rather, it was > proposed by Bas van Beek in the comments [1,2] and received a little > more scrutiny from Eric Wieser in [3]. > > Thanks, between two PRs with that many comments, I couldn't figure that > out - just saw the commit that make the change. Understandable, no worries. > On the subject matter, I am also curious about the potential for > confusion. What other behavior could one expect from a sliding window > view with this shape? > > As I said, I am completely fine with keeping this out of the main > namespace, but I agree with Sebastian's comment, that > `np.lib.stride_tricks` is perhaps not the best namespace. > > > I agree that that's not a great namespace. There's multiple issues with > namespaces, we basically have three good ones (fft, linalg, random) and > a bunch of other ones that range from questionable to terrible. See > https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 > > for details. > > This would be a good thing to work on - making the `numpy.lib` namespace > not bleed into `numpy` via `import *` is one thing to do there, and > there's many others. But given backwards compat constraints it's not easy. I understand cleaning up all the namespaces is a giant task, so far, far out of scope here. As said before, I also completely agree to keep it out of the main namespace (though I will still argue below :P). I was just wondering if, of the top your head, an existing, better fit comes to mind? > The reason > from my point of view is that stride tricks is really a technical (and > slightly ominous) name that might throw of more application oriented > programmers from finding and using this function. Thinking of my > scientist colleagues, I think those are exactly the kind of users that > could benefit from such a prototyping tool. > > > That phrasing is one of a number of concerns. NumPy is normally not in > the business of providing things that are okay as a prototyping tool, > but are potentially extremely slow (as pointed out in the Notes section > of the docstring). A function like that would basically not be the right > tool for almost anything in, e.g., SciPy - it requires an iterative > algorithm. In NumPy we don't prefer performance at all costs, but in > general it's pretty decent rather than "Numba or Cython may gain you > 100x here". I still think that the performance concern is a bit overblown. Yes, application with large windows can need more FLOPs by an equally large factor. But most such applications will use small to moderate windows. Furthermore, this view focuses only on FLOPs. In my current field of climate science (and many others), that is almost never the limiting factor. Memory demands are far more problematic and incidentally, those are more likely to increase in other methods that require the storage of ancillary, temporary data. > Other issues include: > 2) It is very specific to NumPy's memory model (as pointed out by you > and Sebastian) - just like the rest of stride_tricks Not wrong, but on the other hand, that memory model is not exotic. C, Fortran, and any number of other languages play very nicely with this, just as important downstream libraries like dask. > 3) It has "view" in the name, which doesn't quite make sense for the > main namespace (also connected to point 2 above). Ok. > 4) The cost of putting something in the main namespace for other > array/tensor libraries is large. Maybe other libraries, e.g. CuPy, Dask, > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the > main NumPy namespace as well as possible. This would trigger discussions > and likely many person-weeks of work for others. Agreed. Though I have to say that my whole motivation comes from corresponding issues in dask that where specifically waiting for (the older version of) this PR (see [1, 2,...]). But I understand that dask is effectively much closer to the numpy memory model than, say, CuPy, so don't take this to mean it should be in the main namespace. > 5) It's a useful function, but it's very much on the margins of NumPy's > scope. It could easily have gone into, for example, scipy.signal. At > this point the bar for functions going into the main namespace should be> (and is) high. I agree that the bar for the main namespace should be high! > All this taken together means it's not even a toss-up for me. If it were > just one or two of these points, maybe. But given all the above, I'm > pretty confident saying "it does not belong in the main namespace". Again, I am happy with that. Thanks for your thoughts and work! I really appreciate it! Cheers Klaus [1] https://github.com/dask/dask/issues/4659 [2] https://github.com/pydata/xarray/issues/3608 [3] https://github.com/pandas-dev/pandas/issues/26959 > > > Cheers > Klaus > > > > [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 > > [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468 > > [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340 > > > On 06/11/2020 01:39, Sebastian Berg wrote: > > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > >>> ralf.gommers at gmail.com > > >>> wrote: > >>> > >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > >>>> sebastian at sipsolutions.net > > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> just a brief note that I merged this proposal: > >>>>> > >>>>>? ? ?https://github.com/numpy/numpy/pull/17394 > > >>>>> > >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy. > >>>>> > >>>>> There was only one public API change, and that is that the > >>>>> `shape` > >>>>> argument is now called `window_shape`. > >>>>> > >>>>> This is still a good time for feedback in case you have a > >>>>> better > >>>>> idea > >>>>> e.g. for the function or parameter names. > >>>>> > >>>> > >>>> The old PR had this in the lib.stride_tricks namespace. Seeing it > >>>> in the > >>>> main namespace is unexpected and likely will lead to > >>>> issues/questions, > >>>> given that such an overlapping view is going to do behave in ways > >>>> the > >>>> average user will be surprised by. It may also lead to requests > >>>> for > >>>> other > >>>> array/tensor libraries to implement this. I don't see any > >>>> discussion on > >>>> this in PR 17394, it looks like a decision by the PR author that > >>>> no > >>>> one > >>>> commented on - reconsider that? > >>>> > >>>> Cheers, > >>>> Ralf > >>>> > >>> > >>> +1 let's keep this in the lib.stride_tricks namespace. > >>> > >> > >> I have no reservations against having it in the main namespace and am > >> happy either way (it can still be exposed later in any case). It is > >> the > >> conservative choice and maybe it is an uncommon enough function that > >> it > >> deserves being a bit hidden... > > > > > > In any case, its the safe bet for NumPy 1.20 at least so I opened > a PR: > > > >? ? ?https://github.com/numpy/numpy/pull/17720 > > > > > Name changes, etc. are also possible of course. > > > > I still think it might be nice to find a better place for this type of > > function that `np.lib.stride_tricks` though, but dunno... > > > > - Sebastian > > > > > > > >> > >> But I am curious, it sounds like you have both very strong > >> reservations, and I would like to understand them better. > >> > >> The behaviour can be surprising, but that is why the default is a > >> read- > >> only view.? I do not think it is worse than `np.broadcast_to` in this > >> regard. (It is nowhere near as dangerous as `as_strided`.) > >> > >> It is true that it is specific to NumPy (memory model). So that is > >> maybe a good enough reason right now.? But I am not sure that > >> stuffing > >> things into a pretty hidden `np.lib.*` namespaces is a great long > >> term > >> solution either. There is very little useful functionality hidden > >> away > >> in `np.lib.*` currently. > >> > >> Cheers, > >> > >> Sebastian > >> > >>>> > >>>> > >>>>> Cheers, > >>>>> > >>>>> Sebastian > >>>>> > >>>>> > >>>>> > >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > >>>>>> Hello, > >>>>>> > >>>>>> I would like to draw the attention of this list to PR #17394 > >>>>>> [1] that > >>>>>> adds the implementation of a sliding window view to numpy. > >>>>>> > >>>>>> Having a sliding window view in numpy is a longstanding open > >>>>>> issue > >>>>>> (cf > >>>>>> #7753 [2] from 2016). A brief summary of the discussions > >>>>>> surrounding > >>>>>> it > >>>>>> can be found in the description of the PR. > >>>>>> > >>>>>> This PR implements a sliding window view based on stride > >>>>>> tricks. > >>>>>> Following the discussion in issue #7753, a first > >>>>>> implementation > >>>>>> was > >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion, > >>>>>> that PR > >>>>>> stalled and I picked up the issue in the present PR #17394. > >>>>>> It > >>>>>> is > >>>>>> based > >>>>>> on the first implementation, but follows the changed API as > >>>>>> suggested > >>>>>> by > >>>>>> Eric Wieser. > >>>>>> > >>>>>> Code reviews have been provided by Bas van Beek, Stephen > >>>>>> Hoyer, > >>>>>> and > >>>>>> Eric > >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. > >>>>>> > >>>>>> > >>>>>> Do you think this is suitable for inclusion in numpy? > >>>>>> > >>>>>> Do you consider the PR ready? > >>>>>> > >>>>>> Do you have suggestions or requests? > >>>>>> > >>>>>> > >>>>>> Thanks for your time and consideration! > >>>>>> Klaus > >>>>>> > >>>>>> > >>>>>> [1] https://github.com/numpy/numpy/pull/17394 > > >>>>>> [2] https://github.com/numpy/numpy/issues/7753 > > >>>>>> _______________________________________________ > >>>>>> NumPy-Discussion mailing list > >>>>>> NumPy-Discussion at python.org > >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> NumPy-Discussion mailing list > >>>>> NumPy-Discussion at python.org > >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >>>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at python.org > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >>>> > >>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at python.org > >>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From melissawm at gmail.com Sat Nov 7 07:48:39 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Sat, 7 Nov 2020 09:48:39 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday November 9 In-Reply-To: References: Message-ID: Hi all! This is a reminder that our next Documentation Team meeting will be on *Monday, November 9* at 3PM UTC** (PLEASE MIND THE RECENT TIME CHANGES AND SEE IF IT APPLIES TO YOUR AREA) If you wish to join on Zoom, **you need to use this NEW link** https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20201109T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From noamraph at gmail.com Sat Nov 7 15:22:20 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Sat, 7 Nov 2020 22:22:20 +0200 Subject: [Numpy-discussion] datetime64: Remove deprecation warning when constructing with timezone In-Reply-To: References: Message-ID: On Fri, Nov 6, 2020 at 5:58 PM Brock Mendel wrote: > > I find the whole notion of a "timezone naive timestamp" to be nearly > meaningless > > From the perspective of, say, the dateutil parser, what would you do with > "2020-11-06 07:48"? If you assume it's UTC you'll be wrong in this case. > If you assume it is in your local timezone, you'll be wrong in Europe. > Timezone-naive datetimes are an abstraction for exactly this case. > > I'm not sure what you mean by "the perspective of the dateutil parser". Indeed, "2020-11-06 07:48" is not a well-defined timestamp, since it doesn't define a specific moment in time. If you ask what a timestamp type should do when constructed from such a string, then I can think of two reasonable alternatives. One is to just not allow it, and perhaps provide a .from_local() method which makes it explicit. The other is to allow it, and make it clear that when an offset is not defined, it uses the environment's timezone to convert the string to a timestamp. I wouldn't use the third alternative, which is to parse it in UTC, since it doesn't add a lot of convenience since it's easy to add a "Z" to the string. > >>> t0 = pd.Timestamp.now() > > You can use `pd.Timestamp.now("UTC")`. See also > https://mail.python.org/archives/list/datetime-sig at python.org/thread/PT4JWJLYBE5R2QASVBPZLHH37ULJQR43/ > , https://github.com/pandas-dev/pandas/issues/22451 > > Thanks for pointing this out. However, this doesn't work: >>> pd.Timestamp.fromtimestamp(time.time(), 'UTC') Traceback (most recent call last): ... TypeError: fromtimestamp() takes exactly 2 positional arguments (3 given) Also, this doesn't work: >>> t0 = pd.Timestamp.now('UTC') ... t1 = pd.Timestamp.now('Asia/Jerusalem') ... t1 - t0 Traceback (most recent call last): ... TypeError: Timestamp subtraction must have the same timezones or no timezones Also, this doesn't do what it probably should: >>> pd.Timestamp.now('UTC'), pd.Timestamp.now().tz_localize('UTC') (Timestamp('2020-11-07 20:18:38.719603+0000', tz='UTC'), Timestamp('2020-11-08 01:18:38.719701+0000', tz='UTC')) (I have no idea how the second result was calculated, but it's wrong. It should have been equal to the first) So, pd.Timestamp is crap. I think that adding np.timestamp64 may finally bring a sane timestamp type to python. Thanks, Noam > > > > On Fri, Nov 6, 2020 at 2:48 AM Noam Yorav-Raphael > wrote: > >> Hi, >> >> I actually arrived at this by first trying to use pandas.Timestamp and >> getting very frustrated about it. With pandas, I get: >> >> >>> pd.Timestamp.now() >> Timestamp('2020-11-06 09:45:24.249851') >> >> I find the whole notion of a "timezone naive timestamp" to be nearly >> meaningless. A timestamp should mean a moment in time (as the current numpy >> documentation defines very well). A "naive timestamp" doesn't mean >> anything. It's exactly like a "unit naive length". I can have a Length type >> which just takes a number, and be very happy that it works both if my "unit >> zone" is inches or centimeters. So "Length(3)" will mean 3 cm in most of >> the world and 3 inches in the US. But then, if I get "Length(3)" from >> someone, I can't be sure what length it refers to. >> >> So currently, this happens with pandas timestamps: >> >> >>> os.environ['TZ'] = 'UTC'; time.tzset() >> ... t0 = pd.Timestamp.now() >> ... time.sleep(1) >> ... os.environ['TZ'] = 'EST-5'; time.tzset() >> ... t1 = pd.Timestamp.now() >> ... t1 - t0 >> Timedelta('0 days 05:00:01.001583') >> >> This is not just theoretical - I actually need to work with data from >> several devices, each in its own time zone. And I need to know that I won't >> get such meaningless results. >> >> And you can even get something like this: >> >> >>> t0 = pd.Timestamp.now() >> ... time.sleep(10) >> ... t1 = pd.Timestamp.now() >> ... t1 - t0 >> Timedelta('0 days 01:00:10.001583') >> >> if the first measurement happened to be in winter time and the second >> measurement happened to be in daylight saving time. >> >> The solution is simple, and is what datetime64 used to do before the >> change - have a type that just represents a moment in time. It's not "in >> UTC" - it just stores the number of seconds that passed since an agreed >> moment in time (which is usually 1970-01-01 02:00+0200, which is more >> commonly referred to as 1970-01-01 00:00Z - it's the exact same moment). >> >> I think it would make things clearer if I'll mention that there are >> operations that are not dealing with timestamps. For example, it's >> meaningless to ask what is the year of a timestamp - it may depend on the >> time zone. These are always *human* related questions, that depend on >> certain human conventions. We can call them "calendar questions". For these >> types of questions, a type that includes both a timestamp and a timezone >> offset (in minutes from UTC) can be useful. Some questions even require >> full timezone information, meaning a function that defines what's the >> timezone offset for each moment. However, I don't think numpy should deal >> with those calendar issues. As a very simple example, even for >> "timestamp+offset" types, it's not clear how to compare them - should >> values with the same timestamp and different offsets be considered equal or >> not? And in virtually all of my data analysis, this calendar aspect has >> nothing to do with the questions I'm trying to answer. >> >> I have a suggestion. Instead of changing datetime64 (which I consider to >> be ill-defined, but never mind), add a new type called "timestamp64". It >> will have the exact same behavior as datetime64 had before the change, >> except that its only allowed units will be seconds, milliseconds, >> microseconds and nanoseconds. Removing the longer units will make it clear >> that it doesn't deal with calendar and dates. Also, all the business day >> functionality will not be applicable to timestamp64. In order to get >> calendar information (such as the year) from timestamp64, you will have to >> manually convert it to python's datetime (or to np.datetime64) with an >> explicit timezone (utc, local, an offset, or a timezone object). >> >> What do you think? >> >> Thanks, >> Noam >> >> >> >> >> >> On Fri, Nov 6, 2020 at 1:45 AM Stephan Hoyer wrote: >> >>> I can try to dig up the old discussions, but datetime64 used to >>> implement both (1) and (3), and this was updated in a very intentional way. >>> Datetime64 now works like Python's own time-zone naive datetime.datetime >>> objects. The documentation referencing "Z" should be updated -- datetime64 >>> can be in any timezone you like. >>> >>> Timezone aware datetime objects are certainly useful, but NumPy's >>> datetime64 was restricted to UTC. The consensus was that it was worse to >>> have UTC-only rather than timezone-naive-only. NumPy's datetime64 is often >>> used for data analysis purposes, for which automatic conversion to the >>> local timezone of the computer running the analysis is often >>> counter-productive. >>> >>> If you care about timezone conversions, I would highly recommend looking >>> into pandas's Timestamp class for this purpose. In the future, this would >>> be a good use-case for a new custom NumPy dtype. (The existing >>> np.datetime64 code cannot easily handle multiple timezones.) >>> >>> On Thu, Nov 5, 2020 at 1:04 PM Eric Wieser >>> wrote: >>> >>>> Without weighing in yet on how I feel about the deprecation, you can >>>> see some discussion about why this was originally deprecated in the PR that >>>> introduced the warning: >>>> >>>> https://github.com/numpy/numpy/pull/6453 >>>> >>>> Eric >>>> >>>> On Thu, Nov 5, 2020, 20:13 Noam Yorav-Raphael >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I suggest removing the deprecation warning when constructing a >>>>> datetime64 with a timezone. For example, this is the current behavior: >>>>> >>>>> >>> np.datetime64('2020-11-05 16:00+0200') >>>>> :1: DeprecationWarning: parsing timezone aware datetimes is >>>>> deprecated; this will raise an error in the future >>>>> numpy.datetime64('2020-11-05T14:00') >>>>> >>>>> I suggest removing the deprecation warning because I find this to be a >>>>> useful behavior, and because it is a correct behavior. The manual says: >>>>> "The datetime object represents a single moment in time... Datetimes are >>>>> always stored based on POSIX time, with an epoch of 1970-01-01T00:00Z." >>>>> So 2020-11-05T16:00+0200 is indeed the moment in time represented by >>>>> np.datetime64('2020-11-05T14:00'). >>>>> >>>>> I just used this to restrict my data set to records created after a >>>>> certain moment. It was easier for me to write the moment in my local time >>>>> and add "+0200" than to figure out the moment representation in UTC. >>>>> >>>>> So this is my simple suggestion: remove the deprecation warning. >>>>> >>>>> >>>>> Beyond that, I have 3 ideas for changing the repr of datetime64 that I >>>>> would like to discuss. >>>>> >>>>> 1. Add "Z" at the end, for example, >>>>> numpy.datetime64('2020-11-05T14:00Z'). This will make it clear to which >>>>> moment it refers. I think this is significant - I had to dig quite a bit to >>>>> realize that datetime64('2020-11-05T14:00') means 14:00 UTC. >>>>> >>>>> 2. Replace the 'T' with a space. I just find it much easier to read >>>>> '2020-11-05 14:00Z' than '2020-11-05T14:00Z'. The long sequence of >>>>> characters makes it hard for my brain to parse. >>>>> >>>>> 3. This will require discussion, but will be very convenient: have the >>>>> repr display the time using the environment time zone, including a time >>>>> offset. So, in my specific time zone (+0200), I will have: >>>>> >>>>> repr(np.datetime64('2020-11-05 14:00Z')) == >>>>> "numpy.datetime64('2020-11-05T16:00+0200')" >>>>> >>>>> I'm sure the pros and cons of having an environment-dependent repr >>>>> should be discussed. But I will list some pros: >>>>> 1. It's very convenient - it's immediately obvious to me to which >>>>> moment 2020-11-05 16:00+0200 refers. >>>>> 2. It's well defined - I may collect timestamps from machines with >>>>> different time zones, and I will be able to know to which exact moment each >>>>> timestamp refers. >>>>> 3. It's very simple - I could compare any two timestamps, I don't have >>>>> to worry about time zones. >>>>> >>>>> I would be happy to hear your thoughts. >>>>> >>>>> Thanks, >>>>> Noam >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noamraph at gmail.com Sat Nov 7 15:57:20 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Sat, 7 Nov 2020 22:57:20 +0200 Subject: [Numpy-discussion] Proposal: add the timestamp64 type Message-ID: Hi, (I'm repeating things I wrote under the "datetime64: Remove deprecation warning..." thread, since I'm now proposing a new solution.) I propose to add a new type called "timestamp64". It will be a pure timestamp, meaning that it represents a moment in time (as seconds/ms/us/ns since the epoch), without any timezone information. It will have the exact same behavior as datetime64 had before version 1.11, except that its only allowed units will be seconds, milliseconds, microseconds and nanoseconds. Removing the longer units will make it clear that it doesn't deal with calendar and dates. Also, all the business day functionality will not be applicable to timestamp64. In order to get calendar information (such as the year) from timestamp64, you will have to manually convert it to python's datetime (or perhaps to np.datetime64) with an explicit timezone (utc, local, an offset, or a timezone object). This is needed because since the change introduced in 1.11, datetime64 no longer represents a timestamp, but rather a date and time of an abstract calendar. So given a datetime64, it is not possible to get an actual timestamp without knowing the timezone to which the datetime64 refers. If the datetime64 is in a timezone with daylight saving time, it can even be ambiguous, since the same written hour will occur twice on the transition from DST to winter time. I would like it to work like this: >>> np.timestamp64.now() numpy.timestamp64('2020-11-07 22:42:52.871159+0200') >>> np.timestamp64.now('s') numpy.timestamp64('2020-11-07 22:42:52+0200') >>> np.timestamp64(1604781916, 's') numpy.timestamp64('2020-11-07 22:42:52+0200') >>> np.timestamp64('2020-11-07 20:42:52Z') numpy.timestamp64('2020-11-07 22:42:52+0200') * timestamp64.now() will get an optional string argument with the base units. If not given, I think 'us' is a good default. * The repr will format the timestamp using the environment's timezone. * I like the repr to not include a 'T' between the date and the time. I find it much easier to read. * I tend to think that it should be allowed to construct a timestamp64 from an ISO8601 string without a timezone offset, in which case the environment's timezone will be used to convert it to a timestamp. So in the Asia/Jerusalem timezone it will look like: >>> np.timestamp64('2020-11-07 22:42:52') numpy.timestamp64('2020-11-07 22:42:52+0200') >>> np.timestamp64('2020-08-01 22:00:00') numpy.timestamp64('2020-08-01 22:00:00+0300') If I implement this, could it be added to numpy? Thanks, Noam -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Nov 9 20:17:44 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Nov 2020 18:17:44 -0700 Subject: [Numpy-discussion] Python 3.6 has been dropped from NumPy 1.20 Message-ID: Hi All, The subject says it all: Python 3.6 has been dropped for the NumPy 1.20 release. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Nov 10 05:46:44 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 10 Nov 2020 10:46:44 +0000 Subject: [Numpy-discussion] Python 3.6 has been dropped from NumPy 1.20 In-Reply-To: References: Message-ID: On Tue, Nov 10, 2020 at 1:18 AM Charles R Harris wrote: > Hi All, > > The subject says it all: Python 3.6 has been dropped for the NumPy 1.20 > release. > That's great, thanks! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Nov 10 13:19:53 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 10 Nov 2020 18:19:53 +0000 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: Hi all, I'd like to share an update on this topic. The draft array API standard is now ready for wider review: - Blog post: https://data-apis.org/blog/array_api_standard_release - Array API standard document: https://data-apis.github.io/array-api/latest/ - Repo: https://github.com/data-apis/array-api/ It would be great if people - and in particular, NumPy maintainers - could have a look at it and see if that looks sensible from a NumPy perspective and whether the goals and benefits of adopting it are described clearly enough and are compelling. I'm sure a NEP will be needed for proposing adoption of the standard once it is closer to completion, and work out what that means for interaction with the array protocol NEPs and/or NEP 37, and how an implementation would look. It's a bit early for that now, I'm thinking maybe by the end of the year. Some initial discussion now would be useful though, since it's easier to make changes now rather than when that API standard is already further along. Cheers, Ralf On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers wrote: > Hi all, > > I'd like to share this announcement blog post about the creation of a > consortium for array and dataframe API standardization here: > https://data-apis.org/blog/announcing_the_consortium/. It's still in the > beginning stages, but starting to take shape. We have participation from > one or more maintainers of most array and tensor libraries - NumPy, > TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis > Oliphant and myself have been providing input from a NumPy perspective. > > The effort is very much related to some of the interoperability work we've > been doing in NumPy (e.g. it could provide an answer to what's described in > https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api > ). > > At this point we're looking for feedback from maintainers at a high level > (see the blog post for details). > > Also important: the python-record-api tooling and data in its repo has > very granular API usage data, of the kind we could really use when making > decisions that impact backwards compatibility. > > Cheers, > Ralf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 10 16:20:52 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 10 Nov 2020 15:20:52 -0600 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: Hi all, There will be a NumPy Community meeting Wednesday November 11th at 1pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ilhanpolat at gmail.com Wed Nov 11 05:55:39 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Wed, 11 Nov 2020 11:55:39 +0100 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: This is great work. Thanks to everyone who contributed. Very clean user-interface too. One question: Can we propose feature requests already or is that discussion closed? On Tue, Nov 10, 2020 at 7:21 PM Ralf Gommers wrote: > Hi all, > > I'd like to share an update on this topic. The draft array API standard is > now ready for wider review: > > - Blog post: https://data-apis.org/blog/array_api_standard_release > - Array API standard document: > https://data-apis.github.io/array-api/latest/ > - Repo: https://github.com/data-apis/array-api/ > > It would be great if people - and in particular, NumPy maintainers - could > have a look at it and see if that looks sensible from a NumPy perspective > and whether the goals and benefits of adopting it are described clearly > enough and are compelling. > > I'm sure a NEP will be needed for proposing adoption of the standard once > it is closer to completion, and work out what that means for interaction > with the array protocol NEPs and/or NEP 37, and how an implementation would > look. It's a bit early for that now, I'm thinking maybe by the end of the > year. Some initial discussion now would be useful though, since it's easier > to make changes now rather than when that API standard is already further > along. > > Cheers, > Ralf > > > On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers > wrote: > >> Hi all, >> >> I'd like to share this announcement blog post about the creation of a >> consortium for array and dataframe API standardization here: >> https://data-apis.org/blog/announcing_the_consortium/. It's still in the >> beginning stages, but starting to take shape. We have participation from >> one or more maintainers of most array and tensor libraries - NumPy, >> TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis >> Oliphant and myself have been providing input from a NumPy perspective. >> >> The effort is very much related to some of the interoperability work >> we've been doing in NumPy (e.g. it could provide an answer to what's >> described in >> https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api >> ). >> >> At this point we're looking for feedback from maintainers at a high level >> (see the blog post for details). >> >> Also important: the python-record-api tooling and data in its repo has >> very granular API usage data, of the kind we could really use when making >> decisions that impact backwards compatibility. >> >> Cheers, >> Ralf >> >> _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From compl.yue at icloud.com Wed Nov 11 07:14:35 2020 From: compl.yue at icloud.com (YueCompl) Date: Wed, 11 Nov 2020 20:14:35 +0800 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: <38D4DE93-1F6D-4F09-820F-2E335FDF0674@icloud.com> This is great! I'm working on some Haskell based mmap shared array lib, with Python like surface language API. I would adhere to such standard very willingly. A quick skim but I can't find dataframe related info, that's scheduled for the future? Will take Pandas as primary reference? Thanks with best regards, Compl > On 2020-11-11, at 02:19, Ralf Gommers wrote: > > Hi all, > > I'd like to share an update on this topic. The draft array API standard is now ready for wider review: > > - Blog post: https://data-apis.org/blog/array_api_standard_release > - Array API standard document: https://data-apis.github.io/array-api/latest/ > - Repo: https://github.com/data-apis/array-api/ > > It would be great if people - and in particular, NumPy maintainers - could have a look at it and see if that looks sensible from a NumPy perspective and whether the goals and benefits of adopting it are described clearly enough and are compelling. > > I'm sure a NEP will be needed for proposing adoption of the standard once it is closer to completion, and work out what that means for interaction with the array protocol NEPs and/or NEP 37, and how an implementation would look. It's a bit early for that now, I'm thinking maybe by the end of the year. Some initial discussion now would be useful though, since it's easier to make changes now rather than when that API standard is already further along. > > Cheers, > Ralf > > > On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers > wrote: > Hi all, > > I'd like to share this announcement blog post about the creation of a consortium for array and dataframe API standardization here: https://data-apis.org/blog/announcing_the_consortium/ . It's still in the beginning stages, but starting to take shape. We have participation from one or more maintainers of most array and tensor libraries - NumPy, TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis Oliphant and myself have been providing input from a NumPy perspective. > > The effort is very much related to some of the interoperability work we've been doing in NumPy (e.g. it could provide an answer to what's described in https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api ). > > At this point we're looking for feedback from maintainers at a high level (see the blog post for details). > > Also important: the python-record-api tooling and data in its repo has very granular API usage data, of the kind we could really use when making decisions that impact backwards compatibility. > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Nov 11 07:57:03 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 11 Nov 2020 12:57:03 +0000 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: On Wed, Nov 11, 2020 at 10:56 AM Ilhan Polat wrote: > This is great work. Thanks to everyone who contributed. Very clean > user-interface too. > > One question: Can we propose feature requests already or is that > discussion closed? > It's not closed, this is the start of community review so if things are missing or need changing, now is a good time to bring them up - please have a look at CONTRIBUTING.md in the array-api repo. What I would personally expect is that most discussion will be about the bigger picture topics and about the clarity of the document. There may be some individual functions that are important to add, if that's what you have in mind I would recommend looking at some merged PRs to see how the analysis is done (e.g. usage data, comparison between existing libraries). https://github.com/data-apis/array-api/pull/42 is a good example. Cheers, Ralf > On Tue, Nov 10, 2020 at 7:21 PM Ralf Gommers > wrote: > >> Hi all, >> >> I'd like to share an update on this topic. The draft array API standard >> is now ready for wider review: >> >> - Blog post: https://data-apis.org/blog/array_api_standard_release >> - Array API standard document: >> https://data-apis.github.io/array-api/latest/ >> - Repo: https://github.com/data-apis/array-api/ >> >> It would be great if people - and in particular, NumPy maintainers - >> could have a look at it and see if that looks sensible from a NumPy >> perspective and whether the goals and benefits of adopting it are described >> clearly enough and are compelling. >> >> I'm sure a NEP will be needed for proposing adoption of the standard once >> it is closer to completion, and work out what that means for interaction >> with the array protocol NEPs and/or NEP 37, and how an implementation would >> look. It's a bit early for that now, I'm thinking maybe by the end of the >> year. Some initial discussion now would be useful though, since it's easier >> to make changes now rather than when that API standard is already further >> along. >> >> Cheers, >> Ralf >> >> >> On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers >> wrote: >> >>> Hi all, >>> >>> I'd like to share this announcement blog post about the creation of a >>> consortium for array and dataframe API standardization here: >>> https://data-apis.org/blog/announcing_the_consortium/. It's still in >>> the beginning stages, but starting to take shape. We have participation >>> from one or more maintainers of most array and tensor libraries - NumPy, >>> TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis >>> Oliphant and myself have been providing input from a NumPy perspective. >>> >>> The effort is very much related to some of the interoperability work >>> we've been doing in NumPy (e.g. it could provide an answer to what's >>> described in >>> https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api >>> ). >>> >>> At this point we're looking for feedback from maintainers at a high >>> level (see the blog post for details). >>> >>> Also important: the python-record-api tooling and data in its repo has >>> very granular API usage data, of the kind we could really use when making >>> decisions that impact backwards compatibility. >>> >>> Cheers, >>> Ralf >>> >>> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Nov 11 08:00:31 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 11 Nov 2020 13:00:31 +0000 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: <38D4DE93-1F6D-4F09-820F-2E335FDF0674@icloud.com> References: <38D4DE93-1F6D-4F09-820F-2E335FDF0674@icloud.com> Message-ID: On Wed, Nov 11, 2020 at 12:15 PM YueCompl wrote: > This is great! > > I'm working on some Haskell based mmap shared array lib, with Python like > surface language API. I would adhere to such standard very willingly. > Awesome. Library authors from other languages is definitely something else we had in mind, so glad to hear it's helpful. A quick skim but I can't find dataframe related info, that's scheduled for > the future? Will take Pandas as primary reference? > Yes, that is planned but will take a while longer. Dataframes are less mature, and Pandas itself is still very much in flux (the first proposal after the 1.0 release was "let's deprecate for 2.0", so it's a more complex puzzle. Pandas is an important reference, but I'd expect the end result to deviate more from Pandas than the array API differs from NumPy. Cheers, Ralf > Thanks with best regards, > Compl > > > On 2020-11-11, at 02:19, Ralf Gommers wrote: > > Hi all, > > I'd like to share an update on this topic. The draft array API standard is > now ready for wider review: > > - Blog post: https://data-apis.org/blog/array_api_standard_release > - Array API standard document: > https://data-apis.github.io/array-api/latest/ > - Repo: https://github.com/data-apis/array-api/ > > It would be great if people - and in particular, NumPy maintainers - could > have a look at it and see if that looks sensible from a NumPy perspective > and whether the goals and benefits of adopting it are described clearly > enough and are compelling. > > I'm sure a NEP will be needed for proposing adoption of the standard once > it is closer to completion, and work out what that means for interaction > with the array protocol NEPs and/or NEP 37, and how an implementation would > look. It's a bit early for that now, I'm thinking maybe by the end of the > year. Some initial discussion now would be useful though, since it's easier > to make changes now rather than when that API standard is already further > along. > > Cheers, > Ralf > > > On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers > wrote: > >> Hi all, >> >> I'd like to share this announcement blog post about the creation of a >> consortium for array and dataframe API standardization here: >> https://data-apis.org/blog/announcing_the_consortium/. It's still in the >> beginning stages, but starting to take shape. We have participation from >> one or more maintainers of most array and tensor libraries - NumPy, >> TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis >> Oliphant and myself have been providing input from a NumPy perspective. >> >> The effort is very much related to some of the interoperability work >> we've been doing in NumPy (e.g. it could provide an answer to what's >> described in >> https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api >> ). >> >> At this point we're looking for feedback from maintainers at a high level >> (see the blog post for details). >> >> Also important: the python-record-api tooling and data in its repo has >> very granular API usage data, of the kind we could really use when making >> decisions that impact backwards compatibility. >> >> Cheers, >> Ralf >> >> _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noamraph at gmail.com Wed Nov 11 08:07:12 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Wed, 11 Nov 2020 15:07:12 +0200 Subject: [Numpy-discussion] Proposal: add the timestamp64 type In-Reply-To: References: Message-ID: I added discussing my proposal to the upcoming meeting agenda. I thought of a refinement. Since numpy data types don't have static methods, instead of using "timestamp64.now()" it could be another function of the constructor. So timestamp64() will return the current timestamp in microseconds, and timestamp64('s'), timestamp64('ms'), timestamp64('us') and timestamp64('ns') will return the current timestamp in the given unit. This makes the interface even simpler! Cheers, Noam On Sat, Nov 7, 2020 at 10:57 PM Noam Yorav-Raphael wrote: > Hi, > > (I'm repeating things I wrote under the "datetime64: Remove deprecation > warning..." thread, since I'm now proposing a new solution.) > > I propose to add a new type called "timestamp64". It will be a pure > timestamp, meaning that it represents a moment in time (as seconds/ms/us/ns > since the epoch), without any timezone information. It will have the exact > same behavior as datetime64 had before version 1.11, except that its only > allowed units will be seconds, milliseconds, microseconds and nanoseconds. > Removing the longer units will make it clear that it doesn't deal with > calendar and dates. Also, all the business day functionality will not be > applicable to timestamp64. In order to get calendar information (such as > the year) from timestamp64, you will have to manually convert it to > python's datetime (or perhaps to np.datetime64) with an explicit timezone > (utc, local, an offset, or a timezone object). > > This is needed because since the change introduced in 1.11, datetime64 no > longer represents a timestamp, but rather a date and time of an abstract > calendar. So given a datetime64, it is not possible to get an actual > timestamp without knowing the timezone to which the datetime64 refers. If > the datetime64 is in a timezone with daylight saving time, it can even be > ambiguous, since the same written hour will occur twice on the transition > from DST to winter time. > > I would like it to work like this: > > >>> np.timestamp64.now() > numpy.timestamp64('2020-11-07 22:42:52.871159+0200') > > >>> np.timestamp64.now('s') > numpy.timestamp64('2020-11-07 22:42:52+0200') > > >>> np.timestamp64(1604781916, 's') > numpy.timestamp64('2020-11-07 22:42:52+0200') > > >>> np.timestamp64('2020-11-07 20:42:52Z') > numpy.timestamp64('2020-11-07 22:42:52+0200') > > * timestamp64.now() will get an optional string argument with the base > units. If not given, I think 'us' is a good default. > * The repr will format the timestamp using the environment's timezone. > * I like the repr to not include a 'T' between the date and the time. I > find it much easier to read. > * I tend to think that it should be allowed to construct a timestamp64 > from an ISO8601 string without a timezone offset, in which case the > environment's timezone will be used to convert it to a timestamp. So in the > Asia/Jerusalem timezone it will look like: > > >>> np.timestamp64('2020-11-07 22:42:52') > numpy.timestamp64('2020-11-07 22:42:52+0200') > > >>> np.timestamp64('2020-08-01 22:00:00') > numpy.timestamp64('2020-08-01 22:00:00+0300') > > > If I implement this, could it be added to numpy? > > > Thanks, > Noam > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Nov 12 08:53:46 2020 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 12 Nov 2020 15:53:46 +0200 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: On 11/10/20 8:19 PM, Ralf Gommers wrote: > Hi all, > > I'd like to share an update on this topic. The draft array API > standard is now ready for wider review: > > - Blog post: https://data-apis.org/blog/array_api_standard_release > > - Array API standard document: > https://data-apis.github.io/array-api/latest/ > - Repo: https://github.com/data-apis/array-api/ > > It would be great if people - and in particular, NumPy maintainers - > could have a look at it and see if that looks sensible from a NumPy > perspective and whether the goals and benefits of adopting it are > described clearly enough and are compelling. > I think it is compelling for a first version. The test suite and benchmark suite will be valuable tools. I hope future versions standardize complex numbers as a dtype. I realize there is a limit to the breadth of the scope of functions to be covered. Is there a page that lists them in one place? For instance I tried to look up what the standard has to say on issue https://github.com/numpy/numpy/issues/17760 about using bincount on unt64 arrays. It took me a while to figure out that bincount was not in the API (although unique(..., return_counts) is). Matti From stefano.miccoli at polimi.it Thu Nov 12 11:04:18 2020 From: stefano.miccoli at polimi.it (Stefano Miccoli) Date: Thu, 12 Nov 2020 16:04:18 +0000 Subject: [Numpy-discussion] Proposal: add the timestamp64 type (Noam Yorav-Raphael) In-Reply-To: References: Message-ID: <9DE45866-E937-48A0-ADB4-FCED4FB30790@polimi.it> On 11 Nov 2020, at 18:00, numpy-discussion-request at python.org wrote: I propose to add a new type called "timestamp64". It will be a pure timestamp, meaning that it represents a moment in time (as seconds/ms/us/ns since the epoch), without any timezone information. Sorry, but I really don see the usefulness for another time stamping format based on POSIX time. Indeed POSIX time is based on a naive approximation of UTC and is ambiguous across leap seconds. Quoting from Wikipedia The Unix time number 1483142400 is thus ambiguous: it can refer either to start of the leap second (2016-12-31 23:59:60) or the end of it, one second later (2017-01-01 00:00:00). In the theoretical case when a negative leap second occurs, no ambiguity is caused, but instead there is a range of Unix time numbers that do not refer to any point in UTC time at all. Precision time stamping is quite a complex task: you can use UTC, TAI, GPS, just to mention the most used timescales. And how do you deal with timestamps in the past, when timekeeping was based on earth rotation, and not atomic clocks ticking at (approximately) 1 SI-second frequency? In my opinion time-stamping should be application dependent, and I doubt that the new ?timestamp64? could be beneficial to the numpy community. Best regards, Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Nov 12 11:40:22 2020 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 12 Nov 2020 18:40:22 +0200 Subject: [Numpy-discussion] Proposal: add the timestamp64 type (Noam Yorav-Raphael) In-Reply-To: <9DE45866-E937-48A0-ADB4-FCED4FB30790@polimi.it> References: <9DE45866-E937-48A0-ADB4-FCED4FB30790@polimi.it> Message-ID: <54968e8c-b9b3-83c0-e651-28f3518ce7de@gmail.com> On 11/12/20 6:04 PM, Stefano Miccoli wrote: > > >> On 11 Nov 2020, at 18:00, numpy-discussion-request at python.org >> wrote: >> >> I propose to add a new type called "timestamp64". It will be a pure >> timestamp, meaning that it represents a moment in time (as >> seconds/ms/us/ns since the epoch), without any timezone information. > > Sorry, but I really don see the usefulness for another time stamping > format based on POSIX time. Indeed POSIX time is based on a naive > approximation of UTC and is ambiguous across leap seconds. Quoting > from Wikipedia > > ... In a one-on-one discussion with Noam in a pre-community call (that, how ironically, we had time for since we both messed up the meeting time-zone change) we reached the conclusion that the request is to clarify whether NumPy's datetime64 represents TAI time [0] or POSIX time, with a preferecne for TAI time. The documentation mentions POSIX time[1]. As Stefano points out, there is a couple of seconds difference between POSIX (or Unix) time and TAI time. In practice numpy simply stores a int64 value to represent the datetime64, and relies on others to convert it. The leap-second might be getting lost in the conversions. So it might make sense to clarify exactly how those conversions deal with the leap-seconds and choose which one we mean when we use datetime64. Noam please correct me if I am mistaken. Matti [0] https://en.wikipedia.org/wiki/International_Atomic_Time [1] https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units From noamraph at gmail.com Thu Nov 12 16:13:53 2020 From: noamraph at gmail.com (Noam Yorav-Raphael) Date: Thu, 12 Nov 2020 23:13:53 +0200 Subject: [Numpy-discussion] Proposal: add the timestamp64 type (Noam Yorav-Raphael) In-Reply-To: <54968e8c-b9b3-83c0-e651-28f3518ce7de@gmail.com> References: <9DE45866-E937-48A0-ADB4-FCED4FB30790@polimi.it> <54968e8c-b9b3-83c0-e651-28f3518ce7de@gmail.com> Message-ID: Hi Matti and Stefano, My understanding is that datetime64 was decided to be neither TAI nor posix time, but rather represent an abstract calendar point, like datetime.datetime without a specified timezone. This can usually be converted into posix time given a timezone (although in the "repeated" hour between DST and winter time there will be ambiguity!) If it is agreed by all users that a datetime64 represents the time in UTC, it is the same as posix time. I would like to have a type that is defined to be equivalent to posix time. I don't agree with Stefano, I think that posix time is very useful (as I think its ubiquity shows that), and I think that a type that is defined to be posix time would also be very useful. I think that posix time is well suited for the vast majority of use cases. Indeed, there are use cases where you should take into account leap seconds, but those are rare. In practice, a leap second would be presented by the OS as a second that actually takes more than a second. This actually happens all the time without leap seconds - when your computer automatically syncs with ntp, it adjusts the time continuously, so applications will not experience "time bumps". If you want to make sure that the intervals you measure are correct, you should use something like time.monotonic(). So, most users are not interested in very precise time measurements, but rather in knowing what happened before what, and roughly when. For this, posix time is great - it's very simple, and does the job. In some cases you need to take into account leap seconds, but in those cases, just using the computer clock will not give you the precision you need no matter what - so you'll need specialized software anyway. I think that posix time is great, and since it's very easy to make wrong decisions that seem to work until you discover they don't (such as discovering too late that local time won't work when you are not sure of the time zone, or when you switch from DST to winter time), a sane and simple default is important. Cheers, Noam On Thu, Nov 12, 2020 at 6:41 PM Matti Picus wrote: > > On 11/12/20 6:04 PM, Stefano Miccoli wrote: > > > > > >> On 11 Nov 2020, at 18:00, numpy-discussion-request at python.org > >> wrote: > >> > >> I propose to add a new type called "timestamp64". It will be a pure > >> timestamp, meaning that it represents a moment in time (as > >> seconds/ms/us/ns since the epoch), without any timezone information. > > > > Sorry, but I really don see the usefulness for another time stamping > > format based on POSIX time. Indeed POSIX time is based on a naive > > approximation of UTC and is ambiguous across leap seconds. Quoting > > from Wikipedia > > > > ... > > > In a one-on-one discussion with Noam in a pre-community call (that, how > ironically, we had time for since we both messed up the meeting > time-zone change) we reached the conclusion that the request is to > clarify whether NumPy's datetime64 represents TAI time [0] or POSIX > time, with a preferecne for TAI time. The documentation mentions POSIX > time[1]. As Stefano points out, there is a couple of seconds difference > between POSIX (or Unix) time and TAI time. In practice numpy simply > stores a int64 value to represent the datetime64, and relies on others > to convert it. The leap-second might be getting lost in the conversions. > So it might make sense to clarify exactly how those conversions deal > with the leap-seconds and choose which one we mean when we use > datetime64. Noam please correct me if I am mistaken. > > > Matti > > > [0] https://en.wikipedia.org/wiki/International_Atomic_Time > > [1] > https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Thu Nov 12 17:45:34 2020 From: daniele at grinta.net (Daniele Nicolodi) Date: Thu, 12 Nov 2020 23:45:34 +0100 Subject: [Numpy-discussion] Proposal: add the timestamp64 type (Noam Yorav-Raphael) In-Reply-To: <54968e8c-b9b3-83c0-e651-28f3518ce7de@gmail.com> References: <9DE45866-E937-48A0-ADB4-FCED4FB30790@polimi.it> <54968e8c-b9b3-83c0-e651-28f3518ce7de@gmail.com> Message-ID: On 12/11/2020 17:40, Matti Picus wrote: > In a one-on-one discussion with Noam in a pre-community call (that, how > ironically, we had time for since we both messed up the meeting > time-zone change) we reached the conclusion that the request is to > clarify whether NumPy's datetime64 represents TAI time [0] or POSIX > time, with a preferecne for TAI time. The documentation mentions POSIX > time[1]. As Stefano points out, there is a couple of seconds difference > between POSIX (or Unix) time and TAI time. In practice numpy simply > stores a int64 value to represent the datetime64, and relies on others > to convert it. The leap-second might be getting lost in the conversions. > So it might make sense to clarify exactly how those conversions deal > with the leap-seconds and choose which one we mean when we use > datetime64. Noam please correct me if I am mistaken. Unix time is a representation of the UTC timescale that counts 1 seconds intervals starting from a defined epoch. It deals with leap seconds either skipping one interval (never happened so far) or repeating an interval so that two moments in time that on the UTC timescale are separated by one second (for example 2016-12-31 23:59:59 and 2016-12-31 23:59:60) are represented in the same way and thus the conversion from Unix time to UTC is ambiguous during this one second. This happened 37 times since 1972. This comes with the nice properties that minutes, hours and days have always the same duration (in Unix time), thus converting from the Unix time representation to an date and hour and vice versa is fairly easy. The drawback are, as seen above, an ambiguity on leap seconds and the fact that the trivial computation of time intervals does not take into account leap seconds and thus may be shorted of a few seconds (any time interval across 2016-12-31 23:59:59 is off by at least one second if computed simply subtracting Unix times). I don't think these two drawbacks are important for Numpy (or any other general purpose library). As things stand, it is not even possible, in Python, with or without Numpy, to create a datetime or datetime64 object from the time "2016-12-31 23:59:60" (neither accept the existence of a minute with 61 seconds) thus the ambiguity issue is not an issue in practice. The time interval issue may matter for some applications, but the ones affected are aware of the issue and have means to deal with it (the most common one being taking a day off on the days leap seconds are introduced). I think documenting that datetime64 is a representation of fixed time intervals since a conventional epoch, neglecting leap seconds, is easy to explain and implement and allows for easy interoperability with the rest of the world. What advantage would making datetime64 explicitly a representation of TAI bring? One disadvantage would be that `np.datetime64(datetime.now())` would be harder to support as we are trying to match a point in time on the UTC time scale to a point in time in on the TAI time scale. This is trivial for past times (just need to adjust for the right offset) but it is impossible to do correctly for dates in the future because we cannot predict future leap second insertions. This would, for example, make timestamp conversions not be reproducible across announcement of leap second insertions. Cheers, Dan From sebastian at sipsolutions.net Thu Nov 12 20:48:45 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 12 Nov 2020 19:48:45 -0600 Subject: [Numpy-discussion] API, NEP: Inclusion of the experimental `like=` argument in NumPy 1.20 (we currently lean to yes) Message-ID: <2303a2659afbd366a24f5c321a36d6baec4703de.camel@sipsolutions.net> Hi all, TL;DR: Should NumPy add a `like=` to array creation functions? This is an extension of the `__array_function__` protocol useful when working with array-like objects other than NumPy arrays. Including it, effectively means we preliminarily accept NEP 35. Note that without any feedback here, the current default is to include it in the upcoming NumPy 1.20 release. Long Version: Users who only work with NumPy arrays and no alternative array objects are not affected by this (but will see a "useless" keyword argument). However, dask and cupy, asked for the addition of a `like=` keyword argument to array creation functions (list below at [1]) in the proposed NEP 35: https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html This is an extension of the `__array_function__` protocol. I will refer to the well written NEP for details. It is important to note that there are alternative ideas under consideration [2]. This means there is a chance that the `like=` argument will be superseded by a different solution. My very personal angle currently is this: * Dask/CuPy have shown that this is useful to them * It does not seem like a big burden to me (aside from the new API/documentation which users might get confused by) * We could deprecate it again, even if it may be a slow process. With a lack of a strong argument against it and no clarity when alternatives might become available, I am fine with accepting it into NumPy [3]. However, I can certainly be swayed if anyone has concerns. There are also currently a few small outstanding discussion listed at: https://github.com/numpy/numpy/issues/17075 which is probably too technical for the decision whether or not to include it, but if you are interested in using the feature, lets discuss these as well! Cheers, Sebastian [1] The functions which receive the keyword argument are: * np.array, np.asarray, np.ascontiguousarray, etc. * np.arange * np.ones, np.zeros, np.empty, np.full * np.fromfunction * np.identity * np.fromfile * ... and a few I forgot ... [2] E.g. https://numpy.org/neps/nep-0037-array-module.html [3] There is the "middle ground": We could require an environment variable to activate it. But we discussed it briefly at the community meeting as well, and I think the consensus was there is probably no good argument for that. (e.g. it would mean the argument doesn't show up in the online documentation.) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From stefanv at berkeley.edu Thu Nov 12 23:22:10 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 12 Nov 2020 20:22:10 -0800 Subject: [Numpy-discussion] =?utf-8?q?API=2C_NEP=3A_Inclusion_of_the_expe?= =?utf-8?q?rimental_=60like=3D=60_argument_in_NumPy_1=2E20_=28we_currently?= =?utf-8?q?_lean_to_yes=29?= In-Reply-To: <2303a2659afbd366a24f5c321a36d6baec4703de.camel@sipsolutions.net> References: <2303a2659afbd366a24f5c321a36d6baec4703de.camel@sipsolutions.net> Message-ID: <4256d0d9-b2f8-416d-97de-2f40516ed638@www.fastmail.com> On Thu, Nov 12, 2020, at 17:48, Sebastian Berg wrote: > [3] There is the "middle ground": We could require an environment > variable to activate it. But we discussed it briefly at the community > meeting as well, and I think the consensus was there is probably no > good argument for that. (e.g. it would mean the argument doesn't show > up in the online documentation.) As long as we don't follow this option (3), I think the changes are innocuous. If it helps some libraries out there, I see no reason not to include it. St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.miccoli at polimi.it Fri Nov 13 05:06:45 2020 From: stefano.miccoli at polimi.it (Stefano Miccoli) Date: Fri, 13 Nov 2020 10:06:45 +0000 Subject: [Numpy-discussion] Proposal: add the timestamp64 type In-Reply-To: References: Message-ID: Discussion on time is endless! (Sorry for the extra noise, on the mailing list, but I would clarify some points.) If I got it right, np.datetime64 is defined by these points. 1) Internal representation: 64bit signed integer *plus* a time unit. The time unit can be expressed as - a SI valid unit (SI second and all decimal subunits up to the attosecond) - a SI acceptable unit (minute, hour, day) - a date unit (week, month, year) 2) Conversion routines: a bijective map from the internal representation to a proleptic Gregorian calendar [0] assuming a fixed epoch of 1970-01-01T00:00Z. The mapping neglects leap seconds and is not time-zone aware. I think that the current choice of 2) is a sensible one: I agree with Dan that it is useful to a wide audience, easy to compute, not ambiguous. I would discourage any attempt to implement in numpy more complex mappings, which are aware of time-zones and leap seconds, and why not, of the wide array of other time scales and time representation in use: this is a very complex task, and a nightmare from the point of view of maintenance. Other specialised libraries exist, like astropy.time [1] or dateutil [2] to this purpose. However the docs of numpy.datetime64 should be updated, to explicitly mention the use of the proleptic Gregorian calendar, and better clarify how the date units (month, year) are handled when casted to other shorter units like seconds, etc. Stefano [0] https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar [1] https://docs.astropy.org/en/stable/time/ [2] https://dateutil.readthedocs.io/en/stable/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Fri Nov 13 05:59:03 2020 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 13 Nov 2020 12:59:03 +0200 Subject: [Numpy-discussion] shipping manylinux1 wheels - when do we stop? Message-ID: <0c8996ca-6674-b340-b6c7-92ea1eec2e1d@gmail.com> The question of manylinux1 wheels came up enough that I wrote a blog post about it. In short: for 1.21 I would like to ship only manylinux2014 and up. Here is the blog post https://labs.quansight.org/blog/2020/11/manylinux1-is-obsolete-manylinux2010-is-almost-eol-what-is-next/ Matti From charlesr.harris at gmail.com Fri Nov 13 10:57:37 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Nov 2020 08:57:37 -0700 Subject: [Numpy-discussion] shipping manylinux1 wheels - when do we stop? In-Reply-To: <0c8996ca-6674-b340-b6c7-92ea1eec2e1d@gmail.com> References: <0c8996ca-6674-b340-b6c7-92ea1eec2e1d@gmail.com> Message-ID: On Fri, Nov 13, 2020 at 3:59 AM Matti Picus wrote: > The question of manylinux1 wheels came up enough that I wrote a blog > post about it. In short: for 1.21 I would like to ship only > manylinux2014 and up. Here is the blog post > > https://labs.quansight.org/blog/2020/11/manylinux1-is-obsolete-manylinux2010-is-almost-eol-what-is-next/ > > Good job summarizing the information. I looked at the code for how Python supports pip and it seems build dependent, it isn't part of the Python library, so I'm not sure that pip and Python versions are strongly associated. I didn't find any such list when looking for it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efrem.braun at gmail.com Fri Nov 13 11:35:53 2020 From: efrem.braun at gmail.com (efremdan1) Date: Fri, 13 Nov 2020 09:35:53 -0700 (MST) Subject: [Numpy-discussion] How did Numpy get its latest version of the documentation to appear at the top of Google search results? Message-ID: <1605285353468-0.post@n7.nabble.com> I'm working with Bokeh (https://docs.bokeh.org/en/latest/), another open-source Python package. The developers would like to have the latest version of their documentation appear at the top of Google search results when users search for information, but knowledge of how to do this is lacking. I've noticed that Numpy seems to have gotten this problem figured out, e.g., googling "numpy interpolate" results in the first hit being https://numpy.org/doc/stable/reference/generated/numpy.interp.html. This is unlike Python itself, where googling "python string formatting" results in the first hit being https://docs.python.org/3.4/library/string.html. So apparently someone in the Numpy developer world knows how to setup the doc pages in a manner that allows for this. Would that person be willing to post to the Bokeh message board on the topic (https://discourse.bokeh.org/t/some-unsolicited-feedback/6643/17) with some advice? Thank you! -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From ilhanpolat at gmail.com Fri Nov 13 11:43:15 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Fri, 13 Nov 2020 17:43:15 +0100 Subject: [Numpy-discussion] How did Numpy get its latest version of the documentation to appear at the top of Google search results? In-Reply-To: <1605285353468-0.post@n7.nabble.com> References: <1605285353468-0.post@n7.nabble.com> Message-ID: Have a look at here for "some" background https://github.com/scipy/docs.scipy.org/issues/39 On Fri, Nov 13, 2020 at 5:37 PM efremdan1 wrote: > I'm working with Bokeh (https://docs.bokeh.org/en/latest/), another > open-source Python package. The developers would like to have the latest > version of their documentation appear at the top of Google search results > when users search for information, but knowledge of how to do this is > lacking. > > I've noticed that Numpy seems to have gotten this problem figured out, > e.g., > googling "numpy interpolate" results in the first hit being > https://numpy.org/doc/stable/reference/generated/numpy.interp.html. This > is > unlike Python itself, where googling "python string formatting" results in > the first hit being https://docs.python.org/3.4/library/string.html. > > So apparently someone in the Numpy developer world knows how to setup the > doc pages in a manner that allows for this. Would that person be willing to > post to the Bokeh message board on the topic > (https://discourse.bokeh.org/t/some-unsolicited-feedback/6643/17) with > some > advice? > > Thank you! > > > > -- > Sent from: http://numpy-discussion.10968.n7.nabble.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Fri Nov 13 11:45:38 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Fri, 13 Nov 2020 16:45:38 +0000 Subject: [Numpy-discussion] How did Numpy get its latest version of the documentation to appear at the top of Google search results? In-Reply-To: <1605285353468-0.post@n7.nabble.com> References: <1605285353468-0.post@n7.nabble.com> Message-ID: <3B1E5BFF-36C3-4CC7-9D36-AE45C8EB92D5@hxcore.ol> An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Fri Nov 13 14:20:06 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Fri, 13 Nov 2020 12:20:06 -0700 Subject: [Numpy-discussion] How did Numpy get its latest version of the documentation to appear at the top of Google search results? In-Reply-To: References: <1605285353468-0.post@n7.nabble.com> Message-ID: I'm unclear from that issue what exactly was done that ended up working. Was it the "moved permanently" redirect, or something else? Did you use the webmaster tools? "Moved permanently" redirects aren't an option if you want to host old version docs but still have Google default to "latest". For SymPy we got so tired of people ending up at old docs versions that we just removed them (so now we only have "latest" and "dev"). We don't support old versions anyway. But another problem I noticed is that I had a fork of our docs repo on GitHub with the gh-pages branch, and people were ending up at the version of the docs on my fork (I discovered this from looking at the webmaster tools for my domain and seeing that those pages were being clicked on from search results). But two things I can recommend: - Make sure the latest version of your docs use "latest" in the URL, instead of a version number. That way when people copy the URL to create a link, it will always point to the latest version (it looks like Bokeh already does this). - Poke around at the Google webmaster tools. There's a lot of good stuff there, including a lot of good data on how people end up on your site via Google searches. Aaron Meurer On Fri, Nov 13, 2020 at 9:43 AM Ilhan Polat wrote: > > Have a look at here for "some" background https://github.com/scipy/docs.scipy.org/issues/39 > > On Fri, Nov 13, 2020 at 5:37 PM efremdan1 wrote: >> >> I'm working with Bokeh (https://docs.bokeh.org/en/latest/), another >> open-source Python package. The developers would like to have the latest >> version of their documentation appear at the top of Google search results >> when users search for information, but knowledge of how to do this is >> lacking. >> >> I've noticed that Numpy seems to have gotten this problem figured out, e.g., >> googling "numpy interpolate" results in the first hit being >> https://numpy.org/doc/stable/reference/generated/numpy.interp.html. This is >> unlike Python itself, where googling "python string formatting" results in >> the first hit being https://docs.python.org/3.4/library/string.html. >> >> So apparently someone in the Numpy developer world knows how to setup the >> doc pages in a manner that allows for this. Would that person be willing to >> post to the Bokeh message board on the topic >> (https://discourse.bokeh.org/t/some-unsolicited-feedback/6643/17) with some >> advice? >> >> Thank you! >> >> >> >> -- >> Sent from: http://numpy-discussion.10968.n7.nabble.com/ >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Fri Nov 13 18:09:27 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 13 Nov 2020 23:09:27 +0000 Subject: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative In-Reply-To: References: Message-ID: On Thu, Nov 12, 2020 at 1:54 PM Matti Picus wrote: > > On 11/10/20 8:19 PM, Ralf Gommers wrote: > > Hi all, > > > > I'd like to share an update on this topic. The draft array API > > standard is now ready for wider review: > > > > - Blog post: https://data-apis.org/blog/array_api_standard_release > > > > - Array API standard document: > > https://data-apis.github.io/array-api/latest/ > > - Repo: https://github.com/data-apis/array-api/ > > > > It would be great if people - and in particular, NumPy maintainers - > > could have a look at it and see if that looks sensible from a NumPy > > perspective and whether the goals and benefits of adopting it are > > described clearly enough and are compelling. > > > > I think it is compelling for a first version. The test suite and > benchmark suite will be valuable tools. I hope future versions > standardize complex numbers as a dtype. Yes, that's definitely a desire - when implementations are there/ready. At the moment most libraries have very incomplete support for complex dtypes, largely because they're not very important for deep learning. Also NumPy's implementations/choices are shaky in places, and that's being turned up by the PyTorch effort that's ongoing now to implement complex dtype support in a NumPy-compatible way. I realize there is a limit to > the breadth of the scope of functions to be covered. Is there a page > that lists them in one place? For instance I tried to look up what the > standard has to say on issue https://github.com/numpy/numpy/issues/17760 > about using bincount on unt64 arrays. It took me a while to figure out > that bincount was not in the API (although unique(..., return_counts) is). > That's a good idea and still missing, thanks for asking. The test suite that's in development has a complete list [1]. In the document itself Sphinx search works, but it should be easier to get a complete overview perhaps (although it requires some thought - the NumPy docs don't have everything on one page either). [1] https://github.com/data-apis/array-api-tests/tree/master/array_api_tests/function_stubs Cheers, Ralf > > Matti > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sat Nov 14 12:42:04 2020 From: matti.picus at gmail.com (Matti Picus) Date: Sat, 14 Nov 2020 19:42:04 +0200 Subject: [Numpy-discussion] shipping manylinux1 wheels - when do we stop? In-Reply-To: References: <0c8996ca-6674-b340-b6c7-92ea1eec2e1d@gmail.com> Message-ID: On 11/13/20 5:57 PM, Charles R Harris wrote: > Good job summarizing the information. I looked at the code for how > Python supports pip and it seems build dependent, it isn't part of the > Python library, so I'm not sure that pip and Python versions are > strongly associated. I didn't find any such list when looking for it. > > Chuck In order to ascertain what version of pip is shipped with what version of CPython, I? checked the package vendored into the CPython repo at "Lib/ensurepip/_bundled". In the table below, pip 19.0 was the first to support manylinux10, 19.3 was the first to support manylinux2014. The release dates came from https://github.com/python/cpython/releases version???? release date??????? pip 3.7.2???? Dec 23, 2018????? 18.1 3.7.3,??? Mar 25, 2019????? 19.0 3.7.8???? June 27, 2020???? 20.1 3.8.0-2 ?? ? ? ? ? ? ? ? ? ? ? ? ?? 19.2 3.8.4??? July 13, 2020???? ?? 20.1 3.9.0???????????????????????????????? 20.2 Matti From robbmcleod at gmail.com Mon Nov 16 12:26:17 2020 From: robbmcleod at gmail.com (Robert McLeod) Date: Mon, 16 Nov 2020 09:26:17 -0800 Subject: [Numpy-discussion] manylinux wheels with Github Actions Message-ID: Everyone, I'm looking to move `numexpr` into GitHub Actions for the combination of testing but also building wheels for deployment to pypi. I've never really been satisfied with the Appveyor/Travis combo and the documentation around Actions seems to be a lot stronger. I thought I might ask for advice on the mailing list before I jump in. Does anyone here know of good, working examples of such for scientific packages that I could crib as a guide? I would want/need to make use of manylinux1 Docker images for building Linux wheels. https://github.com/pypa/manylinux This is an example recipe that builds Cython using a custom Docker image. It's a nice starting point but it would be preferable to use the official pypa images. https://github.com/marketplace/actions/python-wheels-manylinux-build Kivy is a working example with a pretty complete test and build setup (although it's a bit over complicated for my purposes): https://github.com/kivy/kivy/tree/master/.github/workflows Anyone have any experiences to share with test and deploy via Actions? Robert -- Robert McLeod robbmcleod at gmail.com robert.mcleod at hitachi-hhtc.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: From larson.eric.d at gmail.com Mon Nov 16 12:34:43 2020 From: larson.eric.d at gmail.com (Eric Larson) Date: Mon, 16 Nov 2020 12:34:43 -0500 Subject: [Numpy-discussion] manylinux wheels with Github Actions In-Reply-To: References: Message-ID: I have had good experiences for about a year now using CIBUILDWHEEL on Azure for VisPy and on python-rtmixer to deploy Linux, Windows, and macOS wheels to PyPi automatically when a release is tagged. It wasn't difficult to set up rtmixer after David Hoese did the heavy lifting sorting out everything with VisPy. I haven't used other methods so I can't really offer any comparison, but the overhead for those two fairly simple (from a build-and-deploy perspective) projects at least has seemed pretty low. My 2c, Eric On Mon, Nov 16, 2020 at 12:27 PM Robert McLeod wrote: > Everyone, > > I'm looking to move `numexpr` into GitHub Actions for the combination of > testing but also building wheels for deployment to pypi. I've never really > been satisfied with the Appveyor/Travis combo and the documentation around > Actions seems to be a lot stronger. I thought I might ask for advice on the > mailing list before I jump in. > > Does anyone here know of good, working examples of such for scientific > packages that I could crib as a guide? I would want/need to make use of > manylinux1 Docker images for building Linux wheels. > > https://github.com/pypa/manylinux > > This is an example recipe that builds Cython using a custom Docker image. > It's a nice starting point but it would be preferable to use the official > pypa images. > > https://github.com/marketplace/actions/python-wheels-manylinux-build > Kivy is a working example with a pretty complete test and build setup > (although it's a bit over complicated for my purposes): > > https://github.com/kivy/kivy/tree/master/.github/workflows > > Anyone have any experiences to share with test and deploy via Actions? > Robert > > -- > Robert McLeod > robbmcleod at gmail.com > robert.mcleod at hitachi-hhtc.ca > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Nov 16 17:47:50 2020 From: andyfaff at gmail.com (Andrew Nelson) Date: Tue, 17 Nov 2020 09:47:50 +1100 Subject: [Numpy-discussion] manylinux wheels with Github Actions In-Reply-To: References: Message-ID: For my project (refnx) I solely use GH Actions to test and make wheels. In my workflow ( https://github.com/refnx/refnx/blob/master/.github/workflows/pythonpackage.yml) I make a 3.7/3.8/3.9 matrix across Linux/macOS/Windows. First of all I make the wheels for all those combos, then install the wheels, then run tests on the installed wheel (thereby checking that the libraries are all delocated nicely). The final step in the workflow is uploading the wheel artefacts (stored somewhere on GH). To make the Linux 2010/2014 wheels I use PyPA docker images ( https://github.com/refnx/refnx/blob/master/.github/workflows/pythonpackage.yml#L70). I specify what platform I want to make wheels for, as an environment variable that is used to select the correct docker image ["manylinux2010_x86_64", "manylinux2014_x86_64"]. There are other images available. The docker image runs a script to make all the wheels ( https://github.com/refnx/refnx/blob/master/tools/build_manylinux_wheels.sh) and put them in a wheelhouse, that is then uploaded as an artefact along with the macOS and windows wheels. I decided to make the deployment step to PyPI a manual process, just so I can check that everything went ok. When I've just made a release tag I download the artefacts corresponding to the release commit, then upload them from my personal PC. It all works really well. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 17 23:58:54 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 17 Nov 2020 22:58:54 -0600 Subject: [Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus Message-ID: Hi all, Our bi-weekly triage-focused NumPy development meeting is today (Wednesday, November 18th) at 11 am Pacific Time (18:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized, discussed, or reviewed. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From melissawm at gmail.com Thu Nov 19 15:33:39 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Thu, 19 Nov 2020 17:33:39 -0300 Subject: [Numpy-discussion] A second CZI grant for NumPy and OpenBLAS Message-ID: Hi all, I'm happy to announce that NumPy and OpenBLAS have (again) received a joint grant from the Chan Zuckerberg Initiative. Here is the official press release [1], medium article [2], and the full list of grantees [3]. I also wrote a blog post about it to explain a bit more about the details of the proposal [4]. For NumPy, this funding is meant to support activities related to the Documentation Team (maintenance, onboarding of new team members), but also maintenance and development work for f2py. For this, we are glad to be able to support Pearu Peterson and hope we can improve the state of Fortran integration [5] in Python. The grant is for a year of work, starting on January 1st, 2021. I hope this can have a positive impact on the community and I am grateful to be able to be a part of it. All comments, questions and considerations are welcome! Cheers, Melissa [1] https://chanzuckerberg.com/newsroom/czi-awards-4-7-million-for-open-source-software-and-organizations-advancing-open-science/ [2] https://cziscience.medium.com/scaling-open-infrastructure-and-reproducibility-in-biomedicine-69546a399747 [3] https://chanzuckerberg.com/eoss/proposals/?cycle=3 [4] https://labs.quansight.org/blog/2020/11/a-second-czi-grant-for-numpy-and-openblas/ [5] https://github.com/numpy/numpy/issues/14938 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Thu Nov 19 16:09:25 2020 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 19 Nov 2020 13:09:25 -0800 Subject: [Numpy-discussion] A second CZI grant for NumPy and OpenBLAS In-Reply-To: References: Message-ID: On Thu, Nov 19, 2020, at 12:33, Melissa Mendon?a wrote: > I'm happy to announce that NumPy and OpenBLAS have (again) received a joint grant from the Chan Zuckerberg Initiative. That's fantastic news?congratulations, Melissa! St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon.niederhut at gmail.com Fri Nov 20 12:15:37 2020 From: dillon.niederhut at gmail.com (Dillon Niederhut) Date: Fri, 20 Nov 2020 11:15:37 -0600 Subject: [Numpy-discussion] A second CZI grant for NumPy and OpenBLAS In-Reply-To: References: Message-ID: Congratulations Melissa! On Thu, Nov 19, 2020 at 2:34 PM Melissa Mendon?a wrote: > Hi all, > > I'm happy to announce that NumPy and OpenBLAS have (again) received a > joint grant from the Chan Zuckerberg Initiative. > > Here is the official press release [1], medium article [2], and the full > list of grantees [3]. I also wrote a blog post about it to explain a bit > more about the details of the proposal [4]. > > For NumPy, this funding is meant to support activities related to the > Documentation Team (maintenance, onboarding of new team members), but also > maintenance and development work for f2py. For this, we are glad to be able > to support Pearu Peterson and hope we can improve the state of Fortran > integration [5] in Python. > > The grant is for a year of work, starting on January 1st, 2021. I hope > this can have a positive impact on the community and I am grateful to be > able to be a part of it. > > All comments, questions and considerations are welcome! > > Cheers, > > Melissa > > [1] > https://chanzuckerberg.com/newsroom/czi-awards-4-7-million-for-open-source-software-and-organizations-advancing-open-science/ > [2] > https://cziscience.medium.com/scaling-open-infrastructure-and-reproducibility-in-biomedicine-69546a399747 > [3] https://chanzuckerberg.com/eoss/proposals/?cycle=3 > [4] > https://labs.quansight.org/blog/2020/11/a-second-czi-grant-for-numpy-and-openblas/ > [5] https://github.com/numpy/numpy/issues/14938 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Nov 20 17:10:11 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 20 Nov 2020 19:10:11 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday November 23 In-Reply-To: References: Message-ID: Hi all! First, let me apologize for the reminder that I sent last time: because of the time change I ended up making a mistake with the timezones. Our next Documentation Team meeting will be on *Monday, November 9* at ***4PM UTC***. All are welcome - you don't need to already be a contributor to join. If you have questions or are curious about what we're doing, we'll be happy to meet you! If you wish to join on Zoom, **you need to use this NEW link** https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20201123T16&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Fri Nov 20 22:33:47 2020 From: tcaswell at gmail.com (Thomas Caswell) Date: Fri, 20 Nov 2020 22:33:47 -0500 Subject: [Numpy-discussion] manylinux wheels with Github Actions In-Reply-To: References: Message-ID: Matplotlib has also migrated to building wheels via github actions and it has been working well. Tom On Mon, Nov 16, 2020, 17:48 Andrew Nelson wrote: > For my project (refnx) I solely use GH Actions to test and make wheels. In > my workflow ( > https://github.com/refnx/refnx/blob/master/.github/workflows/pythonpackage.yml) > I make a 3.7/3.8/3.9 matrix across Linux/macOS/Windows. First of all I make > the wheels for all those combos, then install the wheels, then run tests on > the installed wheel (thereby checking that the libraries are all delocated > nicely). The final step in the workflow is uploading the wheel artefacts > (stored somewhere on GH). > To make the Linux 2010/2014 wheels I use PyPA docker images ( > https://github.com/refnx/refnx/blob/master/.github/workflows/pythonpackage.yml#L70). > I specify what platform I want to make wheels for, as an environment > variable that is used to select the correct docker image > ["manylinux2010_x86_64", "manylinux2014_x86_64"]. There are other images > available. The docker image runs a script to make all the wheels ( > https://github.com/refnx/refnx/blob/master/tools/build_manylinux_wheels.sh) > and put them in a wheelhouse, that is then uploaded as an artefact along > with the macOS and windows wheels. > > I decided to make the deployment step to PyPI a manual process, just so I > can check that everything went ok. When I've just made a release tag I > download the artefacts corresponding to the release commit, then upload > them from my personal PC. > > It all works really well. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sat Nov 21 14:28:27 2020 From: matti.picus at gmail.com (Matti Picus) Date: Sat, 21 Nov 2020 21:28:27 +0200 Subject: [Numpy-discussion] Changing the size of PyArrayObject_fields (the ndarray c-struct) Message-ID: <35a5a1b7-d3bd-0cf5-1698-c0d7be586fac@gmail.com> PyArrayObject_fields is the c-struct that underlies ndarray. It is defined in ndarraytypes.h [0]. Since version 1.7, we have been trying to hide it from the public C-API so that we can freely modify it, the structure has the comment: ?* It has been recommended to use the inline functions defined below ?* (PyArray_DATA and friends) to access fields here for a number of ?* releases. Direct access to the members themselves is deprecated. ?* To ensure that your code does not use deprecated access, ?* #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION ?* (or NPY_1_8_API_VERSION or higher as required). In order to clean up buffer exports, Sebastian suggested (and I pushed for supporting) PR 16938 [1] which would add a new field to the struct. As Eric pointed out on the pull request, this would change the size of the struct, meaning users of the struct (i.e., subclassing it in C) would have to be very careful interacting with NumPy-generated objects which may have changed sizes. Or we should give up and declare that we cannot change the size of the struct until we release a NumPy 2.0? Are there real-world cases that changing the size of the struct would break? I admit I have an agenda to further modify the struct in upcoming versions to better support things like alternative data memory allocator strategies, so in my opinion it would be a shame if we are stuck forever with the current struct. Matti [0] https://github.com/numpy/numpy/blob/v1.19.4/numpy/core/include/numpy/ndarraytypes.h#L659 [1] https://github.com/numpy/numpy/pull/16938 From charlesr.harris at gmail.com Sat Nov 21 16:42:41 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 21 Nov 2020 14:42:41 -0700 Subject: [Numpy-discussion] Changing the size of PyArrayObject_fields (the ndarray c-struct) In-Reply-To: <35a5a1b7-d3bd-0cf5-1698-c0d7be586fac@gmail.com> References: <35a5a1b7-d3bd-0cf5-1698-c0d7be586fac@gmail.com> Message-ID: On Sat, Nov 21, 2020 at 12:28 PM Matti Picus wrote: > PyArrayObject_fields is the c-struct that underlies ndarray. It is > defined in ndarraytypes.h [0]. Since version 1.7, we have been trying to > hide it from the public C-API so that we can freely modify it, the > structure has the comment: > > > * It has been recommended to use the inline functions defined below > * (PyArray_DATA and friends) to access fields here for a number of > * releases. Direct access to the members themselves is deprecated. > * To ensure that your code does not use deprecated access, > * #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION > * (or NPY_1_8_API_VERSION or higher as required). > > > > In order to clean up buffer exports, Sebastian suggested (and I pushed > for supporting) PR 16938 [1] which would add a new field to the struct. > As Eric pointed out on the pull request, this would change the size of > the struct, meaning users of the struct (i.e., subclassing it in C) > would have to be very careful interacting with NumPy-generated objects > which may have changed sizes. > > > Or we should give up and declare that we cannot change the size of the > struct until we release a NumPy 2.0? > > > Are there real-world cases that changing the size of the struct would > break? I admit I have an agenda to further modify the struct in upcoming > versions to better support things like alternative data memory allocator > strategies, so in my opinion it would be a shame if we are stuck forever > with the current struct. > > > Matti > > I think the risk is small and this will probably work, the potential problem is if people have extended the essentially private structure in C code. That said, it might be better to put it in after 1.20.x is branched so that it is out there for others to test against, and perhaps implement fixes if needed. We do need to make this move at some point and can't stay fixed on an old structure forever, so the proposed change is something of a warning shot. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Nov 21 20:50:41 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 21 Nov 2020 19:50:41 -0600 Subject: [Numpy-discussion] Changing the size of PyArrayObject_fields (the ndarray c-struct) In-Reply-To: References: <35a5a1b7-d3bd-0cf5-1698-c0d7be586fac@gmail.com> Message-ID: <241a1d0b64ea923966154a30cdeedb21f22739de.camel@sipsolutions.net> On Sat, 2020-11-21 at 14:42 -0700, Charles R Harris wrote: > On Sat, Nov 21, 2020 at 12:28 PM Matti Picus > wrote: > > > PyArrayObject_fields is the c-struct that underlies ndarray. It is > > defined in ndarraytypes.h [0]. Since version 1.7, we have been > > trying to > > hide it from the public C-API so that we can freely modify it, the > > structure has the comment: > > > > > > * It has been recommended to use the inline functions defined > > below > > * (PyArray_DATA and friends) to access fields here for a number > > of > > * releases. Direct access to the members themselves is > > deprecated. > > * To ensure that your code does not use deprecated access, > > * #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION > > * (or NPY_1_8_API_VERSION or higher as required). > > > > > > > > In order to clean up buffer exports, Sebastian suggested (and I > > pushed > > for supporting) PR 16938 [1] which would add a new field to the > > struct. TL;DR: Unless we find real examples of affected users I currently think we should just do it. I don't see much gain in pushing it off (but I don't care). I simply expect exceedingly few affected users compared to small gains for everyone else (including our flexibility for future improvements). > > As Eric pointed out on the pull request, this would change the size > > of > > the struct, meaning users of the struct (i.e., subclassing it in C) > > would have to be very careful interacting with NumPy-generated > > objects > > which may have changed sizes. > > Access to the struct remains ABI compatible. Only those relying on the size will have issues. Using the struct size is mainly relevant when subclassing in C. Just to note: Cython appears not relevant [0]. Adapting to this change may not be pretty, but it should not be very tricky to work around it [1]. Without fixes/recompilation you hopefully get an error. But arbitrarily weird things could happen. (Most likely a crash.) I currently hope that so few enough users will run into this, that we can just help every one who reports an issue... > > > > > I think the risk is small and this will probably work, the potential > problem is if people have extended the essentially private structure > in C > code. That said, it might be better to put it in after 1.20.x is > branched > so that it is out there for others to test against, and perhaps > implement I don't mind waiting. Although, I also don't really see us learning much since only the bigger projects tend to test against master. Currently, I am aware of only one tiny project that will run into this and the author of it seemed fine with having to adapt [2]. If we find more projects that are broken by this, I may change my mind. Until then, I think we should go ahead. For those who got this far... Aside from code cleanup, the PR also reduces some overheads. Some quick, approximate timings: * `memoryview(arr)` is 20% faste. This also helps typed memoryviews in cython e.g.: cdef myfunc(double[::1] data): pass has the same speed-up (about 15% faster without any function body). * `arr[...]` is about 20+% faster. This is because the PR speeds up deletion of every NumPy array object by a small bit. I do not know whether this actually matters for real world code, likely not significantly... (It would be cool to have a pool of real world benchmarks to test this type of thing). Cheers, Sebastian [0] My attempt at this using: cdef class myarr(np.ndarray): pass failed. It achieved nothing but crashes with current NumPy. [1] There are three approaches: 1. Just recompile (using the "deprecated api"): struct MyArraySubclass { char[sizeof(PyArrayObject_fields)]; int my_field; } And in the module init function, you should add: if (sizeof(PyArrayObject_fields) < PyArrayObject_Type.tp_basicsize) { PyErr_SetString(PyExc_RuntimeError, "Module not binary compatible with NumPy, please recompile.") return -1; } 2. Do the same as 1., but add a small constant: sizeof(PyArrayObject_fields) + constant That way you can compile with an old NumPy version, but ensure compatibility with newer versions. (You still need the check while loading the module). 3. You go to lengths to achieve 100% binary compatibility no matter what and avoid `sizeof(PyArrayObject_fields)` entirely: size_t offset_of_myfields = PyArrayObject_Type.tp_basicsize + alignment_padding; MyArraySubclass->tp_basicsize = offset_of_myfields + sizeof(my_fields); And to access `my_fields` you have to use a small macro such as: #define MY_FIELDS(obj) ((my_fields *)((char *)obj + offset_of_myfields)) My personal guess is that solution 2. is sufficient for most small projects, as it allows you to compile in a forward compatible way, but the last is the perfectly clean solution of course. [2] https://github.com/patmiller/bignumpy Also note that it may be that the project can be easily replaced with a simpler solution that does not require C-subclassing at all. > fixes if needed. We do need to make this move at some point and can't > stay > fixed on an old structure forever, so the proposed change is > something of a > warning shot. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Mon Nov 23 08:03:26 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Nov 2020 13:03:26 +0000 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] Please Read: NumFOCUS Summit 2020 In-Reply-To: References: Message-ID: FYI, this may be interesting to some maintainers. I can forward invites if you let me know what you want one. Cheers, Ralf ---------- Forwarded message --------- From: Nicole Foster Date: Sat, Nov 21, 2020 at 4:46 PM Subject: [NumFOCUS Projects] Please Read: NumFOCUS Summit 2020 To: Fiscally Sponsored Project Representatives , Affiliated Projects Hello NumFOCUS Projects! As you know, things have been a little different for us this year thanks to COVID-19. We won't be able to hold our usual in-person Summit, but this has prompted us to present some content that will reach those who wouldn't be able to attend otherwise. In place of this year's event we have chosen to hold a few online sessions that *ALL NumFOCUS project maintainers* *are invited to attend*. The sessions will take place via live Zoom meeting. Sessions will be recorded for those that still can't make the time slots, but we encourage everyone to attend the live sessions and interact with the presenters. The time slots were chosen in an attempt to accommodate as many folks as possible. Tentative schedule is below: Session Title Presenter Day & Time Duration Sponsoring Open Source Software with Government R&D Funding - The PALISADE Case Study Kurt Rohloff Thursday, December 3rd - 5:00 p.m. UTC 1 hr Best Practices for Survey Design and Evaluation Abdoul Karim Coulibaly Wednesday, December 2nd - 5:00 p.m. UTC 1 1/2 hr Legal Q & A Pam Chestek Friday, December 4th - 5:00 p.m. UTC 1 1/2 hr Optimizing Engagement with the Community--Optuna's Story Crissman Loomis Friday, December 4th - 9:00 a.m. UTC 1 hr Social Media and Communications Best Practices TBD TBD 1 hr Social Session NumFOCUS TBD 1 hr *For the Legal Q & A we ask that you please send us your questions before the session by replying to this email. Pam can answer most legal questions in these areas: Copyright, Trademark, Open source licensing, IT contracts and Marketing * *I will be sending out invitations to the session meetings as soon as they are available so please keep an eye out for those. We hope you all can attend!* Best, Nicole -- Nicole Foster Executive Operations Administrator, NumFOCUS nicole at numfocus.org 512-831-2870 x102 -- You received this message because you are subscribed to the Google Groups "Fiscally Sponsored Project Representatives" group. To unsubscribe from this group and stop receiving emails from it, send an email to projects+unsubscribe at numfocus.org. To view this discussion on the web visit https://groups.google.com/a/numfocus.org/d/msgid/projects/CAJLwxPH0Qsr85iBVQ13Ay4oCS89YXp4FvPPFBpMLAbWo5m5wBQ%40mail.gmail.com . -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasbbrunner at gmail.com Mon Nov 23 19:12:36 2020 From: thomasbbrunner at gmail.com (Thomas) Date: Tue, 24 Nov 2020 01:12:36 +0100 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] Message-ID: Hi, I have a proposal for a feature and I hope this is the right place to post this. The idea is to have a function to map any input angle to the range of [ 0, 2*pi ] or [ - pi, pi ]. There already is a function called 'unwrap' that does the opposite, so I'd suggest calling this function 'wrap'. Example usage: # wrap to range [ 0, 2*pi ] >>> np.wrap([ -2*pi, -pi, 0, 4*pi ]) [0, pi, 0, 2*pi] There is some ambiguity regarding what the solution should be for the extremes. An example would be an input of 4*pi, as both 0 and 2*pi would be valid mappings. There has been interest for this topic in the community (see https://stackoverflow.com/questions/15927755/opposite-of-numpy-unwrap). Similar functions exist for Matlab (see https://de.mathworks.com/help/map/ref/wrapto2pi.html). They solved the ambiguity by mapping "positive multiples of 2*pi map to 2*pi and negative multiples of 2*pi map to 0." for the 0 to 2*pi case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Nov 23 20:49:07 2020 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 23 Nov 2020 17:49:07 -0800 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: Message-ID: How would this proposed function compare to using the modulo operator, like 'arr % (2*pi)'? On Mon, Nov 23, 2020, 16:13 Thomas wrote: > Hi, > > I have a proposal for a feature and I hope this is the right place to post > this. > > The idea is to have a function to map any input angle to the range of [ 0, > 2*pi ] or [ - pi, pi ]. > > There already is a function called 'unwrap' that does the opposite, so I'd > suggest calling this function 'wrap'. > > Example usage: > # wrap to range [ 0, 2*pi ] > >>> np.wrap([ -2*pi, -pi, 0, 4*pi ]) > [0, pi, 0, 2*pi] > > There is some ambiguity regarding what the solution should be for the > extremes. An example would be an input of 4*pi, as both 0 and 2*pi would be > valid mappings. > > There has been interest for this topic in the community (see > https://stackoverflow.com/questions/15927755/opposite-of-numpy-unwrap). > > Similar functions exist for Matlab (see > https://de.mathworks.com/help/map/ref/wrapto2pi.html). They solved the > ambiguity by mapping "positive multiples of 2*pi map to 2*pi and negative > multiples of 2*pi map to 0." for the 0 to 2*pi case. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Tue Nov 24 03:12:08 2020 From: daniele at grinta.net (Daniele Nicolodi) Date: Tue, 24 Nov 2020 09:12:08 +0100 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: Message-ID: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> On 24/11/2020 02:49, Nathaniel Smith wrote: > How would this proposed function compare to using the modulo operator, > like 'arr % (2*pi)'? I wrote almost the same word bu word reply, before realizing that taking the modulo looses the sign. The correct operation is slightly more complex (untested): def wrap(alpha): return (alpha + np.pi) % 2.0 * np.pi - np.pi However, I don't think there is much value in adding something so trivial as a function to numpy: I cannot think to any commonly used algorithm that requires wrapping the phase, and it is going to be an infinite source of bikesheeding whether the wrapped range should be [-pi, pi) or (-pi, pi] or (0, 2*pi] or [0, 2*pi) Cheers, Dan > On Mon, Nov 23, 2020, 16:13 Thomas > wrote: > > Hi, > > I have a proposal for a feature and I hope this is the right place > to?post this. > > The idea is to have a function to map any input angle to the range > of [ 0, 2*pi ] or [ - pi, pi ]. > > There already is a function called 'unwrap' that does the opposite, > so I'd suggest calling this function 'wrap'. > > Example usage: > # wrap to range [ 0, 2*pi ] > >>> np.wrap([ -2*pi, -pi, 0, 4*pi ]) > [0, pi, 0, 2*pi] > > There is some ambiguity regarding what the solution should be for > the extremes. An example would be an input of 4*pi, as both 0 and > 2*pi would be valid mappings. > > There has been interest for this topic in the community > (see?https://stackoverflow.com/questions/15927755/opposite-of-numpy-unwrap > ). > > Similar functions exist for Matlab > (see?https://de.mathworks.com/help/map/ref/wrapto2pi.html > ). They solved > the ambiguity by mapping "positive multiples of 2*pi map to 2*pi and > negative multiples of 2*pi map to 0." for the 0 to 2*pi case. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From thomasbbrunner at gmail.com Tue Nov 24 04:25:57 2020 From: thomasbbrunner at gmail.com (Thomas) Date: Tue, 24 Nov 2020 10:25:57 +0100 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> References: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> Message-ID: Like Nathaniel said, it would not improve much when compared to the modulo operator. It could handle the edge cases better, but really the biggest benefit would be that it is more convenient. And as the "unwrap" function already exists, people would expect that and look for a function for the inverse operation (at least I did). On Tue, 24 Nov 2020 at 09:22, Daniele Nicolodi wrote: > On 24/11/2020 02:49, Nathaniel Smith wrote: > > How would this proposed function compare to using the modulo operator, > > like 'arr % (2*pi)'? > > I wrote almost the same word bu word reply, before realizing that taking > the modulo looses the sign. The correct operation is slightly more > complex (untested): > > def wrap(alpha): > return (alpha + np.pi) % 2.0 * np.pi - np.pi > > However, I don't think there is much value in adding something so > trivial as a function to numpy: I cannot think to any commonly used > algorithm that requires wrapping the phase, and it is going to be an > infinite source of bikesheeding whether the wrapped range should be > [-pi, pi) or (-pi, pi] or (0, 2*pi] or [0, 2*pi) > > Cheers, > Dan > > > > On Mon, Nov 23, 2020, 16:13 Thomas > > wrote: > > > > Hi, > > > > I have a proposal for a feature and I hope this is the right place > > to post this. > > > > The idea is to have a function to map any input angle to the range > > of [ 0, 2*pi ] or [ - pi, pi ]. > > > > There already is a function called 'unwrap' that does the opposite, > > so I'd suggest calling this function 'wrap'. > > > > Example usage: > > # wrap to range [ 0, 2*pi ] > > >>> np.wrap([ -2*pi, -pi, 0, 4*pi ]) > > [0, pi, 0, 2*pi] > > > > There is some ambiguity regarding what the solution should be for > > the extremes. An example would be an input of 4*pi, as both 0 and > > 2*pi would be valid mappings. > > > > There has been interest for this topic in the community > > (see > https://stackoverflow.com/questions/15927755/opposite-of-numpy-unwrap > > < > https://stackoverflow.com/questions/15927755/opposite-of-numpy-unwrap>). > > > > Similar functions exist for Matlab > > (see https://de.mathworks.com/help/map/ref/wrapto2pi.html > > ). They solved > > the ambiguity by mapping "positive multiples of 2*pi map to 2*pi and > > negative multiples of 2*pi map to 0." for the 0 to 2*pi case. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Tue Nov 24 06:37:15 2020 From: daniele at grinta.net (Daniele Nicolodi) Date: Tue, 24 Nov 2020 12:37:15 +0100 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> Message-ID: On 24/11/2020 10:25, Thomas wrote: > Like Nathaniel said, it would not improve much when compared to the > modulo operator.? > > It could handle the edge cases better, but really the biggest benefit > would be that it is more convenient. Which edge cases? Better how? > And as the "unwrap" function already exists, The unwrap() function exists because it is not as trivial. > people would expect that > and look for a function for the inverse operation (at least I did). What is your use of a wrap() function? I cannot think of any. Cheers, Dan From malyasova.viktoriya at yandex.ru Tue Nov 24 07:49:18 2020 From: malyasova.viktoriya at yandex.ru (=?utf-8?B?0JLQuNC60YLQvtGA0LjRjyDQnNCw0LvRj9GB0L7QstCw?=) Date: Tue, 24 Nov 2020 15:49:18 +0300 Subject: [Numpy-discussion] Added Rivest-Floyd selection algorithm as an option to numpy.partition Message-ID: <18811251606219790@mail.yandex.ru> An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Nov 24 09:46:45 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Nov 2020 14:46:45 +0000 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> Message-ID: On Tue, Nov 24, 2020 at 11:37 AM Daniele Nicolodi wrote: > On 24/11/2020 10:25, Thomas wrote: > > Like Nathaniel said, it would not improve much when compared to the > > modulo operator. > > > > It could handle the edge cases better, but really the biggest benefit > > would be that it is more convenient. > > Which edge cases? Better how? > > > And as the "unwrap" function already exists, > > The unwrap() function exists because it is not as trivial. > I agree, we prefer not to add trivial functions like this. To help those few people that may need this, maybe just add the one-liner Daniele gave to the Notes section of unwrap()? Cheers, Ralf > > people would expect that > > and look for a function for the inverse operation (at least I did). > > What is your use of a wrap() function? I cannot think of any. > > Cheers, > Dan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.augier at univ-grenoble-alpes.fr Tue Nov 24 10:47:05 2020 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Tue, 24 Nov 2020 16:47:05 +0100 (CET) Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python Message-ID: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Hi, I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): | # particles | Py | C++ | Fortran | Julia | |-------------|-----|-----|---------|-------| | 1024 | 29 | 55 | 41 | 45 | | 2048 | 123 | 231 | 166 | 173 | The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org, so one still has to be very careful.) I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). Is there already something planned to answer to Zwart (2020)? Any opinions or suggestions on this potential project? Pierre PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! -- Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 From ilhanpolat at gmail.com Tue Nov 24 12:11:13 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Tue, 24 Nov 2020 18:11:13 +0100 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: Do we have to take it seriously to start with? Because, with absolutely no offense meant, I am having significant difficulty doing so. On Tue, Nov 24, 2020 at 4:58 PM PIERRE AUGIER < pierre.augier at univ-grenoble-alpes.fr> wrote: > Hi, > > I recently took a bit of time to study the comment "The ecological impact > of high-performance computing in astrophysics" published in Nature > Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, > https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best > however, for the environment is to abandon Python for a more > environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for this > study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, > it's very efficient. Here are some numbers (elapsed times in s, smaller is > better): > > | # particles | Py | C++ | Fortran | Julia | > |-------------|-----|-----|---------|-------| > | 1024 | 29 | 55 | 41 | 45 | > | 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: https://github.com/paugier/nbabel > (There is no check on the results for https://www.nbabel.org, so one > still has to be very careful.) > > I think that the Numpy community should spend a bit of energy to show what > can be done with the existing tools to get very high performance (and low > CO2 production) with Python. This work could be the basis of a serious > reply to the comment by Zwart (2020). > > Unfortunately the Python solution in https://www.nbabel.org is very bad > in terms of performance (and therefore CO2 production). It is also true for > most of the Python solutions for the Computer Language Benchmarks Game in > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > We could try to fix this so that people see that in many cases, it is not > necessary to "abandon Python for a more environmentally friendly (compiled) > programming language". One of the longest and hardest task would be to > implement the different cases of the Computer Language Benchmarks Game in > standard and modern Python-Numpy. Then, optimizing and accelerating such > code should be doable and we should be able to get very good performance at > least for some cases. Good news for this project, (i) the first point can > be done by anyone with good knowledge in Python-Numpy (many potential > workers), (ii) for some cases, there are already good Python > implementations and (iii) the work can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy website > https://numpy.org/ is not very convincing in terms of performance. It's > written "Performant The core of NumPy is well-optimized C code. Enjoy the > flexibility of Python with the speed of compiled code." It's true that the > core of Numpy is well-optimized C code but to seriously compete with C++, > Fortran or Julia in terms of numerical performance, one needs to use other > tools to move the compiled-interpreted boundary outside the hot loops. So > it could be reasonable to mention such tools (in particular Numba, Pythran, > Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, > Pyston, etc.) could also be used, especially if HPy ( > https://github.com/hpyproject/hpy) is successful (C core of Numpy written > in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit > skeptical in the ability of such technologies to reach very high > performance for low-level Numpy code (performance that can be reached by > replacing whole Python functions with optimized compiled code). Of course, > I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > -- > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 24 12:25:02 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 24 Nov 2020 11:25:02 -0600 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: <11dd8b14a9b25d8d1125e62079025ab629cb4648.camel@sipsolutions.net> On Tue, 2020-11-24 at 16:47 +0100, PIERRE AUGIER wrote: > Hi, > > I recently took a bit of time to study the comment "The ecological > impact of high-performance computing in astrophysics" published in > Nature Astronomy (Zwart, 2020, > https://www.nature.com/articles/s41550-020-1208-y, > https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best > however, for the environment is to abandon Python for a more > environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for > this study (https://www.nbabel.org) and, accelerated by Transonic- > Pythran, it's very efficient. Here are some numbers (elapsed times in > s, smaller is better): > > > # particles | Py | C++ | Fortran | Julia | > > -------------|-----|-----|---------|-------| > > 1024 | 29 | 55 | 41 | 45 | > > 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: > https://github.com/paugier/nbabel (There is no check on the results > for https://www.nbabel.org, so one still has to be very careful.) > > I think that the Numpy community should spend a bit of energy to show > what can be done with the existing tools to get very high performance > (and low CO2 production) with Python. This work could be the basis of > a serious reply to the comment by Zwart (2020). > > Unfortunately the Python solution in https://www.nbabel.org is very > bad in terms of performance (and therefore CO2 production). It is > also true for most of the Python solutions for the Computer Language > Benchmarks Game in > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes > here > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else > ). > > We could try to fix this so that people see that in many cases, it is > not necessary to "abandon Python for a more environmentally friendly > (compiled) programming language". One of the longest and hardest task > would be to implement the different cases of the Computer Language > Benchmarks Game in standard and modern Python-Numpy. Then, optimizing > and accelerating such code should be doable and we should be able to > get very good performance at least for some cases. Good news for this > project, (i) the first point can be done by anyone with good > knowledge in Python-Numpy (many potential workers), (ii) for some > cases, there are already good Python implementations and (iii) the > work can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy > website https://numpy.org/ is not very convincing in terms of > performance. It's written "Performant The core of NumPy is well- > optimized C code. Enjoy the flexibility of Python with the speed of > compiled code." It's true that the core of Numpy is well-optimized C > code but to seriously compete with C++, Fortran or Julia in terms of > numerical performance, one needs to use other tools to move the > compiled-interpreted boundary outside the hot loops. So it could be > reasonable to mention such tools (in particular Numba, Pythran, > Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? I don't think there is any need for rebuttal. The author is right right, you should not write the core of an N-Body simulation in Python :). I completely disagree with the focus on programming languages/tooling, quite honestly. A PhD who writes performance critical code, must get the education necessary to do it well. That may mean learning something beyond Python, but not replacing Python entirely. In one point the opinion notes: NumPy, for example, is mostly used for its advanced array handling and support functions. Using these will reduce runtime and, therefore, also carbon emission, but optimization is generally stopped as soon as the calculation runs within an unconsciously determined reasonable amount of time, such as the coffee-refill timescale or a holiday weekend. IMO, this applies to any other programming language just as much. If your correlation is fast enough, you will not invest time in implementing an fft based algorithm. If you iterate your array in Fortran instead of C-order in your C++ program (which new users may just do randomly), you are likely to waste more(!) cpu cycles then if you were using NumPy :). Personally, I am always curious how much of that "GPUs are faster" factor is actually due to the effort spend on making it faster... My angle is that in the end, it is far more about technical knowledge than about using the "right" language. An example: At an old workplace we had had some simulations running five times slower, because years earlier someone forgot to set `RELEASE=True` in the default config, always compiling in debug mode! But honestly, if it was 5 times faster, we probably would probably have done at least 3 times as many simulations :). Aside from that, most complex C/C++ programs can probably be sped up significantly just as well. In the end, my main reading is that code running on power-hungry machines (clusters, workstations) should maybe be audited for performance. Yes! (Although even then, reduces tend to get used, no matter how much you have!) As for actually doing something to reduce the carbon footprint, I think the vast majority of our users would have more impact if they throttle their CPUs a bit rather than worry about what tool they use to do their job :). Cheers, Sebastian > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, > Pyjion, Pyston, etc.) could also be used, especially if HPy ( > https://github.com/hpyproject/hpy) is successful (C core of Numpy > written in HPy, Cython able to produce HPy code, etc.). However, I > tend to be a bit skeptical in the ability of such technologies to > reach very high performance for low-level Numpy code (performance > that can be reached by replacing whole Python functions with > optimized compiled code). Of course, I hope I'm wrong! IMHO, it does > not remove the need for a successful HPy! > > -- > Pierre Augier - CR CNRS > http://www.legi.grenoble-inp.fr > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et > Industriels > BP53, 38041 Grenoble Cedex, > France tel:+33.4.56.52.86.16 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From andy.terrel at gmail.com Tue Nov 24 12:27:52 2020 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Tue, 24 Nov 2020 11:27:52 -0600 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: I think we, the community, does have to take it seriously. NumPy and the rest of the ecosystem is trying to raise money to hire developers. This sentiment, which is much wider than a single paper, is a prevalent roadblock. -- Andy On Tue, Nov 24, 2020 at 11:12 AM Ilhan Polat wrote: > Do we have to take it seriously to start with? Because, with absolutely no > offense meant, I am having significant difficulty doing so. > > On Tue, Nov 24, 2020 at 4:58 PM PIERRE AUGIER < > pierre.augier at univ-grenoble-alpes.fr> wrote: > >> Hi, >> >> I recently took a bit of time to study the comment "The ecological impact >> of high-performance computing in astrophysics" published in Nature >> Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, >> https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best >> however, for the environment is to abandon Python for a more >> environmentally friendly (compiled) programming language.". >> >> I wrote a simple Python-Numpy implementation of the problem used for this >> study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, >> it's very efficient. Here are some numbers (elapsed times in s, smaller is >> better): >> >> | # particles | Py | C++ | Fortran | Julia | >> |-------------|-----|-----|---------|-------| >> | 1024 | 29 | 55 | 41 | 45 | >> | 2048 | 123 | 231 | 166 | 173 | >> >> The code and a modified figure are here: >> https://github.com/paugier/nbabel (There is no check on the results for >> https://www.nbabel.org, so one still has to be very careful.) >> >> I think that the Numpy community should spend a bit of energy to show >> what can be done with the existing tools to get very high performance (and >> low CO2 production) with Python. This work could be the basis of a serious >> reply to the comment by Zwart (2020). >> >> Unfortunately the Python solution in https://www.nbabel.org is very bad >> in terms of performance (and therefore CO2 production). It is also true for >> most of the Python solutions for the Computer Language Benchmarks Game in >> https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here >> https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). >> >> We could try to fix this so that people see that in many cases, it is not >> necessary to "abandon Python for a more environmentally friendly (compiled) >> programming language". One of the longest and hardest task would be to >> implement the different cases of the Computer Language Benchmarks Game in >> standard and modern Python-Numpy. Then, optimizing and accelerating such >> code should be doable and we should be able to get very good performance at >> least for some cases. Good news for this project, (i) the first point can >> be done by anyone with good knowledge in Python-Numpy (many potential >> workers), (ii) for some cases, there are already good Python >> implementations and (iii) the work can easily be parallelized. >> >> It is not a criticism, but the (beautiful and very nice) new Numpy >> website https://numpy.org/ is not very convincing in terms of >> performance. It's written "Performant The core of NumPy is well-optimized C >> code. Enjoy the flexibility of Python with the speed of compiled code." >> It's true that the core of Numpy is well-optimized C code but to seriously >> compete with C++, Fortran or Julia in terms of numerical performance, one >> needs to use other tools to move the compiled-interpreted boundary outside >> the hot loops. So it could be reasonable to mention such tools (in >> particular Numba, Pythran, Cython and Transonic). >> >> Is there already something planned to answer to Zwart (2020)? >> >> Any opinions or suggestions on this potential project? >> >> Pierre >> >> PS: Of course, alternative Python interpreters (PyPy, GraalPython, >> Pyjion, Pyston, etc.) could also be used, especially if HPy ( >> https://github.com/hpyproject/hpy) is successful (C core of Numpy >> written in HPy, Cython able to produce HPy code, etc.). However, I tend to >> be a bit skeptical in the ability of such technologies to reach very high >> performance for low-level Numpy code (performance that can be reached by >> replacing whole Python functions with optimized compiled code). Of course, >> I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! >> >> -- >> Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr >> LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels >> BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Tue Nov 24 12:41:45 2020 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 24 Nov 2020 18:41:45 +0100 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: <20201124184145.49580b75@patagonia> Hi Pierre, I agree with your point of view: the author wants to demonstrate C++ and Fortran are better than Python... and environmentally speaking he has some evidences. We develop with Python, Cython, Numpy, and OpenCL and what annoys me most is the compilation time needed for the development of those statically typed ahead of time extensions (C++, C, Fortran). Clearly the author wants to get his article viral and in a sense he managed :). But he did not mention Julia / Numba and other JIT compiled languages (including matlab ?) that are probably outperforming the C++ / Fortran when considering the development time and test-time. Beside this the OpenMP parallelism (implicitly advertized) is far from scaling well on multi-socket systems and other programming paradigms are needed to extract the best performances from spercomputers. Cheers, Jerome From compl.yue at icloud.com Tue Nov 24 13:21:53 2020 From: compl.yue at icloud.com (YueCompl) Date: Wed, 25 Nov 2020 02:21:53 +0800 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> Is there some community interest to develop fusion based high-performance array programming? Something like https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , but that embedded DSL is far less pleasing compared to Python as the surface language for optimized Numpy code in C. I imagine that we might be able to transpile a Numpy program into fused LLVM IR, then deploy part as host code on CPUs and part as CUDA code on GPUs? I know Numba is already doing the array part, but it is too limited in addressing more complex non-array data structures. I had been approaching ~20K separate data series with some intermediate variables for each, then it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. Compl > On 2020-11-24, at 23:47, PIERRE AUGIER wrote: > > Hi, > > I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): > > | # particles | Py | C++ | Fortran | Julia | > |-------------|-----|-----|---------|-------| > | 1024 | 29 | 55 | 41 | 45 | > | 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org, so one still has to be very careful.) > > I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). > > Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > -- > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 24 13:22:50 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 24 Nov 2020 12:22:50 -0600 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <20201124184145.49580b75@patagonia> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: On Tue, 2020-11-24 at 18:41 +0100, Jerome Kieffer wrote: > Hi Pierre, > > I agree with your point of view: the author wants to demonstrate C++ > and Fortran are better than Python... and environmentally speaking he > has some evidences. > > We develop with Python, Cython, Numpy, and OpenCL and what annoys me > most is the compilation time needed for the development of those > statically typed ahead of time extensions (C++, C, Fortran). > > Clearly the author wants to get his article viral and in a sense he > managed :). But he did not mention Julia / Numba and other JIT > compiled > languages (including matlab ?) that are probably outperforming the > C++ / Fortran when considering the development time and test-time. > Beside this the OpenMP parallelism (implicitly advertized) is far > from > scaling well on multi-socket systems and other programming paradigms > are needed to extract the best performances from spercomputers. > As an interesting aside: Algorithms may have actually improved *more* than computational speed when it comes to performance [1]. That shows the impressive scale and complexity of efficient code. So, I could possibly argue that the most important thing may well be accessibility of algorithms. And I think that is what a large chunk of Scientific Python packages are all about. Whether or not that has an impact on the environment... Cheers, Sebastian [1] This was the first resource I found, I am sure there are plenty: https://www.lanl.gov/conferences/salishan/salishan2004/womble.pdf > Cheers, > > Jerome > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From einstein.edison at gmail.com Tue Nov 24 13:27:59 2020 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 24 Nov 2020 19:27:59 +0100 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> Message-ID: Hello, We?re trying to do a part of this in the TACO team, and with a Python wrapper in the form of PyData/Sparse. It will allow an abstract array/scheduling to take place, but there are a bunch of constraints, the most important one being that a C compiler cannot be required at runtime. However, this may take a while to materialize, as we need an LLVM backend, and a Python wrapper (matching the NumPy API), and support for arbitrary functions (like universal functions). https://github.com/tensor-compiler/taco http://fredrikbk.com/publications/kjolstad-thesis.pdf -- Sent from Canary (https://canarymail.io) > On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl wrote: > Is there some community interest to develop fusion based high-performance array programming? Something like https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , but that embedded DSL is far less pleasing compared to Python as the surface language for optimized Numpy code in C. > > I imagine that we might be able to transpile a Numpy program into fused LLVM IR, then deploy part as host code on CPUs and part as CUDA code on GPUs? > > I know Numba is already doing the array part, but it is too limited in addressing more complex non-array data structures. I had been approaching ~20K separate data series with some intermediate variables for each, then it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. > > Compl > > > > On 2020-11-24, at 23:47, PIERRE AUGIER wrote: > > Hi, > > > > I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". > > > > I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): > > > > | # particles | Py | C++ | Fortran | Julia | > > |-------------|-----|-----|---------|-------| > > | 1024 | 29 | 55 | 41 | 45 | > > | 2048 | 123 | 231 | 166 | 173 | > > > > The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org, so one still has to be very careful.) > > > > I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). > > > > Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > > > We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. > > > > It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). > > > > Is there already something planned to answer to Zwart (2020)? > > > > Any opinions or suggestions on this potential project? > > > > Pierre > > > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > > > -- > > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr > > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org) > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Tue Nov 24 13:50:41 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Tue, 24 Nov 2020 19:50:41 +0100 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> Message-ID: Measuring running time of a program in arbitrary programming language is not an objective metric. Otherwise force everyone code in Assembler and we would be done as quick as possible. Hire 5 people to come to the workplace for 6 months to optimize it and we will be done with their transportation. There is a reason for not doing so. Alternatively, any time that will be shaved off from this will be spent on extremely inefficient i9 laptops that developers have while debugging the type issues. As the author themselves admit, the development speed would justify the loss encountered from the actual code running. So this study is suggestive at the very least just; like my rebuttal, very difficult to verify. I do Industrial IoT for a living, and while I wholeheartedly agree with the intentions, I would seriously question the power metrics given here because similarly I can easily show a steel factory to be very efficient if I am not careful. Especially tying the code quality to the programming language is a very slippery slope that I have been listening to in the last 20 years from Fortran people. > I think we, the community, does have to take it seriously. NumPy and the rest of the ecosystem is trying to raise money to hire developers. This sentiment, which is much wider than a single paper, is a prevalent roadblock. I don't get this sentence. On Tue, Nov 24, 2020 at 7:29 PM Hameer Abbasi wrote: > Hello, > > We?re trying to do a part of this in the TACO team, and with a Python > wrapper in the form of PyData/Sparse. It will allow an abstract > array/scheduling to take place, but there are a bunch of constraints, the > most important one being that a C compiler cannot be required at runtime. > > However, this may take a while to materialize, as we need an LLVM backend, > and a Python wrapper (matching the NumPy API), and support for arbitrary > functions (like universal functions). > > https://github.com/tensor-compiler/taco > http://fredrikbk.com/publications/kjolstad-thesis.pdf > > -- > Sent from Canary > > On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl > wrote: > Is there some community interest to develop fusion based high-performance > array programming? Something like > https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , > but that embedded DSL is far less pleasing compared to Python as the > surface language for optimized Numpy code in C. > > I imagine that we might be able to transpile a Numpy program into fused > LLVM IR, then deploy part as host code on CPUs and part as CUDA code on > GPUs? > > I know Numba is already doing the array part, but it is too limited in > addressing more complex non-array data structures. I had been approaching > ~20K separate data series with some intermediate variables for each, then > it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. > > Compl > > > On 2020-11-24, at 23:47, PIERRE AUGIER < > pierre.augier at univ-grenoble-alpes.fr> wrote: > > Hi, > > I recently took a bit of time to study the comment "The ecological impact > of high-performance computing in astrophysics" published in Nature > Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, > https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best > however, for the environment is to abandon Python for a more > environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for this > study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, > it's very efficient. Here are some numbers (elapsed times in s, smaller is > better): > > | # particles | Py | C++ | Fortran | Julia | > |-------------|-----|-----|---------|-------| > | 1024 | 29 | 55 | 41 | 45 | > | 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: https://github.com/paugier/nbabel > (There is no check on the results for https://www.nbabel.org, so one > still has to be very careful.) > > I think that the Numpy community should spend a bit of energy to show what > can be done with the existing tools to get very high performance (and low > CO2 production) with Python. This work could be the basis of a serious > reply to the comment by Zwart (2020). > > Unfortunately the Python solution in https://www.nbabel.org is very bad > in terms of performance (and therefore CO2 production). It is also true for > most of the Python solutions for the Computer Language Benchmarks Game in > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > We could try to fix this so that people see that in many cases, it is not > necessary to "abandon Python for a more environmentally friendly (compiled) > programming language". One of the longest and hardest task would be to > implement the different cases of the Computer Language Benchmarks Game in > standard and modern Python-Numpy. Then, optimizing and accelerating such > code should be doable and we should be able to get very good performance at > least for some cases. Good news for this project, (i) the first point can > be done by anyone with good knowledge in Python-Numpy (many potential > workers), (ii) for some cases, there are already good Python > implementations and (iii) the work can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy website > https://numpy.org/ is not very convincing in terms of performance. It's > written "Performant The core of NumPy is well-optimized C code. Enjoy the > flexibility of Python with the speed of compiled code." It's true that the > core of Numpy is well-optimized C code but to seriously compete with C++, > Fortran or Julia in terms of numerical performance, one needs to use other > tools to move the compiled-interpreted boundary outside the hot loops. So > it could be reasonable to mention such tools (in particular Numba, Pythran, > Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, > Pyston, etc.) could also be used, especially if HPy ( > https://github.com/hpyproject/hpy) is successful (C core of Numpy written > in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit > skeptical in the ability of such technologies to reach very high > performance for low-level Numpy code (performance that can be reached by > replacing whole Python functions with optimized compiled code). Of course, > I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > -- > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > <+33.4.56.52.86.16> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Nov 24 13:52:51 2020 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 24 Nov 2020 13:52:51 -0500 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: Given that AWS and Azure have both made commitments to have their data centers be carbon neutral, and given that electricity and heat production make up ~25% of GHG pollution, I find these sorts of power-usage-analysis-for-the-sake-of-the-environment to be a bit disingenuous. Especially since GHG pollution from power generation is forecasted to shrink as more power is generated by alternative means. I am fine with improving python performance, but let's not fool ourselves into thinking that it is going to have any meaningful impact on the environment. Ben Root https://sustainability.aboutamazon.com/environment/the-cloud?energyType=true https://azure.microsoft.com/en-au/global-infrastructure/sustainability/#energy-innovations https://www.epa.gov/ghgemissions/global-greenhouse-gas-emissions-data On Tue, Nov 24, 2020 at 1:25 PM Sebastian Berg wrote: > On Tue, 2020-11-24 at 18:41 +0100, Jerome Kieffer wrote: > > Hi Pierre, > > > > I agree with your point of view: the author wants to demonstrate C++ > > and Fortran are better than Python... and environmentally speaking he > > has some evidences. > > > > We develop with Python, Cython, Numpy, and OpenCL and what annoys me > > most is the compilation time needed for the development of those > > statically typed ahead of time extensions (C++, C, Fortran). > > > > Clearly the author wants to get his article viral and in a sense he > > managed :). But he did not mention Julia / Numba and other JIT > > compiled > > languages (including matlab ?) that are probably outperforming the > > C++ / Fortran when considering the development time and test-time. > > Beside this the OpenMP parallelism (implicitly advertized) is far > > from > > scaling well on multi-socket systems and other programming paradigms > > are needed to extract the best performances from spercomputers. > > > > As an interesting aside: Algorithms may have actually improved *more* > than computational speed when it comes to performance [1]. That shows > the impressive scale and complexity of efficient code. > > So, I could possibly argue that the most important thing may well be > accessibility of algorithms. And I think that is what a large chunk of > Scientific Python packages are all about. > > Whether or not that has an impact on the environment... > > Cheers, > > Sebastian > > > [1] This was the first resource I found, I am sure there are plenty: > https://www.lanl.gov/conferences/salishan/salishan2004/womble.pdf > > > > Cheers, > > > > Jerome > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Nov 24 14:06:40 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Nov 2020 12:06:40 -0700 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: On Tue, Nov 24, 2020 at 11:54 AM Benjamin Root wrote: > > Given that AWS and Azure have both made commitments to have their data > centers be carbon neutral, and given that electricity and heat production > make up ~25% of GHG pollution, I find these sorts of > power-usage-analysis-for-the-sake-of-the-environment to be a bit > disingenuous. Especially since GHG pollution from power generation is > forecasted to shrink as more power is generated by alternative means. I am > fine with improving python performance, but let's not fool ourselves into > thinking that it is going to have any meaningful impact on the environment. > > Ben Root > Bingo. I lived through the Freon ozone panic that lasted for 20 years even after the key reaction rate was remeasured and found to be 75-100 times slower than that used in the research that started the panic. The models never recovered, but the panic persisted until it magically disappeared in 1994. There are still ozone holes over the Antarctic, last time I looked they were explained as due to an influx of cold air. If you want to deal with GHG, push nuclear power. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Nov 24 14:27:17 2020 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 24 Nov 2020 14:27:17 -0500 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: Digressing here, but the ozone hole over the antarctic was always going to take time to recover because of the approximately 50 year residence time of the CFCs in the upper atmosphere. Cold temperatures can actually speed up depletion because of certain ice crystal formations that give a boost in the CFC+sunlight+O3 reaction rate. Note that it doesn't mean that 50 years are needed to get rid of all CFCs in the atmosphere, it is just a measure of the amount of time it is expected to take for half of the gas that is already there to be removed. That doesn't account for the amount of time it has taken for CFC usage to drop in the first place, and the fact that there are still CFC pollution occurring (albeit far less than in the 80's). Ben Root https://ozone.unep.org/nasa-provides-first-direct-evidence-ozone-hole-recovery https://csl.noaa.gov/assessments/ozone/1998/faq11.html On Tue, Nov 24, 2020 at 2:07 PM Charles R Harris wrote: > > > On Tue, Nov 24, 2020 at 11:54 AM Benjamin Root > wrote: > >> >> Given that AWS and Azure have both made commitments to have their data >> centers be carbon neutral, and given that electricity and heat production >> make up ~25% of GHG pollution, I find these sorts of >> power-usage-analysis-for-the-sake-of-the-environment to be a bit >> disingenuous. Especially since GHG pollution from power generation is >> forecasted to shrink as more power is generated by alternative means. I am >> fine with improving python performance, but let's not fool ourselves into >> thinking that it is going to have any meaningful impact on the environment. >> >> Ben Root >> > > Bingo. I lived through the Freon ozone panic that lasted for 20 years even > after the key reaction rate was remeasured and found to be 75-100 times > slower than that used in the research that started the panic. The models > never recovered, but the panic persisted until it magically disappeared in > 1994. There are still ozone holes over the Antarctic, last time I looked > they were explained as due to an influx of cold air. > > If you want to deal with GHG, push nuclear power. > > > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Nov 24 14:43:01 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Nov 2020 12:43:01 -0700 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: On Tue, Nov 24, 2020 at 12:28 PM Benjamin Root wrote: > Digressing here, but the ozone hole over the antarctic was always going to > take time to recover because of the approximately 50 year residence time of > the CFCs in the upper atmosphere. Cold temperatures can actually speed up > depletion because of certain ice crystal formations that give a boost in > the CFC+sunlight+O3 reaction rate. Note that it doesn't mean that 50 years > are needed to get rid of all CFCs in the atmosphere, it is just a measure > of the amount of time it is expected to take for half of the gas that is > already there to be removed. That doesn't account for the amount of time it > has taken for CFC usage to drop in the first place, and the fact that there > are still CFC pollution occurring (albeit far less than in the 80's). > > Ben Root > > Out of curiosity, has the ice crystal acceleration been established in the lab? I recall it being proposed to help save the models, but that was a long time ago. IIRC, another reaction rate was remeasured in 2005 and found to be 10X lower than thought, but don't recall which one. I've been looking for a good recent review article to see what the current status is. The funding mostly disappeared after 1994 along with several careers. Freon is still used -- off the books -- in several countries, a phenomenon now seen with increasing coal generation. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Tue Nov 24 14:50:07 2020 From: alan.isaac at gmail.com (Alan G. Isaac) Date: Tue, 24 Nov 2020 14:50:07 -0500 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: On 11/24/2020 2:06 PM, Charles R Harris wrote: > There are still ozone holes over the Antarctic, last time I looked they were explained as due to an influx of cold air. I believe industrial CFC usage, which has fallen since the Montreal Protocol, is still considered the primary culprit in ozone layer thinning. Is there a particular model you have in mind? (Ideally one with publicly available source code and some data.) On 11/24/2020 2:06 PM, Charles R Harris wrote: > If you want to deal with GHG, push nuclear power. Yes. However, solar is becoming competitive is some regions for cost per watt, and avoids the worst waste disposal issues. fwiw, Alan Isaac From asmeurer at gmail.com Tue Nov 24 16:10:37 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Tue, 24 Nov 2020 14:10:37 -0700 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: This always seems like such a ridiculous argument. If CO2 emissions are directly proportional to the time it takes for a program to run, then there's no real need to concern ourselves with it. People already have a direct reason to avoid programs that take a long time to run, namely, that they take a long time to run. If I have two codes that compute the same thing and one takes a week and the other takes a few minutes, then obviously I will choose the one that takes a few minutes, and my decision will have nothing to do with ecological impact. The real issue with CO2 emissions are instances where the agency is completely removed and the people damaging the environment don't suffer any ill effects from it. It would be more intellectually honest to try to determine why it is that people choose Python, an apparently very slow language, to do high performance computing. If one spends even a moment thinking about this, and actually looking at what the real scientific Python community does, one would realize that simply having a fast core in Python is enough for the majority of performance. NumPy array expressions are fast because the core loops are fast, and those dominate the runtime for the majority of uses. And for instances where it isn't fast enough, e.g., when writing a looping algorithm directly, there are multiple tools that allow writing fast Python or Python-like code, such as Numba, Cython, Pythran, PyPy, and so on. Aaron Meurer On Tue, Nov 24, 2020 at 8:57 AM PIERRE AUGIER wrote: > > Hi, > > I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): > > | # particles | Py | C++ | Fortran | Julia | > |-------------|-----|-----|---------|-------| > | 1024 | 29 | 55 | 41 | 45 | > | 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org, so one still has to be very careful.) > > I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). > > Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > -- > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From thomasbbrunner at gmail.com Tue Nov 24 17:54:09 2020 From: thomasbbrunner at gmail.com (Thomas) Date: Tue, 24 Nov 2020 23:54:09 +0100 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> Message-ID: I use my own implementation of the wrap function in kinematics and kinetics (robotics). Solutions beyond [0, 2pi] or [-pi, pi] can cause some problems when combined with learning algorithms, so we wrap them. Interestingly, today I reviewed code for a teammate. He had the exact same problem, but did not think much about it and solved it with if-else statements. But yes, maybe this is too specific and trivial for a Numpy function. Thanks for taking the time to look into it! On Tue, 24 Nov 2020 at 15:47, Ralf Gommers wrote: > > > On Tue, Nov 24, 2020 at 11:37 AM Daniele Nicolodi > wrote: > >> On 24/11/2020 10:25, Thomas wrote: >> > Like Nathaniel said, it would not improve much when compared to the >> > modulo operator. >> > >> > It could handle the edge cases better, but really the biggest benefit >> > would be that it is more convenient. >> >> Which edge cases? Better how? >> >> > And as the "unwrap" function already exists, >> >> The unwrap() function exists because it is not as trivial. >> > > I agree, we prefer not to add trivial functions like this. To help those > few people that may need this, maybe just add the one-liner Daniele gave to > the Notes section of unwrap()? > > Cheers, > Ralf > > > >> > people would expect that >> > and look for a function for the inverse operation (at least I did). >> >> What is your use of a wrap() function? I cannot think of any. >> >> Cheers, >> Dan >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 24 20:04:26 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 24 Nov 2020 19:04:26 -0600 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday -- One our earlier from now on! Message-ID: <0f62311e1dd4f21833c6d6694fb1803316fc9a2f.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday November 25th at 12pm Pacific Time (19:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ewm at redtetrahedron.org Wed Nov 25 00:33:47 2020 From: ewm at redtetrahedron.org (Eric Moore) Date: Wed, 25 Nov 2020 00:33:47 -0500 Subject: [Numpy-discussion] NumPy Feature Request: Function to wrap angles to range [ 0, 2*pi] or [ -pi, pi ] In-Reply-To: References: <097a376b-d153-d6b7-8e6f-d305d444c099@grinta.net> Message-ID: On Tue, Nov 24, 2020 at 6:38 AM Daniele Nicolodi wrote: > On 24/11/2020 10:25, Thomas wrote: > > Like Nathaniel said, it would not improve much when compared to the > > modulo operator. > > > > It could handle the edge cases better, but really the biggest benefit > > would be that it is more convenient. > > Which edge cases? Better how? > > > And as the "unwrap" function already exists, > > The unwrap() function exists because it is not as trivial. > > > people would expect that > > and look for a function for the inverse operation (at least I did). > > What is your use of a wrap() function? I cannot think of any. > > Cheers, > Dan > For what it?s worth, this kind of range reduction can be extremely nontrivial depending on the needs of your application. Look at the efforts needed to ensure that the trigonometric functions give good results. This is discussed in this paper: https://www.csee.umbc.edu/~phatak/645/supl/Ng-ArgReduction.pdf. I don?t think that this belongs in Numpy, but it certainly isn?t a one liner. Best, Eric > -------------- next part -------------- An HTML attachment was scrubbed... URL: From compl.yue at icloud.com Wed Nov 25 02:55:09 2020 From: compl.yue at icloud.com (YueCompl) Date: Wed, 25 Nov 2020 15:55:09 +0800 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <20201124184145.49580b75@patagonia> Message-ID: <0489E90B-F8AD-41F3-8007-FA96261A5F37@icloud.com> I'm imagining a study on programmer and maintainer's time spent on a given problem, tackled in different programming languages, maybe Python can be shown to reduce GHG on the contrary. It goes like this: Many human programmers/administrators/managers eat beef or likes as they grow up, as cattle produces great amount of GHG, and optimization experts need more years to graduate ... So Numpy actually used less emission to gain greater yields in works demanding optimization, i.e. the ecosystem scales up to more people not being optimization experts themselves, yet get serious work done. > On 2020-11-25, at 02:52, Benjamin Root wrote: > > > Given that AWS and Azure have both made commitments to have their data centers be carbon neutral, and given that electricity and heat production make up ~25% of GHG pollution, I find these sorts of power-usage-analysis-for-the-sake-of-the-environment to be a bit disingenuous. Especially since GHG pollution from power generation is forecasted to shrink as more power is generated by alternative means. I am fine with improving python performance, but let's not fool ourselves into thinking that it is going to have any meaningful impact on the environment. > > Ben Root > > https://sustainability.aboutamazon.com/environment/the-cloud?energyType=true > https://azure.microsoft.com/en-au/global-infrastructure/sustainability/#energy-innovations > https://www.epa.gov/ghgemissions/global-greenhouse-gas-emissions-data > On Tue, Nov 24, 2020 at 1:25 PM Sebastian Berg > wrote: > On Tue, 2020-11-24 at 18:41 +0100, Jerome Kieffer wrote: > > Hi Pierre, > > > > I agree with your point of view: the author wants to demonstrate C++ > > and Fortran are better than Python... and environmentally speaking he > > has some evidences. > > > > We develop with Python, Cython, Numpy, and OpenCL and what annoys me > > most is the compilation time needed for the development of those > > statically typed ahead of time extensions (C++, C, Fortran). > > > > Clearly the author wants to get his article viral and in a sense he > > managed :). But he did not mention Julia / Numba and other JIT > > compiled > > languages (including matlab ?) that are probably outperforming the > > C++ / Fortran when considering the development time and test-time. > > Beside this the OpenMP parallelism (implicitly advertized) is far > > from > > scaling well on multi-socket systems and other programming paradigms > > are needed to extract the best performances from spercomputers. > > > > As an interesting aside: Algorithms may have actually improved *more* > than computational speed when it comes to performance [1]. That shows > the impressive scale and complexity of efficient code. > > So, I could possibly argue that the most important thing may well be > accessibility of algorithms. And I think that is what a large chunk of > Scientific Python packages are all about. > > Whether or not that has an impact on the environment... > > Cheers, > > Sebastian > > > [1] This was the first resource I found, I am sure there are plenty: > https://www.lanl.gov/conferences/salishan/salishan2004/womble.pdf > > > > Cheers, > > > > Jerome > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From compl.yue at icloud.com Wed Nov 25 03:06:51 2020 From: compl.yue at icloud.com (YueCompl) Date: Wed, 25 Nov 2020 16:06:51 +0800 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> Message-ID: Great to know. Skimmed through the project readme, so TACO currently generating C code as intermediate language, if the purpose is about tensors, why not Numba's llvmlite for it? I'm aware that the scheduling code tend not to be array programs, and llvmlite may have tailored too much to optimize more general programs well. How is TACO going in this regard? Compl > On 2020-11-25, at 02:27, Hameer Abbasi wrote: > > > Hello, > > We?re trying to do a part of this in the TACO team, and with a Python wrapper in the form of PyData/Sparse. It will allow an abstract array/scheduling to take place, but there are a bunch of constraints, the most important one being that a C compiler cannot be required at runtime. > > However, this may take a while to materialize, as we need an LLVM backend, and a Python wrapper (matching the NumPy API), and support for arbitrary functions (like universal functions). > > https://github.com/tensor-compiler/taco > http://fredrikbk.com/publications/kjolstad-thesis.pdf > > -- > Sent from Canary > > On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl > wrote: > Is there some community interest to develop fusion based high-performance array programming? Something like https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , but that embedded DSL is far less pleasing compared to Python as the surface language for optimized Numpy code in C. > > I imagine that we might be able to transpile a Numpy program into fused LLVM IR, then deploy part as host code on CPUs and part as CUDA code on GPUs? > > I know Numba is already doing the array part, but it is too limited in addressing more complex non-array data structures. I had been approaching ~20K separate data series with some intermediate variables for each, then it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. > > Compl > > >> On 2020-11-24, at 23:47, PIERRE AUGIER > wrote: >> >> Hi, >> >> I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y , https://arxiv.org/pdf/2009.11295.pdf ), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". >> >> I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org ) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): >> >> | # particles | Py | C++ | Fortran | Julia | >> |-------------|-----|-----|---------|-------| >> | 1024 | 29 | 55 | 41 | 45 | >> | 2048 | 123 | 231 | 166 | 173 | >> >> The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org , so one still has to be very careful.) >> >> I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). >> >> Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else ). >> >> We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. >> >> It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). >> >> Is there already something planned to answer to Zwart (2020)? >> >> Any opinions or suggestions on this potential project? >> >> Pierre >> >> PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy ) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! >> >> -- >> Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr >> LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels >> BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From einstein.edison at gmail.com Wed Nov 25 04:17:22 2020 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Wed, 25 Nov 2020 10:17:22 +0100 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> Message-ID: <8cb3f78f-70a6-4e60-aff6-e5050501a5cf@Canary> Hello, TACO consists of three things: An array API A scheduling language A language for describing sparse modes of the tensor So it combines arrays with scheduling, and also sparse tensors for a lot of different applications. It also includes an auto-scheduler. The code thus generated is on par or faster than, e.g. MKL and other equivalent libraries, with the ability to do fusion for arbitrary expressions. This is, for more complicated expressions involving sparse operands, big-O superior to composing the operations. The limitations are: Right now, it can only compute Einstein-summation type expressions, we?re (along with Rawn, another member of the TACO team) trying to extend that to any kind of point-wise expressions and reductions (such as exp(tensor), sum(tensor), ...). It requires a C compiler at runtime. We?re writing an LLVM backend for it that will hopefully remove that requirement. It can?t do arbitrary non-pointwise functions, e.g. SVD, inverse. This is a long way from being completely solved. As for why not Numba/llvmlite: Re-writing TACO is a large task that would be hard to do, wrapping/extending it is much easier. Best regards, Hameer Abbasi -- Sent from Canary (https://canarymail.io) > On Mittwoch, Nov. 25, 2020 at 9:07 AM, YueCompl wrote: > Great to know. > > Skimmed through the project readme, so TACO currently generating C code as intermediate language, if the purpose is about tensors, why not Numba's llvmlite for it? > > I'm aware that the scheduling code tend not to be array programs, and llvmlite may have tailored too much to optimize more general programs well. How is TACO going in this regard? > > Compl > > > On 2020-11-25, at 02:27, Hameer Abbasi wrote: > > Hello, > > > > We?re trying to do a part of this in the TACO team, and with a Python wrapper in the form of PyData/Sparse. It will allow an abstract array/scheduling to take place, but there are a bunch of constraints, the most important one being that a C compiler cannot be required at runtime. > > > > However, this may take a while to materialize, as we need an LLVM backend, and a Python wrapper (matching the NumPy API), and support for arbitrary functions (like universal functions). > > > > https://github.com/tensor-compiler/taco > > http://fredrikbk.com/publications/kjolstad-thesis.pdf > > > > -- > > Sent from Canary (https://canarymail.io/) > > > > > On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl wrote: > > > Is there some community interest to develop fusion based high-performance array programming? Something like https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , but that embedded DSL is far less pleasing compared to Python as the surface language for optimized Numpy code in C. > > > > > > I imagine that we might be able to transpile a Numpy program into fused LLVM IR, then deploy part as host code on CPUs and part as CUDA code on GPUs? > > > > > > I know Numba is already doing the array part, but it is too limited in addressing more complex non-array data structures. I had been approaching ~20K separate data series with some intermediate variables for each, then it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. > > > > > > Compl > > > > > > > > > > On 2020-11-24, at 23:47, PIERRE AUGIER wrote: > > > > Hi, > > > > > > > > I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". > > > > > > > > I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org (https://www.nbabel.org/)) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): > > > > > > > > | # particles | Py | C++ | Fortran | Julia | > > > > |-------------|-----|-----|---------|-------| > > > > | 1024 | 29 | 55 | 41 | 45 | > > > > | 2048 | 123 | 231 | 166 | 173 | > > > > > > > > The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org (https://www.nbabel.org/), so one still has to be very careful.) > > > > > > > > I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). > > > > > > > > Unfortunately the Python solution in https://www.nbabel.org (https://www.nbabel.org/) is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else). > > > > > > > > We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. > > > > > > > > It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). > > > > > > > > Is there already something planned to answer to Zwart (2020)? > > > > > > > > Any opinions or suggestions on this potential project? > > > > > > > > Pierre > > > > > > > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! > > > > > > > > -- > > > > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr (http://www.legi.grenoble-inp.fr/) > > > > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > > > > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org) > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org) > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org) > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From compl.yue at icloud.com Wed Nov 25 05:06:56 2020 From: compl.yue at icloud.com (YueCompl) Date: Wed, 25 Nov 2020 18:06:56 +0800 Subject: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python In-Reply-To: <8cb3f78f-70a6-4e60-aff6-e5050501a5cf@Canary> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <488160C0-E5A0-40DA-81EB-55266B81495D@icloud.com> <8cb3f78f-70a6-4e60-aff6-e5050501a5cf@Canary> Message-ID: <7BB5AABE-13CA-42B3-B030-3AD3F41C01F8@icloud.com> Yeah, I get it. llvmlite would only do composition, while TACO is doing fusion. This is more promising! Best regards, Compl > On 2020-11-25, at 17:17, Hameer Abbasi wrote: > > > Hello, > > TACO consists of three things: > An array API > A scheduling language > A language for describing sparse modes of the tensor > So it combines arrays with scheduling, and also sparse tensors for a lot of different applications. It also includes an auto-scheduler. The code thus generated is on par or faster than, e.g. MKL and other equivalent libraries, with the ability to do fusion for arbitrary expressions. This is, for more complicated expressions involving sparse operands, big-O superior to composing the operations. > > The limitations are: > Right now, it can only compute Einstein-summation type expressions, we?re (along with Rawn, another member of the TACO team) trying to extend that to any kind of point-wise expressions and reductions (such as exp(tensor), sum(tensor), ...). > It requires a C compiler at runtime. We?re writing an LLVM backend for it that will hopefully remove that requirement. > It can?t do arbitrary non-pointwise functions, e.g. SVD, inverse. This is a long way from being completely solved. > > As for why not Numba/llvmlite: Re-writing TACO is a large task that would be hard to do, wrapping/extending it is much easier. > > Best regards, > Hameer Abbasi > > -- > Sent from Canary > > On Mittwoch, Nov. 25, 2020 at 9:07 AM, YueCompl > wrote: > Great to know. > > Skimmed through the project readme, so TACO currently generating C code as intermediate language, if the purpose is about tensors, why not Numba's llvmlite for it? > > I'm aware that the scheduling code tend not to be array programs, and llvmlite may have tailored too much to optimize more general programs well. How is TACO going in this regard? > > Compl > >> On 2020-11-25, at 02:27, Hameer Abbasi > wrote: >> >> >> Hello, >> >> We?re trying to do a part of this in the TACO team, and with a Python wrapper in the form of PyData/Sparse. It will allow an abstract array/scheduling to take place, but there are a bunch of constraints, the most important one being that a C compiler cannot be required at runtime. >> >> However, this may take a while to materialize, as we need an LLVM backend, and a Python wrapper (matching the NumPy API), and support for arbitrary functions (like universal functions). >> >> https://github.com/tensor-compiler/taco >> http://fredrikbk.com/publications/kjolstad-thesis.pdf >> >> -- >> Sent from Canary >> >> On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl > wrote: >> Is there some community interest to develop fusion based high-performance array programming? Something like https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations , but that embedded DSL is far less pleasing compared to Python as the surface language for optimized Numpy code in C. >> >> I imagine that we might be able to transpile a Numpy program into fused LLVM IR, then deploy part as host code on CPUs and part as CUDA code on GPUs? >> >> I know Numba is already doing the array part, but it is too limited in addressing more complex non-array data structures. I had been approaching ~20K separate data series with some intermediate variables for each, then it took up to 30+GB RAM keep compiling yet gave no result after 10+hours. >> >> Compl >> >> >>> On 2020-11-24, at 23:47, PIERRE AUGIER > wrote: >>> >>> Hi, >>> >>> I recently took a bit of time to study the comment "The ecological impact of high-performance computing in astrophysics" published in Nature Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y , https://arxiv.org/pdf/2009.11295.pdf ), where it is stated that "Best however, for the environment is to abandon Python for a more environmentally friendly (compiled) programming language.". >>> >>> I wrote a simple Python-Numpy implementation of the problem used for this study (https://www.nbabel.org ) and, accelerated by Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in s, smaller is better): >>> >>> | # particles | Py | C++ | Fortran | Julia | >>> |-------------|-----|-----|---------|-------| >>> | 1024 | 29 | 55 | 41 | 45 | >>> | 2048 | 123 | 231 | 166 | 173 | >>> >>> The code and a modified figure are here: https://github.com/paugier/nbabel (There is no check on the results for https://www.nbabel.org , so one still has to be very careful.) >>> >>> I think that the Numpy community should spend a bit of energy to show what can be done with the existing tools to get very high performance (and low CO2 production) with Python. This work could be the basis of a serious reply to the comment by Zwart (2020). >>> >>> Unfortunately the Python solution in https://www.nbabel.org is very bad in terms of performance (and therefore CO2 production). It is also true for most of the Python solutions for the Computer Language Benchmarks Game in https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else ). >>> >>> We could try to fix this so that people see that in many cases, it is not necessary to "abandon Python for a more environmentally friendly (compiled) programming language". One of the longest and hardest task would be to implement the different cases of the Computer Language Benchmarks Game in standard and modern Python-Numpy. Then, optimizing and accelerating such code should be doable and we should be able to get very good performance at least for some cases. Good news for this project, (i) the first point can be done by anyone with good knowledge in Python-Numpy (many potential workers), (ii) for some cases, there are already good Python implementations and (iii) the work can easily be parallelized. >>> >>> It is not a criticism, but the (beautiful and very nice) new Numpy website https://numpy.org/ is not very convincing in terms of performance. It's written "Performant The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code." It's true that the core of Numpy is well-optimized C code but to seriously compete with C++, Fortran or Julia in terms of numerical performance, one needs to use other tools to move the compiled-interpreted boundary outside the hot loops. So it could be reasonable to mention such tools (in particular Numba, Pythran, Cython and Transonic). >>> >>> Is there already something planned to answer to Zwart (2020)? >>> >>> Any opinions or suggestions on this potential project? >>> >>> Pierre >>> >>> PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, Pyston, etc.) could also be used, especially if HPy (https://github.com/hpyproject/hpy ) is successful (C core of Numpy written in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit skeptical in the ability of such technologies to reach very high performance for low-level Numpy code (performance that can be reached by replacing whole Python functions with optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove the need for a successful HPy! >>> >>> -- >>> Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr >>> LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels >>> BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Nov 25 13:21:21 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Nov 2020 18:21:21 +0000 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday -- One our earlier from now on! In-Reply-To: <0f62311e1dd4f21833c6d6694fb1803316fc9a2f.camel@sipsolutions.net> References: <0f62311e1dd4f21833c6d6694fb1803316fc9a2f.camel@sipsolutions.net> Message-ID: On Wed, Nov 25, 2020 at 1:05 AM Sebastian Berg wrote: > Hi all, > > There will be a NumPy Community meeting Wednesday November 25th at 12pm > Pacific Time (19:00 UTC). Should be 20:00 UTC (~1.5 hrs from now) Cheers, Ralf Everyone is invited and encouraged to > join in and edit the work-in-progress meeting topics and notes at: > > https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both > > Best wishes > > Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Nov 26 09:17:18 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 26 Nov 2020 14:17:18 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> Message-ID: On Fri, Nov 6, 2020 at 4:03 PM Zimmermann Klaus wrote: > Hi, > > On 06/11/2020 15:58, Ralf Gommers wrote: > > On Fri, Nov 6, 2020 at 9:51 AM Zimmermann Klaus > > > wrote: > > I have absolutely no problem keeping this out of the main namespace. > > > > In fact I'd like to point out that it was not my idea. Rather, it was > > proposed by Bas van Beek in the comments [1,2] and received a little > > more scrutiny from Eric Wieser in [3]. > > > > Thanks, between two PRs with that many comments, I couldn't figure that > > out - just saw the commit that make the change. > > Understandable, no worries. > > > On the subject matter, I am also curious about the potential for > > confusion. What other behavior could one expect from a sliding window > > view with this shape? > > > > As I said, I am completely fine with keeping this out of the main > > namespace, but I agree with Sebastian's comment, that > > `np.lib.stride_tricks` is perhaps not the best namespace. > > > > > > I agree that that's not a great namespace. There's multiple issues with > > namespaces, we basically have three good ones (fft, linalg, random) and > > a bunch of other ones that range from questionable to terrible. See > > > https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 > > < > https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127 > > > > for details. > > > > This would be a good thing to work on - making the `numpy.lib` namespace > > not bleed into `numpy` via `import *` is one thing to do there, and > > there's many others. But given backwards compat constraints it's not > easy. > > I understand cleaning up all the namespaces is a giant task, so far, far > out of scope here. As said before, I also completely agree to keep it > out of the main namespace (though I will still argue below :P). > > I was just wondering if, of the top your head, an existing, better fit > comes to mind? > Not really. Outside of stride_tricks there's nothing that quite fits. This function is more in scope for something like scipy.signal. Cheers, Ralf > > The reason > > from my point of view is that stride tricks is really a technical > (and > > slightly ominous) name that might throw of more application oriented > > programmers from finding and using this function. Thinking of my > > scientist colleagues, I think those are exactly the kind of users > that > > could benefit from such a prototyping tool. > > > > > > That phrasing is one of a number of concerns. NumPy is normally not in > > the business of providing things that are okay as a prototyping tool, > > but are potentially extremely slow (as pointed out in the Notes section > > of the docstring). A function like that would basically not be the right > > tool for almost anything in, e.g., SciPy - it requires an iterative > > algorithm. In NumPy we don't prefer performance at all costs, but in > > general it's pretty decent rather than "Numba or Cython may gain you > > 100x here". > > I still think that the performance concern is a bit overblown. Yes, > application with large windows can need more FLOPs by an equally large > factor. But most such applications will use small to moderate windows. > Furthermore, this view focuses only on FLOPs. In my current field of > climate science (and many others), that is almost never the limiting > factor. Memory demands are far more problematic and incidentally, those > are more likely to increase in other methods that require the storage of > ancillary, temporary data. > > > Other issues include: > > 2) It is very specific to NumPy's memory model (as pointed out by you > > and Sebastian) - just like the rest of stride_tricks > Not wrong, but on the other hand, that memory model is not exotic. C, > Fortran, and any number of other languages play very nicely with this, > just as important downstream libraries like dask. > > > 3) It has "view" in the name, which doesn't quite make sense for the > > main namespace (also connected to point 2 above). > Ok. > > > 4) The cost of putting something in the main namespace for other > > array/tensor libraries is large. Maybe other libraries, e.g. CuPy, Dask, > > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the > > main NumPy namespace as well as possible. This would trigger discussions > > and likely many person-weeks of work for others. > Agreed. Though I have to say that my whole motivation comes from > corresponding issues in dask that where specifically waiting for (the > older version of) this PR (see [1, 2,...]). But I understand that dask > is effectively much closer to the numpy memory model than, say, CuPy, so > don't take this to mean it should be in the main namespace. > > > 5) It's a useful function, but it's very much on the margins of NumPy's > > scope. It could easily have gone into, for example, scipy.signal. At > > this point the bar for functions going into the main namespace should > be> (and is) high. > I agree that the bar for the main namespace should be high! > > > All this taken together means it's not even a toss-up for me. If it were > > just one or two of these points, maybe. But given all the above, I'm > > pretty confident saying "it does not belong in the main namespace". > Again, I am happy with that. > > > Thanks for your thoughts and work! I really appreciate it! > > Cheers > Klaus > > [1] https://github.com/dask/dask/issues/4659 > [2] https://github.com/pydata/xarray/issues/3608 > [3] https://github.com/pandas-dev/pandas/issues/26959 > > > > > > > > Cheers > > Klaus > > > > > > > > [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 > > > > [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468 > > > > [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340 > > > > > > On 06/11/2020 01:39, Sebastian Berg wrote: > > > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > > >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > > >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > > >>> ralf.gommers at gmail.com > > > >>> wrote: > > >>> > > >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > > >>>> sebastian at sipsolutions.net > > > >>>> wrote: > > >>>> > > >>>>> Hi all, > > >>>>> > > >>>>> just a brief note that I merged this proposal: > > >>>>> > > >>>>> https://github.com/numpy/numpy/pull/17394 > > > > >>>>> > > >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy. > > >>>>> > > >>>>> There was only one public API change, and that is that the > > >>>>> `shape` > > >>>>> argument is now called `window_shape`. > > >>>>> > > >>>>> This is still a good time for feedback in case you have a > > >>>>> better > > >>>>> idea > > >>>>> e.g. for the function or parameter names. > > >>>>> > > >>>> > > >>>> The old PR had this in the lib.stride_tricks namespace. Seeing > it > > >>>> in the > > >>>> main namespace is unexpected and likely will lead to > > >>>> issues/questions, > > >>>> given that such an overlapping view is going to do behave in > ways > > >>>> the > > >>>> average user will be surprised by. It may also lead to requests > > >>>> for > > >>>> other > > >>>> array/tensor libraries to implement this. I don't see any > > >>>> discussion on > > >>>> this in PR 17394, it looks like a decision by the PR author that > > >>>> no > > >>>> one > > >>>> commented on - reconsider that? > > >>>> > > >>>> Cheers, > > >>>> Ralf > > >>>> > > >>> > > >>> +1 let's keep this in the lib.stride_tricks namespace. > > >>> > > >> > > >> I have no reservations against having it in the main namespace > and am > > >> happy either way (it can still be exposed later in any case). It > is > > >> the > > >> conservative choice and maybe it is an uncommon enough function > that > > >> it > > >> deserves being a bit hidden... > > > > > > > > > In any case, its the safe bet for NumPy 1.20 at least so I opened > > a PR: > > > > > > https://github.com/numpy/numpy/pull/17720 > > > > > > > > Name changes, etc. are also possible of course. > > > > > > I still think it might be nice to find a better place for this > type of > > > function that `np.lib.stride_tricks` though, but dunno... > > > > > > - Sebastian > > > > > > > > > > > >> > > >> But I am curious, it sounds like you have both very strong > > >> reservations, and I would like to understand them better. > > >> > > >> The behaviour can be surprising, but that is why the default is a > > >> read- > > >> only view. I do not think it is worse than `np.broadcast_to` in > this > > >> regard. (It is nowhere near as dangerous as `as_strided`.) > > >> > > >> It is true that it is specific to NumPy (memory model). So that is > > >> maybe a good enough reason right now. But I am not sure that > > >> stuffing > > >> things into a pretty hidden `np.lib.*` namespaces is a great long > > >> term > > >> solution either. There is very little useful functionality hidden > > >> away > > >> in `np.lib.*` currently. > > >> > > >> Cheers, > > >> > > >> Sebastian > > >> > > >>>> > > >>>> > > >>>>> Cheers, > > >>>>> > > >>>>> Sebastian > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > > >>>>>> Hello, > > >>>>>> > > >>>>>> I would like to draw the attention of this list to PR #17394 > > >>>>>> [1] that > > >>>>>> adds the implementation of a sliding window view to numpy. > > >>>>>> > > >>>>>> Having a sliding window view in numpy is a longstanding open > > >>>>>> issue > > >>>>>> (cf > > >>>>>> #7753 [2] from 2016). A brief summary of the discussions > > >>>>>> surrounding > > >>>>>> it > > >>>>>> can be found in the description of the PR. > > >>>>>> > > >>>>>> This PR implements a sliding window view based on stride > > >>>>>> tricks. > > >>>>>> Following the discussion in issue #7753, a first > > >>>>>> implementation > > >>>>>> was > > >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion, > > >>>>>> that PR > > >>>>>> stalled and I picked up the issue in the present PR #17394. > > >>>>>> It > > >>>>>> is > > >>>>>> based > > >>>>>> on the first implementation, but follows the changed API as > > >>>>>> suggested > > >>>>>> by > > >>>>>> Eric Wieser. > > >>>>>> > > >>>>>> Code reviews have been provided by Bas van Beek, Stephen > > >>>>>> Hoyer, > > >>>>>> and > > >>>>>> Eric > > >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. > > >>>>>> > > >>>>>> > > >>>>>> Do you think this is suitable for inclusion in numpy? > > >>>>>> > > >>>>>> Do you consider the PR ready? > > >>>>>> > > >>>>>> Do you have suggestions or requests? > > >>>>>> > > >>>>>> > > >>>>>> Thanks for your time and consideration! > > >>>>>> Klaus > > >>>>>> > > >>>>>> > > >>>>>> [1] https://github.com/numpy/numpy/pull/17394 > > > > >>>>>> [2] https://github.com/numpy/numpy/issues/7753 > > > > >>>>>> _______________________________________________ > > >>>>>> NumPy-Discussion mailing list > > >>>>>> NumPy-Discussion at python.org NumPy-Discussion at python.org> > > >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >>>>>> > > >>>>> > > >>>>> _______________________________________________ > > >>>>> NumPy-Discussion mailing list > > >>>>> NumPy-Discussion at python.org NumPy-Discussion at python.org> > > >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >>>>> > > >>>> _______________________________________________ > > >>>> NumPy-Discussion mailing list > > >>>> NumPy-Discussion at python.org > > > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >>>> > > >>> > > >>> _______________________________________________ > > >>> NumPy-Discussion mailing list > > >>> NumPy-Discussion at python.org > > >>> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >> > > >> _______________________________________________ > > >> NumPy-Discussion mailing list > > >> NumPy-Discussion at python.org > > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.augier at univ-grenoble-alpes.fr Thu Nov 26 16:14:40 2020 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Thu, 26 Nov 2020 22:14:40 +0100 (CET) Subject: [Numpy-discussion] Showing by examples how Python-Numpy can be efficient even for computationally intensive tasks In-Reply-To: References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: <304723263.47365520.1606425280293.JavaMail.zimbra@univ-grenoble-alpes.fr> I changed the email subject because I'd like to focus less on CO2 (a very interesting subject, but not my focus here) and more on computing... ----- Mail original ----- > De: "Andy Ray Terrel" > ?: "numpy-discussion" > Envoy?: Mardi 24 Novembre 2020 18:27:52 > Objet: Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python > I think we, the community, does have to take it seriously. NumPy and the rest of > the ecosystem is trying to raise money to hire developers. This sentiment, > which is much wider than a single paper, is a prevalent roadblock. > > -- Andy I agree. I don't know if it is a matter of scientific field, but I tend to hear more and more people explaining that they don't use Python because of performance. Or telling that they don't have performance problems because they don't use Python. Some communities (I won't give names ?) communicate a lot on the bad performances of Python-Numpy. I am well aware that performance is in many cases not so important but it is not a good thing to have such bad reputation. I think we have to show what is doable with Python-Numpy code to get very good performance. ----- Mail original ----- > De: "Sebastian Berg" > Envoy?: Mardi 24 Novembre 2020 18:25:02 > Objet: Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python >> Is there already something planned to answer to Zwart (2020)? > > I don't think there is any need for rebuttal. The author is right > right, you should not write the core of an N-Body simulation in Python > :). I completely disagree with the focus on programming > languages/tooling, quite honestly. I'm not a fan of this focus neither. But we have to realize that many people think like that and are sensible to such arguments. Being so bad in all benchmark games does not help the scientific Python community (especially in the long terms). > A PhD who writes performance critical code, must get the education > necessary to do it well. That may mean learning something beyond > Python, but not replacing Python entirely. I'm really not sure. Or at least that depends on the type of performance critical code. I see many students or scientists who sometimes need to write few functions that are not super inefficient. For many people, I don't see why they would need to learn and use another language. I did my PhD (in turbulence) with Fortran (and Matlab) and I have really nothing against Fortran. However, I'm really happy that we code in my group nearly everything in Python (+ a bit of C++ for the fun). For example, Fluidsim (https://foss.heptapod.net/fluiddyn/fluidsim) is ~100% Python and I know that it is very efficient (more efficient than many alternatives written with a lot of C++/Fortran). I realize that it wouldn't be possible for all kinds of code (and fluidsim uses fluidfft, written in C++ / Cython / Python), but being 100% Python has a lot of advantages (I won't list them here). For a N-Body simulation, why not using Python? Using Python, you get a very readable, clear and efficient implementation (see https://github.com/paugier/nbabel), even faster than what you can get with easy C++/Fortran/Julia. IMHO, it is just what one needs for most PhD in astronomy. Of course, for many things, one needs native languages! Have a look at Pythran C++ code, it's beautiful ? ! But I don't think every scientist that writes critical code has to become an expert in C++ or Fortran (or Julia). I also sometimes have to read and use C++ and Fortran codes written by scientists. Sometimes (often), I tend to think that they would be more productive with other tools to reach the same performance. I think it is only a matter of education and not of tooling, but using serious tools does not make you a serious developer, and reaching the level in C++/Fortran to write efficient, clean, readable and maintainable codes in not so easy for a PhD or scientist that has other things to do. Python-Numpy is so slow for some algorithms that many Python-Numpy users would benefit to know how to accelerate it. Just an example, with some elapsed times (in s) for the N-Body problem (see https://github.com/paugier/nbabel#smaller-benchmarks-between-different-python-solutions): | Transonic-Pythran | Transonic-Numba | High-level Numpy | PyPy OOP | PyPy lists | |-------------------|-----------------|------------------|----------|------------| | 0.48 | 3.91 | 686 | 87 | 15 | For comparison, we have for this case `{"c++": 0.85, "Fortran": 0.62, "Julia": 2.57}`. Note that just adding `from transonic import jit` to the simple high-level Numpy code and then decorating the function `compute_accelerations` with `@jit`, the elapsed time decreases to 8 s (a x85 speedup!, with Pythran 0.9.8). I conclude from these types of results that we need to tell Python users how to accelerate their Python-Numpy codes when they feel the need of it. I think acceleration tools should be mentioned in Numpy website. I also think we should spend a bit of energy to play some benchmark games. It would be much better if we can change the widespread idea on Python performance for numerical problems from "Python is very slow and ineffective for most algorithms" to "interpreted Python can be very slow but with the existing Python accelerators, one can be extremely efficient with Python". Pierre > > On Tue, Nov 24, 2020 at 11:12 AM Ilhan Polat < [ mailto:ilhanpolat at gmail.com | > ilhanpolat at gmail.com ] > wrote: > > Do we have to take it seriously to start with? Because, with absolutely no > offense meant, I am having significant difficulty doing so. > > On Tue, Nov 24, 2020 at 4:58 PM PIERRE AUGIER < [ > mailto:pierre.augier at univ-grenoble-alpes.fr | > pierre.augier at univ-grenoble-alpes.fr ] > wrote: > > > Hi, > > I recently took a bit of time to study the comment "The ecological impact of > high-performance computing in astrophysics" published in Nature Astronomy > (Zwart, 2020, [ https://www.nature.com/articles/s41550-020-1208-y | > https://www.nature.com/articles/s41550-020-1208-y ] , [ > https://arxiv.org/pdf/2009.11295.pdf | https://arxiv.org/pdf/2009.11295.pdf ] > ), where it is stated that "Best however, for the environment is to abandon > Python for a more environmentally friendly (compiled) programming language.". > > I wrote a simple Python-Numpy implementation of the problem used for this study > ( [ https://www.nbabel.org/ | https://www.nbabel.org ] ) and, accelerated by > Transonic-Pythran, it's very efficient. Here are some numbers (elapsed times in > s, smaller is better): > >| # particles | Py | C++ | Fortran | Julia | >|-------------|-----|-----|---------|-------| >| 1024 | 29 | 55 | 41 | 45 | >| 2048 | 123 | 231 | 166 | 173 | > > The code and a modified figure are here: [ https://github.com/paugier/nbabel | > https://github.com/paugier/nbabel ] (There is no check on the results for [ > https://www.nbabel.org/ | https://www.nbabel.org ] , so one still has to be > very careful.) > > I think that the Numpy community should spend a bit of energy to show what can > be done with the existing tools to get very high performance (and low CO2 > production) with Python. This work could be the basis of a serious reply to the > comment by Zwart (2020). > > Unfortunately the Python solution in [ https://www.nbabel.org/ | > https://www.nbabel.org ] is very bad in terms of performance (and therefore CO2 > production). It is also true for most of the Python solutions for the Computer > Language Benchmarks Game in [ > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ | > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ ] (codes here [ > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else | > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else ] ). > > We could try to fix this so that people see that in many cases, it is not > necessary to "abandon Python for a more environmentally friendly (compiled) > programming language". One of the longest and hardest task would be to > implement the different cases of the Computer Language Benchmarks Game in > standard and modern Python-Numpy. Then, optimizing and accelerating such code > should be doable and we should be able to get very good performance at least > for some cases. Good news for this project, (i) the first point can be done by > anyone with good knowledge in Python-Numpy (many potential workers), (ii) for > some cases, there are already good Python implementations and (iii) the work > can easily be parallelized. > > It is not a criticism, but the (beautiful and very nice) new Numpy website [ > https://numpy.org/ | https://numpy.org/ ] is not very convincing in terms of > performance. It's written "Performant The core of NumPy is well-optimized C > code. Enjoy the flexibility of Python with the speed of compiled code." It's > true that the core of Numpy is well-optimized C code but to seriously compete > with C++, Fortran or Julia in terms of numerical performance, one needs to use > other tools to move the compiled-interpreted boundary outside the hot loops. So > it could be reasonable to mention such tools (in particular Numba, Pythran, > Cython and Transonic). > > Is there already something planned to answer to Zwart (2020)? > > Any opinions or suggestions on this potential project? > > Pierre > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, > Pyston, etc.) could also be used, especially if HPy ( [ > https://github.com/hpyproject/hpy | https://github.com/hpyproject/hpy ] ) is > successful (C core of Numpy written in HPy, Cython able to produce HPy code, > etc.). However, I tend to be a bit skeptical in the ability of such > technologies to reach very high performance for low-level Numpy code > (performance that can be reached by replacing whole Python functions with > optimized compiled code). Of course, I hope I'm wrong! IMHO, it does not remove > the need for a successful HPy! > > -- > Pierre Augier - CR CNRS [ http://www.legi.grenoble-inp.fr/ | > http://www.legi.grenoble-inp.fr ] > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 > _______________________________________________ > NumPy-Discussion mailing list > [ mailto:NumPy-Discussion at python.org | NumPy-Discussion at python.org ] > [ https://mail.python.org/mailman/listinfo/numpy-discussion | > https://mail.python.org/mailman/listinfo/numpy-discussion ] > _______________________________________________ > NumPy-Discussion mailing list > [ mailto:NumPy-Discussion at python.org | NumPy-Discussion at python.org ] > [ https://mail.python.org/mailman/listinfo/numpy-discussion | > https://mail.python.org/mailman/listinfo/numpy-discussion ] > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Thu Nov 26 17:48:40 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 26 Nov 2020 22:48:40 +0000 Subject: [Numpy-discussion] Showing by examples how Python-Numpy can be efficient even for computationally intensive tasks In-Reply-To: <304723263.47365520.1606425280293.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <304723263.47365520.1606425280293.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: On Thu, Nov 26, 2020 at 9:15 PM PIERRE AUGIER < pierre.augier at univ-grenoble-alpes.fr> wrote: > > I conclude from these types of results that we need to tell Python users > how to accelerate their Python-Numpy codes when they feel the need of it. I > think acceleration tools should be mentioned in Numpy website. I also think > we should spend a bit of energy to play some benchmark games. > Good point, added an issue for it on the website repo: https://github.com/numpy/numpy.org/issues/370 Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Nov 26 18:10:00 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 26 Nov 2020 23:10:00 +0000 Subject: [Numpy-discussion] Added Rivest-Floyd selection algorithm as an option to numpy.partition In-Reply-To: <18811251606219790@mail.yandex.ru> References: <18811251606219790@mail.yandex.ru> Message-ID: On Tue, Nov 24, 2020 at 12:56 PM ???????? ???????? < malyasova.viktoriya at yandex.ru> wrote: > Hello everyone! > > I've implemented the Rivest-Floyd selection algorithm as a second option > to the partition method. I found it works about 1.5 times faster on average > for big array sizes; here are average times for finding a median: > > array length 10 > introselect 4.6e-05 > rivest_floyd 4.4e-05 > array length 100 > introselect 5.5e-05 > rivest_floyd 4.7e-05 > array length 1000 > introselect 6.9e-05 > rivest_floyd 6.5e-05 > array length 10000 > introselect 3.1e-04 > rivest_floyd 2.3e-04 > array length 100000 > introselect 2.9e-03 > rivest_floyd 2.0e-03 > array length 1000000 > introselect 2.9e-02 > rivest_floyd 2.0e-02 > > I've created a pull request https://github.com/numpy/numpy/pull/17813 and implemented reviewer's suggestions and fixes. Do you think this feature should be added? I am new to open source, sorry if I am doing anything wrong. > > Hi Viktoriya, welcome! It looks like you're doing everything right, and the reviews so far are positive. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Fri Nov 27 02:20:20 2020 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Fri, 27 Nov 2020 08:20:20 +0100 Subject: [Numpy-discussion] Showing by examples how Python-Numpy can be efficient even for computationally intensive tasks In-Reply-To: <304723263.47365520.1606425280293.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <1941664094.43615172.1606232825156.JavaMail.zimbra@univ-grenoble-alpes.fr> <304723263.47365520.1606425280293.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: <20201127082020.2b6dfaef@patagonia> On Thu, 26 Nov 2020 22:14:40 +0100 (CET) PIERRE AUGIER wrote: > I changed the email subject because I'd like to focus less on CO2 (a very interesting subject, but not my focus here) and more on computing... > Hi Pierre, We may turn the problem in another way around: one should more focus on the algorithm than on the programming language. I would like to share with you one example, where we published how to speed-up crystallographic computation written in Python. https://onlinelibrary.wiley.com/iucr/doi/10.1107/S1600576719008471 One referee asked us to validate vs C and Fortran equivalent code. C code was as fast as Pythran or Cython and Fortran was still faster (the std of the Fortran-compiled runtime was much smaller which allows Fortran to be faster by 3 std !) But I consider the difference to be marginal at this level ! If one considers the "Moore law", i.e. the time needed for "performance" to double in different aspects of computing, one gets 18 to 24 month for the number of transistor in a processor, 18 years for the compilers and 2 years (in average) for the development of new algorithms. In this sense one should more focus on the algorithm used. The Table 1 of the article is especially interesting: Pure Python is 10x slower than proper Numpy code, and parallel Pythran is 50x faster than Numpy (on the given computer) but using the proper algorithm, i.e. FFT in this case, is 13000x faster ! So I believe that Python, with its expressivity, helps much in understanding the algorithm and hence to design faster code. Cheers, Jerome From klaus.zimmermann at smhi.se Fri Nov 27 10:10:24 2020 From: klaus.zimmermann at smhi.se (Zimmermann Klaus) Date: Fri, 27 Nov 2020 15:10:24 +0000 Subject: [Numpy-discussion] Add sliding_window_view method to numpy In-Reply-To: References: <1cdb0b720f09845d03ccfdc2e171f98d7e925ee3.camel@sipsolutions.net> <32e8736e-55ed-1155-3da5-003d907c4e65@smhi.se> Message-ID: <98a7ea1f-2296-70e0-feea-8c34c0070301@smhi.se> Hi Ralf, On 26/11/2020 15:17, Ralf Gommers wrote: > On Fri, Nov 6, 2020 at 4:03 PM Zimmermann Klaus > > wrote: > I was just wondering if, of the top your head, an existing, better fit > comes to mind? > > > Not really. Outside of stride_tricks there's nothing that quite fits. > This function is more in scope for something like scipy.signal. Alright, let's keep it as is then. Thanks and cheers Klaus > > Cheers, > Ralf > > > > >? ? ?The reason > >? ? ?from my point of view is that stride tricks is really a > technical (and > >? ? ?slightly ominous) name that might throw of more application > oriented > >? ? ?programmers from finding and using this function. Thinking of my > >? ? ?scientist colleagues, I think those are exactly the kind of > users that > >? ? ?could benefit from such a prototyping tool. > > > > > > That phrasing is one of a number of concerns. NumPy is normally not in > > the business of providing things that are okay as a prototyping tool, > > but are potentially extremely slow (as pointed out in the Notes > section > > of the docstring). A function like that would basically not be the > right > > tool for almost anything in, e.g., SciPy - it requires an iterative > > algorithm. In NumPy we don't prefer performance at all costs, but in > > general it's pretty decent rather than "Numba or Cython may gain you > > 100x here". > > I still think that the performance concern is a bit overblown. Yes, > application with large windows can need more FLOPs by an equally large > factor. But most such applications will use small to moderate windows. > Furthermore, this view focuses only on FLOPs. In my current field of > climate science (and many others), that is almost never the limiting > factor. Memory demands are far more problematic and incidentally, those > are more likely to increase in other methods that require the storage of > ancillary, temporary data. > > > Other issues include: > > 2) It is very specific to NumPy's memory model (as pointed out by you > > and Sebastian) - just like the rest of stride_tricks > Not wrong, but on the other hand, that memory model is not exotic. C, > Fortran, and any number of other languages play very nicely with this, > just as important downstream libraries like dask. > > > 3) It has "view" in the name, which doesn't quite make sense for the > > main namespace (also connected to point 2 above). > Ok. > > > 4) The cost of putting something in the main namespace for other > > array/tensor libraries is large. Maybe other libraries, e.g. CuPy, > Dask, > > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the > > main NumPy namespace as well as possible. This would trigger > discussions > > and likely many person-weeks of work for others. > Agreed. Though I have to say that my whole motivation comes from > corresponding issues in dask that where specifically waiting for (the > older version of) this PR (see [1, 2,...]). But I understand that dask > is effectively much closer to the numpy memory model than, say, CuPy, so > don't take this to mean it should be in the main namespace. > > > 5) It's a useful function, but it's very much on the margins of > NumPy's > > scope. It could easily have gone into, for example, scipy.signal. At > > this point the bar for functions going into the main namespace > should be> (and is) high. > I agree that the bar for the main namespace should be high! > > > All this taken together means it's not even a toss-up for me. If > it were > > just one or two of these points, maybe. But given all the above, I'm > > pretty confident saying "it does not belong in the main namespace". > Again, I am happy with that. > > > Thanks for your thoughts and work! I really appreciate it! > > Cheers > Klaus > > [1] https://github.com/dask/dask/issues/4659 > > [2] https://github.com/pydata/xarray/issues/3608 > > [3] https://github.com/pandas-dev/pandas/issues/26959 > > > > > > > > >? ? ?Cheers > >? ? ?Klaus > > > > > > > >? ? ?[1] > https://github.com/numpy/numpy/pull/17394#issuecomment-700998618 > > >? ? > ? > > >? ? ?[2] > https://github.com/numpy/numpy/pull/17394#discussion_r498215468 > > >? ? > ? > > >? ? ?[3] > https://github.com/numpy/numpy/pull/17394#discussion_r498724340 > > >? ? > ? > > > > >? ? ?On 06/11/2020 01:39, Sebastian Berg wrote: > >? ? ?> On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote: > >? ? ?>> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote: > >? ? ?>>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers < > >? ? ?>>> ralf.gommers at gmail.com > >> > >? ? ?>>> wrote: > >? ? ?>>> > >? ? ?>>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg < > >? ? ?>>>> sebastian at sipsolutions.net > > >> > >? ? ?>>>> wrote: > >? ? ?>>>> > >? ? ?>>>>> Hi all, > >? ? ?>>>>> > >? ? ?>>>>> just a brief note that I merged this proposal: > >? ? ?>>>>> > >? ? ?>>>>>? ? ?https://github.com/numpy/numpy/pull/17394 > > >? ? ? > > >? ? ?>>>>> > >? ? ?>>>>> adding `np.sliding_window_view` into the 1.20 release of > NumPy. > >? ? ?>>>>> > >? ? ?>>>>> There was only one public API change, and that is that the > >? ? ?>>>>> `shape` > >? ? ?>>>>> argument is now called `window_shape`. > >? ? ?>>>>> > >? ? ?>>>>> This is still a good time for feedback in case you have a > >? ? ?>>>>> better > >? ? ?>>>>> idea > >? ? ?>>>>> e.g. for the function or parameter names. > >? ? ?>>>>> > >? ? ?>>>> > >? ? ?>>>> The old PR had this in the lib.stride_tricks namespace. > Seeing it > >? ? ?>>>> in the > >? ? ?>>>> main namespace is unexpected and likely will lead to > >? ? ?>>>> issues/questions, > >? ? ?>>>> given that such an overlapping view is going to do behave > in ways > >? ? ?>>>> the > >? ? ?>>>> average user will be surprised by. It may also lead to > requests > >? ? ?>>>> for > >? ? ?>>>> other > >? ? ?>>>> array/tensor libraries to implement this. I don't see any > >? ? ?>>>> discussion on > >? ? ?>>>> this in PR 17394, it looks like a decision by the PR > author that > >? ? ?>>>> no > >? ? ?>>>> one > >? ? ?>>>> commented on - reconsider that? > >? ? ?>>>> > >? ? ?>>>> Cheers, > >? ? ?>>>> Ralf > >? ? ?>>>> > >? ? ?>>> > >? ? ?>>> +1 let's keep this in the lib.stride_tricks namespace. > >? ? ?>>> > >? ? ?>> > >? ? ?>> I have no reservations against having it in the main > namespace and am > >? ? ?>> happy either way (it can still be exposed later in any > case). It is > >? ? ?>> the > >? ? ?>> conservative choice and maybe it is an uncommon enough > function that > >? ? ?>> it > >? ? ?>> deserves being a bit hidden... > >? ? ?> > >? ? ?> > >? ? ?> In any case, its the safe bet for NumPy 1.20 at least so I > opened > >? ? ?a PR: > >? ? ?> > >? ? ?>? ? ?https://github.com/numpy/numpy/pull/17720 > > >? ? ? > > >? ? ?> > >? ? ?> Name changes, etc. are also possible of course. > >? ? ?> > >? ? ?> I still think it might be nice to find a better place for > this type of > >? ? ?> function that `np.lib.stride_tricks` though, but dunno... > >? ? ?> > >? ? ?> - Sebastian > >? ? ?> > >? ? ?> > >? ? ?> > >? ? ?>> > >? ? ?>> But I am curious, it sounds like you have both very strong > >? ? ?>> reservations, and I would like to understand them better. > >? ? ?>> > >? ? ?>> The behaviour can be surprising, but that is why the > default is a > >? ? ?>> read- > >? ? ?>> only view.? I do not think it is worse than > `np.broadcast_to` in this > >? ? ?>> regard. (It is nowhere near as dangerous as `as_strided`.) > >? ? ?>> > >? ? ?>> It is true that it is specific to NumPy (memory model). So > that is > >? ? ?>> maybe a good enough reason right now.? But I am not sure that > >? ? ?>> stuffing > >? ? ?>> things into a pretty hidden `np.lib.*` namespaces is a > great long > >? ? ?>> term > >? ? ?>> solution either. There is very little useful functionality > hidden > >? ? ?>> away > >? ? ?>> in `np.lib.*` currently. > >? ? ?>> > >? ? ?>> Cheers, > >? ? ?>> > >? ? ?>> Sebastian > >? ? ?>> > >? ? ?>>>> > >? ? ?>>>> > >? ? ?>>>>> Cheers, > >? ? ?>>>>> > >? ? ?>>>>> Sebastian > >? ? ?>>>>> > >? ? ?>>>>> > >? ? ?>>>>> > >? ? ?>>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote: > >? ? ?>>>>>> Hello, > >? ? ?>>>>>> > >? ? ?>>>>>> I would like to draw the attention of this list to PR > #17394 > >? ? ?>>>>>> [1] that > >? ? ?>>>>>> adds the implementation of a sliding window view to numpy. > >? ? ?>>>>>> > >? ? ?>>>>>> Having a sliding window view in numpy is a longstanding > open > >? ? ?>>>>>> issue > >? ? ?>>>>>> (cf > >? ? ?>>>>>> #7753 [2] from 2016). A brief summary of the discussions > >? ? ?>>>>>> surrounding > >? ? ?>>>>>> it > >? ? ?>>>>>> can be found in the description of the PR. > >? ? ?>>>>>> > >? ? ?>>>>>> This PR implements a sliding window view based on stride > >? ? ?>>>>>> tricks. > >? ? ?>>>>>> Following the discussion in issue #7753, a first > >? ? ?>>>>>> implementation > >? ? ?>>>>>> was > >? ? ?>>>>>> provided by Fanjin Zeng in PR #10771. After some > discussion, > >? ? ?>>>>>> that PR > >? ? ?>>>>>> stalled and I picked up the issue in the present PR #17394. > >? ? ?>>>>>> It > >? ? ?>>>>>> is > >? ? ?>>>>>> based > >? ? ?>>>>>> on the first implementation, but follows the changed API as > >? ? ?>>>>>> suggested > >? ? ?>>>>>> by > >? ? ?>>>>>> Eric Wieser. > >? ? ?>>>>>> > >? ? ?>>>>>> Code reviews have been provided by Bas van Beek, Stephen > >? ? ?>>>>>> Hoyer, > >? ? ?>>>>>> and > >? ? ?>>>>>> Eric > >? ? ?>>>>>> Wieser. Sebastian Berg added the "62 - Python API" label. > >? ? ?>>>>>> > >? ? ?>>>>>> > >? ? ?>>>>>> Do you think this is suitable for inclusion in numpy? > >? ? ?>>>>>> > >? ? ?>>>>>> Do you consider the PR ready? > >? ? ?>>>>>> > >? ? ?>>>>>> Do you have suggestions or requests? > >? ? ?>>>>>> > >? ? ?>>>>>> > >? ? ?>>>>>> Thanks for your time and consideration! > >? ? ?>>>>>> Klaus > >? ? ?>>>>>> > >? ? ?>>>>>> > >? ? ?>>>>>> [1] https://github.com/numpy/numpy/pull/17394 > > >? ? ? > > >? ? ?>>>>>> [2] https://github.com/numpy/numpy/issues/7753 > > >? ? ? > > >? ? ?>>>>>> _______________________________________________ > >? ? ?>>>>>> NumPy-Discussion mailing list > >? ? ?>>>>>> NumPy-Discussion at python.org > > > > >? ? ?>>>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?>>>>>> > >? ? ?>>>>> > >? ? ?>>>>> _______________________________________________ > >? ? ?>>>>> NumPy-Discussion mailing list > >? ? ?>>>>> NumPy-Discussion at python.org > > > > >? ? ?>>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?>>>>> > >? ? ?>>>> _______________________________________________ > >? ? ?>>>> NumPy-Discussion mailing list > >? ? ?>>>> NumPy-Discussion at python.org > > > > >? ? ?>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?>>>> > >? ? ?>>> > >? ? ?>>> _______________________________________________ > >? ? ?>>> NumPy-Discussion mailing list > >? ? ?>>> NumPy-Discussion at python.org > > > > >? ? ?>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?>> > >? ? ?>> _______________________________________________ > >? ? ?>> NumPy-Discussion mailing list > >? ? ?>> NumPy-Discussion at python.org > > > > >? ? ?>> https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?> > >? ? ?> > >? ? ?> _______________________________________________ > >? ? ?> NumPy-Discussion mailing list > >? ? ?> NumPy-Discussion at python.org > > > > >? ? ?> https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > >? ? ?> > >? ? ?_______________________________________________ > >? ? ?NumPy-Discussion mailing list > >? ? ?NumPy-Discussion at python.org > > > > >? ? ?https://mail.python.org/mailman/listinfo/numpy-discussion > > >? ? ? > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Fri Nov 27 11:04:54 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 27 Nov 2020 09:04:54 -0700 Subject: [Numpy-discussion] NumPy master branch is now open for 1.21 development Message-ID: Hi All, The maintenance/1.20.x has been made and master is now open for 1.21 development. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: