From ralf.gommers at gmail.com Mon Jan 1 20:39:10 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 2 Jan 2018 14:39:10 +1300 Subject: [Numpy-discussion] NEP process PR In-Reply-To: References: Message-ID: On Wed, Dec 13, 2017 at 11:45 AM, Jarrod Millman wrote: > Hi all, > > I've started working on the proposal discussed in this thread: > https://mail.python.org/pipermail/numpy-discussion/ > 2017-December/077481.html > here: > https://github.com/numpy/numpy/pull/10213 Thanks Jarrod! Stefan, Marten and I reviewed in the meantime, and all looks good. Probably good to accept the PR in a few days unless there are other comments by then. Ralf > > You can see how I modified PEP 1 here: > https://github.com/numpy/numpy/pull/10213/commits/ > eaf788940dee7d0f1c7922fac70a87144de89656 > > I used numbers (i.e., ``nep-0000.rst``) in the file names, since it > seems more standard and numbers are easier to refer to in discussions. > I don't have a strong preference, so I am happy to change it. If > using numbers seems reasonable, I will number the existing NEPs. > Moreover, if the preamble seems reasonable, I will go through the > existing NEPs and make sure they all have compliant headers. For now, > I think auto-generating the index is unnecessary. Once we are happy > with the purpose and template NEPs as well as automatically publish to > GH pages, we can always go back and write a little script to > autogenerate the index using the preamble information. > > Finally, I started preparing to host the NEPs online at: > http://numpy.github.io/neps > If that seems reasonable, I will need someone to create a neps repo. > > We also need to decide whether to move ``numpy/doc/neps`` to the > master branch of the new neps repo or whether to leave it where it is. > I don't have a strong opinion either way. However, if no one else > minds leaving it where it is, not moving it is slightly less work. > Either way, the work is trivial. Regardless of where the source > resides, we can host the generated web page in the same location > (i.e., http://numpy.github.io/neps). > > It is probably also worth having > https://docs.scipy.org/doc/numpy/neps/ > redirect to > http://numpy.github.io/neps > > Best regards, > Jarrod > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jo7ueb at gmail.com Tue Jan 2 10:37:16 2018 From: jo7ueb at gmail.com (Yasunori Endo) Date: Tue, 02 Jan 2018 15:37:16 +0000 Subject: [Numpy-discussion] Direct GPU support on NumPy Message-ID: Hi I recently started working with Python and GPU, found that there're lot's of libraries provides ndarray like interface such as CuPy/PyOpenCL/PyCUDA/etc. I got so confused which one to use. Is there any reason not to support GPU computation directly on the NumPy itself? I want NumPy to support GPU computation as a standard. If the reason is just about human resources, I'd like to try implementing GPU support on my NumPy fork. My goal is to create standard NumPy interface which supports both CUDA and OpenCL, and more devices if available. Are there other reason not to support GPU on NumPy? Thanks. -- Yasunori Endo -------------- next part -------------- An HTML attachment was scrubbed... URL: From lev at columbia.edu Tue Jan 2 12:49:34 2018 From: lev at columbia.edu (Lev E Givon) Date: Tue, 2 Jan 2018 12:49:34 -0500 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: Message-ID: On Tue, Jan 2, 2018 at 10:37 AM, Yasunori Endo wrote: > Hi > > I recently started working with Python and GPU, > found that there're lot's of libraries provides > ndarray like interface such as CuPy/PyOpenCL/PyCUDA/etc. > I got so confused which one to use. > > Is there any reason not to support GPU computation > directly on the NumPy itself? > I want NumPy to support GPU computation as a standard. > > If the reason is just about human resources, > I'd like to try implementing GPU support on my NumPy fork. > My goal is to create standard NumPy interface which supports > both CUDA and OpenCL, and more devices if available. > > Are there other reason not to support GPU on NumPy? > > Thanks. > -- > Yasunori Endo Check out numba - it may already address some of your needs: https://numba.pydata.org/ -- Lev E. Givon, PhD http://lebedov.github.io From jerome.kieffer at esrf.fr Tue Jan 2 15:22:01 2018 From: jerome.kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 2 Jan 2018 21:22:01 +0100 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: Message-ID: <20180102212201.69978472@mac13.esrf.fr> On Tue, 02 Jan 2018 15:37:16 +0000 Yasunori Endo wrote: > If the reason is just about human resources, > I'd like to try implementing GPU support on my NumPy fork. > My goal is to create standard NumPy interface which supports > both CUDA and OpenCL, and more devices if available. I think this initiative already exists ... something which merges the approach of cuda and opencl but I have no idea on the momentum behind it. > Are there other reason not to support GPU on NumPy? yes. Matlab has such support and the performances gain are in the order of 2x vs 10x when addressing the GPU directly. All the time is spent in sending data back & forth. Numba is indeed a good candidate bu limited to the PTX assembly (i.e. cuda, hence nvidia hardware) Cheers, Jerome From stefan at seefeld.name Tue Jan 2 15:38:41 2018 From: stefan at seefeld.name (Stefan Seefeld) Date: Tue, 2 Jan 2018 15:38:41 -0500 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: <20180102212201.69978472@mac13.esrf.fr> References: <20180102212201.69978472@mac13.esrf.fr> Message-ID: On 02.01.2018 15:22, Jerome Kieffer wrote: > On Tue, 02 Jan 2018 15:37:16 +0000 > Yasunori Endo wrote: > >> If the reason is just about human resources, >> I'd like to try implementing GPU support on my NumPy fork. >> My goal is to create standard NumPy interface which supports >> both CUDA and OpenCL, and more devices if available. > I think this initiative already exists ... something which merges the > approach of cuda and opencl but I have no idea on the momentum behind > it. > >> Are there other reason not to support GPU on NumPy? > yes. Matlab has such support and the performances gain are in the order > of 2x vs 10x when addressing the GPU directly. All the time is spent in > sending data back & forth. Numba is indeed a good candidate bu limited > to the PTX assembly (i.e. cuda, hence nvidia hardware) This suggests a new, higher-level data model which supports replicating data into different memory spaces (e.g. host and GPU). Then users (or some higher layer in the software stack) can dispatch operations to suitable implementations to minimize data movement. Given NumPy's current raw-pointer C API this seems difficult to implement, though, as it is very hard to track memory aliases. Regards, Stefan -- ...ich hab' noch einen Koffer in Berlin... -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.png Type: image/png Size: 1478 bytes Desc: not available URL: From jo7ueb at gmail.com Tue Jan 2 16:21:03 2018 From: jo7ueb at gmail.com (Yasunori Endo) Date: Tue, 02 Jan 2018 21:21:03 +0000 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr> Message-ID: Hi all Numba looks so nice library to try. Thanks for the information. This suggests a new, higher-level data model which supports replicating > data into different memory spaces (e.g. host and GPU). Then users (or some > higher layer in the software stack) can dispatch operations to suitable > implementations to minimize data movement. > > Given NumPy's current raw-pointer C API this seems difficult to implement, > though, as it is very hard to track memory aliases. > I understood modifying numpy.ndarray for GPU is technically difficult. So my next primitive question is why NumPy doesn't offer ndarray like interface (e.g. numpy.gpuarray)? I wonder why everybody making *separate* library, making user confused. Is there any policy that NumPy refuse standard GPU implementation? Thanks. -- Yasunori Endo -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Tue Jan 2 16:36:30 2018 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 2 Jan 2018 22:36:30 +0100 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr>

Message-ID: Hi, Let's say that Numpy provides a GPU version on GPU. How would that work with all the packages that expect the memory to be allocated on CPU? It's not that Numpy refuses a GPU implementation, it's that it wouldn't solve the problem of GPU/CPU having different memory. When/if nVidia decides (finally) that memory should be also accessible from the CPU (like AMD APU), then this argument is actually void. Matthieu 2018-01-02 22:21 GMT+01:00 Yasunori Endo : > Hi all > > Numba looks so nice library to try. > Thanks for the information. > > This suggests a new, higher-level data model which supports replicating >> data into different memory spaces (e.g. host and GPU). Then users (or some >> higher layer in the software stack) can dispatch operations to suitable >> implementations to minimize data movement. >> >> Given NumPy's current raw-pointer C API this seems difficult to >> implement, though, as it is very hard to track memory aliases. >> > I understood modifying numpy.ndarray for GPU is technically difficult. > > So my next primitive question is why NumPy doesn't offer > ndarray like interface (e.g. numpy.gpuarray)? > I wonder why everybody making *separate* library, making user confused. > Is there any policy that NumPy refuse standard GPU implementation? > > Thanks. > > > -- > Yasunori Endo > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Quantitative analyst, Ph.D. Blog: http://blog.audio-tk.com/ LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jan 2 16:38:40 2018 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jan 2018 13:38:40 -0800 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr>

Message-ID: On Tue, Jan 2, 2018 at 1:21 PM, Yasunori Endo wrote: > > Hi all > > Numba looks so nice library to try. > Thanks for the information. > >> This suggests a new, higher-level data model which supports replicating data into different memory spaces (e.g. host and GPU). Then users (or some higher layer in the software stack) can dispatch operations to suitable implementations to minimize data movement. >> >> Given NumPy's current raw-pointer C API this seems difficult to implement, though, as it is very hard to track memory aliases. > > I understood modifying numpy.ndarray for GPU is technically difficult. > > So my next primitive question is why NumPy doesn't offer > ndarray like interface (e.g. numpy.gpuarray)? > I wonder why everybody making *separate* library, making user confused. Because there is no settled way to do this. All of those separate library implementations are trying different approaches. We are learning from each of their attempts. They can each move at their own pace rather than being tied down to numpy's slow rate of development and strict backwards compatibility requirements. They can try new things and aren't limited to their first mistakes. The user may well be confused by all of the different options currently available. I don't think that's avoidable: there are lots of meaningful options. Picking just one to stick into numpy is a disservice to the community that needs the other options. > Is there any policy that NumPy refuse standard GPU implementation? Not officially, but I'm pretty sure that there is no appetite among the developers for incorporating proper GPU support into numpy (by which I mean that a user would build numpy with certain settings then make use of the GPU using just numpy APIs). numpy is a mature project with a relatively small development team. Much of that effort is spent more on maintenance than new development. What there is appetite for is to listen to the needs of the GPU-using libraries and making sure that numpy's C and Python APIs are flexible enough to do what the GPU libraries need. This ties into the work that's being done to make ndarray subclasses better and formalizing the notions of an "array-like" interface that things like pandas Series, etc. can implement and play well with the rest of numpy. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at seefeld.name Tue Jan 2 17:04:14 2018 From: stefan at seefeld.name (Stefan Seefeld) Date: Tue, 2 Jan 2018 17:04:14 -0500 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr>

Message-ID: <823d3343-886f-2b9d-69b5-ecfffc81bcc0@seefeld.name> On 02.01.2018 16:36, Matthieu Brucher wrote: > Hi, > > Let's say that Numpy provides a GPU version on GPU. How would that > work with all the packages that expect the memory to be allocated on CPU? > It's not that Numpy refuses a GPU implementation, it's that it > wouldn't solve the problem of GPU/CPU having different memory. When/if > nVidia decides (finally) that memory should be also accessible from > the CPU (like AMD APU), then this argument is actually void. I actually doubt that. Sure, having a unified memory is convenient for the programmer. But as long as copying data between host and GPU is orders of magnitude slower than copying data locally, performance will suffer. Addressing this performance issue requires some NUMA-like approach, moving the operation to where the data resides, rather than treating all data locations equal. Stefan -- ...ich hab' noch einen Koffer in Berlin... -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.png Type: image/png Size: 1478 bytes Desc: not available URL: From harrigan.matthew at gmail.com Tue Jan 2 20:35:38 2018 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Tue, 2 Jan 2018 20:35:38 -0500 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: <823d3343-886f-2b9d-69b5-ecfffc81bcc0@seefeld.name> References: <20180102212201.69978472@mac13.esrf.fr>

<823d3343-886f-2b9d-69b5-ecfffc81bcc0@seefeld.name> Message-ID: Is it possible to have NumPy use a BLAS/LAPACK library that is GPU accelerated for certain problems? Any recommendations or readme's on how that might be set up? The other packages are nice but I would really love to just use scipy/sklearn and have decompositions, factorizations, etc for big matrices go a little faster without recoding the algorithms. Thanks On Tue, Jan 2, 2018 at 5:04 PM, Stefan Seefeld wrote: > On 02.01.2018 16:36, Matthieu Brucher wrote: > > Hi, > > Let's say that Numpy provides a GPU version on GPU. How would that work > with all the packages that expect the memory to be allocated on CPU? > It's not that Numpy refuses a GPU implementation, it's that it wouldn't > solve the problem of GPU/CPU having different memory. When/if nVidia > decides (finally) that memory should be also accessible from the CPU (like > AMD APU), then this argument is actually void. > > > I actually doubt that. Sure, having a unified memory is convenient for the > programmer. But as long as copying data between host and GPU is orders of > magnitude slower than copying data locally, performance will suffer. > Addressing this performance issue requires some NUMA-like approach, moving > the operation to where the data resides, rather than treating all data > locations equal. > > [image: Stefan] > > -- > > ...ich hab' noch einen Koffer in Berlin... > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.png Type: image/png Size: 1478 bytes Desc: not available URL: From lev at columbia.edu Tue Jan 2 20:56:21 2018 From: lev at columbia.edu (Lev E Givon) Date: Tue, 2 Jan 2018 20:56:21 -0500 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr>

<823d3343-886f-2b9d-69b5-ecfffc81bcc0@seefeld.name> Message-ID: On Jan 2, 2018 8:35 PM, "Matthew Harrigan" wrote: Is it possible to have NumPy use a BLAS/LAPACK library that is GPU accelerated for certain problems? Any recommendations or readme's on how that might be set up? The other packages are nice but I would really love to just use scipy/sklearn and have decompositions, factorizations, etc for big matrices go a little faster without recoding the algorithms. Thanks Depending on what operation you want to accelerate, scikit-cuda may provide a scipy-like interface to GPU-based implementations that you can use. It isn't a drop-in replacement for numpy/scipy, however. http://github.com/lebedov/scikit-cuda L -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Wed Jan 3 02:00:54 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 3 Jan 2018 08:00:54 +0100 Subject: [Numpy-discussion] Direct GPU support on NumPy In-Reply-To: References: <20180102212201.69978472@mac13.esrf.fr>

<823d3343-886f-2b9d-69b5-ecfffc81bcc0@seefeld.name> Message-ID: <20180103070054.GH754811@phare.normalesup.org> > The other packages are nice but I would really love to just use scipy/ > sklearn and have decompositions, factorizations, etc for big matrices > go a little faster without recoding the algorithms.? Thanks If you have very big matrices, scikit-learn's PCA already uses randomized linear algebra, which buys you more than GPUs. Ga?l From stefanv at berkeley.edu Thu Jan 4 15:43:16 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 04 Jan 2018 12:43:16 -0800 Subject: [Numpy-discussion] NEP process PR In-Reply-To: References:

Message-ID: <1515098596.2948656.1224548360.0B291606@webmail.messagingengine.com> On Mon, Jan 1, 2018, at 17:39, Ralf Gommers wrote: > > On Wed, Dec 13, 2017 at 11:45 AM, Jarrod Millman > wrote:>> Hi all, >> >> I've started working on the proposal discussed in this thread: >> https://mail.python.org/pipermail/numpy-discussion/2017-December/077481.html>> here: >> https://github.com/numpy/numpy/pull/10213 > > Thanks Jarrod! > > Stefan, Marten and I reviewed in the meantime, and all looks good. > Probably good to accept the PR in a few days unless there are other > comments by then. Since the PR has been merged, I went ahead and created the two relevant repos, `numpy/neps` and `numpy/devdocs`, and gave Jarrod admin access. He'll take it from here, but feel free to change names etc. as needed. St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Thu Jan 4 20:02:02 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 04 Jan 2018 17:02:02 -0800 Subject: [Numpy-discussion] Position at BIDS (UC Berkeley) to work on NumPy Message-ID: <1515114122.3037386.1224777800.7DFD06B2@webmail.messagingengine.com> Hi everyone, Chuck suggested that I send a reminder that the Berkeley Institute for Data Science (BIDS) is hiring scientific Python Developers to contribute to NumPy. You can read more about the new positions here: https://bids.berkeley.edu/news/bids-receives-sloan-foundation-grant-contribute-numpy-development If you enjoy collaborative work as well as the technical challenges posed by numerical computing, this is an excellent opportunity to play a fundamental role in the development of one of the most impactful libraries in the entire Python ecosystem. Best regards St?fan Job link: https://jobsprod.is.berkeley.edu/psc/jobsprod/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_JOB_DTL&Action=A&JobOpeningId=24142&SiteId=1&PostingSeq=1 From allanhaldane at gmail.com Fri Jan 5 14:47:35 2018 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 5 Jan 2018 14:47:35 -0500 Subject: [Numpy-discussion] PowerPC machine available from OSUOSL Message-ID: <1add5c32-40db-c088-34d5-a51da4719902@gmail.com> Hi all, For building and testing Numpy on the PowerPC arch, I've requested and obtained a VM instance at OSUOSL, which provides free hosting for open-source projects. This was suggested by @npanpaliya on github. http://osuosl.org/services/powerdev/request_hosting/ I have an immediate use for it to fix some float128 ppc problems, but it can be useful to other numpy devs too. If you are a numpy dev and want access to a ppc system for testing, I can gladly create an account for you. For now, just send me an email with your desired username and a public ssh key. We can discuss permissions and management depending on interest. === VM details === It is single node, Ubuntu 16.04.1, ppc64 POWER (Big Endian) arch, "m1_small" flavor, with usage policy at http://osuosl.org/services/hosting/policy/ I've installed the packages needed to build and test numpy. I ran tests: Numpy 1.13.3 has 2 unimportant test failures, but as expected 1.14 has ~20 more failures related to float128 reprs. === Other services to Consider === I did not request it but they provide a "Power CI" service which allows automated testing of ppc arch on github. If we want this I can look into it more. We might also consider asking for a node with ppc64le (Little Endian) arch for testing, but maybe the big-endian one is enough. Cheers, Allan From m.h.vankerkwijk at gmail.com Fri Jan 5 17:50:08 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Fri, 5 Jan 2018 17:50:08 -0500 Subject: [Numpy-discussion] PowerPC machine available from OSUOSL In-Reply-To: <1add5c32-40db-c088-34d5-a51da4719902@gmail.com> References: <1add5c32-40db-c088-34d5-a51da4719902@gmail.com> Message-ID: Doing CI on a different architecture, especially big-endian, would seem very useful. (Indeed, I'll look into it for my own project, of reading radio baseband data -- we run it on a BGQ, so more constant checking would be good to have). But it may be that we're a bit too big for CI, and that it is better to test at release time (as I guess you're doing!). -- Marten From charlesr.harris at gmail.com Sat Jan 6 20:00:20 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jan 2018 18:00:20 -0700 Subject: [Numpy-discussion] NumPy 1.14.0 release Message-ID: Hi All, On behalf of the NumPy team, I am pleased to announce NumPy 1.14.0. Numpy 1.14.0 is the result of seven months of work and contains a large number of bug fixes and new features, along with several changes with potential compatibility issues. The major change that users will notice are the stylistic changes in the way numpy arrays and scalars are printed, a change that will affect doctests. See the release notes for details on how to preserve the old style printing when needed. A major decision affecting future development concerns the schedule for dropping Python 2.7 support in the runup to 2020. The decision has been made to support 2.7 for all releases made in 2018, with the last release being designated a long term release with support for bug fixes extending through the end of 2019. Starting from January, 2019 support for 2.7 will be dropped in all new releases. More details can be found in the relevant NEP . This release supports Python 2.7 and 3.4 - 3.6. Wheels for the release are available on PyPI. Source tarballs, zipfiles, release notes, and the changelog are available on github . *Highlights* - The ``np.einsum`` function uses BLAS when possible - ``genfromtxt``, ``loadtxt``, ``fromregex`` and ``savetxt`` can now handle files with arbitrary Python supported encoding. - Major improvements to printing of NumPy arrays and scalars. *New functions* - ``parametrize``: decorator added to numpy.testing - ``chebinterpolate``: Interpolate function at Chebyshev points. - ``format_float_positional`` and ``format_float_scientific`` : format floating-point scalars unambiguously with control of rounding and padding. - ``PyArray_ResolveWritebackIfCopy`` and ``PyArray_SetWritebackIfCopyBase``, new C-API functions useful in achieving PyPy compatibity. *Contributors* A total of 100 people contributed to this release. People with a "+" by their names contributed a patch for the first time. * Alexey Brodkin + * Allan Haldane * Andras Deak + * Andrew Lawson + * Anna Chiara + * Antoine Pitrou * Bernhard M. Wiedemann + * Bob Eldering + * Brandon Carter * CJ Carey * Charles Harris * Chris Lamb * Christoph Boeddeker + * Christoph Gohlke * Daniel Hrisca + * Daniel Smith * Danny Hermes * David Freese * David Hagen * David Linke + * David Schaefer + * Dillon Niederhut + * Egor Panfilov + * Emilien Kofman * Eric Wieser * Erik Bray + * Erik Quaeghebeur + * Garry Polley + * Gunjan + * Han Shen + * Henke Adolfsson + * Hidehiro NAGAOKA + * Hemil Desai + * Hong Xu + * Iryna Shcherbina + * Jaime Fernandez * James Bourbeau + * Jamie Townsend + * Jarrod Millman * Jean Helie + * Jeroen Demeyer + * John Goetz + * John Kirkham * John Zwinck * Jonathan Helmus * Joseph Fox-Rabinovitz * Joseph Paul Cohen + * Joshua Leahy + * Julian Taylor * J?rg D?pfert + * Keno Goertz + * Kevin Sheppard + * Kexuan Sun + * Konrad Kapp + * Kristofor Maynard + * Licht Takeuchi + * Lo?c Est?ve * Lukas Mericle + * Marten van Kerkwijk * Matheus Portela + * Matthew Brett * Matti Picus * Michael Lamparski + * Michael Odintsov + * Michael Schnaitter + * Michael Seifert * Mike Nolta * Nathaniel J. Smith * Nelle Varoquaux + * Nicholas Del Grosso + * Nico Schl?mer + * Oleg Zabluda + * Oleksandr Pavlyk * Pauli Virtanen * Pim de Haan + * Ralf Gommers * Robert T. McGibbon + * Roland Kaufmann * Sebastian Berg * Serhiy Storchaka + * Shitian Ni + * Spencer Hill + * Srinivas Reddy Thatiparthy + * Stefan Winkler + * Stephan Hoyer * Steven Maude + * SuperBo + * Thomas K?ppe + * Toon Verstraelen * Vedant Misra + * Warren Weckesser * Wirawan Purwanto + * Yang Li + * Ziyan Zhou + * chaoyu3 + * orbit-stabilizer + * solarjoe * wufangjie + * xoviat + * ?lie Gouzien + Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jan 7 12:37:58 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 8 Jan 2018 06:37:58 +1300 Subject: [Numpy-discussion] NumPy 1.14.0 release In-Reply-To: References: Message-ID: On Sun, Jan 7, 2018 at 2:00 PM, Charles R Harris wrote: > Hi All, > > On behalf of the NumPy team, I am pleased to announce NumPy 1.14.0. > Thanks for doing the heavy lifting to get this release out the door Chuck! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sun Jan 7 15:59:06 2018 From: allanhaldane at gmail.com (Allan Haldane) Date: Sun, 7 Jan 2018 15:59:06 -0500 Subject: [Numpy-discussion] NumPy 1.14.0 release In-Reply-To: References:

Message-ID: <4e07370e-51ff-03b7-848a-c9080cdb433f@gmail.com> On 01/07/2018 12:37 PM, Ralf Gommers wrote: > > > On Sun, Jan 7, 2018 at 2:00 PM, Charles R Harris > > wrote: > > Hi All, > > On behalf of the NumPy team, I am pleased to announce NumPy 1.14.0. > > > Thanks for doing the heavy lifting to get this release out the door Chuck! > > Ralf Yes, I am always very impressed and appreciative of all the work Chuck does for Numpy. Thank you very much! Allan From njs at pobox.com Sun Jan 7 23:24:41 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Jan 2018 20:24:41 -0800 Subject: [Numpy-discussion] NumPy 1.14.0 release In-Reply-To: <4e07370e-51ff-03b7-848a-c9080cdb433f@gmail.com> References:

<4e07370e-51ff-03b7-848a-c9080cdb433f@gmail.com> Message-ID: On Sun, Jan 7, 2018 at 12:59 PM, Allan Haldane wrote: > On 01/07/2018 12:37 PM, Ralf Gommers wrote: >> >> >> >> On Sun, Jan 7, 2018 at 2:00 PM, Charles R Harris >> > wrote: >> >> Hi All, >> >> On behalf of the NumPy team, I am pleased to announce NumPy 1.14.0. >> >> >> Thanks for doing the heavy lifting to get this release out the door Chuck! >> >> Ralf > > > Yes, I am always very impressed and appreciative of all the work Chuck does > for Numpy. Thank you very much! +1 -- Nathaniel J. Smith -- https://vorpus.org From me.vinob at gmail.com Mon Jan 8 06:20:15 2018 From: me.vinob at gmail.com (Vinodhini Balusamy) Date: Mon, 8 Jan 2018 22:20:15 +1100 Subject: [Numpy-discussion] array - dimension size of 1-D and 2-D examples In-Reply-To: References:

<449981B3-DE05-4595-9A13-C129CFFBC51F@gmail.com> Message-ID: <08FA01E4-A372-4BBA-BFCE-B75669804B3C@gmail.com> Missed this mail. Thanks Derek For the clarification provided. Kind Rgds, Vinodhini > On 31 Dec 2017, at 10:11 am, Derek Homeier wrote: > > On 30 Dec 2017, at 5:38 pm, Vinodhini Balusamy wrote: >> >> Just one more question from the details you have provided which from my understanding strongly seems to be Design >> [DEREK] You cannot create a regular 2-dimensional integer array from one row of length 3 >>> and a second one of length 0. Thus np.array chooses the next most basic type of >>> array it can fit your input data in >> > Indeed, the general philosophy is to preserve the structure and type of your input data > as far as possible, i.e. a list is turned into a 1d-array, a list of lists (or tuples etc?) into > a 2d-array,_ if_ the sequences are of equal length (even if length 1). > As long as there is an unambiguous way to convert the data into an array (see below). > >> Which is the case, only if an second one of length 0 is given. >> What about the case 1 : >>>>> x12 = np.array([[1,2,3]]) >>>>> x12 >> array([[1, 2, 3]]) >>>>> print(x12) >> [[1 2 3]] >>>>> x12.ndim >> 2 >>>>> >>>>> >> This seems to take 2 dimension. > > Yes, structurally this is equivalent to your second example > >> also, >>>> x12 = np.array([[1,2,3],[0,0,0]]) >>>> print(x12) > [[1 2 3] > [0 0 0]] >>>> x12.ndim > 2 > >> I presumed the above case and the case where length 0 is provided to be treated same(I mean same behaviour). >> Correct me if I am wrong. >> > In this case there is no unambiguous way to construct the array - you would need a shape (2, 3) > array to store the two lists with 3 elements in the first list. Obviously x12[0] would be np.array([1,2,3]), > but what should be the value of x12[1], if the second list is empty - it could be zeros, or repeating x12[0], > or simply undefined. np.array([1, 2, 3], [4]]) would be even less clearly defined. > These cases where there is no obvious ?right? way to create the array have usually been discussed at > some length, but I don?t know if this is fully documented in some place. For the essentials, see > > https://docs.scipy.org/doc/numpy/reference/routines.array-creation.html > > note also the upcasting rules if you have e.g. a mix of integers and reals or complex numbers, > and also how to control shape or data type explicitly with the respective keywords. > > Derek > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From njs at pobox.com Tue Jan 9 03:24:17 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 9 Jan 2018 00:24:17 -0800 Subject: [Numpy-discussion] RFC: comments to BLAS committee from numpy/scipy devs Message-ID: Hi all, As mentioned earlier [1][2], there's work underway to revise and update the BLAS standard -- e.g. we might get support for strided arrays and lose xerbla! There's a draft at [3]. They're interested in feedback from users, so I've written up a first draft of comments about what we would like as NumPy/SciPy developers. This is very much a first attempt -- I know we have lots of people who are more expert on BLAS than me on these lists :-). Please let me know what you think. -n [1] https://mail.python.org/pipermail/numpy-discussion/2017-November/077420.html [2] https://mail.python.org/pipermail/scipy-dev/2017-November/022267.html [3] https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdBDvtD5I14QHp9OE/edit ----- # Comments from NumPy / SciPy developers on "A Proposal for a Next-Generation BLAS" These are comments on [A Proposal for a Next-Generation BLAS](https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdBDvtD5I14QHp9OE/edit#) (version as of 2017-12-13), from the perspective of the developers of the NumPy and SciPy libraries. We hope this feedback is useful, and welcome further discussion. ## Who are we? NumPy and SciPy are the two foundational libraries of the Python numerical ecosystem, and one of their duties is to wrap BLAS and expose it for the use of other Python libraries. (NumPy primarily provides a GEMM wrapper, while SciPy exposes more specialized operations.) It's unclear how many users we have exactly, but we certainly ship multiple million copies of BLAS every month, and provide one of the most popular numerical toolkits for both novice and expert users. Looking at the original BLAS and LAPACK interfaces, it often seems that their imagined user is something like a classic supercomputer consumer, who writes code directly in Fortran or C against the BLAS API, and where the person writing the code and running the code are the same. NumPy/SciPy are coming from a very different perspective: our users generally know nothing about the details of the underlying BLAS; they just want to describe their problem in some high-level way, and the library is responsible for making it happen as efficiently as possible, and is often integrated into some larger system (e.g. a real-time analytics platform embedded in a web server). When it comes to our BLAS usage, we mostly use only a small subset of the routines. However, as "consumer software" used by a wide variety of users with differing degress of technical expertise, we're expected to Just Work on a wide variety of systems, and with as many different vendor BLAS libraries as possible. On the other hand, the fact that we're working with Python means we don't tend to worry about small inefficiencies that will be lost in the noise in any case, and are willing to sacrifice some performance to get more reliable operation across our diverse userbase. ## Comments on specific aspects of the proposal ### Data Layout We are **strongly in favor** of the proposal to support arbitrary strided data layouts. Ideally, this would support strides *specified in bytes* (allowing for unaligned data layouts), and allow for truly arbitrary strides, including *zero or negative* values. However, we think it's fine if some of the weirder cases suffer a performance penalty. Rationale: NumPy ? and thus, most of the scientific Python ecosystem ? only has one way of representing an array: the `numpy.ndarray` type, which is an arbitrary dimensional tensor with arbitrary strides. It is common to encounter matrices with non-trivial strides. For example:: # Make a 3-dimensional tensor, 10 x 9 x 8 t = np.zeros((10, 9, 8)) # Considering this as a stack of eight 10x9 matrices, extract the first: mat = t[:, :, 0] Now `mat` has non-trivial strides on both axes. (If running this in a Python interpreter, you can see this by looking at the value of `mat.strides`.) Another case where interesting strides arise is when performing ["broadcasting"](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html), which is the name for NumPy's rules for stretching arrays to make their shapes match. For example, in an expression like:: np.array([1, 2, 3]) + 1 the scalar `1` is "broadcast" to create a vector `[1, 1, 1]`. This is accomplished without allocating memory, by creating a vector with settings length = 3, strides = 0 ? so all the elements share a single location in memory. Similarly, by using negative strides we can reverse an array without allocating memory:: a = np.array([1, 2, 3]) a_flipped = a[::-1] Now `a_flipped` has the value `[3, 2, 1]`, while sharing storage with the array `a = [1, 2, 3]`. Misaligned data is also possible (e.g. an array of 8-byte doubles with a 9-byte stride), though it arises more rarely. (An example of when it might occurs is in an on-disk data format that alternates between storing a double value and then a single byte value, which is then memory-mapped.) While this array representation is very flexible and useful, it makes interfacing with BLAS a challenge: how do you perform a GEMM operation on two arrays with arbitrary strides? Currently, NumPy attempts to detect a number of special cases: if the strides in both arrays imply a column-major layout, then call BLAS directly; if one of them has strides corresponding to a row-major layout, then set the corresponding `transA`/`transB` argument, etc. ? and if all else fails, either copy the data into a contiguous buffer, or else fall back on a naive triple-nested-loop GEMM implementation. (There's also a check where if we can determine through examining the arrays' data pointers and strides that they're actually transposes of each other, then we instead dispatch to SYRK.) So for us, native stride support in the BLAS has two major advantages: - It allows a wider variety of operations to be transparently handled using high-speed kernels. - It reduces the complexity of the NumPy/BLAS interface code. Any kind of stride support produces both of these advantages to some extent. The ideal would be if BLAS supports strides specified in bytes (not elements), and allowing truly arbitrary values, because in this case, we can simply pass through our arrays directly to the library without having to do any fiddly checking at all. Note that for these purposes, it's OK if "weird" arrays pay a speed penalty: if the BLAS didn't support them, we'd just have to fall back on some even worse strategy for handling them ourselves. And this approach would mean that if some BLAS vendors do at some point decide to optimize these cases, then our users will immediately see the advantages. ### Error-reporting We are **strongly in favor** of the draft's proposal to replace XERBLA with proper error codes. XERBLA is fine for one-off programs in Fortran, but awful for wrapper libraries like NumPy/SciPy. ### Handling of exceptional values A recurring issue in low-level linear-algebra libraries is that they may crash, freeze, or produce unexpected results when receiving non-finite inputs. Of course as a wrapper library, NumPy/SciPy can't control what kind of strange data our users might pass in, and we have to produce sensible results regardless, so we sometimes accumulate hacks like having to validate all incoming data for exceptional values We support the proposal's recommendation of "NaN interpretation (4)". We also note that contrary to the note under "Interpretation (1)"; in the R statistical environment NaN does *not* represent missing data. R maintains a clear distinction between "NA" (missing) values and "NaN" (invalid) values. This is important for various statistical procedures such as imputation, where you might want to estimate the true value of an unmeasured variable, but you don't want to estimate the true value of 0/0. For efficiency, they do perform some calculations on NA values by storing them as NaNs with a specific payload; in the R context, therefore, it's particularly valuable if operations **correctly propagate NaN payloads**. NumPy does not yet provide first-class missing value support, but we plan to add it within the next 1-2 years, and when we do it seems likely that we'll have similar needs to R. ### BLAS G2 NumPy has a rich type system, including 16-, 32-, and 64-bit floats (plus extended precision on some platforms), a similar set of complex values, and is further extensible with new types. We have a similarly rich dispatch system for operations that attempts to find the best matching optimized-loop for an arbitrary set of input types. We're thus quite interested in the proposed mixed-type operations, and if implemented we'll probably start transparently taking advantage of them (i.e., in cases where we're currently upcasting to perform a requested operation, we may switch to using the native precision operation without requiring any changes in user code). Two comments, though: 1. The various tricks to make names less redundant (e.g., specifying just one type if they're all the same) are convenient for those calling these routines by hand, but rather inconvenient for those of us who will be generating a table mapping different type combinations to different functions. When designing a compression scheme for names, please remember that the scheme will have to be unambiguously implemented in code by multiple projects. Another option to consider: ask vendors to implement the full "uncompressed" names, and then provide a shim library mapping short "convenience" names to their full form. 2. Regarding the suggestion that the naming scheme will allow for a wide variety of differently typed routines, but that only some subset will be implemented by any particular vendor: we are worried about this subset business. For a library like NumPy that wants to wrap "all" the provided routines, while supporting "all" the different BLAS libraries ... how will this work? **How do we know which routines any particular library provides?** What if new routines are added to the list later? ### Reproducible BLAS This is interesting, but we don't have any particularly useful comments. ### Batch BLAS NumPy/SciPy actually provide batched operations in many cases, currently implemented using a `for` loop around individual calls to BLAS/LAPACK. However, our interface is relatively restricted compared to some of the proposals considered here: generally all we need is the ability to perform an operation on two "stacks" of size-homogenous matrices, with the offset between matrices determined by an arbitrary (potentially zero) stride. ### Fixed-point BLAS This is interesting, but we don't have any particularly useful comments. -- Nathaniel J. Smith -- https://vorpus.org From ilhanpolat at gmail.com Tue Jan 9 06:40:02 2018 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Tue, 9 Jan 2018 12:40:02 +0100 Subject: [Numpy-discussion] [SciPy-Dev] RFC: comments to BLAS committee from numpy/scipy devs In-Reply-To: References: Message-ID: I couldn't find an item to place this but I think ilaenv and also calling the function twice (one with lwork=-1 and reading the optimal block size and the call the function again properly with lwork=) in LAPACK needs to be gotten rid of. That's a major annoyance during the wrapping of LAPACK routines for SciPy. I don't know if this is realistic but the values ilaenv needed can be computed once (or again if hardware is changed) at the install and can be read off by the routines. On Jan 9, 2018 09:25, "Nathaniel Smith" wrote: > Hi all, > > As mentioned earlier [1][2], there's work underway to revise and > update the BLAS standard -- e.g. we might get support for strided > arrays and lose xerbla! There's a draft at [3]. They're interested in > feedback from users, so I've written up a first draft of comments > about what we would like as NumPy/SciPy developers. This is very much > a first attempt -- I know we have lots of people who are more expert > on BLAS than me on these lists :-). Please let me know what you think. > > -n > > [1] https://mail.python.org/pipermail/numpy-discussion/ > 2017-November/077420.html > [2] https://mail.python.org/pipermail/scipy-dev/2017-November/022267.html > [3] https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdB > DvtD5I14QHp9OE/edit > > ----- > > # Comments from NumPy / SciPy developers on "A Proposal for a > Next-Generation BLAS" > > These are comments on [A Proposal for a Next-Generation > BLAS](https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdB > DvtD5I14QHp9OE/edit#) > (version as of 2017-12-13), from the perspective of the developers of > the NumPy and SciPy libraries. We hope this feedback is useful, and > welcome further discussion. > > ## Who are we? > > NumPy and SciPy are the two foundational libraries of the Python > numerical ecosystem, and one of their duties is to wrap BLAS and > expose it for the use of other Python libraries. (NumPy primarily > provides a GEMM wrapper, while SciPy exposes more specialized > operations.) It's unclear how many users we have exactly, but we > certainly ship multiple million copies of BLAS every month, and > provide one of the most popular numerical toolkits for both novice and > expert users. > > Looking at the original BLAS and LAPACK interfaces, it often seems > that their imagined user is something like a classic supercomputer > consumer, who writes code directly in Fortran or C against the BLAS > API, and where the person writing the code and running the code are > the same. NumPy/SciPy are coming from a very different perspective: > our users generally know nothing about the details of the underlying > BLAS; they just want to describe their problem in some high-level way, > and the library is responsible for making it happen as efficiently as > possible, and is often integrated into some larger system (e.g. a > real-time analytics platform embedded in a web server). > > When it comes to our BLAS usage, we mostly use only a small subset of > the routines. However, as "consumer software" used by a wide variety > of users with differing degress of technical expertise, we're expected > to Just Work on a wide variety of systems, and with as many different > vendor BLAS libraries as possible. On the other hand, the fact that > we're working with Python means we don't tend to worry about small > inefficiencies that will be lost in the noise in any case, and are > willing to sacrifice some performance to get more reliable operation > across our diverse userbase. > > ## Comments on specific aspects of the proposal > > ### Data Layout > > We are **strongly in favor** of the proposal to support arbitrary > strided data layouts. Ideally, this would support strides *specified > in bytes* (allowing for unaligned data layouts), and allow for truly > arbitrary strides, including *zero or negative* values. However, we > think it's fine if some of the weirder cases suffer a performance > penalty. > > Rationale: NumPy ? and thus, most of the scientific Python ecosystem ? > only has one way of representing an array: the `numpy.ndarray` type, > which is an arbitrary dimensional tensor with arbitrary strides. It is > common to encounter matrices with non-trivial strides. For example:: > > # Make a 3-dimensional tensor, 10 x 9 x 8 > t = np.zeros((10, 9, 8)) > # Considering this as a stack of eight 10x9 matrices, extract the > first: > mat = t[:, :, 0] > > Now `mat` has non-trivial strides on both axes. (If running this in a > Python interpreter, you can see this by looking at the value of > `mat.strides`.) Another case where interesting strides arise is when > performing ["broadcasting"](https://docs.scipy.org/doc/numpy-1.13.0/ > user/basics.broadcasting.html), > which is the name for NumPy's rules for stretching arrays to make > their shapes match. For example, in an expression like:: > > np.array([1, 2, 3]) + 1 > > the scalar `1` is "broadcast" to create a vector `[1, 1, 1]`. This is > accomplished without allocating memory, by creating a vector with > settings length = 3, strides = 0 ? so all the elements share a single > location in memory. Similarly, by using negative strides we can > reverse an array without allocating memory:: > > a = np.array([1, 2, 3]) > a_flipped = a[::-1] > > Now `a_flipped` has the value `[3, 2, 1]`, while sharing storage with > the array `a = [1, 2, 3]`. Misaligned data is also possible (e.g. an > array of 8-byte doubles with a 9-byte stride), though it arises more > rarely. (An example of when it might occurs is in an on-disk data > format that alternates between storing a double value and then a > single byte value, which is then memory-mapped.) > > While this array representation is very flexible and useful, it makes > interfacing with BLAS a challenge: how do you perform a GEMM operation > on two arrays with arbitrary strides? Currently, NumPy attempts to > detect a number of special cases: if the strides in both arrays imply > a column-major layout, then call BLAS directly; if one of them has > strides corresponding to a row-major layout, then set the > corresponding `transA`/`transB` argument, etc. ? and if all else > fails, either copy the data into a contiguous buffer, or else fall > back on a naive triple-nested-loop GEMM implementation. (There's also > a check where if we can determine through examining the arrays' data > pointers and strides that they're actually transposes of each other, > then we instead dispatch to SYRK.) > > So for us, native stride support in the BLAS has two major advantages: > > - It allows a wider variety of operations to be transparently handled > using high-speed kernels. > > - It reduces the complexity of the NumPy/BLAS interface code. > > Any kind of stride support produces both of these advantages to some > extent. The ideal would be if BLAS supports strides specified in bytes > (not elements), and allowing truly arbitrary values, because in this > case, we can simply pass through our arrays directly to the library > without having to do any fiddly checking at all. Note that for these > purposes, it's OK if "weird" arrays pay a speed penalty: if the BLAS > didn't support them, we'd just have to fall back on some even worse > strategy for handling them ourselves. And this approach would mean > that if some BLAS vendors do at some point decide to optimize these > cases, then our users will immediately see the advantages. > > ### Error-reporting > > We are **strongly in favor** of the draft's proposal to replace XERBLA > with proper error codes. XERBLA is fine for one-off programs in > Fortran, but awful for wrapper libraries like NumPy/SciPy. > > ### Handling of exceptional values > > A recurring issue in low-level linear-algebra libraries is that they > may crash, freeze, or produce unexpected results when receiving > non-finite inputs. Of course as a wrapper library, NumPy/SciPy can't > control what kind of strange data our users might pass in, and we have > to produce sensible results regardless, so we sometimes accumulate > hacks like having to validate all incoming data for exceptional values > > We support the proposal's recommendation of "NaN interpretation (4)". > > We also note that contrary to the note under "Interpretation (1)"; in > the R statistical environment NaN does *not* represent missing data. R > maintains a clear distinction between "NA" (missing) values and "NaN" > (invalid) values. This is important for various statistical procedures > such as imputation, where you might want to estimate the true value of > an unmeasured variable, but you don't want to estimate the true value > of 0/0. For efficiency, they do perform some calculations on NA values > by storing them as NaNs with a specific payload; in the R context, > therefore, it's particularly valuable if operations **correctly > propagate NaN payloads**. NumPy does not yet provide first-class > missing value support, but we plan to add it within the next 1-2 > years, and when we do it seems likely that we'll have similar needs to > R. > > ### BLAS G2 > > NumPy has a rich type system, including 16-, 32-, and 64-bit floats > (plus extended precision on some platforms), a similar set of complex > values, and is further extensible with new types. We have a similarly > rich dispatch system for operations that attempts to find the best > matching optimized-loop for an arbitrary set of input types. We're > thus quite interested in the proposed mixed-type operations, and if > implemented we'll probably start transparently taking advantage of > them (i.e., in cases where we're currently upcasting to perform a > requested operation, we may switch to using the native precision > operation without requiring any changes in user code). > > Two comments, though: > > 1. The various tricks to make names less redundant (e.g., specifying > just one type if they're all the same) are convenient for those > calling these routines by hand, but rather inconvenient for those of > us who will be generating a table mapping different type combinations > to different functions. When designing a compression scheme for names, > please remember that the scheme will have to be unambiguously > implemented in code by multiple projects. Another option to consider: > ask vendors to implement the full "uncompressed" names, and then > provide a shim library mapping short "convenience" names to their full > form. > > 2. Regarding the suggestion that the naming scheme will allow for a > wide variety of differently typed routines, but that only some subset > will be implemented by any particular vendor: we are worried about > this subset business. For a library like NumPy that wants to wrap > "all" the provided routines, while supporting "all" the different BLAS > libraries ... how will this work? **How do we know which routines any > particular library provides?** What if new routines are added to the > list later? > > ### Reproducible BLAS > > This is interesting, but we don't have any particularly useful comments. > > ### Batch BLAS > > NumPy/SciPy actually provide batched operations in many cases, > currently implemented using a `for` loop around individual calls to > BLAS/LAPACK. However, our interface is relatively restricted compared > to some of the proposals considered here: generally all we need is the > ability to perform an operation on two "stacks" of size-homogenous > matrices, with the offset between matrices determined by an arbitrary > (potentially zero) stride. > > ### Fixed-point BLAS > > This is interesting, but we don't have any particularly useful comments. > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Gfeller at swisscom.com Tue Jan 9 07:27:58 2018 From: Martin.Gfeller at swisscom.com (Martin.Gfeller at swisscom.com) Date: Tue, 9 Jan 2018 12:27:58 +0000 Subject: [Numpy-discussion] array - dimension size of 1-D and 2-D examples Message-ID: Hi Derek I have a related question: Given: a = numpy.array([[0,1,2],[3,4]]) assert a.ndim == 1 b = numpy.array([[0,1,2],[3,4,5]]) assert b.ndim == 2 Is there an elegant way to force b to remain a 1-dim object array? I have a use case where normally the sublists are of different lengths, but I get a completely different structure when they are (coincidentally in my case) of the same length. Thanks and best regards, Martin Martin Gfeller, Swisscom / Enterprise / Banking / Products / Quantax Message: 1 Date: Sun, 31 Dec 2017 00:11:48 +0100 From: Derek Homeier To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] array - dimension size of 1-D and 2-D examples Message-ID: Content-Type: text/plain; charset=utf-8 On 30 Dec 2017, at 5:38 pm, Vinodhini Balusamy wrote: > > Just one more question from the details you have provided which from > my understanding strongly seems to be Design [DEREK] You cannot create > a regular 2-dimensional integer array from one row of length 3 >> and a second one of length 0. Thus np.array chooses the next most >> basic type of array it can fit your input data in > Indeed, the general philosophy is to preserve the structure and type of your input data as far as possible, i.e. a list is turned into a 1d-array, a list of lists (or tuples etc?) into a 2d-array,_ if_ the sequences are of equal length (even if length 1). As long as there is an unambiguous way to convert the data into an array (see below). > Which is the case, only if an second one of length 0 is given. > What about the case 1 : > >>> x12 = np.array([[1,2,3]]) > >>> x12 > array([[1, 2, 3]]) > >>> print(x12) > [[1 2 3]] > >>> x12.ndim > 2 > >>> > >>> > This seems to take 2 dimension. Yes, structurally this is equivalent to your second example > also, >>> x12 = np.array([[1,2,3],[0,0,0]]) >>> print(x12) [[1 2 3] [0 0 0]] >>> x12.ndim 2 > I presumed the above case and the case where length 0 is provided to be treated same(I mean same behaviour). > Correct me if I am wrong. > In this case there is no unambiguous way to construct the array - you would need a shape (2, 3) array to store the two lists with 3 elements in the first list. Obviously x12[0] would be np.array([1,2,3]), but what should be the value of x12[1], if the second list is empty - it could be zeros, or repeating x12[0], or simply undefined. np.array([1, 2, 3], [4]]) would be even less clearly defined. These cases where there is no obvious ?right? way to create the array have usually been discussed at some length, but I don?t know if this is fully documented in some place. For the essentials, see https://docs.scipy.org/doc/numpy/reference/routines.array-creation.html note also the upcasting rules if you have e.g. a mix of integers and reals or complex numbers, and also how to control shape or data type explicitly with the respective keywords. Derek From sebastian at sipsolutions.net Tue Jan 9 08:47:41 2018 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 09 Jan 2018 14:47:41 +0100 Subject: [Numpy-discussion] array - dimension size of 1-D and 2-D examples In-Reply-To: References: Message-ID: <1515505661.23088.1.camel@sipsolutions.net> On Tue, 2018-01-09 at 12:27 +0000, Martin.Gfeller at swisscom.com wrote: > Hi Derek > > I have a related question: > > Given: > > a = numpy.array([[0,1,2],[3,4]]) > assert a.ndim == 1 > b = numpy.array([[0,1,2],[3,4,5]]) > assert b.ndim == 2 > > Is there an elegant way to force b to remain a 1-dim object array? > You will have to create an empty object array and assign the lists to it. ``` b = np.empty(len(l), dtype=object) b[...] = l ``` > I have a use case where normally the sublists are of different > lengths, but I get a completely different structure when they are > (coincidentally in my case) of the same length. > > Thanks and best regards, Martin > > > Martin Gfeller, Swisscom / Enterprise / Banking / Products / Quantax > > Message: 1 > Date: Sun, 31 Dec 2017 00:11:48 +0100 > From: Derek Homeier > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] array - dimension size of 1-D and 2-D > examples > Message-ID: > en.de> > Content-Type: text/plain; charset=utf-8 > > On 30 Dec 2017, at 5:38 pm, Vinodhini Balusamy > wrote: > > > > Just one more question from the details you have provided which > > from > > my understanding strongly seems to be Design [DEREK] You cannot > > create > > a regular 2-dimensional integer array from one row of length 3 > > > and a second one of length 0. Thus np.array chooses the next > > > most > > > basic type of array it can fit your input data in > > Indeed, the general philosophy is to preserve the structure and type > of your input data as far as possible, i.e. a list is turned into a > 1d-array, a list of lists (or tuples etc?) into a 2d-array,_ if_ the > sequences are of equal length (even if length 1). > As long as there is an unambiguous way to convert the data into an > array (see below). > > > Which is the case, only if an second one of length 0 is given. > > What about the case 1 : > > > > > x12 = np.array([[1,2,3]]) > > > > > x12 > > > > array([[1, 2, 3]]) > > > > > print(x12) > > > > [[1 2 3]] > > > > > x12.ndim > > > > 2 > > > > > > > > > > > > > > This seems to take 2 dimension. > > Yes, structurally this is equivalent to your second example > > > also, > > > > x12 = np.array([[1,2,3],[0,0,0]]) > > > > print(x12) > > [[1 2 3] > [0 0 0]] > > > > x12.ndim > > 2 > > > I presumed the above case and the case where length 0 is provided > > to be treated same(I mean same behaviour). > > Correct me if I am wrong. > > > > In this case there is no unambiguous way to construct the array - you > would need a shape (2, 3) array to store the two lists with 3 > elements in the first list. Obviously x12[0] would be > np.array([1,2,3]), but what should be the value of x12[1], if the > second list is empty - it could be zeros, or repeating x12[0], or > simply undefined. np.array([1, 2, 3], [4]]) would be even less > clearly defined. > These cases where there is no obvious ?right? way to create the array > have usually been discussed at some length, but I don?t know if this > is fully documented in some place. For the essentials, see > > https://docs.scipy.org/doc/numpy/reference/routines.array-creation.ht > ml > > note also the upcasting rules if you have e.g. a mix of integers and > reals or complex numbers, and also how to control shape or data type > explicitly with the respective keywords. > > Derek > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Tue Jan 9 19:34:04 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 9 Jan 2018 16:34:04 -0800 Subject: [Numpy-discussion] [SciPy-Dev] RFC: comments to BLAS committee from numpy/scipy devs In-Reply-To: References: Message-ID: On Tue, Jan 9, 2018 at 3:40 AM, Ilhan Polat wrote: > I couldn't find an item to place this but I think ilaenv and also calling > the function twice (one with lwork=-1 and reading the optimal block size and > the call the function again properly with lwork=) in LAPACK needs to > be gotten rid of. > > That's a major annoyance during the wrapping of LAPACK routines for SciPy. > > I don't know if this is realistic but the values ilaenv needed can be > computed once (or again if hardware is changed) at the install and can be > read off by the routines. Unfortunately I think this effort is just to revise BLAS, not LAPACK. Maybe you should try starting a conversation with the LAPACK developers though ? I don't know much about how they work but maybe they'd be interested in feedback. -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Tue Jan 9 19:37:18 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 9 Jan 2018 16:37:18 -0800 Subject: [Numpy-discussion] [SciPy-Dev] RFC: comments to BLAS committee from numpy/scipy devs In-Reply-To: References: Message-ID: On Tue, Jan 9, 2018 at 12:53 PM, Tyler Reddy wrote: > One common issue in computational geometry is the need to operate rapidly on > arrays with "heterogeneous shapes." > > So, an array that has rows with different numbers of columns -- shape (1,3) > for the first polygon and shape (1, 12) for the second polygon and so on. > > This seems like a particularly nasty scenario when the loss of "homogeneity" > in shape precludes traditional vectorization -- I think numpy effectively > converts these to dtype=object, etc. I don't > think is necessarily a BLAS issue since wrapping comp. geo. libraries does > happen in a subset of cases to handle this, but if there's overlap in > utility you could pass it along I suppose. You might be interested in this discussion of "Batch BLAS": https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdBDvtD5I14QHp9OE/edit#heading=h.pvsif1mxvaqq I didn't get into it in the draft response, because it didn't seem like something where NumPy/SciPy have any useful experience to offer, but it sounds like there are people worrying about this case. -n -- Nathaniel J. Smith -- https://vorpus.org From andyfaff at gmail.com Wed Jan 10 17:58:41 2018 From: andyfaff at gmail.com (Andrew Nelson) Date: Thu, 11 Jan 2018 09:58:41 +1100 Subject: [Numpy-discussion] Understanding np.min behaviour with nan Message-ID: I'm having some trouble understanding the behaviour of np.min when used with NaN. In the documentation of np.amin it says: "NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. ". It doesn't say that in some circumstances there will be RuntimeWarning's raised (presumably depending on the errstate). Consider the following: >>> import numpy as np >>> np.min([1, np.nan]) /Users/andrew/miniconda3/envs/dev3/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce return umr_minimum(a, axis, None, out, keepdims) nan >>> np.min([1, 2, np.nan]) nan >>> np.min([np.nan, 1, 2]) nan >>> np.min([np.nan, 1]) nan Why is there a RuntimeWarning for the first example, but not the others? Is this expected behaviour? -- _____________________________________ Dr. Andrew Nelson _____________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Wed Jan 10 18:03:25 2018 From: andyfaff at gmail.com (Andrew Nelson) Date: Thu, 11 Jan 2018 10:03:25 +1100 Subject: [Numpy-discussion] Understanding np.min behaviour with nan In-Reply-To: References: Message-ID: Further to my last message, why is the warning only raised once? ``` >>> import numpy as np >>> np.min([1, np.nan]) /Users/andrew/miniconda3/envs/dev3/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce return umr_minimum(a, axis, None, out, keepdims) nan >>> np.min([1, np.nan]) nan >>> ``` -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jan 10 18:21:27 2018 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Jan 2018 15:21:27 -0800 Subject: [Numpy-discussion] Understanding np.min behaviour with nan In-Reply-To: References: Message-ID: On Wed, Jan 10, 2018 at 3:03 PM, Andrew Nelson wrote: > > Further to my last message, why is the warning only raised once? > > ``` > >>> import numpy as np > >>> np.min([1, np.nan]) > /Users/andrew/miniconda3/envs/dev3/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce > return umr_minimum(a, axis, None, out, keepdims) > nan > >>> np.min([1, np.nan]) > nan > >>> > ``` This is default behavior for warnings. https://docs.python.org/3/library/warnings.html#the-warnings-filter It also explains the previous results. The same warning would have been issued from the same place in each of the variations you tried. Since the warnings mechanism had already seen that RuntimeWarning with the same message from the same code location, they were not printed. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From gerrit.holl at gmail.com Wed Jan 10 18:21:57 2018 From: gerrit.holl at gmail.com (Gerrit Holl) Date: Wed, 10 Jan 2018 23:21:57 +0000 Subject: [Numpy-discussion] Understanding np.min behaviour with nan In-Reply-To: References: Message-ID: On 10 January 2018 at 23:03, Andrew Nelson wrote: > Further to my last message, why is the warning only raised once? Warnings are only raised once because that is the default behaviour for warnings. The warning is issued every time but only displayed on the first occurence. You can control this with the warnings module: https://docs.python.org/3.6/library/warnings.html Gerrit. From andyfaff at gmail.com Wed Jan 10 19:24:25 2018 From: andyfaff at gmail.com (Andrew Nelson) Date: Thu, 11 Jan 2018 11:24:25 +1100 Subject: [Numpy-discussion] Understanding np.min behaviour with nan In-Reply-To: References: Message-ID: > The same warning would have been issued from the same place in each of the variations you tried. That's not the case, I tried with np.min([1, 2, 3, np.nan]) in a fresh interpreter and no warning was raised. Furthermore on my work computer (conda python3.6.2 with pip installed numpy) I can't get the problem to show at all: >>> import numpy as np >>> np.version.version '1.14.0' >>> np.min([1., 2., 3., 4., np.nan]) nan >>> np.min([1., 2., 3., np.nan, 4.]) nan >>> np.min([1., 2., np.nan, 3., 4.]) nan >>> np.min([1., np.nan, 2., 3., 4.]) nan >>> np.min([np.nan, 1., 2., 3., 4.]) nan >>> np.min([np.nan, 1.]) nan >>> np.min([np.nan, 1., np.nan]) nan >>> np.min([1., np.nan]) nan >>> np.seterr(all='raise') {'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'} >>> np.min([1., np.nan]) nan >>> np.min([np.nan, 1.]) nan >>> np.min([np.nan, 1., 2., 3., 4.]) nan >>> np.min([np.nan, 1., 2., 3., 4.]) nan The context for these questions is the sudden CI fails I'm observing for scipy on appveyor - https://ci.appveyor.com/project/scipy/scipy/build/1.0.1444/job/n05ptntm0xxjklvt -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jan 11 11:51:21 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jan 2018 09:51:21 -0700 Subject: [Numpy-discussion] Understanding np.min behaviour with nan In-Reply-To: References:

Message-ID: On Wed, Jan 10, 2018 at 5:24 PM, Andrew Nelson wrote: > > The same warning would have been issued from the same place in each of > the variations you tried. > > That's not the case, I tried with np.min([1, 2, 3, np.nan]) in a fresh > interpreter and no warning was raised. > > Furthermore on my work computer (conda python3.6.2 with pip installed > numpy) I can't get the problem to show at all: > > >>> import numpy as np > >>> np.version.version > '1.14.0' > >>> np.min([1., 2., 3., 4., np.nan]) > nan > >>> np.min([1., 2., 3., np.nan, 4.]) > nan > >>> np.min([1., 2., np.nan, 3., 4.]) > nan > >>> np.min([1., np.nan, 2., 3., 4.]) > nan > >>> np.min([np.nan, 1., 2., 3., 4.]) > nan > >>> np.min([np.nan, 1.]) > nan > >>> np.min([np.nan, 1., np.nan]) > nan > >>> np.min([1., np.nan]) > nan > >>> np.seterr(all='raise') > {'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'} > >>> np.min([1., np.nan]) > nan > >>> np.min([np.nan, 1.]) > nan > >>> np.min([np.nan, 1., 2., 3., 4.]) > nan > >>> np.min([np.nan, 1., 2., 3., 4.]) > nan > > > The context for these questions is the sudden CI fails I'm observing for > scipy on appveyor - https://ci.appveyor.com/project/scipy/scipy/build/1.0. > 1444/job/n05ptntm0xxjklvt > Different compilers/libraries respond differently to nans. If the change in scipy is recent, probably compiler flags or something similar has changed in the test environment. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From spluque at gmail.com Fri Jan 12 16:12:16 2018 From: spluque at gmail.com (Seb) Date: Fri, 12 Jan 2018 15:12:16 -0600 Subject: [Numpy-discussion] custom Welch method for power spectral density Message-ID: <87po6e234f.fsf@gmail.com> Hello, I'm trying to compute a power spectral density of a signal, using the Welch method, in the broad sense; i.e. splitting the signal into segments for deriving smoother spectra. This is well implemented in scipy.signal.welch. However, I'd like to use exponentially increasing (power 2) segment length to dampen increasing variance in spectra at higher frequencies. Before hacking the scipy.signal.spectral module for this, I'd appreciate any tips on available packages/modules that allow for this kind of binning scheme, or other suggestions. Thanks, -- Seb From robert.kern at gmail.com Fri Jan 12 16:32:11 2018 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Jan 2018 13:32:11 -0800 Subject: [Numpy-discussion] custom Welch method for power spectral density In-Reply-To: <87po6e234f.fsf@gmail.com> References: <87po6e234f.fsf@gmail.com> Message-ID: On Fri, Jan 12, 2018 at 1:12 PM, Seb wrote: > > Hello, > > I'm trying to compute a power spectral density of a signal, using the > Welch method, in the broad sense; i.e. splitting the signal into > segments for deriving smoother spectra. This is well implemented in > scipy.signal.welch. However, I'd like to use exponentially increasing > (power 2) segment length to dampen increasing variance in spectra at > higher frequencies. Before hacking the scipy.signal.spectral module for > this, I'd appreciate any tips on available packages/modules that allow > for this kind of binning scheme, or other suggestions. Not entirely sure about this kind of binning scheme per se, but you may want to look at multitaper spectral estimation methods. The Welch method can be viewed as a poor-man's multitaper. Multitaper methods give you better control over the resolution/variance tradeoff that may help with your problem. Googling for "python multitaper" gives you several options; I haven't used any of them in anger, so I don't have a single recommendation for you. The nitime documentation provides more information about multitaper methods that may be useful to you: http://nipy.org/nitime/examples/multi_taper_spectral_estimation.html -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From spluque at gmail.com Sat Jan 13 00:08:01 2018 From: spluque at gmail.com (Seb) Date: Fri, 12 Jan 2018 23:08:01 -0600 Subject: [Numpy-discussion] custom Welch method for power spectral density In-Reply-To: (Robert Kern's message of "Fri, 12 Jan 2018 13:32:11 -0800") References: <87po6e234f.fsf@gmail.com> Message-ID: <878td2ny6m.fsf@otaria.sebmel.org> On Fri, 12 Jan 2018 13:32:11 -0800, Robert Kern wrote: [...] > Not entirely sure about this kind of binning scheme per se, but you > may want to look at multitaper spectral estimation methods. The Welch > method can be viewed as a poor-man's multitaper. Multitaper methods > give you better control over the resolution/variance tradeoff that may > help with your problem. Googling for "python multitaper" gives you > several options; I haven't used any of them in anger, so I don't have > a single recommendation for you. The nitime documentation provides > more information about multitaper methods that may be useful to you: > http://nipy.org/nitime/examples/multi_taper_spectral_estimation.html Very interesting documentation and suggestion. A test application of this looks promising. Thanks, -- Seb From josef.pktd at gmail.com Sat Jan 13 16:25:43 2018 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Jan 2018 16:25:43 -0500 Subject: [Numpy-discussion] NumPy 1.14.0 release In-Reply-To: References:

<4e07370e-51ff-03b7-848a-c9080cdb433f@gmail.com> Message-ID: statsmodels does not work with numpy 1.4.0 Besides the missing WarningsManager there seems to be 22 errors or failures from changes in numpy behavior, mainly from recarrays again. Josef From ben.v.root at gmail.com Sat Jan 13 16:49:31 2018 From: ben.v.root at gmail.com (Benjamin Root) Date: Sat, 13 Jan 2018 16:49:31 -0500 Subject: [Numpy-discussion] NumPy 1.14.0 release In-Reply-To: References: