From robert.kern at gmail.com Sat Jul 1 00:08:05 2017 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 30 Jun 2017 21:08:05 -0700 Subject: [Numpy-discussion] proposed changes to array printing in 1.14 In-Reply-To: <0ac942e0-c7e9-41f5-8dc3-3993f9ec34c4@Spark> References: <5bc52c06-fbe7-2c34-8695-6cc40de38610@gmail.com> <20170630131720.GE4115401@phare.normalesup.org> <0c381b38-0182-0305-edbb-3cacfd2dc997@gmail.com> <20170630234721.GK4115401@phare.normalesup.org> <0ac942e0-c7e9-41f5-8dc3-3993f9ec34c4@Spark> Message-ID: On Fri, Jun 30, 2017 at 7:23 PM, Juan Nunez-Iglesias wrote: > I do have sympathy for Ralf?s argument that "exact repr's are not part of the NumPy (or Python for that matter) backwards compatibility guarantees?. But it is such a foundational project in Scientific Python that I think extreme care is warranted, beyond any official guarantees. (Hence this thread, yes. Thank you!) I would also like to make another distinction here: I don't think anyone's actual *code* has broken because of this change. To my knowledge, it is only downstream projects' *doctests* that break. This might deserve *some* care on our part (beyond notification and keeping it out of a 1.x.y bugfix release), but "extreme care" is just not warranted. > Anyway, all this is (mostly) moot if the next NumPy ships with this doctest++ thingy. That would be an enormously valuable contribution to the whole ecosystem. I'd recommend just making an independent project on Github and posting it as its own project to PyPI when you think it's ready. We'll link to it in our documentation. I don't think that it ought to be part of numpy and stuck on our release cadence. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 1 07:12:56 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 1 Jul 2017 23:12:56 +1200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> Message-ID: On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen wrote: > Charles R Harris kirjoitti 29.06.2017 klo 20:45: > > Here's a random idea: how about building a NumPy gallery? > > scikit-{image,learn} has it, and while those projects may have more > > visual datasets, I can imagine something along the lines of Nicolas > > Rougier's beautiful book: > > > > http://www.labri.fr/perso/nrougier/from-python-to-numpy/ > > > > > > > > So that would be added in the numpy > > /numpy.org > > repo? > > Or https://scipy-cookbook.readthedocs.io/ ? > (maybe minus bitrot and images added :) > _____________________________________ I'd like the numpy.org one. numpy.org is now incredibly sparse and ugly, a gallery would make it look a lot better. Another idea, from the "deprecate np.matrix" discussion: add numpy documentation describing the preferred way to handle matrices, extolling the virtues of @, and move np.matrix documentation to a deprecated section. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 1 18:31:47 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Jul 2017 16:31:47 -0600 Subject: [Numpy-discussion] Vector stacks Message-ID: Hi All, The '@' operator works well with stacks of matrices, but not with stacks of vectors. Given the recent addition of '__array_ufunc__', and the intent to make `__matmul__` use a ufunc, I've been wondering is it would make sense to add ndarray subclasses 'rvec' and 'cvec' that would override that operator so as to behave like stacks of row/column vectors. Any other ideas for the solution to stacked vectors are welcome. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Sat Jul 1 18:38:56 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Sat, 01 Jul 2017 22:38:56 +0000 Subject: [Numpy-discussion] Vector stacks In-Reply-To: References: Message-ID: What would these classes offer over these simple functions: def rvec(x): return x[...,np.newaxis,:]def cvec(x): return x[...,:,np.newaxis] That also makes rvec(x) + cvec(y) behave in the least surprising way, with no extra work Eric ? On Sat, 1 Jul 2017 at 23:32 Charles R Harris wrote: > Hi All, > > The '@' operator works well with stacks of matrices, but not with stacks > of vectors. Given the recent addition of '__array_ufunc__', and the intent > to make `__matmul__` use a ufunc, I've been wondering is it would make > sense to add ndarray subclasses 'rvec' and 'cvec' that would override that > operator so as to behave like stacks of row/column vectors. Any other ideas > for the solution to stacked vectors are welcome. > > Thoughts? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Jul 1 18:53:07 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 1 Jul 2017 15:53:07 -0700 Subject: [Numpy-discussion] Vector stacks In-Reply-To: References: Message-ID: On Sat, Jul 1, 2017 at 3:31 PM, Charles R Harris wrote: > Hi All, > > The '@' operator works well with stacks of matrices, but not with stacks of > vectors. Given the recent addition of '__array_ufunc__', and the intent to > make `__matmul__` use a ufunc, I've been wondering is it would make sense to > add ndarray subclasses 'rvec' and 'cvec' that would override that operator > so as to behave like stacks of row/column vectors. Any other ideas for the > solution to stacked vectors are welcome. I feel like the lesson of np.matrix is that subclassing ndarray to change the meaning of basic operators creates more problems than it solves? Some alternatives include: - if you specifically want a stack of row vectors or column vectors, insert a new axis at position -1 or -2 - if you want a stack of 1d vectors that automatically act as rows on the left of @ and columns on the right, then we could have vecvec, matvec, vecmat gufuncs that do that -- which isn't quite as terse as @, but not everything can be and at least it'd be explicit what was going on. -n -- Nathaniel J. Smith -- https://vorpus.org From m.h.vankerkwijk at gmail.com Sat Jul 1 19:45:56 2017 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 1 Jul 2017 19:45:56 -0400 Subject: [Numpy-discussion] Vector stacks In-Reply-To: References: Message-ID: I'm not sure there is *that* much against a class that basically just passes through views of itself inside `__matmul__` and `__rmatmul__` or calls new gufuncs, but I think the lower hurdle is to first get those gufuncs implemented. -- Marten From jni.soma at gmail.com Sat Jul 1 20:31:33 2017 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Sun, 2 Jul 2017 10:31:33 +1000 Subject: [Numpy-discussion] Vector stacks In-Reply-To: References: Message-ID: <27036cbd-2c32-4106-a739-771a73e99a5a@Spark> I?m with Nathaniel on this one. Subclasses make code harder to read and reason about because you now have to be sure of the exact type of things that users are passing you ? which are array-like but subtly different. On 2 Jul 2017, 9:46 AM +1000, Marten van Kerkwijk , wrote: > I'm not sure there is *that* much against a class that basically just > passes through views of itself inside `__matmul__` and `__rmatmul__` > or calls new gufuncs, but I think the lower hurdle is to first get > those gufuncs implemented. > -- Marten > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Sat Jul 1 21:08:48 2017 From: ben.v.root at gmail.com (Benjamin Root) Date: Sat, 1 Jul 2017 21:08:48 -0400 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> Message-ID: Just a heads-up. There is now a sphinx-gallery plugin. Matplotlib and a few other projects have migrated their docs over to use it. https://sphinx-gallery.readthedocs.io/en/latest/ Cheers! Ben Root On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers wrote: > > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen wrote: > >> Charles R Harris kirjoitti 29.06.2017 klo 20:45: >> > Here's a random idea: how about building a NumPy gallery? >> > scikit-{image,learn} has it, and while those projects may have more >> > visual datasets, I can imagine something along the lines of Nicolas >> > Rougier's beautiful book: >> > >> > http://www.labri.fr/perso/nrougier/from-python-to-numpy/ >> > >> > >> > >> > So that would be added in the numpy >> > /numpy.org >> > repo? >> >> Or https://scipy-cookbook.readthedocs.io/ ? >> (maybe minus bitrot and images added :) >> _____________________________________ > > > I'd like the numpy.org one. numpy.org is now incredibly sparse and ugly, > a gallery would make it look a lot better. > > Another idea, from the "deprecate np.matrix" discussion: add numpy > documentation describing the preferred way to handle matrices, extolling > the virtues of @, and move np.matrix documentation to a deprecated section. > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Jul 2 06:34:59 2017 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 2 Jul 2017 12:34:59 +0200 Subject: [Numpy-discussion] Vector stacks In-Reply-To: <27036cbd-2c32-4106-a739-771a73e99a5a@Spark> References: <27036cbd-2c32-4106-a739-771a73e99a5a@Spark> Message-ID: <20170702103459.GT4067506@phare.normalesup.org> My thoughts exactly. Ga?l On Sun, Jul 02, 2017 at 10:31:33AM +1000, Juan Nunez-Iglesias wrote: > I?m with Nathaniel on this one. Subclasses make code harder to read and reason > about because you now have to be sure of the exact type of things that users > are passing you ? which are array-like but subtly different. > On 2 Jul 2017, 9:46 AM +1000, Marten van Kerkwijk , > wrote: > I'm not sure there is *that* much against a class that basically just > passes through views of itself inside `__matmul__` and `__rmatmul__` > or calls new gufuncs, but I think the lower hurdle is to first get > those gufuncs implemented. > -- Marten > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Gael Varoquaux Researcher, INRIA Parietal NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From charlesr.harris at gmail.com Sun Jul 2 10:03:31 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 Jul 2017 08:03:31 -0600 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> Message-ID: Updated list below. On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root wrote: > Just a heads-up. There is now a sphinx-gallery plugin. Matplotlib and a > few other projects have migrated their docs over to use it. > > https://sphinx-gallery.readthedocs.io/en/latest/ > > Cheers! > Ben Root > > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > wrote: > >> >> >> On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen wrote: >> >>> Charles R Harris kirjoitti 29.06.2017 klo 20:45: >>> > Here's a random idea: how about building a NumPy gallery? >>> > scikit-{image,learn} has it, and while those projects may have more >>> > visual datasets, I can imagine something along the lines of Nicolas >>> > Rougier's beautiful book: >>> > >>> > http://www.labri.fr/perso/nrougier/from-python-to-numpy/ >>> > >>> > >>> > >>> > So that would be added in the numpy >>> > /numpy.org >>> > repo? >>> >>> Or https://scipy-cookbook.readthedocs.io/ ? >>> (maybe minus bitrot and images added :) >>> _____________________________________ >> >> >> I'd like the numpy.org one. numpy.org is now incredibly sparse and ugly, >> a gallery would make it look a lot better. >> >> Another idea, from the "deprecate np.matrix" discussion: add numpy >> documentation describing the preferred way to handle matrices, extolling >> the virtues of @, and move np.matrix documentation to a deprecated section. >> >> Putting things together with a few new ideas, 1. add gallery to numpy.org, 2. add extended documentation of '@' operator, 3. make Numpy tests Pytest compatible, 4. add matrix multiplication ufunc. Any more ideas? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sun Jul 2 10:49:29 2017 From: allanhaldane at gmail.com (Allan Haldane) Date: Sun, 2 Jul 2017 10:49:29 -0400 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> Message-ID: <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> On 07/02/2017 10:03 AM, Charles R Harris wrote: > Updated list below. > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root > wrote: > > Just a heads-up. There is now a sphinx-gallery plugin. Matplotlib > and a few other projects have migrated their docs over to use it. > > https://sphinx-gallery.readthedocs.io/en/latest/ > > > Cheers! > Ben Root > > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > wrote: > > > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen > wrote: > > Charles R Harris kirjoitti 29.06.2017 klo 20:45: > > Here's a random idea: how about building a NumPy gallery? > > scikit-{image,learn} has it, and while those projects may have more > > visual datasets, I can imagine something along the lines of Nicolas > > Rougier's beautiful book: > > > > http://www.labri.fr/perso/nrougier/from-python-to-numpy/ > > > > > > > > > > So that would be added in the numpy > > /numpy.org > > > repo? > > Or https://scipy-cookbook.readthedocs.io/ > ? > (maybe minus bitrot and images added :) > _____________________________________ > > > I'd like the numpy.org one. numpy.org > is now incredibly sparse and ugly, a gallery > would make it look a lot better. > > Another idea, from the "deprecate np.matrix" discussion: add > numpy documentation describing the preferred way to handle > matrices, extolling the virtues of @, and move np.matrix > documentation to a deprecated section. > > > Putting things together with a few new ideas, > > 1. add gallery to numpy.org , > 2. add extended documentation of '@' operator, > 3. make Numpy tests Pytest compatible, > 4. add matrix multiplication ufunc. > > Any more ideas? The new doctest runner suggested in the printing thread? This is to ignore whitespace and precision in ndarray output. I can see an argument for distributing it in numpy if it is designed to be specially aware of ndarrays or numpy scalars (eg to test equality between 'wants' and 'got') Allan > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From sebastian at sipsolutions.net Sun Jul 2 11:33:19 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 02 Jul 2017 17:33:19 +0200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> Message-ID: <1499009599.3435.5.camel@sipsolutions.net> On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote: > On 07/02/2017 10:03 AM, Charles R Harris wrote: > > Updated list below. > > > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root > ? > > > wrote: > > > > ????Just a heads-up. There is now a sphinx-gallery plugin. > > Matplotlib > > ????and a few other projects have migrated their docs over to use > > it. > > > > ????https://sphinx-gallery.readthedocs.io/en/latest/ > > ???? > > > > ????Cheers! > > ????Ben Root > > > > > > ????On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > l.com > > ????> wrote: > > > > > > > > ????????On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen > ????????> wrote: > > > > ????????????Charles R Harris kirjoitti 29.06.2017 klo 20:45: > > ????????????>?????Here's a random idea: how about building a NumPy > > gallery? > > ????????????>?????scikit-{image,learn} has it, and while those > > projects may have more > > ????????????>?????visual datasets, I can imagine something along > > the lines of Nicolas > > ????????????>?????Rougier's beautiful book: > > ????????????> > > ????????????>?????http://www.labri.fr/perso/nrougier/from-python-to > > -numpy/ > > ???????????? > y/> > > ????????????>????? > o-numpy/ > > ???????????? > y/>> > > ????????????> > > ????????????> > > ????????????> So that would be added in the??numpy > > ????????????> /numpy.org > > > > ????????????> > ????????????> repo? > > > > ????????????Or https://scipy-cookbook.readthedocs.io/ > > ??????????????? > > ????????????(maybe minus bitrot and images added :) > > ????????????_____________________________________ > > > > > > ????????I'd like the numpy.org one. numpy.org > > ???????? is now incredibly sparse and ugly, a > > gallery > > ????????would make it look a lot better. > > > > ????????Another idea, from the "deprecate np.matrix" discussion: > > add > > ????????numpy documentation describing the preferred way to handle > > ????????matrices, extolling the virtues of @, and move np.matrix > > ????????documentation to a deprecated section. > > > > > > ? Putting things together with a few new ideas, > > > > ?1. add gallery to numpy.org , > > ?2. add extended documentation of '@' operator, > > ?3. make Numpy tests Pytest compatible, > > ?4. add matrix multiplication ufunc. > > > > ? Any more ideas? > > The new doctest runner suggested in the printing thread? This is to? > ignore whitespace and precision in ndarray output. > > I can see an argument for distributing it in numpy if it is designed > to? > be specially aware of ndarrays or numpy scalars (eg to test equality? > between 'wants' and 'got') > I don't really feel it is very numpy specific or should be under the numpy umbrella (I mean if there is no other spot, I guess it could live on the numpy github page). Its about as numpy specific, as the gallery sphinx extension is probably matplotlib specific.... That doesn't mean that it might not be a good sprint, though :). The question to me is a bit what those who actually go there want from it or do a few people who know numpy/scipy already plan to come? Two years ago, we did not have much of a plan, so it was mostly giving three people or so a bit of a tutorial of how numpy worked internally leading to some bug fixes. One quick idea that might be nice and dives a bit into the C-layer (might be nice if there is no big topic with a few people working on): * Find places that should have the new memory overlap detection and implement it there. If someone who does subclasses/array-likes or so (e.g. like Stefan Hoyer ;)) and is interested, and also we do some teleconferencing/chatting (and I have time).... I might be interested in discussing and possibly trying to develop the new indexer ideas, which I feel are pretty far, but I got stuck on how to get subclasses right. - Sebastian > Allan > > > > Chuck > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Sun Jul 2 15:01:16 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 Jul 2017 13:01:16 -0600 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: <1499009599.3435.5.camel@sipsolutions.net> References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Sun, Jul 2, 2017 at 9:33 AM, Sebastian Berg wrote: > On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote: > > On 07/02/2017 10:03 AM, Charles R Harris wrote: > > > Updated list below. > > > > > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root > > > > > > wrote: > > > > > > Just a heads-up. There is now a sphinx-gallery plugin. > > > Matplotlib > > > and a few other projects have migrated their docs over to use > > > it. > > > > > > https://sphinx-gallery.readthedocs.io/en/latest/ > > > > > > > > > Cheers! > > > Ben Root > > > > > > > > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > > l.com > > > > wrote: > > > > > > > > > > > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen > > > wrote: > > > > > > Charles R Harris kirjoitti 29.06.2017 klo 20:45: > > > > Here's a random idea: how about building a NumPy > > > gallery? > > > > scikit-{image,learn} has it, and while those > > > projects may have more > > > > visual datasets, I can imagine something along > > > the lines of Nicolas > > > > Rougier's beautiful book: > > > > > > > > http://www.labri.fr/perso/nrougier/from-python-to > > > -numpy/ > > > > > y/> > > > > > > o-numpy/ > > > > > y/>> > > > > > > > > > > > > So that would be added in the numpy > > > > /numpy.org > > > > > > > > > > repo? > > > > > > Or https://scipy-cookbook.readthedocs.io/ > > > ? > > > (maybe minus bitrot and images added :) > > > _____________________________________ > > > > > > > > > I'd like the numpy.org one. numpy.org > > > is now incredibly sparse and ugly, a > > > gallery > > > would make it look a lot better. > > > > > > Another idea, from the "deprecate np.matrix" discussion: > > > add > > > numpy documentation describing the preferred way to handle > > > matrices, extolling the virtues of @, and move np.matrix > > > documentation to a deprecated section. > > > > > > > > > Putting things together with a few new ideas, > > > > > > 1. add gallery to numpy.org , > > > 2. add extended documentation of '@' operator, > > > 3. make Numpy tests Pytest compatible, > > > 4. add matrix multiplication ufunc. > > > > > > Any more ideas? > > > > The new doctest runner suggested in the printing thread? This is to > > ignore whitespace and precision in ndarray output. > > > > I can see an argument for distributing it in numpy if it is designed > > to > > be specially aware of ndarrays or numpy scalars (eg to test equality > > between 'wants' and 'got') > > > > I don't really feel it is very numpy specific or should be under the > numpy umbrella (I mean if there is no other spot, I guess it could live > on the numpy github page). Its about as numpy specific, as the gallery > sphinx extension is probably matplotlib specific.... > > That doesn't mean that it might not be a good sprint, though :). > > The question to me is a bit what those who actually go there want from > it or do a few people who know numpy/scipy already plan to come? Two > years ago, we did not have much of a plan, so it was mostly giving > three people or so a bit of a tutorial of how numpy worked internally > leading to some bug fixes. > > One quick idea that might be nice and dives a bit into the C-layer > (might be nice if there is no big topic with a few people working on): > > * Find places that should have the new memory overlap > detection and implement it there. > > If someone who does subclasses/array-likes or so (e.g. like Stefan > Hoyer ;)) and is interested, and also we do some > teleconferencing/chatting (and I have time).... I might be interested > in discussing and possibly trying to develop the new indexer ideas, > which I feel are pretty far, but I got stuck on how to get subclasses > right. > > - Sebastian > > > I've opened an issue for Pytests and given it a "Scipy2017 Sprint" label. I'd be much obliged if the folks with suggestions here would open other issues and also label them with "Scipy2017 Sprint". Note that these issues are not Scipy 2017 specific, they could be used in other contexts, but I thought is might be useful to collect them in one spot and give them some structure together with suggestions on how to proceed. Ralf, you have made several previous suggestion on bringing over some to the scipy tests to numpy, to include documentation testing. Were there any other tests we should look into? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Jul 2 21:28:38 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 2 Jul 2017 18:28:38 -0700 Subject: [Numpy-discussion] Vector stacks In-Reply-To: <27036cbd-2c32-4106-a739-771a73e99a5a@Spark> References: <27036cbd-2c32-4106-a739-771a73e99a5a@Spark> Message-ID: I would also prefer separate functions. These are much easier to understand that custom operator overloads. Side note: implementing this class with __array_ufunc__ for ndarray @ cvec actually isn't possible to do currently, until we fix this bug: https://github.com/numpy/numpy/issues/9028 On Sat, Jul 1, 2017 at 5:31 PM, Juan Nunez-Iglesias wrote: > I?m with Nathaniel on this one. Subclasses make code harder to read and > reason about because you now have to be sure of the exact type of things > that users are passing you ? which are array-like but subtly different. > > On 2 Jul 2017, 9:46 AM +1000, Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com>, wrote: > > I'm not sure there is *that* much against a class that basically just > passes through views of itself inside `__matmul__` and `__rmatmul__` > or calls new gufuncs, but I think the lower hurdle is to first get > those gufuncs implemented. > -- Marten > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Mon Jul 3 15:55:52 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Mon, 03 Jul 2017 21:55:52 +0200 Subject: [Numpy-discussion] New python/numpy user Message-ID: Dear All I'm a like matlab user (more specifically a Scilab one) for years, and because I've to deal with huge ascii files (with dozens of millions of lines), I decided to have a look to Python and Numpy, including vectorization topics. Obviously I've been influenced by my current feedbacks. I've a basic question concerning the current code: why it is necessary to transpose the column vector (still in the right format in my mind)? does it make sens? Thanks Paul #################################### import numpy as np ## np = raccourci ## works with a row vector vect0 = np.random.rand(5); print vect0; print("\n") mat = np.zeros((5,4),dtype=float) mat[:,0]=np.transpose(vect0); print mat ## works while the vector is still in column i.e. in a right format, isn't it? vect0 = np.random.rand(5,1); print vect0; print("\n") mat = np.zeros((5,4),dtype=float) mat[:,0]=np.transpose(vect0); print mat ## does not work vect0 = np.random.rand(5,1); print vect0; print("\n") mat = np.zeros((5,4),dtype=float) mat[:,0]=np(vect0); print mat -------------- next part -------------- An HTML attachment was scrubbed... URL: From bblais at gmail.com Mon Jul 3 19:04:10 2017 From: bblais at gmail.com (Brian Blais) Date: Mon, 3 Jul 2017 19:04:10 -0400 Subject: [Numpy-discussion] New python/numpy user In-Reply-To: References: Message-ID: <46dd9396-832d-40eb-a418-386b85d38c30@Spark> There are a couple of interesting observations here. ?In your first bit, you have: > ## works with a row vector > vect0 = np.random.rand(5) > mat[:,0]=np.transpose(vect0) (or I prefer vect0.T). ?Did you happen to notice that this works too: > mat[:,0]=vect0 > The transpose or the original work as well. ?Unlike Scilab, python?s arrays can be literally 1-dimensional. ?Not 5x1 but just 5, ?which doesn?t have a transpose, because it doesn?t have a 2nd dimension. you can see that in vect0.shape so?np.random.rand(5) doesn?t make a row-vector but a length 5 array, which is different than?np.random.rand(5,1) or?np.random.rand(1,5). ?Thus, you have to make sure the shapes all work. in your second example, with the column vector, you can also slice along the 2nd dimension without transposing like: > mat[:,0]=vect0[:,0] mat[:,0] seems to have shape of (5,)?which is just length-5 array, so setting it equal to 1xN or Nx1 arrays seems to cause some issues. - Brian On Jul 3, 2017, 15:57 -0400, paul.carrico at free.fr, wrote: > Dear All > > I'm a like matlab user (more specifically a Scilab one) for years, and because I've to deal with huge ascii files (with dozens of millions of lines), I decided to have a look to Python and Numpy, including?vectorization topics. > > Obviously I've been influenced by my current feedbacks. > > I've a basic question concerning the current code: why it is necessary to transpose the column vector (still in the right format in my mind)? does it make sens? > > Thanks > > Paul > > #################################### > import numpy as np ## np = raccourci > > ## works with a row vector > vect0 = np.random.rand(5); print vect0; print("\n") > mat = np.zeros((5,4),dtype=float) > mat[:,0]=np.transpose(vect0); print mat > > ## works while the vector is still in column i.e. in a right format, isn't it? > vect0 = np.random.rand(5,1); print vect0; print("\n") > mat = np.zeros((5,4),dtype=float) > mat[:,0]=np.transpose(vect0); print mat > > ## does not work > vect0 = np.random.rand(5,1); print vect0; print("\n") > mat = np.zeros((5,4),dtype=float) > mat[:,0]=np(vect0); print mat > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 3 19:27:42 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 03 Jul 2017 23:27:42 +0000 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: <1499009599.3435.5.camel@sipsolutions.net> References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Sun, Jul 2, 2017 at 8:33 AM Sebastian Berg wrote: > If someone who does subclasses/array-likes or so (e.g. like Stefan > Hoyer ;)) and is interested, and also we do some > teleconferencing/chatting (and I have time).... I might be interested > in discussing and possibly trying to develop the new indexer ideas, > which I feel are pretty far, but I got stuck on how to get subclasses > right. I am off course very happy to discuss this (online or via teleconference, sadly I won't be at scipy), but to be clear I use array likes, not subclasses. I think Marten van Kerkwijk is the last one who thinks that is still a good idea :). > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Jul 4 00:15:24 2017 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 4 Jul 2017 00:15:24 -0400 Subject: [Numpy-discussion] make nditer also a context manager Message-ID: <89ccd152-a929-0d16-0763-2d36243178a1@gmail.com> When an nditer uses certain op_flags[0], like updateifcopy or readwrite or copy, the operands (which are ndarray views into the original data) must use the UPDATEIFCOPY flag. The meaning of this flag is to allocate temporary memory to hold the modified data and make the original data readonly. When the caller is finished with the nditer, the temporary memory is written back to to the original data and the original data's readwrite status is restored. The trigger to resolve the temporary data is currently via the deallocate function of the nditer, thus the call becomes something like i = nditer(a, op_flags=) #do something with i i = None # trigger writing data from i to a This IMO violates the "explicit is better" philosophy, and has the additional disadvantage of relying on refcounting semantics to trigger the data resolution, which does not work on PyPy. I have a pending pull request[1] to add a private numpy API function PyArray_ResolveUpdateIfCopy, which allows triggering the data resolution in a more explicit way. The new API function is added at the end of the functions like take or put with non-contiguous arguments, explicit tests have also been added . The only user-facing effect is to allow using an nditer as a context manager, so the lines above would become with nditer(a, op_flags=) as i: #do something with i # data is written back when exiting The pull request passes all tests on CPython. The last commit makes the use of a nditer context manager mandatory on PyPy if the UPDATEIFCOPY semantics are triggered, while allowing existing code to function without a warning on CPython. Note that np.nested_iters[2] is problematic, in that it currently returns a tuple of nidters, which AFAICT cannot be made into a context manager, so __enter__ and __exit__ must be called manually for the nditers. At some future point we could decide to deprecate the non-context managed use on CPython as well, following a cycle of first issuing a deprecation warning for a few release versions. Any thoughts? Does making nditer a context manager make sense? How widespread is use of nested_iters in live code (it does not appear in the official NumPy documentation) Thanks, Matti [0] https://docs.scipy.org/doc/numpy/reference/generated/numpy.nditer.html [1] https://github.com/numpy/numpy/pull/9269 [2] https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_nditer.py#L2344 From paul.carrico at free.fr Tue Jul 4 03:32:42 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Tue, 04 Jul 2017 09:32:42 +0200 Subject: [Numpy-discussion] New python/numpy user In-Reply-To: <46dd9396-832d-40eb-a418-386b85d38c30@Spark> References: <46dd9396-832d-40eb-a418-386b85d38c30@Spark> Message-ID: Hi Brian First of all thanks for the answer, the explanations and the advices; as mentioned I've to think differently in order to code with efficiency. 1.: yes 'mat[:,0]=vect0' works fine and I understand why :-) 2. more generally, I've read some tutorials or presentations saying that Numpy is faster than the native Python; regarding the (huge) size of my matrices, vectorization, Numpy (and others) + optimzation of calls are the basics Thanks Paul Le 2017-07-04 01:04, Brian Blais a ?crit : > There are a couple of interesting observations here. In your first bit, you have: > >> ## works with a row vector >> vect0 = np.random.rand(5) >> mat[:,0]=np.transpose(vect0) > > (or I prefer vect0.T). Did you happen to notice that this works too: > >> mat[:,0]=vect0 > > The transpose or the original work as well. Unlike Scilab, python's arrays can be literally 1-dimensional. Not 5x1 but just 5, which doesn't have a transpose, because it doesn't have a 2nd dimension. > > you can see that in vect0.shape > > so np.random.rand(5) doesn't make a row-vector but a length 5 array, which is different than np.random.rand(5,1) or np.random.rand(1,5). Thus, you have to make sure the shapes all work. > > in your second example, with the column vector, you can also slice along the 2nd dimension without transposing like: > >> mat[:,0]=vect0[:,0] > > mat[:,0] seems to have shape of (5,) which is just length-5 array, so setting it equal to 1xN or Nx1 arrays seems to cause some issues. > > - Brian > > On Jul 3, 2017, 15:57 -0400, paul.carrico at free.fr, wrote: > >> Dear All >> >> I'm a like matlab user (more specifically a Scilab one) for years, and because I've to deal with huge ascii files (with dozens of millions of lines), I decided to have a look to Python and Numpy, including vectorization topics. >> >> Obviously I've been influenced by my current feedbacks. >> >> I've a basic question concerning the current code: why it is necessary to transpose the column vector (still in the right format in my mind)? does it make sens? >> >> Thanks >> >> Paul >> >> #################################### >> import numpy as np ## np = raccourci >> >> ## works with a row vector >> vect0 = np.random.rand(5); print vect0; print("\n") >> mat = np.zeros((5,4),dtype=float) >> mat[:,0]=np.transpose(vect0); print mat >> >> ## works while the vector is still in column i.e. in a right format, isn't it? >> vect0 = np.random.rand(5,1); print vect0; print("\n") >> mat = np.zeros((5,4),dtype=float) >> mat[:,0]=np.transpose(vect0); print mat >> >> ## does not work >> vect0 = np.random.rand(5,1); print vect0; print("\n") >> mat = np.zeros((5,4),dtype=float) >> mat[:,0]=np(vect0); print mat _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 4 22:27:05 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Jul 2017 20:27:05 -0600 Subject: [Numpy-discussion] New python/numpy user In-Reply-To: References: <46dd9396-832d-40eb-a418-386b85d38c30@Spark> Message-ID: On Tue, Jul 4, 2017 at 1:32 AM, wrote: > Hi Brian > > > First of all thanks for the answer, the explanations and the advices; as > mentioned I've to think differently in order to code with efficiency. > > 1.: yes 'mat[:,0]=vect0' works fine and I understand why :-) > > 2. more generally, I've read some tutorials or presentations saying that > Numpy is faster than the native Python; regarding the (huge) size of my > matrices, vectorization, Numpy (and others) + optimzation of calls are the > basics > > Thanks > > > Note that NumPy is not the best for large text files, for instance, pandas is faster. Also note that the custom here is bottom posting. And welcome to Python ;) Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jul 5 05:43:38 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Jul 2017 21:43:38 +1200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Mon, Jul 3, 2017 at 7:01 AM, Charles R Harris wrote: > > > On Sun, Jul 2, 2017 at 9:33 AM, Sebastian Berg > wrote: > >> On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote: >> > On 07/02/2017 10:03 AM, Charles R Harris wrote: >> > > Updated list below. >> > > >> > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root > > > >> > > > wrote: >> > > >> > > Just a heads-up. There is now a sphinx-gallery plugin. >> > > Matplotlib >> > > and a few other projects have migrated their docs over to use >> > > it. >> > > >> > > https://sphinx-gallery.readthedocs.io/en/latest/ >> > > >> > > >> > > Cheers! >> > > Ben Root >> > > >> > > >> > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > > > l.com >> > > > wrote: >> > > >> > > >> > > >> > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen > > > > wrote: >> > > >> > > Charles R Harris kirjoitti 29.06.2017 klo 20:45: >> > > > Here's a random idea: how about building a NumPy >> > > gallery? >> > > > scikit-{image,learn} has it, and while those >> > > projects may have more >> > > > visual datasets, I can imagine something along >> > > the lines of Nicolas >> > > > Rougier's beautiful book: >> > > > >> > > > http://www.labri.fr/perso/nrougier/from-python-to >> > > -numpy/ >> > > > > > y/> >> > > > > > > o-numpy/ >> > > > > > y/>> >> > > > >> > > > >> > > > So that would be added in the numpy >> > > > /numpy.org >> > > >> > > > > > > > repo? >> > > >> > > Or https://scipy-cookbook.readthedocs.io/ >> > > ? >> > > (maybe minus bitrot and images added :) >> > > _____________________________________ >> > > >> > > >> > > I'd like the numpy.org one. numpy.org >> > > is now incredibly sparse and ugly, a >> > > gallery >> > > would make it look a lot better. >> > > >> > > Another idea, from the "deprecate np.matrix" discussion: >> > > add >> > > numpy documentation describing the preferred way to handle >> > > matrices, extolling the virtues of @, and move np.matrix >> > > documentation to a deprecated section. >> > > >> > > >> > > Putting things together with a few new ideas, >> > > >> > > 1. add gallery to numpy.org , >> > > 2. add extended documentation of '@' operator, >> > > 3. make Numpy tests Pytest compatible, >> > > 4. add matrix multiplication ufunc. >> > > >> > > Any more ideas? >> > >> > The new doctest runner suggested in the printing thread? This is to >> > ignore whitespace and precision in ndarray output. >> > >> > I can see an argument for distributing it in numpy if it is designed >> > to >> > be specially aware of ndarrays or numpy scalars (eg to test equality >> > between 'wants' and 'got') >> > >> >> I don't really feel it is very numpy specific or should be under the >> numpy umbrella (I mean if there is no other spot, I guess it could live >> on the numpy github page). Its about as numpy specific, as the gallery >> sphinx extension is probably matplotlib specific.... >> >> That doesn't mean that it might not be a good sprint, though :). >> >> The question to me is a bit what those who actually go there want from >> it or do a few people who know numpy/scipy already plan to come? Two >> years ago, we did not have much of a plan, so it was mostly giving >> three people or so a bit of a tutorial of how numpy worked internally >> leading to some bug fixes. >> >> One quick idea that might be nice and dives a bit into the C-layer >> (might be nice if there is no big topic with a few people working on): >> >> * Find places that should have the new memory overlap >> detection and implement it there. >> >> If someone who does subclasses/array-likes or so (e.g. like Stefan >> Hoyer ;)) and is interested, and also we do some >> teleconferencing/chatting (and I have time).... I might be interested >> in discussing and possibly trying to develop the new indexer ideas, >> which I feel are pretty far, but I got stuck on how to get subclasses >> right. >> >> - Sebastian >> >> >> > I've opened an issue for Pytests > and given it a "Scipy2017 > Sprint" label. I'd be much obliged if the folks with suggestions here would > open other issues and also label them with "Scipy2017 Sprint". Note that > these issues are not Scipy 2017 specific, they could be used in other > contexts, but I thought is might be useful to collect them in one spot and > give them some structure together with suggestions on how to proceed. > > Ralf, you have made several previous suggestion on bringing over some to > the scipy tests to numpy, to include documentation testing. Were there any > other tests we should look into? > Better platform test coverage would be a useful topic if someone is willing to work on that. NumPy needs OS X testing enabled on TravisCI, SciPy needs OS X and a 32-bit test (steal from NumPy). And if someone really feels ambitious: replace ATLAS by OpenBLAS in one of the test matrix entries. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.j.a.cock at googlemail.com Wed Jul 5 06:14:44 2017 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jul 2017 11:14:44 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: Note that TravisCI does not yet have official Python support on Mac OS X, https://github.com/travis-ci/travis-ci/issues/2312 I believe it is possible to do anyway by faking it under another setting (e.g. pretend to be a generic language build, and use the system Python or install your own specific version of Python as needed), so that may be worth trying during a sprint. Peter On Wed, Jul 5, 2017 at 10:43 AM, Ralf Gommers wrote: > > Better platform test coverage would be a useful topic if someone is willing > to work on that. NumPy needs OS X testing enabled on TravisCI, SciPy needs > OS X and a 32-bit test (steal from NumPy). And if someone really feels > ambitious: replace ATLAS by OpenBLAS in one of the test matrix entries. > > Ralf From ralf.gommers at gmail.com Wed Jul 5 06:25:42 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Jul 2017 22:25:42 +1200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 10:14 PM, Peter Cock wrote: > Note that TravisCI does not yet have official Python support on Mac OS X, > > https://github.com/travis-ci/travis-ci/issues/2312 > > I believe it is possible to do anyway by faking it under another setting > (e.g. pretend to be a generic language build, and use the system Python > or install your own specific version of Python as needed), so that may be > worth trying during a sprint. > That approach has worked reliably for https://github.com/MacPython/numpy-wheels for a while now, so should be straightforward. Ralf > Peter > > On Wed, Jul 5, 2017 at 10:43 AM, Ralf Gommers > wrote: > > > > Better platform test coverage would be a useful topic if someone is > willing > > to work on that. NumPy needs OS X testing enabled on TravisCI, SciPy > needs > > OS X and a 32-bit test (steal from NumPy). And if someone really feels > > ambitious: replace ATLAS by OpenBLAS in one of the test matrix entries. > > > > Ralf > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.j.a.cock at googlemail.com Wed Jul 5 06:31:05 2017 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jul 2017 11:31:05 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 11:25 AM, Ralf Gommers wrote: > > > On Wed, Jul 5, 2017 at 10:14 PM, Peter Cock > wrote: >> >> Note that TravisCI does not yet have official Python support on Mac OS X, >> >> https://github.com/travis-ci/travis-ci/issues/2312 >> >> I believe it is possible to do anyway by faking it under another setting >> (e.g. pretend to be a generic language build, and use the system Python >> or install your own specific version of Python as needed), so that may be >> worth trying during a sprint. > > > That approach has worked reliably for > https://github.com/MacPython/numpy-wheels for a while now, so should be > straightforward. > > Ralf Thanks for that link - I'm going off topic but the MacPython wiki page goes into more background about how they build wheels for PyPI which I'm very interested to read up on: https://github.com/MacPython/wiki/wiki/Spinning-wheels Peter From matthew.brett at gmail.com Wed Jul 5 06:30:57 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 5 Jul 2017 11:30:57 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 11:25 AM, Ralf Gommers wrote: > > > On Wed, Jul 5, 2017 at 10:14 PM, Peter Cock > wrote: >> >> Note that TravisCI does not yet have official Python support on Mac OS X, >> >> https://github.com/travis-ci/travis-ci/issues/2312 >> >> I believe it is possible to do anyway by faking it under another setting >> (e.g. pretend to be a generic language build, and use the system Python >> or install your own specific version of Python as needed), so that may be >> worth trying during a sprint. > > > That approach has worked reliably for > https://github.com/MacPython/numpy-wheels for a while now, so should be > straightforward. And https://travis-ci.org/MacPython/scipy-wheels where we are testing OSX, 64 and 32 bit manylinux builds daily. That didn't catch the recent ndimage error because I'd disabled the 32-bit builds there. Numpy, scipy, and a fairly large number of other projects use https://github.com/matthew-brett/multibuild to set up builds in this way for manylinux, OSX and (with a bit more effort) Windows. Cheers, Matthew From matthew.brett at gmail.com Wed Jul 5 06:38:49 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 5 Jul 2017 11:38:49 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 11:31 AM, Peter Cock wrote: > On Wed, Jul 5, 2017 at 11:25 AM, Ralf Gommers wrote: >> >> >> On Wed, Jul 5, 2017 at 10:14 PM, Peter Cock >> wrote: >>> >>> Note that TravisCI does not yet have official Python support on Mac OS X, >>> >>> https://github.com/travis-ci/travis-ci/issues/2312 >>> >>> I believe it is possible to do anyway by faking it under another setting >>> (e.g. pretend to be a generic language build, and use the system Python >>> or install your own specific version of Python as needed), so that may be >>> worth trying during a sprint. >> >> >> That approach has worked reliably for >> https://github.com/MacPython/numpy-wheels for a while now, so should be >> straightforward. >> >> Ralf > > Thanks for that link - I'm going off topic but the MacPython wiki page goes > into more background about how they build wheels for PyPI which I'm > very interested to read up on: > > https://github.com/MacPython/wiki/wiki/Spinning-wheels Yes, you'll see that the multibuild framework that numpy and scipy uses, includes utilities to download Python.org Python and build against that, in Spinning-wheels fashion, Cheers, Matthew From paul.carrico at free.fr Wed Jul 5 08:41:49 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Wed, 05 Jul 2017 14:41:49 +0200 Subject: [Numpy-discussion] record data previous to Numpy use Message-ID: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Dear all I'm sorry if my question is too basic (not fully in relation to Numpy - while it is to build matrices and to work with Numpy afterward), but I'm spending a lot of time and effort to find a way to record data from an asci while, and reassign it into a matrix/array ? with unsuccessfully! The only way I found is to use _'append()'_ instruction involving dynamic memory allocation. :-( >From my current experience under Scilab (a like Matlab scientific solver), it is well know: * Step 1 : matrix initialization like _'np.zeros(n,n)'_ * Step 2 : record the data * and write it in the matrix (step 3) I'm obviously influenced by my current experience, but I'm interested in moving to Python and its packages For huge asci files (involving dozens of millions of lines), my strategy is to work by 'blocks' as : * Find the line index of the beginning and the end of one block (this implies that the file is read ounce) * Read the block * (process repeated on the different other blocks) I tried different codes such as bellow, but each time Python is telling me I CANNOT MIX ITERATION AND RECORD METHOD ############################################# position = []; j=0 with open(PATH + file_name, "r") as rough_ data: for line in rough_ data: if _my_criteria_ in line: position.append(j) ## huge blocs but limited in number j=j+1 i = 0 blockdata = np.zeros( (size_block), dtype=np.float) with open(PATH + file_name, "r") as f: for line in itertools.islice(f,1,size_block): blockdata [i]=float(f.readline() ) i=i+1 ######################################### Should I work on lists using f.readlines (but this implies to load all the file in memory). Additional question: can I use record with vectorization, with 'i =np.arange(0,65406)' if I remain in the previous example Thanks for your time and comprehension (I'm obviously interested by doc references speaking about those specific tasks) Paul PS: for Chuck: I'll had a look to pandas package but in an code optimization step :-) (nearly 2000 doc pages) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Jul 5 09:00:33 2017 From: cournape at gmail.com (David Cournapeau) Date: Wed, 5 Jul 2017 14:00:33 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 10:43 AM, Ralf Gommers wrote: > > > On Mon, Jul 3, 2017 at 7:01 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Jul 2, 2017 at 9:33 AM, Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >> >>> On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote: >>> > On 07/02/2017 10:03 AM, Charles R Harris wrote: >>> > > Updated list below. >>> > > >>> > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root >> > > >>> > > > wrote: >>> > > >>> > > Just a heads-up. There is now a sphinx-gallery plugin. >>> > > Matplotlib >>> > > and a few other projects have migrated their docs over to use >>> > > it. >>> > > >>> > > https://sphinx-gallery.readthedocs.io/en/latest/ >>> > > >>> > > >>> > > Cheers! >>> > > Ben Root >>> > > >>> > > >>> > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers >> > > l.com >>> > > > wrote: >>> > > >>> > > >>> > > >>> > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen >> > > > wrote: >>> > > >>> > > Charles R Harris kirjoitti 29.06.2017 klo 20:45: >>> > > > Here's a random idea: how about building a NumPy >>> > > gallery? >>> > > > scikit-{image,learn} has it, and while those >>> > > projects may have more >>> > > > visual datasets, I can imagine something along >>> > > the lines of Nicolas >>> > > > Rougier's beautiful book: >>> > > > >>> > > > http://www.labri.fr/perso/nrougier/from-python-to >>> > > -numpy/ >>> > > >> > > y/> >>> > > > >> > > o-numpy/ >>> > > >> > > y/>> >>> > > > >>> > > > >>> > > > So that would be added in the numpy >>> > > > /numpy.org >>> > > >>> > > > >> > > > repo? >>> > > >>> > > Or https://scipy-cookbook.readthedocs.io/ >>> > > ? >>> > > (maybe minus bitrot and images added :) >>> > > _____________________________________ >>> > > >>> > > >>> > > I'd like the numpy.org one. numpy.org >>> > > is now incredibly sparse and ugly, a >>> > > gallery >>> > > would make it look a lot better. >>> > > >>> > > Another idea, from the "deprecate np.matrix" discussion: >>> > > add >>> > > numpy documentation describing the preferred way to handle >>> > > matrices, extolling the virtues of @, and move np.matrix >>> > > documentation to a deprecated section. >>> > > >>> > > >>> > > Putting things together with a few new ideas, >>> > > >>> > > 1. add gallery to numpy.org , >>> > > 2. add extended documentation of '@' operator, >>> > > 3. make Numpy tests Pytest compatible, >>> > > 4. add matrix multiplication ufunc. >>> > > >>> > > Any more ideas? >>> > >>> > The new doctest runner suggested in the printing thread? This is to >>> > ignore whitespace and precision in ndarray output. >>> > >>> > I can see an argument for distributing it in numpy if it is designed >>> > to >>> > be specially aware of ndarrays or numpy scalars (eg to test equality >>> > between 'wants' and 'got') >>> > >>> >>> I don't really feel it is very numpy specific or should be under the >>> numpy umbrella (I mean if there is no other spot, I guess it could live >>> on the numpy github page). Its about as numpy specific, as the gallery >>> sphinx extension is probably matplotlib specific.... >>> >>> That doesn't mean that it might not be a good sprint, though :). >>> >>> The question to me is a bit what those who actually go there want from >>> it or do a few people who know numpy/scipy already plan to come? Two >>> years ago, we did not have much of a plan, so it was mostly giving >>> three people or so a bit of a tutorial of how numpy worked internally >>> leading to some bug fixes. >>> >>> One quick idea that might be nice and dives a bit into the C-layer >>> (might be nice if there is no big topic with a few people working on): >>> >>> * Find places that should have the new memory overlap >>> detection and implement it there. >>> >>> If someone who does subclasses/array-likes or so (e.g. like Stefan >>> Hoyer ;)) and is interested, and also we do some >>> teleconferencing/chatting (and I have time).... I might be interested >>> in discussing and possibly trying to develop the new indexer ideas, >>> which I feel are pretty far, but I got stuck on how to get subclasses >>> right. >>> >>> - Sebastian >>> >>> >>> >> I've opened an issue for Pytests >> and given it a "Scipy2017 >> Sprint" label. I'd be much obliged if the folks with suggestions here would >> open other issues and also label them with "Scipy2017 Sprint". Note that >> these issues are not Scipy 2017 specific, they could be used in other >> contexts, but I thought is might be useful to collect them in one spot and >> give them some structure together with suggestions on how to proceed. >> >> Ralf, you have made several previous suggestion on bringing over some to >> the scipy tests to numpy, to include documentation testing. Were there any >> other tests we should look into? >> > > Better platform test coverage would be a useful topic if someone is > willing to work on that. NumPy needs OS X testing enabled on TravisCI, > SciPy needs OS X and a 32-bit test (steal from NumPy). And if someone > really feels ambitious: replace ATLAS by OpenBLAS in one of the test matrix > entries. > I can help with that, especially OpenBLAS. Though I would not mind working on something else than packaging :) David > Ralf > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 5 10:55:31 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Jul 2017 08:55:31 -0600 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: Lots of good ideas here. It would help if issues were opened for them and flagged with the sprint label. I'll be doing some myself, but I'm not as intimately familiar with some of the topics as the proposers are. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jul 5 13:40:03 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 5 Jul 2017 10:40:03 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Mon, Jul 3, 2017 at 4:27 PM, Stephan Hoyer wrote: > If someone who does subclasses/array-likes or so (e.g. like Stefan >> Hoyer ;)) and is interested, and also we do some >> teleconferencing/chatting (and I have time).... I might be interested >> in discussing and possibly trying to develop the new indexer ideas, >> which I feel are pretty far, but I got stuck on how to get subclasses >> right. > > > I am off course very happy to discuss this (online or via teleconference, > sadly I won't be at scipy), but to be clear I use array likes, not > subclasses. I think Marten van Kerkwijk is the last one who thinks that is > still a good idea :). > Indeed -- I thought the community more or less had decided that duck-typing was THE way to make something that could be plugged in where a numpy array is expected. Along those lines, there was some discussion of having a set of utilities (or maybe eve3n an ABC?) that would make it easier to create a ndarray-like object. That is, the boilerplate needed for multi-dimensional indexing and slicing, etc... That could be a nice little sprint-able project. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Jul 5 14:05:40 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 5 Jul 2017 11:05:40 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 10:40 AM, Chris Barker wrote: > Along those lines, there was some discussion of having a set of utilities > (or maybe eve3n an ABC?) that would make it easier to create a ndarray-like > object. > > That is, the boilerplate needed for multi-dimensional indexing and > slicing, etc... > > That could be a nice little sprint-able project. > Indeed. Let me highlight a few mixins that I wrote for xarray that might be more broadly useful. The challenge here is that there are quite a few different meanings to "ndarray-like", so mixins really need to be mix-and-match-able. But at least defining a base list of methods to implement/override would be useful. In NumPy, this could go along with NDArrayOperatorsMixins in numpy/lib/mixins.py -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Wed Jul 5 14:19:40 2017 From: tcaswell at gmail.com (Thomas Caswell) Date: Wed, 05 Jul 2017 18:19:40 +0000 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: Are you tied to ASCII files? HDF5 (via h5py or pytables) might be a better storage format for what you are describing. Tom On Wed, Jul 5, 2017 at 8:42 AM wrote: > Dear all > > > I?m sorry if my question is too basic (not fully in relation to Numpy ? > while it is to build matrices and to work with Numpy afterward), but I?m > spending a lot of time and effort to find a way to record data from an asci > while, and reassign it into a matrix/array ? with unsuccessfully! > > > The only way I found is to use *?append()?* instruction involving dynamic > memory allocation. :-( > > > From my current experience under Scilab (a like Matlab scientific solver), > it is well know: > > 1. Step 1 : matrix initialization like *?np.zeros(n,n)?* > 2. Step 2 : record the data > 3. and write it in the matrix (step 3) > > > I?m obviously influenced by my current experience, but I?m interested in > moving to Python and its packages > > > For huge asci files (involving dozens of millions of lines), my strategy > is to work by ?blocks? as : > > - Find the line index of the beginning and the end of one block (this > implies that the file is read ounce) > - Read the block > - (process repeated on the different other blocks) > > > I tried different codes such as bellow, but each time Python is telling me *I > cannot mix iteration and record method* > > ############################################# > > position = []; j=0 > > with open(PATH + file_name, "r") as rough_ data: > > for line in rough_ data: > > if *my_criteria* in line: > > position.append(j) ## huge blocs but limited in number > > j=j+1 > > > i = 0 > > blockdata = np.zeros( (size_block), dtype=np.float) > > with open(PATH + file_name, "r") as f: > > for line in itertools.islice(f,1,size_block): > > blockdata [i]=float(f.readline() ) > > i=i+1 > > ######################################### > > > Should I work on lists using f.readlines (but this implies to load all the > file in memory). > > > *Additional question*: can I use record with vectorization, with ?i > =np.arange(0,65406)? if I remain in the previous example > > > > Thanks for your time and comprehension > > (I?m obviously interested by doc references speaking about those specific > tasks) > > > Paul > > > PS: for Chuck: I?ll had a look to pandas package but in an code > optimization step :-) (nearly 2000 doc pages) > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Wed Jul 5 14:39:49 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Wed, 05 Jul 2017 20:39:49 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: Hi Thanks for the answer: ascii file is an input format (and the only one I can deal with) HDF5 one might be an export one (it's one of the options) in order to speed up the post-processing stage Paul Le 2017-07-05 20:19, Thomas Caswell a ?crit : > Are you tied to ASCII files? HDF5 (via h5py or pytables) might be a better storage format for what you are describing. > > Tom > > On Wed, Jul 5, 2017 at 8:42 AM wrote: > >> Dear all >> >> I'm sorry if my question is too basic (not fully in relation to Numpy - while it is to build matrices and to work with Numpy afterward), but I'm spending a lot of time and effort to find a way to record data from an asci while, and reassign it into a matrix/array ... with unsuccessfully! >> >> The only way I found is to use _'append()'_ instruction involving dynamic memory allocation. :-( >> >> From my current experience under Scilab (a like Matlab scientific solver), it is well know: >> >> * Step 1 : matrix initialization like _'np.zeros(n,n)'_ >> * Step 2 : record the data >> * and write it in the matrix (step 3) >> >> I'm obviously influenced by my current experience, but I'm interested in moving to Python and its packages >> >> For huge asci files (involving dozens of millions of lines), my strategy is to work by 'blocks' as : >> >> * Find the line index of the beginning and the end of one block (this implies that the file is read ounce) >> * Read the block >> * (process repeated on the different other blocks) >> >> I tried different codes such as bellow, but each time Python is telling me I CANNOT MIX ITERATION AND RECORD METHOD >> >> ############################################# >> >> position = []; j=0 >> >> with open(PATH + file_name, "r") as rough_ data: >> >> for line in rough_ data: >> >> if _my_criteria_ in line: >> >> position.append(j) ## huge blocs but limited in number >> >> j=j+1 >> >> i = 0 >> >> blockdata = np.zeros( (size_block), dtype=np.float) >> >> with open(PATH + file_name, "r") as f: >> >> for line in itertools.islice(f,1,size_block): >> >> blockdata [i]=float(f.readline() ) >> >> i=i+1 >> >> ######################################### >> >> Should I work on lists using f.readlines (but this implies to load all the file in memory). >> >> Additional question: can I use record with vectorization, with 'i =np.arange(0,65406)' if I remain in the previous example >> >> Thanks for your time and comprehension >> >> (I'm obviously interested by doc references speaking about those specific tasks) >> >> Paul >> >> PS: for Chuck: I'll had a look to pandas package but in an code optimization step :-) (nearly 2000 doc pages) >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jul 5 18:21:36 2017 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Jul 2017 15:21:36 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: On Wed, Jul 5, 2017 at 5:41 AM, wrote: > > Dear all > > I?m sorry if my question is too basic (not fully in relation to Numpy ? while it is to build matrices and to work with Numpy afterward), but I?m spending a lot of time and effort to find a way to record data from an asci while, and reassign it into a matrix/array ? with unsuccessfully! > > The only way I found is to use ?append()? instruction involving dynamic memory allocation. :-( Are you talking about appending to Python list objects? Or the np.append() function on numpy arrays? In my experience, it is usually fine to build a list with the `.append()` method while reading the file of unknown size and then converting it to an array afterwards, even for dozens of millions of lines. The list object is quite smart about reallocating memory so it is not that expensive. You should generally avoid the np.append() function, though; it is not smart. > From my current experience under Scilab (a like Matlab scientific solver), it is well know: > > Step 1 : matrix initialization like ?np.zeros(n,n)? > Step 2 : record the data > and write it in the matrix (step 3) > > I?m obviously influenced by my current experience, but I?m interested in moving to Python and its packages > > For huge asci files (involving dozens of millions of lines), my strategy is to work by ?blocks? as : > > Find the line index of the beginning and the end of one block (this implies that the file is read ounce) > Read the block > (process repeated on the different other blocks) Are the blocks intrinsic parts of the file? Or are you just trying to break up the file into fixed-size chunks? > I tried different codes such as bellow, but each time Python is telling me I cannot mix iteration and record method > > ############################################# > > position = []; j=0 > with open(PATH + file_name, "r") as rough_ data: > for line in rough_ data: > if my_criteria in line: > position.append(j) ## huge blocs but limited in number > j=j+1 > > i = 0 > blockdata = np.zeros( (size_block), dtype=np.float) > with open(PATH + file_name, "r") as f: > for line in itertools.islice(f,1,size_block): > blockdata [i]=float(f.readline() ) For what it's worth, this is the line that is causing the error that you describe. When you iterate over the file with the `for line in itertools.islice(f, ...):` loop, you already have the line text. You don't (and can't) call `f.readline()` to get it again. It would mess up the iteration if you did and cause you to skip lines. By the way, it is useful to help us help you if you copy-paste the exact code that you are running as well as the full traceback instead of paraphrasing the error message. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Wed Jul 5 17:53:31 2017 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 5 Jul 2017 23:53:31 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: Hi Paul, > ascii file is an input format (and the only one I can deal with) > > HDF5 one might be an export one (it's one of the options) in order to speed up the post-processing stage > > > > Paul > > > > > > Le 2017-07-05 20:19, Thomas Caswell a ?crit : > >> Are you tied to ASCII files? HDF5 (via h5py or pytables) might be a better storage format for what you are describing. >> >> Tom >> >> On Wed, Jul 5, 2017 at 8:42 AM wrote: >> Dear all >> >> >> >> I'm sorry if my question is too basic (not fully in relation to Numpy ? while it is to build matrices and to work with Numpy afterward), but I'm spending a lot of time and effort to find a way to record data from an asci while, and reassign it into a matrix/array ... with unsuccessfully! >> >> >> >> The only way I found is to use 'append()' instruction involving dynamic memory allocation. :-( >> >> >> >> From my current experience under Scilab (a like Matlab scientific solver), it is well know: >> >> ? Step 1 : matrix initialization like 'np.zeros(n,n)' >> ? Step 2 : record the data >> ? and write it in the matrix (step 3) >> >> >> I'm obviously influenced by my current experience, but I'm interested in moving to Python and its packages >> >> >> >> For huge asci files (involving dozens of millions of lines), my strategy is to work by 'blocks' as : >> >> ? Find the line index of the beginning and the end of one block (this implies that the file is read ounce) >> ? Read the block >> ? (process repeated on the different other blocks) >> >> >> I tried different codes such as bellow, but each time Python is telling me I cannot mix iteration and record method >> if you are indeed tied to using ASCII input data, you will of course have to deal with significant performance handicaps, but there are at least some gains to be had by using an input parser that does not do all the conversions at the Python level, but with a compiled (C) reader - either pandas as Tom already mentioned, or astropy - see e.g. https://github.com/dhomeier/astropy-notebooks/blob/master/io/ascii/ascii_read_bench.ipynb for the almost one order of magnitude speed gains you may get. In your example it is not clear what ?record? method you were trying to use that raised the errors you mention - we would certainly need a full traceback of the error to find out more. In principle your approach of allocating the numpy matrix first and reading the data in chunks makes sense, as it will avoid the much larger temporary lists created during read-in. But it might be more convenient to just read in the block into a list of lines and pass that to a higher-level reader like np.genfromtxt or the faster astropy.io.ascii.read or pandas.read_csv to speed up the parsing of the numbers themselves. That said, on most systems these readers should still be able to handle files up to a few 10^8 items (expect ~ 25-55 bytes of memory for each input number allocated for temporary lists), so if saving memory is not an absolute priority, directly reading the entire file might still be the best choice (and would also save the first pass reading). Cheers, Derek From robbmcleod at gmail.com Wed Jul 5 20:41:00 2017 From: robbmcleod at gmail.com (Robert McLeod) Date: Wed, 5 Jul 2017 17:41:00 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: While I'm going to bet that the fastest way to build a ndarray from ascii is with a 'io.ByteIO` stream, NumPy does have a function to load from text, `numpy.loadtxt` that works well enough for most purposes. https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html It's hard to tell from the original post if the ascii is being continuously generated or not. If it's being produced in an on-going fashion then a stream object is definitely the way to go, as the array chunks can be produced by `numpy.frombuffer()`. https://docs.python.org/3/library/io.html https://docs.scipy.org/doc/numpy/reference/generated/numpy.frombuffer.html Robert On Wed, Jul 5, 2017 at 3:21 PM, Robert Kern wrote: > On Wed, Jul 5, 2017 at 5:41 AM, wrote: > > > > Dear all > > > > I?m sorry if my question is too basic (not fully in relation to Numpy ? > while it is to build matrices and to work with Numpy afterward), but I?m > spending a lot of time and effort to find a way to record data from an asci > while, and reassign it into a matrix/array ? with unsuccessfully! > > > > The only way I found is to use ?append()? instruction involving dynamic > memory allocation. :-( > > Are you talking about appending to Python list objects? Or the np.append() > function on numpy arrays? > > In my experience, it is usually fine to build a list with the `.append()` > method while reading the file of unknown size and then converting it to an > array afterwards, even for dozens of millions of lines. The list object is > quite smart about reallocating memory so it is not that expensive. You > should generally avoid the np.append() function, though; it is not smart. > > > From my current experience under Scilab (a like Matlab scientific > solver), it is well know: > > > > Step 1 : matrix initialization like ?np.zeros(n,n)? > > Step 2 : record the data > > and write it in the matrix (step 3) > > > > I?m obviously influenced by my current experience, but I?m interested in > moving to Python and its packages > > > > For huge asci files (involving dozens of millions of lines), my strategy > is to work by ?blocks? as : > > > > Find the line index of the beginning and the end of one block (this > implies that the file is read ounce) > > Read the block > > (process repeated on the different other blocks) > > Are the blocks intrinsic parts of the file? Or are you just trying to > break up the file into fixed-size chunks? > > > I tried different codes such as bellow, but each time Python is telling > me I cannot mix iteration and record method > > > > ############################################# > > > > position = []; j=0 > > with open(PATH + file_name, "r") as rough_ data: > > for line in rough_ data: > > if my_criteria in line: > > position.append(j) ## huge blocs but limited in > number > > j=j+1 > > > > i = 0 > > blockdata = np.zeros( (size_block), dtype=np.float) > > with open(PATH + file_name, "r") as f: > > for line in itertools.islice(f,1,size_block): > > blockdata [i]=float(f.readline() ) > > For what it's worth, this is the line that is causing the error that you > describe. When you iterate over the file with the `for line in > itertools.islice(f, ...):` loop, you already have the line text. You don't > (and can't) call `f.readline()` to get it again. It would mess up the > iteration if you did and cause you to skip lines. > > By the way, it is useful to help us help you if you copy-paste the exact > code that you are running as well as the full traceback instead of > paraphrasing the error message. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Robert McLeod, Ph.D. robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Thu Jul 6 04:49:26 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Thu, 06 Jul 2017 10:49:26 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: Dear All First of all thanks for the answers and the information's (I'll ding into it) and let me trying to add comments on what I want to : * My asci file mainly contains data (float and int) in a single column * (it is not always the case but I can easily manage it - as well I saw I can use 'spli' instruction if necessary) * Comments/texts indicates the beginning of a bloc immediately followed by the number of sub-blocs * So I need to read/record all the values in order to build a matrix before working on it (using Numpy & vectorization) * The columns 2 and 3 have been added for further treatments * The '0' values will be specifically treated afterward Numpy won't be a problem I guess (I did some basic tests and I'm quite confident) on how to proceed, but I'm really blocked on data records ? I trying to find a way to efficiently read and record data in a matrix: * avoiding dynamic memory allocation (here using 'append' in python meaning, not np), * dealing with huge asci file: the latest file I get contains more than 60 MILLION OF LINES Please find in attachment an extract of the input format ('example_of_input'), and the matrix I'm trying to create and manage with Numpy Thanks again for your time Paul ####################################### ##BEGIN _-> line number x in the original file_ 42 _-> indicates the number of sub-blocs_ 1 _-> number of the 1rst sub-bloc_ 6 _-> gives how many value belong to the sub bloc_ 12 47 2 46 3 51 ?. 13 _ -> another type of sub-bloc with 25 values_ 25 15 88 21 42 22 76 19 89 0 18 80 23 38 24 73 20 81 0 90 0 41 0 39 0 77 ? 42 _-> another type of sub-bloc with 2 values_ 2 115 109 ####################################### THE MATRIX RESULT 1 0 0 6 12 47 2 46 3 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 6 3 50 11 70 12 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 8 11 50 3 49 4 54 5 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 8 12 70 11 66 9 65 10 68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 8 2 47 12 68 10 44 1 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 8 5 56 6 58 7 61 11 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 8 11 61 7 60 8 63 9 66 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 19 12 47 2 46 3 51 0 13 97 14 92 15 96 0 72 0 48 0 52 0 0 0 0 0 0 9 0 0 19 13 97 14 92 15 96 0 16 86 17 82 18 85 0 95 0 91 0 90 0 0 0 0 0 0 10 0 0 19 3 50 11 70 12 51 0 15 89 19 94 13 96 0 52 0 71 0 72 0 0 0 0 0 0 11 0 0 19 15 89 19 94 13 96 0 18 81 20 84 16 85 0 90 0 77 0 95 0 0 0 0 0 0 12 0 0 25 3 49 4 54 5 57 11 50 0 15 88 21 42 22 76 19 89 0 52 0 53 0 55 0 71 13 0 0 25 15 88 21 42 22 76 19 89 0 18 80 23 38 24 73 20 81 0 90 0 41 0 39 0 77 14 0 0 25 11 66 9 65 10 68 12 70 0 19 78 25 99 26 98 13 94 0 71 0 67 0 69 0 72 ?. ####################################### AN EXAMPLE OF THE CODE I STARTED TO WRITE # -*- coding: utf-8 -*- import time, sys, os, re import itertools import numpy as np PATH = str(os.path.abspath('')) input_file_name ='/example_of_input.txt' ## check if the file exists, then if it's empty or not if (os.path.isfile(PATH + input_file_name)): if (os.stat(PATH + input_file_name).st_size > 0): ## go through the file in order to find specific sentences ## specific blocks will be defined afterward Block_position = []; j=0; with open(PATH + input_file_name, "r") as data: for line in data: if '##BEGIN' in line: Block_position.append(j) j=j+1 ## just to tests to get all the values # i = 0 # data = np.zeros( (505), dtype=np.int ) # with open(PATH + input_file_name, "r") as f: # for i in range (0,505): # data[i] = int(f.read(Block_position[0]+1+i)) # print ("i = ", i) # for line in itertools.islice(f,Block_position[0],516): # data[i]=f.read(0+i) # i=i+1 else: print "The file %s is empty : post-processing cannot be performed !!!\n" % input_file_name else: print "Error : the file %s does not exist: post-processing stops !!!\n" % input_file_name -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 6 05:51:55 2017 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Jul 2017 02:51:55 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: On Thu, Jul 6, 2017 at 1:49 AM, wrote: > > Dear All > > First of all thanks for the answers and the information?s (I?ll ding into it) and let me trying to add comments on what I want to : > > My asci file mainly contains data (float and int) in a single column > (it is not always the case but I can easily manage it ? as well I saw I can use ?spli? instruction if necessary) > Comments/texts indicates the beginning of a bloc immediately followed by the number of sub-blocs > So I need to read/record all the values in order to build a matrix before working on it (using Numpy & vectorization) > > The columns 2 and 3 have been added for further treatments > The ?0? values will be specifically treated afterward > > > Numpy won?t be a problem I guess (I did some basic tests and I?m quite confident) on how to proceed, but I?m really blocked on data records ? I trying to find a way to efficiently read and record data in a matrix: > > avoiding dynamic memory allocation (here using ?append? in python meaning, not np), Although you can avoid some list appending in your case (because the blocks self-describe their length), I would caution you against prematurely avoiding it. It's often the most natural way to write the code in Python, so go ahead and write it that way first. Once you get it working correctly, but it's too slow or memory intensive, then you can puzzle over how to preallocate the numpy arrays later. But quite often, it's fine. In this case, the reading and handling of the text data itself is probably the bottleneck, not appending to the lists. As I said, Python lists are cleverly implemented to make appending fast. Accumulating numbers in a list then converting to an array afterwards is a well-accepted numpy idiom. > dealing with huge asci file: the latest file I get contains more than 60 million of lines > > Please find in attachment an extract of the input format (?example_of_input?), and the matrix I?m trying to create and manage with Numpy > > Thanks again for your time Try something like the attached. The function will return a list of blocks. Each block will itself be a list of numpy arrays, which are the sub-blocks themselves. I didn't bother adding the first three columns to the sub-blocks or trying to assemble them all into a uniform-width matrix by padding with trailing 0s. Since you say that the trailing 0s are going to be "specially treated afterwards", I suspect that you can more easily work with the lists of arrays instead. I assume floating-point data rather than trying to figure out whether int or float from the data. The code can handle multiple data values on one line (not especially well-tested, but it ought to work), but it assumes that the number of sub-blocks, index of the sub-block, and sub-block size are each on the own line. The code gets a little more complicated if that's not the case. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: read_blocks.py Type: text/x-python-script Size: 3093 bytes Desc: not available URL: From paul.carrico at free.fr Thu Jul 6 06:19:16 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Thu, 06 Jul 2017 12:19:16 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: Thanks Rober for your effort - I'll have a look on it ... the goal is be guide in how to proceed (and to understand), and not to have a "ready-made solution" ... but I appreciate honnestly :-) Paul Le 2017-07-06 11:51, Robert Kern a ?crit : > On Thu, Jul 6, 2017 at 1:49 AM, wrote: >> >> Dear All >> >> First of all thanks for the answers and the information's (I'll ding into it) and let me trying to add comments on what I want to : >> >> My asci file mainly contains data (float and int) in a single column >> (it is not always the case but I can easily manage it - as well I saw I can use 'spli' instruction if necessary) >> Comments/texts indicates the beginning of a bloc immediately followed by the number of sub-blocs >> So I need to read/record all the values in order to build a matrix before working on it (using Numpy & vectorization) >> >> The columns 2 and 3 have been added for further treatments >> The '0' values will be specifically treated afterward >> >> >> Numpy won't be a problem I guess (I did some basic tests and I'm quite confident) on how to proceed, but I'm really blocked on data records ... I trying to find a way to efficiently read and record data in a matrix: >> >> avoiding dynamic memory allocation (here using 'append' in python meaning, not np), > > Although you can avoid some list appending in your case (because the blocks self-describe their length), I would caution you against prematurely avoiding it. It's often the most natural way to write the code in Python, so go ahead and write it that way first. Once you get it working correctly, but it's too slow or memory intensive, then you can puzzle over how to preallocate the numpy arrays later. But quite often, it's fine. In this case, the reading and handling of the text data itself is probably the bottleneck, not appending to the lists. As I said, Python lists are cleverly implemented to make appending fast. Accumulating numbers in a list then converting to an array afterwards is a well-accepted numpy idiom. > >> dealing with huge asci file: the latest file I get contains more than 60 million of lines >> >> Please find in attachment an extract of the input format ('example_of_input'), and the matrix I'm trying to create and manage with Numpy >> >> Thanks again for your time > > Try something like the attached. The function will return a list of blocks. Each block will itself be a list of numpy arrays, which are the sub-blocks themselves. I didn't bother adding the first three columns to the sub-blocks or trying to assemble them all into a uniform-width matrix by padding with trailing 0s. Since you say that the trailing 0s are going to be "specially treated afterwards", I suspect that you can more easily work with the lists of arrays instead. I assume floating-point data rather than trying to figure out whether int or float from the data. The code can handle multiple data values on one line (not especially well-tested, but it ought to work), but it assumes that the number of sub-blocks, index of the sub-block, and sub-block size are each on the own line. The code gets a little more complicated if that's not the case. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From bennyrowland at mac.com Thu Jul 6 07:42:59 2017 From: bennyrowland at mac.com (Ben Rowland) Date: Thu, 06 Jul 2017 12:42:59 +0100 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: > On 5 Jul 2017, at 19:05, Stephan Hoyer wrote: > > On Wed, Jul 5, 2017 at 10:40 AM, Chris Barker > wrote: > Along those lines, there was some discussion of having a set of utilities (or maybe eve3n an ABC?) that would make it easier to create a ndarray-like object. > > That is, the boilerplate needed for multi-dimensional indexing and slicing, etc... > > That could be a nice little sprint-able project. > > Indeed. Let me highlight a few mixins that I wrote for xarray that might be more broadly useful. The challenge here is that there are quite a few different meanings to "ndarray-like", so mixins really need to be mix-and-match-able. But at least defining a base list of methods to implement/override would be useful. > > In NumPy, this could go along with NDArrayOperatorsMixins in numpy/lib/mixins.py Slightly off topic, but as someone who has just spent a fair amount of time implementing various subclasses of nd-array, I am interested (and a little concerned), that the consensus is not to use them. Is there anything available which explains why this is the case and what the alternatives are? Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 6 09:10:18 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Jul 2017 07:10:18 -0600 Subject: [Numpy-discussion] Making a 1.13.2 release Message-ID: Hi All, I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up fixing #29943 so we can close #9272 , but the Python release has been delayed to July 11 (expected). The Python problem means that NumPy compiled with Python 3.6.1 will not run in Python 3.6.0. However, I've also been asked to have a bugfixed version of 1.13 available for Scipy 2017 next week. At this point it looks like the best thing to do is release 1.13.1 compiled with Python 3.6.1 and ask folks to upgrade Python if they have a problem, and then release 1.13.2 as soon as 3.6.2 is released. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Jul 6 09:15:41 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Jul 2017 14:15:41 +0100 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: Hi, On Thu, Jul 6, 2017 at 2:10 PM, Charles R Harris wrote: > Hi All, > > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up > fixing #29943 so we can close #9272, but the Python release has been > delayed to July 11 (expected). The Python problem means that NumPy compiled > with Python 3.6.1 will not run in Python 3.6.0. However, I've also been > asked to have a bugfixed version of 1.13 available for Scipy 2017 next week. > At this point it looks like the best thing to do is release 1.13.1 compiled > with Python 3.6.1 and ask folks to upgrade Python if they have a problem, > and then release 1.13.2 as soon as 3.6.2 is released. I think this problem only applies to Windows. We might be able to downgrade the Appveyor Python 3.6.1 to 3.6.0 for that - I can look into it today if it would help. While I'm at it - how about switching to OpenBLAS wheels on Windows for this release? Cheers, Matthew From charlesr.harris at gmail.com Thu Jul 6 10:37:36 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Jul 2017 08:37:36 -0600 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 7:15 AM, Matthew Brett wrote: > Hi, > > On Thu, Jul 6, 2017 at 2:10 PM, Charles R Harris > wrote: > > Hi All, > > > > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up > > fixing #29943 so we can close #9272, but the Python release has been > > delayed to July 11 (expected). The Python problem means that NumPy > compiled > > with Python 3.6.1 will not run in Python 3.6.0. However, I've also been > > asked to have a bugfixed version of 1.13 available for Scipy 2017 next > week. > > At this point it looks like the best thing to do is release 1.13.1 > compiled > > with Python 3.6.1 and ask folks to upgrade Python if they have a problem, > > and then release 1.13.2 as soon as 3.6.2 is released. > > I think this problem only applies to Windows. We might be able to > downgrade the Appveyor Python 3.6.1 to 3.6.0 for that - I can look > into it today if it would help. > > While I'm at it - how about switching to OpenBLAS wheels on Windows > for this release? > > Cheers, > > Matthew > Haste makes waste ;) I'd rather put off the move to OpenBlas to 1.14 to allow more time for it to settle, and compiling against Python 3.6.0 seems like more work than it is worth, It should be easy to upgrade to 3.6.1 for those affected once they are aware of the problem, and it should not be too long before Python 3.6.2 is out. I'll call it the Scipy2017 release. Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Jul 6 11:53:27 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Jul 2017 16:53:27 +0100 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 3:37 PM, Charles R Harris wrote: > > > On Thu, Jul 6, 2017 at 7:15 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Thu, Jul 6, 2017 at 2:10 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up >> > fixing #29943 so we can close #9272, but the Python release has been >> > delayed to July 11 (expected). The Python problem means that NumPy >> > compiled >> > with Python 3.6.1 will not run in Python 3.6.0. However, I've also been >> > asked to have a bugfixed version of 1.13 available for Scipy 2017 next >> > week. >> > At this point it looks like the best thing to do is release 1.13.1 >> > compiled >> > with Python 3.6.1 and ask folks to upgrade Python if they have a >> > problem, >> > and then release 1.13.2 as soon as 3.6.2 is released. >> >> I think this problem only applies to Windows. We might be able to >> downgrade the Appveyor Python 3.6.1 to 3.6.0 for that - I can look >> into it today if it would help. >> >> While I'm at it - how about switching to OpenBLAS wheels on Windows >> for this release? >> >> Cheers, >> >> Matthew > > > Haste makes waste ;) I'd rather put off the move to OpenBlas to 1.14 to > allow more time for it to settle, I'd only say that I don't know of any settling that is likely to happen. I suspect that not many people have tried the experimental wheels. I've automated the build process both for OpenBLAS and the OpenBLAS wheels, and I believe those are solid now. > and compiling against Python 3.6.0 seems > like more work than it is worth, Probably about two hours of futzing on Appveyor - your call - I'm happy not to do it :) Cheers, Matthew From charlesr.harris at gmail.com Thu Jul 6 12:33:57 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Jul 2017 10:33:57 -0600 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 9:53 AM, Matthew Brett wrote: > On Thu, Jul 6, 2017 at 3:37 PM, Charles R Harris > wrote: > > > > > > On Thu, Jul 6, 2017 at 7:15 AM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Thu, Jul 6, 2017 at 2:10 PM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show > up > >> > fixing #29943 so we can close #9272, but the Python release has been > >> > delayed to July 11 (expected). The Python problem means that NumPy > >> > compiled > >> > with Python 3.6.1 will not run in Python 3.6.0. However, I've also > been > >> > asked to have a bugfixed version of 1.13 available for Scipy 2017 next > >> > week. > >> > At this point it looks like the best thing to do is release 1.13.1 > >> > compiled > >> > with Python 3.6.1 and ask folks to upgrade Python if they have a > >> > problem, > >> > and then release 1.13.2 as soon as 3.6.2 is released. > >> > >> I think this problem only applies to Windows. We might be able to > >> downgrade the Appveyor Python 3.6.1 to 3.6.0 for that - I can look > >> into it today if it would help. > >> > >> While I'm at it - how about switching to OpenBLAS wheels on Windows > >> for this release? > >> > >> Cheers, > >> > >> Matthew > > > > > > Haste makes waste ;) I'd rather put off the move to OpenBlas to 1.14 to > > allow more time for it to settle, > > I'd only say that I don't know of any settling that is likely to > happen. I suspect that not many people have tried the experimental > wheels. I've automated the build process both for OpenBLAS and the > OpenBLAS wheels, and I believe those are solid now. > But it does add risk. We can deal with that in a regular release because of the betas and release condidates, but I'm not counting on any of those for 1.13.1 (2). Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 6 12:33:59 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Jul 2017 09:33:59 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: OK, you have two performance "issues" 1) memory use: IF yu need to read a file to build a numpy array, and dont know how big it is when you start, you need to accumulate the values first, and then make an array out of them. And numpy arrays are fixed size, so they can not efficiently accumulate values. The usual way to handle this is to read the data into a list with .append() or the like, and then make an array from it. This is quite fast -- lists are fast and efficient for extending arrays. However, you are then storing (at least) a pointer and a python float object for each value, which is a lot more memory than a single float value in a numpy array, and you need to make the array from it, which means you have the full list and all its pyton floats AND the array in memory at once. Frankly, computers have a lot of memory these days, so this is a non-issue in most cases. Nonetheless, a while back I wrote an extendable numpy array object to address just this issue. You can find the code on gitHub here: https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py I have not tested it with recent numpy's but I expect is still works fine. It's also py2, but wouldn't take much to port. In practice, it uses less memory that the "build a list, then make it into an array", but isnt any faster, unless you add (.extend) a bunch of values at once, rather than one at a time. (if you do it one at a time, the whole python float to numpy float conversion, and function call overhead takes just as long). But it will, generally be as fast or faster than using a list, and use less memory, so a fine basis for a big ascii file reader. However, it looks like while your files may be huge, they hold a number of arrays, so each array may not be large enough to bother with any of this. 2) parsing and converting overhead -- for the most part, python/numpy text file reading code read the text into a python string, converts it to python number objects, then puts them in a list or converts them to native numbers in an array. This whole process is a bit slow (though reading files is slow anyway, so usually not worth worrying about, which is why the built-in file reading methods do this). To improve this, you need to use code that reads the file and parses it in C, and puts it straight into a numpy array without passing through python. This is what the pandas (and I assume astropy) text file readers do. But if you don't want those dependencies, there is the "fromfile()" function in numpy -- it is not very robust, but if you files are well-formed, then it is quite fast. So your code would look something like: with open(the_filename) as infile: while True: line = infile.readline() if not line: break # work with line to figure out the next block if ready_to_read_a_block: arr = np.fromfile(infile, dtype=np.int32, count=num_values, sep=' ') # sep specifies that you are reading text, not binary! arr.shape = the_shape_it_should_be But Robert is right -- get it to work with the "usual" methods -- i.e. put numbers in a list, then make an array out it -- first, and then worry about making it faster. -CHB On Thu, Jul 6, 2017 at 1:49 AM, wrote: > Dear All > > > First of all thanks for the answers and the information?s (I?ll ding into > it) and let me trying to add comments on what I want to : > > 1. My asci file mainly contains data (float and int) in a single column > 2. (it is not always the case but I can easily manage it ? as well I > saw I can use ?spli? instruction if necessary) > 3. Comments/texts indicates the beginning of a bloc immediately > followed by the number of sub-blocs > 4. So I need to read/record all the values in order to build a matrix > before working on it (using Numpy & vectorization) > - The columns 2 and 3 have been added for further treatments > - The ?0? values will be specifically treated afterward > > > Numpy won?t be a problem I guess (I did some basic tests and I?m quite > confident) on how to proceed, but I?m really blocked on data records ? I > trying to find a way to efficiently read and record data in a matrix: > > - avoiding dynamic memory allocation (here using ?append? in python > meaning, not np), > - dealing with huge asci file: the latest file I get contains more > than *60 million of lines* > > > Please find in attachment an extract of the input format > (?example_of_input?), and the matrix I?m trying to create and manage with > Numpy > > > Thanks again for your time > > Paul > > > ####################################### > > ##BEGIN *-> line number x in the original file* > > 42 *-> indicates the number of sub-blocs* > > 1 *-> number of the 1rst sub-bloc* > > 6 *-> gives how many value belong to the sub bloc* > > 12 > > 47 > > 2 > > 46 > > 3 > > 51 > > ?. > > 13 * -> another type of sub-bloc with 25 values* > > 25 > > 15 > > 88 > > 21 > > 42 > > 22 > > 76 > > 19 > > 89 > > 0 > > 18 > > 80 > > 23 > > 38 > > 24 > > 73 > > 20 > > 81 > > 0 > > 90 > > 0 > > 41 > > 0 > > 39 > > 0 > > 77 > > ? > > 42 *-> another type of sub-bloc with 2 values* > > 2 > > 115 > > 109 > > > ####################################### > > *The matrix result* > > 1 0 0 6 12 47 2 46 3 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 2 0 0 6 3 50 11 70 12 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 3 0 0 8 11 50 3 49 4 54 5 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 4 0 0 8 12 70 11 66 9 65 10 68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 5 0 0 8 2 47 12 68 10 44 1 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 6 0 0 8 5 56 6 58 7 61 11 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 7 0 0 8 11 61 7 60 8 63 9 66 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > 8 0 0 19 12 47 2 46 3 51 0 13 97 14 92 15 96 0 72 0 48 0 52 0 0 0 0 0 0 > > 9 0 0 19 13 97 14 92 15 96 0 16 86 17 82 18 85 0 95 0 91 0 90 0 0 0 0 0 0 > > 10 0 0 19 3 50 11 70 12 51 0 15 89 19 94 13 96 0 52 0 71 0 72 0 0 0 0 0 0 > > 11 0 0 19 15 89 19 94 13 96 0 18 81 20 84 16 85 0 90 0 77 0 95 0 0 0 0 0 0 > > 12 0 0 25 3 49 4 54 5 57 11 50 0 15 88 21 42 22 76 19 89 0 52 0 53 0 55 0 > 71 > > 13 0 0 25 15 88 21 42 22 76 19 89 0 18 80 23 38 24 73 20 81 0 90 0 41 0 39 > 0 77 > > 14 0 0 25 11 66 9 65 10 68 12 70 0 19 78 25 99 26 98 13 94 0 71 0 67 0 69 > 0 72 > > ?. > > > ####################################### > > *An example of the code I started to write* > > # -*- coding: utf-8 -*- > > import time, sys, os, re > > import itertools > > import numpy as np > > > PATH = str(os.path.abspath('')) > > > input_file_name ='/example_of_input.txt' > > > > > ## check if the file exists, then if it's empty or not > > if (os.path.isfile(PATH + input_file_name)): > > if (os.stat(PATH + input_file_name).st_size > 0): > > > > ## go through the file in order to find specific sentences > > ## specific blocks will be defined afterward > > Block_position = []; j=0; > > with open(PATH + input_file_name, "r") as data: > > for line in data: > > if '##BEGIN' in line: > > Block_position.append(j) > > j=j+1 > > > > > > ## just to tests to get all the values > > # i = 0 > > # data = np.zeros( (505), dtype=np.int ) > > # with open(PATH + input_file_name, "r") as f: > > # for i in range (0,505): > > # data[i] = int(f.read(Block_position[0]+1+i)) > > # print ("i = ", i) > > > > > > # for line in itertools.islice(f,Block_position[0],516): > > # data[i]=f.read(0+i) > > # i=i+1 > > > > > > > else: > > print "The file %s is empty : post-processing cannot be performed > !!!\n" % input_file_name > > > > > else: > > print "Error : the file %s does not exist: post-processing stops > !!!\n" % input_file_name > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 6 12:37:20 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Jul 2017 09:37:20 -0700 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 6:10 AM, Charles R Harris wrote: > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up > fixing #29943 so we can close #9272 > , but the Python release has > been delayed to July 11 (expected). The Python problem means that NumPy > compiled with Python 3.6.1 will not run in Python 3.6.0. > If it's compiled against 3.6.0 will it work fine with 3.6.1? and probably 3.6.2 as well? If so, it would be nice to do it that way, if Matthew doesn't mind :-) But either way, it'll be good to get it out. Thanks! -CHB > However, I've also been asked to have a bugfixed version of 1.13 available > for Scipy 2017 next week. At this point it looks like the best thing to do > is release 1.13.1 compiled with Python 3.6.1 and ask folks to upgrade > Python if they have a problem, and then release 1.13.2 as soon as 3.6.2 is > released. > > Thoughts? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 6 12:42:06 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Jul 2017 09:42:06 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Wed, Jul 5, 2017 at 11:05 AM, Stephan Hoyer wrote: > That is, the boilerplate needed for multi-dimensional indexing and >> slicing, etc... >> >> That could be a nice little sprint-able project. >> > > Indeed. Let me highlight a few mixins > that > I wrote for xarray that might be more broadly useful. > At a quick glance, that is exactly the kind of ting I had in mind. The challenge here is that there are quite a few different meanings to > "ndarray-like", so mixins really need to be mix-and-match-able. > exactly! > But at least defining a base list of methods to implement/override would > be useful. > With sample implementations, even... at last of parts of it -- I'm thinking things like parsing out the indexes/slices in __getitem__ -- that sort of thing. > In NumPy, this could go along with NDArrayOperatorsMixins in > numpy/lib/mixins.py > > Yes! I had no idea that existed. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Jul 6 13:03:19 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 6 Jul 2017 10:03:19 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Thu, Jul 6, 2017 at 4:42 AM, Ben Rowland wrote: > Slightly off topic, but as someone who has just spent a fair amount of > time implementing various > subclasses of nd-array, I am interested (and a little concerned), that the > consensus is not to use > them. Is there anything available which explains why this is the case and > what the alternatives > are? > Writing such docs (especially to explain how to write array-like objects that aren't subclasses) would be another good topic for the sprint ;). But more seriously: numpy.ndarray subclasses are supported, but inherently error prone, because we don't have a well defined subclassing API. As Martin will attest, this means seemingly harmless internal refactoring in NumPy has a tendency to break downstream subclasses, which often unintentionally end up relying on untested implementation details. This is particularly problematic when subclasses are implemented in a different code-base, as is the case for user subclasses of numpy.ndarray. Due to diligent testing efforts, we often (but not always) catch these issues before making a release, but the process is inherently error prone. Writing NumPy functionality in a manner that is robust to all possible subclassing approaches turns out to be very difficult (nearly impossible). This is actually a classic OOP problem, e.g., see https://en.wikipedia.org/wiki/Composition_over_inheritance -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Jul 6 13:04:44 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 6 Jul 2017 10:04:44 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Thu, Jul 6, 2017 at 9:42 AM, Chris Barker wrote: > In NumPy, this could go along with NDArrayOperatorsMixins in >> numpy/lib/mixins.py >> >> > > Yes! I had no idea that existed. > It's brand new for NumPy 1.13 :). I wrote it to go along with __array_ufunc__. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jul 6 13:54:00 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Jul 2017 10:54:00 -0700 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: It's also possible to work around the 3.6.1 problem with a small preprocessor hack. On my phone but there's a link in the bug report discussion. On Jul 6, 2017 6:10 AM, "Charles R Harris" wrote: > Hi All, > > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up > fixing #29943 so we can close #9272 > , but the Python release has > been delayed to July 11 (expected). The Python problem means that NumPy > compiled with Python 3.6.1 will not run in Python 3.6.0. However, I've also > been asked to have a bugfixed version of 1.13 available for Scipy 2017 next > week. At this point it looks like the best thing to do is release 1.13.1 > compiled with Python 3.6.1 and ask folks to upgrade Python if they have a > problem, and then release 1.13.2 as soon as 3.6.2 is released. > > Thoughts? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Thu Jul 6 13:55:22 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Thu, 06 Jul 2017 19:55:22 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: <0d69df8c7c5da400b85af9af7d213a76@free.fr> Thanks all for your advices Well many thing to look for, but it's obvious now that I've first to work on (better) strategy than the one I was thinking previously (i.e. load all the files and results in one step). It's is just a reflexion, but for huge files one solution might be to split/write/build first the array in a dedicated file (2x o(n) iterations - one to identify the blocks size - additional one to get and write), and then to load it in memory and work with numpy - at this stage the dimension is known and some packages will be fast and more adapted (pandas or astropy as suggested). Thanks all for your time and help Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 6 16:01:29 2017 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Jul 2017 13:01:29 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> Message-ID: On Thu, Jul 6, 2017 at 3:19 AM, wrote: > > Thanks Rober for your effort - I'll have a look on it > > ... the goal is be guide in how to proceed (and to understand), and not to have a "ready-made solution" ... but I appreciate honnestly :-) Sometimes it's easier to just write the code than to try to explain in prose what to do. :-) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 6 19:59:12 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Jul 2017 16:59:12 -0700 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: <0d69df8c7c5da400b85af9af7d213a76@free.fr> References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> <0d69df8c7c5da400b85af9af7d213a76@free.fr> Message-ID: On Thu, Jul 6, 2017 at 10:55 AM, wrote: > > It's is just a reflexion, but for huge files one solution might be to > split/write/build first the array in a dedicated file (2x o(n) iterations - > one to identify the blocks size - additional one to get and write), and > then to load it in memory and work with numpy - > I may have your use case confused, but if you have a huge file with multiple "blocks" in it, there shouldn't be any problem with loading it in one go -- start at the top of the file and load one block at a time (accumulating in a list) -- then you only have the memory overhead issues for one block at a time, should be no problem. at this stage the dimension is known and some packages will be fast and > more adapted (pandas or astropy as suggested). > pandas at least is designed to read variations of CSV files, not sure you could use the optimized part to read an array out of part of an open file from a particular point or not. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Thu Jul 6 20:24:41 2017 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Fri, 7 Jul 2017 10:24:41 +1000 Subject: [Numpy-discussion] Making a 1.13.2 release In-Reply-To: References: Message-ID: <283d9252-0530-45e0-a3bd-311fbc39919d@Spark> Just chiming in with a +1 to releasing 1.13.1 before SciPy. It will certainly save the skimage tutorial a lot of headaches! Not that I?ll be there but I look out for my own. =P On 7 Jul 2017, 3:54 AM +1000, Nathaniel Smith , wrote: > It's also possible to work around the 3.6.1 problem with a small preprocessor hack. On my phone but there's a link in the bug report discussion. > > > On Jul 6, 2017 6:10 AM, "Charles R Harris" wrote: > > > Hi All, > > > > > > I've delayed the NumPy 1.13.2 release hoping for Python 3.6.2 to show up fixing #29943? so we can close #9272, but the Python release has been delayed to July 11 (expected). The Python problem means that NumPy compiled with Python 3.6.1 will not run in Python 3.6.0. However, I've also been asked to have a bugfixed version of 1.13 available for Scipy 2017 next week. At this point it looks like the best thing to do is release 1.13.1 compiled with Python 3.6.1 and ask folks to upgrade Python if they have a problem, and then release 1.13.2 as soon as 3.6.2 is released. > > > > > > Thoughts? > > > > > > Chuck > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 6 22:20:18 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Jul 2017 20:20:18 -0600 Subject: [Numpy-discussion] NumPy 1.13.1 released Message-ID: Hi All, On behalf of the NumPy team, I am pleased to announce the release of NumPy 1.13.1. This is a bugfix release for problems found in 1.13.0. The major changes are: - fixes for the new memory overlap detection, - fixes for the new temporary elision capability, - reversion of the removal of the boolean binary ``-`` operator. It is recommended that users of 1.13.0 upgrade to 1.13.1. Wheels can be found on PyPI . Source tarballs, zipfiles, release notes, and the changelog are available on github . Note that the wheels for Python 3.6 are built against 3.6.1, hence will not work when used with 3.6.0 due to Python bug #29943 . The plan is to release NumPy 1.13.2 shortly after the release of Python 3.6.2 is out with a fix that problem. If you are using 3.6.0, the workaround is to upgrade to 3.6.1 or use an earlier Python version. *Pull requests merged*A total of 19 pull requests were merged for this release. * #9240 DOC: BLD: fix lots of Sphinx warnings/errors. * #9255 Revert "DEP: Raise TypeError for subtract(bool_, bool_)." * #9261 BUG: don't elide into readonly and updateifcopy temporaries for... * #9262 BUG: fix missing keyword rename for common block in numpy.f2py * #9263 BUG: handle resize of 0d array * #9267 DOC: update f2py front page and some doc build metadata. * #9299 BUG: Fix Intel compilation on Unix. * #9317 BUG: fix wrong ndim used in empty where check * #9319 BUG: Make extensions compilable with MinGW on Py2.7 * #9339 BUG: Prevent crash if ufunc doc string is null * #9340 BUG: umath: un-break ufunc where= when no out= is given * #9371 DOC: Add isnat/positive ufunc to documentation * #9372 BUG: Fix error in fromstring function from numpy.core.records... * #9373 BUG: ')' is printed at the end pointer of the buffer in numpy.f2py. * #9374 DOC: Create NumPy 1.13.1 release notes. * #9376 BUG: Prevent hang traversing ufunc userloop linked list * #9377 DOC: Use x1 and x2 in the heaviside docstring. * #9378 DOC: Add $PARAMS to the isnat docstring * #9379 DOC: Update the 1.13.1 release notes *Contributors* A total of 12 people contributed to this release. People with a "+" by their names contributed a patch for the first time. * Andras Deak + * Bob Eldering + * Charles Harris * Daniel Hrisca + * Eric Wieser * Joshua Leahy + * Julian Taylor * Michael Seifert * Pauli Virtanen * Ralf Gommers * Roland Kaufmann * Warren Weckesser -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Fri Jul 7 06:04:57 2017 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 7 Jul 2017 12:04:57 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> <0d69df8c7c5da400b85af9af7d213a76@free.fr> Message-ID: On 7 Jul 2017, at 1:59 am, Chris Barker wrote: > > On Thu, Jul 6, 2017 at 10:55 AM, wrote: > It's is just a reflexion, but for huge files one solution might be to split/write/build first the array in a dedicated file (2x o(n) iterations - one to identify the blocks size - additional one to get and write), and then to load it in memory and work with numpy - > > > I may have your use case confused, but if you have a huge file with multiple "blocks" in it, there shouldn't be any problem with loading it in one go -- start at the top of the file and load one block at a time (accumulating in a list) -- then you only have the memory overhead issues for one block at a time, should be no problem. > > at this stage the dimension is known and some packages will be fast and more adapted (pandas or astropy as suggested). > > pandas at least is designed to read variations of CSV files, not sure you could use the optimized part to read an array out of part of an open file from a particular point or not. > The fragmented structure indeed would probably be the biggest challenge, although astropy, while it cannot read from an open file handle, at least should be able to directly parse a block of input lines, e.g. collected with readline() in a list. Guess pandas could do the same. Alternatively the line positions of the blocks could be directly passed to the data_start and data_end keywords, but that would require opening and at least partially reading the file multiple times. In fact, if the blocks are relatively small, the overhead may be too large to make it worth using the faster parsers - if you look at the timing notebooks I had linked to earlier, it takes at least ~100 input lines before they show any speed gains over genfromtxt, and ~1000 to see roughly linear scaling. In that case writing your own customised reader could be the best option after all. Cheers, Derek From paul.carrico at free.fr Fri Jul 7 10:24:08 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Fri, 07 Jul 2017 16:24:08 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> <0d69df8c7c5da400b85af9af7d213a76@free.fr> Message-ID: <7b317dbaa9e82e7319143caed422a41f@free.fr> Hi (all) Ounce again I would like to thanks the community for the supports. I progressing in moving my code to Python .. In my mind some parts remains quite hugly (and burns me the eyes), but it works and I'll optimized it in the future ; so far I can work with the data in a single reading I builts some blocks in a text file and used Astropy to read it (work fine now - i'll test pandas next step) Not finish yet but in a significant progress compare to yesterday :-) Have a good WE Paul ps : I'd like to use the following code that is much more familiar for me :-) COMP_list = np.asarray(COMP_list, dtype = np.float64) i = np.arange(1,NumberOfRecords,2) COMP_list = np.delete(COMP_list,i) Le 2017-07-07 12:04, Derek Homeier a ?crit : > On 7 Jul 2017, at 1:59 am, Chris Barker wrote: > >> On Thu, Jul 6, 2017 at 10:55 AM, wrote: >> It's is just a reflexion, but for huge files one solution might be to split/write/build first the array in a dedicated file (2x o(n) iterations - one to identify the blocks size - additional one to get and write), and then to load it in memory and work with numpy - >> >> I may have your use case confused, but if you have a huge file with multiple "blocks" in it, there shouldn't be any problem with loading it in one go -- start at the top of the file and load one block at a time (accumulating in a list) -- then you only have the memory overhead issues for one block at a time, should be no problem. >> >> at this stage the dimension is known and some packages will be fast and more adapted (pandas or astropy as suggested). >> >> pandas at least is designed to read variations of CSV files, not sure you could use the optimized part to read an array out of part of an open file from a particular point or not. > The fragmented structure indeed would probably be the biggest challenge, although astropy, > while it cannot read from an open file handle, at least should be able to directly parse a block > of input lines, e.g. collected with readline() in a list. Guess pandas could do the same. > Alternatively the line positions of the blocks could be directly passed to the data_start and > data_end keywords, but that would require opening and at least partially reading the file > multiple times. In fact, if the blocks are relatively small, the overhead may be too large to > make it worth using the faster parsers - if you look at the timing notebooks I had linked to > earlier, it takes at least ~100 input lines before they show any speed gains over genfromtxt, > and ~1000 to see roughly linear scaling. In that case writing your own customised reader > could be the best option after all. > > Cheers, > Derek > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Fri Jul 7 10:55:35 2017 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 7 Jul 2017 16:55:35 +0200 Subject: [Numpy-discussion] record data previous to Numpy use In-Reply-To: <7b317dbaa9e82e7319143caed422a41f@free.fr> References: <7f45847ca0c1184e86ecde96adddc2f0@free.fr> <0d69df8c7c5da400b85af9af7d213a76@free.fr> <7b317dbaa9e82e7319143caed422a41f@free.fr> Message-ID: On 07 Jul 2017, at 4:24 PM, paul.carrico at free.fr wrote: > > ps : I'd like to use the following code that is much more familiar for me :-) > > COMP_list = np.asarray(COMP_list, dtype = np.float64) > i = np.arange(1,NumberOfRecords,2) > COMP_list = np.delete(COMP_list,i) > Not sure about the background of this, but if you want to remove every second entry (if NumberOfRecords is the full length of the list, that is), it would always be preferable to make changes to the list, or even better, extract only the entries you want: COMP_list = np.asarray(COMP_list[::2], dtype = np.float64) Have a good weekend Derek From matthew.brett at gmail.com Fri Jul 7 12:26:56 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 7 Jul 2017 17:26:56 +0100 Subject: [Numpy-discussion] Polynomial silent breakage with 1.13 Message-ID: Hi, Our (nipy's) test suite just failed with the upgrade to numpy 1.13, and the cause boiled down to this: ``` import numpy as np poly = np.poly1d([1]) poly.c[0] *= 2 print(poly.c) ``` Numpy 1.12 gives (to me) expected output: [2] Numpy 1.13 gives (to me) unexpected output: [1] The problem is caused by the fact that the coefficients are now a *copy* of the actual coefficient array - I think in an attempt to stop us modifying the coefficients directly. I can't see any deprecation warnings with `-W always`. The pain point here is that code that used to give the right answer has now (I believe silently) switched to giving the wrong answer. Cheers, Matthew From wieser.eric+numpy at gmail.com Fri Jul 7 13:14:25 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Fri, 07 Jul 2017 17:14:25 +0000 Subject: [Numpy-discussion] Polynomial silent breakage with 1.13 In-Reply-To: References: Message-ID: That?s a regression, and it?s on me, in 8762. That was a side effect of a fix for the weird behaviour here . I think we need to fix this in 1.13.2, so we should file an issue about it. Eric ? On Fri, 7 Jul 2017 at 18:31 Matthew Brett wrote: > Hi, > > Our (nipy's) test suite just failed with the upgrade to numpy 1.13, > and the cause boiled down to this: > > ``` > import numpy as np > > poly = np.poly1d([1]) > poly.c[0] *= 2 > print(poly.c) > ``` > > Numpy 1.12 gives (to me) expected output: > > [2] > > Numpy 1.13 gives (to me) unexpected output: > > [1] > > The problem is caused by the fact that the coefficients are now a > *copy* of the actual coefficient array - I think in an attempt to stop > us modifying the coefficients directly. > > I can't see any deprecation warnings with `-W always`. > > The pain point here is that code that used to give the right answer > has now (I believe silently) switched to giving the wrong answer. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Jul 7 16:27:21 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 7 Jul 2017 21:27:21 +0100 Subject: [Numpy-discussion] Polynomial silent breakage with 1.13 In-Reply-To: References: Message-ID: Hi, On Fri, Jul 7, 2017 at 6:14 PM, Eric Wieser wrote: > That?s a regression, and it?s on me, in 8762. > > That was a side effect of a fix for the weird behaviour here. > > I think we need to fix this in 1.13.2, so we should file an issue about it. Thanks for the feedback. Do you want to file an issue, or should I? Cheers, Matthew From wieser.eric+numpy at gmail.com Fri Jul 7 17:44:41 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Fri, 07 Jul 2017 21:44:41 +0000 Subject: [Numpy-discussion] Polynomial silent breakage with 1.13 In-Reply-To: References: Message-ID: I've gone ahead and filed one at https://github.com/numpy/numpy/issues/9385, along with links to relevant PRs. On Fri, 7 Jul 2017 at 22:28 Matthew Brett wrote: > Hi, > > On Fri, Jul 7, 2017 at 6:14 PM, Eric Wieser > wrote: > > That?s a regression, and it?s on me, in 8762. > > > > That was a side effect of a fix for the weird behaviour here. > > > > I think we need to fix this in 1.13.2, so we should file an issue about > it. > > Thanks for the feedback. Do you want to file an issue, or should I? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Fri Jul 7 18:27:18 2017 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 8 Jul 2017 00:27:18 +0200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: Hi All, I doubt I'm really the last one thinking ndarray subclassing is a good idea, but as that was stated, I feel I should at least pipe in. It seems to me there is both a perceived problem -- with the two subclasses that numpy provides -- `matrix` and `MaskedArray` -- both being problematic in ways that seem to me to have very little to do with subclassing being a bad idea, and a real one following from the fact that numpy was written at a time when python's inheritance system was not as well developed as it is now. Though based on my experience with Quantity, I'd also argue that the more annoying problems are not so much with `ndarray` itself, but rather with the helper functions. Ufuncs were not so bad -- they really just needed a better override mechanism, which __array_ufunc__ now provides -- but for quite a few of the other functions subclassing was clearly an afterthought. Indeed, `MaskedArray` provides a nice example of this, with its many special `np.ma.` routines, providing huge duplication and thus lots of duplicated bugs (which Eric has been patiently fixing...). Indeed, `MaskedArray` is also a much better example than ndarrat of a class that is really hard to subclass (even though, conceptually, it should be a far easier one). All that said, duck-type arrays make a lot of sense, and e.g. the slicing and shaping methods are easily emulated, especially if one's underlying data are stored in `ndarray`. For astropy's version of a relevant mixin, see http://docs.astropy.org/en/stable/api/astropy.utils.misc.ShapedLikeNDArray.html All the best, Marten From rmay31 at gmail.com Fri Jul 7 18:42:46 2017 From: rmay31 at gmail.com (Ryan May) Date: Fri, 7 Jul 2017 16:42:46 -0600 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Fri, Jul 7, 2017 at 4:27 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi All, > > I doubt I'm really the last one thinking ndarray subclassing is a good > idea, but as that was stated, I feel I should at least pipe in. It > seems to me there is both a perceived problem -- with the two > subclasses that numpy provides -- `matrix` and `MaskedArray` -- both > being problematic in ways that seem to me to have very little to do > with subclassing being a bad idea, and a real one following from the > fact that numpy was written at a time when python's inheritance system > was not as well developed as it is now. > > Though based on my experience with Quantity, I'd also argue that the > more annoying problems are not so much with `ndarray` itself, but > rather with the helper functions. Ufuncs were not so bad -- they > really just needed a better override mechanism, which __array_ufunc__ > now provides -- but for quite a few of the other functions subclassing > was clearly an afterthought. Indeed, `MaskedArray` provides a nice > example of this, with its many special `np.ma.` routines, > providing huge duplication and thus lots of duplicated bugs (which > Eric has been patiently fixing...). Indeed, `MaskedArray` is also a > much better example than ndarrat of a class that is really hard to > subclass (even though, conceptually, it should be a far easier one). > > All that said, duck-type arrays make a lot of sense, and e.g. the > slicing and shaping methods are easily emulated, especially if one's > underlying data are stored in `ndarray`. For astropy's version of a > relevant mixin, see > http://docs.astropy.org/en/stable/api/astropy.utils.misc. > ShapedLikeNDArray.html My biggest problem with subclassing as it exists now is that they don't survive the first encounter with np.asarray (or np.array). So much code written to work with numpy uses that as a bandaid (for e.g. handling lists) that in my experience it's 50/50 whether passing a subclass to a function will actually behave as expected--even if there's no good reason it shouldn't. Ryan -- Ryan May -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Jul 8 03:54:03 2017 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 8 Jul 2017 09:54:03 +0200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: Hi Ryan, Indeed, the liberal use of `np.asarray` is one of the main reason the helper routines are relatively annoying. Of course, that is not an argument for using duck-types over subclasses: those wouldn't even survive `asanyarray` (which many numpy routines now have moved to). All the best, Marten From paul.carrico at free.fr Sat Jul 8 04:20:06 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Sat, 08 Jul 2017 10:20:06 +0200 Subject: [Numpy-discussion] Numpy arrays and slicing comprehension issue Message-ID: Hi Once again I need your help to understand one topic concerning slicing topic, or in other word I do not understand how it works in that particular (but common) case; I'm trying to reassign the 4 first values in an array: * If I use [:3] I'm expecting to have 4 values (index 0 to 3 included) * Ditto with [0:3] * If I use [3:] I have 2 values as expected (indexes 3 and 4) Both code and results are presented here after, so this way of thinking worked so far in other calculations, and it fails here? Thanks Paul ps : extraction from the doc (https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html) _[... all indices are zero-based ...]_ CODE: x = np.random.rand(5); print("x = ",x); ## test 1 print("partials =\n %s \nor %s \nor %s" %( x[:3], x[0:3], x[3:]) ) print("x[0] : ",x[0]); print("x[1] : ",x[1]); print("x[2] : ",x[2]); print("x[3] : ",x[3]) ## test 2 y = np.ones(4); print("y = ",y) x[0:4] = y print("x final = ",x) PROVIDE: x = [ 0.39921271 0.07097531 0.37044695 0.28078163 0.11590451] partials = [ 0.39921271 0.07097531 0.37044695] or [ 0.39921271 0.07097531 0.37044695] or [ 0.28078163 0.11590451] x[0] : 0.39921271184 x[1] : 0.0709753133926 x[2] : 0.370446946245 x[3] : 0.280781629 y = [ 1. 1. 1. 1.] x final = [ 1. 1. 1. 1. 0.11590451] -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Jul 8 05:58:13 2017 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 8 Jul 2017 05:58:13 -0400 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: <1498760145.3918433.1025609912.4766E01C@webmail.messagingengine.com> <7576d1af-b1f2-341c-de6f-69b664f88b79@iki.fi> <49c7f45e-2e4e-b64a-7fc1-ec54e8d78b87@gmail.com> <1499009599.3435.5.camel@sipsolutions.net> Message-ID: On Fri, Jul 7, 2017 at 6:42 PM, Ryan May wrote: > On Fri, Jul 7, 2017 at 4:27 PM, Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> Hi All, >> >> I doubt I'm really the last one thinking ndarray subclassing is a good >> idea, but as that was stated, I feel I should at least pipe in. It >> seems to me there is both a perceived problem -- with the two >> subclasses that numpy provides -- `matrix` and `MaskedArray` -- both >> being problematic in ways that seem to me to have very little to do >> with subclassing being a bad idea, and a real one following from the >> fact that numpy was written at a time when python's inheritance system >> was not as well developed as it is now. >> >> Though based on my experience with Quantity, I'd also argue that the >> more annoying problems are not so much with `ndarray` itself, but >> rather with the helper functions. Ufuncs were not so bad -- they >> really just needed a better override mechanism, which __array_ufunc__ >> now provides -- but for quite a few of the other functions subclassing >> was clearly an afterthought. Indeed, `MaskedArray` provides a nice >> example of this, with its many special `np.ma.` routines, >> providing huge duplication and thus lots of duplicated bugs (which >> Eric has been patiently fixing...). Indeed, `MaskedArray` is also a >> much better example than ndarrat of a class that is really hard to >> subclass (even though, conceptually, it should be a far easier one). >> >> All that said, duck-type arrays make a lot of sense, and e.g. the >> slicing and shaping methods are easily emulated, especially if one's >> underlying data are stored in `ndarray`. For astropy's version of a >> relevant mixin, see >> http://docs.astropy.org/en/stable/api/astropy.utils.misc.Sha >> pedLikeNDArray.html > > > My biggest problem with subclassing as it exists now is that they don't > survive the first encounter with np.asarray (or np.array). So much code > written to work with numpy uses that as a bandaid (for e.g. handling lists) > that in my experience it's 50/50 whether passing a subclass to a function > will actually behave as expected--even if there's no good reason it > shouldn't. > as a downstream developer: The problem is that we cannot trust any array subclass or anything that pretends to be like an array. Even asarray is letting already too many things go through. We would need an indication or guarantee for the behavior to quack in the correct way, otherwise it is very difficult to write code that would work for various subclasses. (even in the simplest case, writing code that works for matrix and arrays beyond a few lines is getting difficult.) scipy.stats.mstats is largely not code duplication, it needs to handle the mask (although the nan versions in scipy.stats are catching up). Josef > > > Ryan > > -- > Ryan May > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Jul 8 07:03:58 2017 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sat, 8 Jul 2017 13:03:58 +0200 Subject: [Numpy-discussion] Numpy arrays and slicing comprehension issue In-Reply-To: References: Message-ID: The last index is exclusive: [a:b] means a <= index < b. On Jul 8, 2017 10:20 AM, wrote: > Hi > > Once again I need your help to understand one topic concerning slicing > topic, or in other word I do not understand how it works in that particular > (but common) case; I?m trying to reassign the 4 first values in an array: > > - If I use [:3] I?m expecting to have 4 values (index 0 to 3 included) > - Ditto with [0:3] > - If I use [3:] I have 2 values as expected (indexes 3 and 4) > > Both code and results are presented here after, so this way of thinking > worked so far in other calculations, and it fails here? > > > Thanks > > Paul > > ps : extraction from the doc (https://docs.scipy.org/doc/ > numpy/reference/arrays.indexing.html) > > *[... all indices are zero-based ...]* > > > > *Code*: > > x = np.random.rand(5); print("x = ",x); > > ## test 1 > > print("partials =\n %s \nor %s \nor %s" %( x[:3], x[0:3], x[3:]) ) > > print("x[0] : ",x[0]); print("x[1] : ",x[1]); print("x[2] : ",x[2]); > print("x[3] : ",x[3]) > > > > ## test 2 > > y = np.ones(4); print("y = ",y) > > x[0:4] = y > > print("x final = ",x) > > > *Provide*: > > x = [ 0.39921271 0.07097531 0.37044695 0.28078163 0.11590451] > > partials = > > [ 0.39921271 0.07097531 0.37044695] > > or [ 0.39921271 0.07097531 0.37044695] > > or [ 0.28078163 0.11590451] > > x[0] : 0.39921271184 > > x[1] : 0.0709753133926 > > x[2] : 0.370446946245 > > x[3] : 0.280781629 > > y = [ 1. 1. 1. 1.] > > x final = [ 1. 1. 1. 1. 0.11590451] > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Sat Jul 8 08:13:23 2017 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Sat, 8 Jul 2017 14:13:23 +0200 Subject: [Numpy-discussion] Numpy arrays and slicing comprehension issue In-Reply-To: References: Message-ID: On 8 July 2017 at 13:03, Jaime Fern?ndez del R?o wrote: > The last index is exclusive: > [a:b] means a <= index < b. > And the consequence is that the length of your array is b - a, so [:3] gives you the first 3 values. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Sat Jul 8 18:00:51 2017 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Sat, 8 Jul 2017 15:00:51 -0700 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint Message-ID: > From: Marten van Kerkwijk > > Though based on my experience with Quantity, I'd also argue that the > more annoying problems are not so much with `ndarray` itself, but > rather with the helper functions. Just to give an alternative view - as another astronomer I would say that concerns about the use of subclassing in Astropy is one of the reasons I rarely use it. To take an example, if I?m relying on the Quantity class to keep track of my units, then if I have an (N,3) array-like of positions in parsecs, that?s just one step away from going through the sausage-machine of scipy.spatial.cKDTree.query (to find distances to neighbours) and my units are toast. That probably sounds stronger than I meant - keeping track of units systematically is neither easy nor unimportant and I have enormous respect for the developers of Astropy and the community they have built around it. Peter Creasey From m.h.vankerkwijk at gmail.com Sun Jul 9 16:35:12 2017 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 9 Jul 2017 22:35:12 +0200 Subject: [Numpy-discussion] Scipy 2017 NumPy sprint In-Reply-To: References: Message-ID: Hi Peter, In the context of the discussion here, the fact that Quantity is a subclass and not a duck-type array makes no difference for scipy code - in either case, the code would eat the unit (if it would work at all). My only argument was that sub-classing is not particularly worse than trying to make a duck-type. In the end, it would seem to me for anything reasonably interesting, the code sadly becomes rather complex. Anyway, for me, Quantity is mostly a way to avoid stupid mistakes with units - it has certainly helped save me lots of headaches already. All the best, Marten From mikhailwas at gmail.com Sun Jul 9 17:35:58 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 9 Jul 2017 23:35:58 +0200 Subject: [Numpy-discussion] Generalized rectangle intersection. (Was: Array blitting) Message-ID: disclaimer: I am not a past contributor to numpy and I don't know much about github, and what pull request means. So I just put the examples here. So in short, the proposal idea is to add a library function which calculates the intersection area of two rectangles, generalized for any dimensions. The need for this function comes quite often so I suppose it would be good to have such function in the library. Python function prototype: def box_clip(box1, box2, offset): -> (box_intersection, offset1, offset2) Here the rectangle is called "box". The function takes three equally sized arrays (or tuples) which denote the cartesian parameters of boxes and their relative position: - box1 - destination box size (only positive values) - box2 - source box size (only positive values) - offset - offset between box2 and box1 the boxes ( any values ) And returns an array (or tuple?) containing 3 arrays : - box_intersection : size of the intersection area - offset1 : offset in box1' coordinate system - offset2 : offset in box2' coordinate system Following are example of the full function with comments and usage examples, all tested with arrays and tuples as input and all seem to work correctly. #===== def box_clip(box1, box2, offset): L = len(box1) # amount of dimensions sizes_equal = ( L == len (box2) == len (offset) ) if not sizes_equal: print ("Error: input arrays must have equal size") return R = numpy.zeros((3, L)) # init result array for i in range (0,L): # take the i-th axis d = box1[i] # dest box size along i-th axis s = box2[i] # source box size along i-th axis o = offset[i] # offset along i-th axis left = max(0, o) # startpoint of the clipped area right = min(d, o+s) # endpoint of the clipped area r = right - left # size of the clipped area if r < 0: r = 0 # clamp negative size values R[0,i] = r # return the size of the clipped area R[1,i] = left # return the offset in respect to the destinatition box R[2,i] = left-o # return the offset in respect to the source box return R #===== Typical use cases: Example 1. Finding the intersection of two rectangles. E.g. for 2D rectangles defined in the tuple format (coordinate, size): rect1 = numpy.array ( ( (0, 5), (10, 20) ) ) rect2 = numpy.array ( ( (1, 5), (10, 20) ) ) R = box_clip( rect1[1], rect2[1], rect2[0] - rect1[0] ) > R [[ 9. 20.] # intersection size [ 1. 0.] # coordinate in rect1's origin [ 0. 0.]] # coordinate in rect2's origin E.g. to construct the rectangle object in the same input format (global coord, size) just sum the local rect1's coordinate (R[1]) and global rect1's coordinate: rect_x = numpy.array ( ( rect1[0] + R[1] , R[0] ) ) > rect_x [[ 1. 5.] [ 9. 20.]] Example 2. Use the function as a helper to find array slices for the array "blit" operation. This will need another intermediate function to convert between two cartesian points and the slice object: def p2slice(startpoint, endpoint): # point to slice conversion intervals = numpy.column_stack((startpoint, endpoint)) slices = [slice(*i) for i in intervals] return slices # exampe of blitting SRC array into the DEST array at a given offset W = 6; H = 6 w = 4; h = 1 DEST = numpy.ones([H,W], dtype = "uint8") SRC = numpy.zeros([h,w], dtype = "uint8") SRC[:]=8 offset = (5,4) R = box_clip(DEST.shape, SRC.shape, offset) DEST_slice = p2slice( R[1], R[1] + R[0] ) SRC_slice = p2slice( R[2], R[2] + R[0] ) DEST[DEST_slice] = SRC[SRC_slice] # blit >> DEST [[1 1 1 1 1 1] [1 1 1 1 1 1] [1 1 1 1 1 1] [1 1 1 1 1 1] [1 1 1 1 1 1] [1 1 1 1 8 8]] Notes: the function should be as general as possible. The input as box sizes and returning the intersection area size and both local offsets seems to be appropriate, since more often the rectangle objects are defined as (coordinate, size) tuple and in many cases the destination box is itself the origin (i.e. it's coordinate is 0,0,..) But of course there can be various variants for the output format and order. Regards, Mikhail V From Jerome.Kieffer at esrf.fr Mon Jul 10 04:34:46 2017 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Mon, 10 Jul 2017 10:34:46 +0200 Subject: [Numpy-discussion] Generalized rectangle intersection. (Was: Array blitting) In-Reply-To: References: Message-ID: <20170710103446.5e1e0aa7@lintaillefer.esrf.fr> On Sun, 9 Jul 2017 23:35:58 +0200 Mikhail V wrote: > disclaimer: I am not a past contributor to numpy and I don't know > much about github, and what pull request means. So I just put the > examples here. > > So in short, the proposal idea is to add a library function which > calculates the intersection > area of two rectangles, generalized for any dimensions. I am using this kind of clipping as well but in the case you are suggesting the boxes looks aligned to the axis which is limiting to me. The general case is much more complicated (and interesting to me :) Moreover the scikit-image may be more interested for this algorithm. Cheers, -- J?r?me Kieffer tel +33 476 882 445 From paul.carrico at free.fr Mon Jul 10 06:39:42 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Mon, 10 Jul 2017 12:39:42 +0200 Subject: [Numpy-discussion] Reshape 2D array into 3D Message-ID: Dear All I'm looking in a way to reshape a 2D matrix into a 3D one ; in my example I want to MOVE THE COLUMNS FROM THE 4TH TO THE 8TH IN THE 2ND PLANE (3rd dimension i guess) a = np.random.rand(5,8); print(a) I tried a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting Nota : it looks like the following task (while I want to split it in 2 levels and not in 4), but I've not understood at all https://stackoverflow.com/questions/31686989/numpy-reshape-and-partition-2d-array-to-3d Thanks for your support Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Jul 10 06:46:19 2017 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 10 Jul 2017 11:46:19 +0100 Subject: [Numpy-discussion] Reshape 2D array into 3D In-Reply-To: References: Message-ID: Hi, This works, but reshape doesn't move data around. What happens is that the data is flattened and then reshaped. If your 5 is not supposed to move, you should create a 2,5,4 array and then copy the two slices by hand, or use transpose (make it 5,4,2 and then transpose to 2,5,4=. Matthieu Le 10 juil. 2017 11:40 AM, a ?crit : > Dear All > > I'm looking in a way to reshape a 2D matrix into a 3D one ; in my example > I want to *move the columns from the 4th to the 8th in the 2nd plane* (3rd > dimension i guess) > > a = np.random.rand(5,8); print(a) > > I tried > > a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting > > > Nota : it looks like the following task (while I want to split it in 2 > levels and not in 4), but I've not understood at all > > https://stackoverflow.com/questions/31686989/numpy- > reshape-and-partition-2d-array-to-3d > > > Thanks for your support > > > Paul > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Mon Jul 10 08:20:33 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Mon, 10 Jul 2017 14:20:33 +0200 Subject: [Numpy-discussion] reshape 2D array into 3D Message-ID: Dear All I'm looking in a way to reshape a 2D matrix into a 3D one ; in my example I want to MOVE THE COLUMNS FROM THE 4TH TO THE 8TH IN THE 2ND PLANE (3rd dimension i guess) a = np.random.rand(5,8); print(a) I tried a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting Nota : it looks like the following task (while I want to split it in 2 levels and not in 4), but I've not understood at all https://stackoverflow.com/questions/31686989/numpy-reshape-and-partition-2d-array-to-3d Thanks for your support Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Mon Jul 10 09:16:45 2017 From: e.antero.tammi at gmail.com (eat) Date: Mon, 10 Jul 2017 16:16:45 +0300 Subject: [Numpy-discussion] reshape 2D array into 3D In-Reply-To: References: Message-ID: Hi, On Mon, Jul 10, 2017 at 3:20 PM, wrote: > Dear All > > I'm looking in a way to reshape a 2D matrix into a 3D one ; in my example > I want to *move the columns from the 4th to the 8th in the 2nd plane* (3rd > dimension i guess) > > a = np.random.rand(5,8); print(a) > > I tried > > a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting > > > Nota : it looks like the following task (while I want to split it in 2 > levels and not in 4), but I've not understood at all > > https://stackoverflow.com/questions/31686989/numpy- > reshape-and-partition-2d-array-to-3d > Is this what you are looking for: import numpy as np a= np.arange(40).reshape(5, 8) a Out[]: array([[ 0, 1, 2, 3, 4, 5, 6, 7], [ 8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39]]) np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)) Out[]: array([[[ 0, 1, 2, 3], [ 8, 9, 10, 11], [16, 17, 18, 19], [24, 25, 26, 27], [32, 33, 34, 35]], [[ 4, 5, 6, 7], [12, 13, 14, 15], [20, 21, 22, 23], [28, 29, 30, 31], [36, 37, 38, 39]]]) Regards, -eat > > Thanks for your support > > > Paul > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Mon Jul 10 09:35:33 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Mon, 10 Jul 2017 15:35:33 +0200 Subject: [Numpy-discussion] reshape 2D array into 3D In-Reply-To: References: Message-ID: <6ffd24c6fc21cc586839c355282d4c95@free.fr> Thanks Nevertheless it does not work for me and I suspect the python/numpy releases :-( The server on which I'm working on is under Contos 7 that uses python 2.7 et numpy 1.7 from memory ; I tried to upgrade both of them (plus spyder) but it fails. I didn't want to impact the other solvers installed on, so I stopped Paul a = np.arange(40).reshape(5, 8); print(a) print("b =") b = np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)); print(b) [[ 0 1 2 3 4 5 6 7] [ 8 9 10 11 12 13 14 15] [16 17 18 19 20 21 22 23] [24 25 26 27 28 29 30 31] [32 33 34 35 36 37 38 39]] b = [[[ 0 4294967296 1 8589934592] [ 4 21474836480 5 25769803776] [ 8 38654705664 9 42949672960] [ 12 55834574848 13 60129542144] [ 16 73014444032 17 77309411328]] [[ 2 12884901888 3 17179869184] [ 6 30064771072 7 34359738368] [ 10 47244640256 11 51539607552] [ 14 64424509440 15 68719476736] [ 18 81604378624 19 85899345920]]] Le 2017-07-10 15:16, eat a ?crit : > Hi, > > On Mon, Jul 10, 2017 at 3:20 PM, wrote: > >> Dear All >> >> I'm looking in a way to reshape a 2D matrix into a 3D one ; in my example I want to MOVE THE COLUMNS FROM THE 4TH TO THE 8TH IN THE 2ND PLANE (3rd dimension i guess) >> >> a = np.random.rand(5,8); print(a) >> >> I tried >> >> a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting >> >> Nota : it looks like the following task (while I want to split it in 2 levels and not in 4), but I've not understood at all >> >> https://stackoverflow.com/questions/31686989/numpy-reshape-and-partition-2d-array-to-3d [1] > > Is this what you are looking for: > > import numpy as np > > a= np.arange(40).reshape(5, 8) > > a > Out[]: > array([[ 0, 1, 2, 3, 4, 5, 6, 7], > [ 8, 9, 10, 11, 12, 13, 14, 15], > [16, 17, 18, 19, 20, 21, 22, 23], > [24, 25, 26, 27, 28, 29, 30, 31], > [32, 33, 34, 35, 36, 37, 38, 39]]) > > np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)) > Out[]: > array([[[ 0, 1, 2, 3], > [ 8, 9, 10, 11], > [16, 17, 18, 19], > [24, 25, 26, 27], > [32, 33, 34, 35]], > > [[ 4, 5, 6, 7], > [12, 13, 14, 15], > [20, 21, 22, 23], > [28, 29, 30, 31], > [36, 37, 38, 39]]]) > > Regards, > -eat > >> Thanks for your support >> >> Paul >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion [2] > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion Links: ------ [1] https://stackoverflow.com/questions/31686989/numpy-reshape-and-partition-2d-array-to-3d [2] https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Mon Jul 10 09:52:44 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Mon, 10 Jul 2017 13:52:44 +0000 Subject: [Numpy-discussion] reshape 2D array into 3D In-Reply-To: References: <6ffd24c6fc21cc586839c355282d4c95@free.fr> Message-ID: On Mon, Jul 10, 2017 at 8:46 AM Yarko Tymciurak wrote: > On Mon, Jul 10, 2017 at 8:37 AM wrote: > >> Thanks >> >> >> Nevertheless it does not work for me and I suspect the python/numpy >> releases :-( >> >> The server on which I'm working on is under Contos 7 that uses python 2.7 >> et numpy 1.7 from memory ; I tried to upgrade both of them (plus spyder) >> but it fails. >> > Question: Did you try to control the python & numpy versions by creating a virtualenv, or a conda env? I didn't want to impact the other solvers installed on, so I stopped >> >> Paul >> >> a = np.arange(40).reshape(5, 8); print(a) >> print("b =") >> b = np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)); print(b) >> >> [[ 0 1 2 3 4 5 6 7] >> [ 8 9 10 11 12 13 14 15] >> [16 17 18 19 20 21 22 23] >> [24 25 26 27 28 29 30 31] >> [32 33 34 35 36 37 38 39]] >> b = >> [[[ 0 4294967296 1 8589934592] >> [ 4 21474836480 5 25769803776] >> [ 8 38654705664 9 42949672960] >> [ 12 55834574848 13 60129542144] >> [ 16 73014444032 17 77309411328]] >> >> [[ 2 12884901888 3 17179869184] >> [ 6 30064771072 7 34359738368] >> [ 10 47244640256 11 51539607552] >> [ 14 64424509440 15 68719476736] >> [ 18 81604378624 19 85899345920]]] >> >> >> >> >> Le 2017-07-10 15:16, eat a ?crit : >> >> Hi, >> >> On Mon, Jul 10, 2017 at 3:20 PM, wrote: >> >>> Dear All >>> >>> I'm looking in a way to reshape a 2D matrix into a 3D one ; in my >>> example I want to *move the columns from the 4th to the 8th in the 2nd >>> plane* (3rd dimension i guess) >>> >>> a = np.random.rand(5,8); print(a) >>> >>> I tried >>> >>> a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting >>> >>> >>> Nota : it looks like the following task (while I want to split it in 2 >>> levels and not in 4), but I've not understood at all >>> >>> >>> https://stackoverflow.com/questions/31686989/numpy-reshape-and-partition-2d-array-to-3d >>> >> Is this what you are looking for: >> import numpy as np >> >> a= np.arange(40).reshape(5, 8) >> >> a >> Out[]: >> array([[ 0, 1, 2, 3, 4, 5, 6, 7], >> [ 8, 9, 10, 11, 12, 13, 14, 15], >> [16, 17, 18, 19, 20, 21, 22, 23], >> [24, 25, 26, 27, 28, 29, 30, 31], >> [32, 33, 34, 35, 36, 37, 38, 39]]) >> >> np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)) >> Out[]: >> array([[[ 0, 1, 2, 3], >> [ 8, 9, 10, 11], >> [16, 17, 18, 19], >> [24, 25, 26, 27], >> [32, 33, 34, 35]], >> >> [[ 4, 5, 6, 7], >> [12, 13, 14, 15], >> [20, 21, 22, 23], >> [28, 29, 30, 31], >> [36, 37, 38, 39]]]) >> >> Regards, >> -eat >> >>> >>> Thanks for your support >>> >>> >>> Paul >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.carrico at free.fr Mon Jul 10 10:52:31 2017 From: paul.carrico at free.fr (paul.carrico at free.fr) Date: Mon, 10 Jul 2017 16:52:31 +0200 Subject: [Numpy-discussion] reshape 2D array into 3D In-Reply-To: References: <6ffd24c6fc21cc586839c355282d4c95@free.fr> Message-ID: <710c9f905fbf2d1d8b8f5dcb5c667f19@free.fr> "Question: Did you try to control the python & numpy versions by creating a virtualenv, or a conda env?" I've just downloaded (ana)conda, but I've to take care first that it does not substitute to current python release working for for other solvers thanks for the information's and the support Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Jul 10 11:00:52 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 10 Jul 2017 17:00:52 +0200 Subject: [Numpy-discussion] reshape 2D array into 3D In-Reply-To: References: Message-ID: <1499698852.7024.1.camel@sipsolutions.net> On Mon, 2017-07-10 at 16:16 +0300, eat wrote: > Hi, > > On Mon, Jul 10, 2017 at 3:20 PM, wrote: > > Dear All > > I'm looking in a way to reshape a 2D matrix into a 3D one ; in my > > example I?want to?move the columns from the 4th to the 8th in the > > 2nd plane??(3rd dimension i guess) > > a =??np.random.rand(5,8); print(a) > > I tried > > a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting > > > > Nota : it looks like the following task (while I want to split it > > in 2 levels and not in 4), but I've not understood at all > > https://stackoverflow.com/questions/31686989/numpy-reshape-and-part > > ition-2d-array-to-3d > > > > Is this what you are looking for:? > import numpy as np > > a= np.arange(40).reshape(5, 8) > > a > Out[]:? > array([[ 0, ?1, ?2, ?3, ?4, ?5, ?6, ?7], > ? ? ? ?[ 8, ?9, 10, 11, 12, 13, 14, 15], > ? ? ? ?[16, 17, 18, 19, 20, 21, 22, 23], > ? ? ? ?[24, 25, 26, 27, 28, 29, 30, 31], > ? ? ? ?[32, 33, 34, 35, 36, 37, 38, 39]]) > > np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)) > Out[]:? > array([[[ 0, ?1, ?2, ?3], > ? ? ? ? [ 8, ?9, 10, 11], > ? ? ? ? [16, 17, 18, 19], > ? ? ? ? [24, 25, 26, 27], > ? ? ? ? [32, 33, 34, 35]], > > ? ? ? ?[[ 4, ?5, ?6, ?7], > ? ? ? ? [12, 13, 14, 15], > ? ? ? ? [20, 21, 22, 23], > ? ? ? ? [28, 29, 30, 31], > ? ? ? ? [36, 37, 38, 39]]]) > While maybe what he wants, I would avoid stride tricks if you can achieve the same thing with a reshape + transpose. Far more safe if you hardcode the strides, and much shorter if you don't, plus easier to read usually. One thing some people might get confused about with reshape is the order, numpy reshape defaults to C-order, while other packages may use fortran order for reshaping, you can actually change the order you want to use (though it is in general a good idea to prefer C-order in numpy probably). - Sebastian > Regards, > -eat > > Thanks for your support > > > > Paul > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From stefanv at berkeley.edu Mon Jul 10 14:31:15 2017 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Mon, 10 Jul 2017 11:31:15 -0700 Subject: [Numpy-discussion] Generalized rectangle intersection. (Was: Array blitting) In-Reply-To: <20170710103446.5e1e0aa7@lintaillefer.esrf.fr> References: <20170710103446.5e1e0aa7@lintaillefer.esrf.fr> Message-ID: <1499711475.2865536.1036329208.52FA7AE6@webmail.messagingengine.com> On Mon, Jul 10, 2017, at 01:34, Jerome Kieffer wrote: > On Sun, 9 Jul 2017 23:35:58 +0200 > Mikhail V wrote: > > > disclaimer: I am not a past contributor to numpy and I don't know > > much about github, and what pull request means. So I just put the > > examples here. > > > > So in short, the proposal idea is to add a library function which > > calculates the intersection > > area of two rectangles, generalized for any dimensions. > > I am using this kind of clipping as well but in the case you are > suggesting the boxes looks aligned to the axis which is limiting to me. > The general case is much more complicated (and interesting to me :) > > Moreover the scikit-image may be more interested for this algorithm. We use rectangular clipping in scikit-image, but general polygon clipping is easy thanks to Matplotlib's wrapping of the AGG library. Here's how to do it: https://github.com/scikit-image/scikit-image/blob/master/skimage/_shared/_geometry.py#L6 St?fan From mikhailwas at gmail.com Mon Jul 10 16:43:24 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 10 Jul 2017 22:43:24 +0200 Subject: [Numpy-discussion] Generalized rectangle intersection. (Was: Array blitting) In-Reply-To: <20170710103446.5e1e0aa7@lintaillefer.esrf.fr> References: <20170710103446.5e1e0aa7@lintaillefer.esrf.fr> Message-ID: On 10 July 2017 at 10:34, Jerome Kieffer wrote: > On Sun, 9 Jul 2017 23:35:58 +0200 > Mikhail V wrote: > >> disclaimer: I am not a past contributor to numpy and I don't know >> much about github, and what pull request means. So I just put the >> examples here. >> >> So in short, the proposal idea is to add a library function which >> calculates the intersection >> area of two rectangles, generalized for any dimensions. > > I am using this kind of clipping as well but in the case you are > suggesting the boxes looks aligned to the axis which is limiting to me. > The general case is much more complicated (and interesting to me :) With boxes I mean a bounding box https://en.wikipedia.org/wiki/Minimum_bounding_box And indeed they are cartesian ranges along all axes by definition. And with general case, did you mean the task of _finding_ this bounding box for a set of arbitrary points (e.g. a polygon)? If so that is definitely a separate task, merely for a graphical/geometry lib. E.g. if I want to copy-paste arbitrary polygonal area between two arrays I think it is anavoidable to use masks, since there is no polygonal arrays. So i use boolean mask for simple blitting (crisp edges), or grayscale mask for smooth edges (like full alpha blitting). And this is not condratictory to finding the box intersection. So if my source array needs a mask I use following approach, here using circle as a mask for simplicity: import numpy import cv2 W = 10; H = 10 DEST = numpy.ones([H,W], dtype = "uint8") SRC = numpy.zeros_like(DEST) SRC[:]=8 SRC_mask = numpy.zeros_like(SRC) cv2.circle(SRC_mask, (H/2,W/2), 3, 1, -1) # draw mask SRC_mask = (SRC_mask == 1) # colorkey = 1 offset = (5,4) d = DEST.shape s = SRC.shape R = box_clip(d,s,offset) DEST_slice = p2slice( R[1], R[1] + R[0] ) SRC_slice = p2slice( R[2], R[2] + R[0] ) mask_tmp = SRC_mask[SRC_slice] # the maks also needs slicing!!! DEST[DEST_slice][mask_tmp] = SRC[SRC_slice][mask_tmp] print (DEST) > DEST [[1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 1 1] [1 1 1 1 1 1 1 1 8 1] [1 1 1 1 1 1 8 8 8 8] [1 1 1 1 1 1 8 8 8 8] [1 1 1 1 1 8 8 8 8 8]] So as you see this is all the same code only with mask array added, it copies only the data inside the circle area of the SRC array, that is how for example colorkey blitting works. That is one of the advantages to having "box_clip" as a separate function in order to be able to combine it with masking for example. So I need the intersection calculation to be able to do this without manual adjustments of the slices for an blit offset. How would you do this without the box_clip or similar function? Just want to note - it is needed not only for imaging software, it can be used for copying any data between arrays. To find the bounding box of arbitrary polygon - yes one needs another function but that is different task. And if you want to extract the data from SRC within the mask but with an offset then again you'll need to find intersection rectangle first. The idea is that if I want just to copy data I would need to import one of graphical libraries just for that one function (or roll my own function). Mikhail From charlesr.harris at gmail.com Tue Jul 11 16:49:15 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 11 Jul 2017 14:49:15 -0600 Subject: [Numpy-discussion] pytest and degrees of separation. Message-ID: Hi All, Just looking for opinions and feedback on the need to keep NumPy from having a hard nose/pytest dependency. The options as I see them are: 1. pytest is never imported until the tests are run -- current practice with nose 2. pytest is never imported unless the testfiles are imported -- what I would like 3. pytest is imported together when numpy is -- what we need to avoid. Currently the approach has been 1), but I think 2) makes more sense and allows more flexibility. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 11 17:21:54 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 11 Jul 2017 23:21:54 +0200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: Message-ID: <1499808114.30273.6.camel@sipsolutions.net> On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote: > Hi All, > > Just looking for opinions and feedback on the need to keep NumPy from > having a hard nose/pytest dependency. The options as I see them are: > > pytest is never imported until the tests are run -- current practice > with nose > pytest is never imported unless the testfiles are imported -- what I > would like? > pytest is imported together when numpy is -- what we need to avoid. > Currently the approach has been 1), but I think 2) makes more sense > and allows more flexibility. I am not quite sure about everything here. My guess is we can do whatever we want when it comes to our own tests, and I don't mind just switching everything to pytest (I for one am happy as long as I can run `runtests.py` ;)). When it comes to the utils we provide, those should keep working without nose/pytest if they worked before without it I think. My guess is that all your options do that, so I think we should take the one that gives the nicest maintainable code :). Though can't say I looked enough into it to really make a well educated decision, that probably means your option 2. - Sebastian > Thoughts? > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From tcaswell at gmail.com Tue Jul 11 18:04:01 2017 From: tcaswell at gmail.com (Thomas Caswell) Date: Tue, 11 Jul 2017 22:04:01 +0000 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: <1499808114.30273.6.camel@sipsolutions.net> References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: Going with option 2 is probably the best option so that you can use pytest fixtures and parameterization. Might be worth looking at how Matplotlib re-arranged things on our master branch to maintain back-compatibility with nose-specific tools that were used by down-stream projects. Tom On Tue, Jul 11, 2017 at 4:22 PM Sebastian Berg wrote: > On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote: > > Hi All, > > > > Just looking for opinions and feedback on the need to keep NumPy from > > having a hard nose/pytest dependency. The options as I see them are: > > > > pytest is never imported until the tests are run -- current practice > > with nose > > pytest is never imported unless the testfiles are imported -- what I > > would like > > pytest is imported together when numpy is -- what we need to avoid. > > Currently the approach has been 1), but I think 2) makes more sense > > and allows more flexibility. > > > I am not quite sure about everything here. My guess is we can do > whatever we want when it comes to our own tests, and I don't mind just > switching everything to pytest (I for one am happy as long as I can run > `runtests.py` ;)). > When it comes to the utils we provide, those should keep working > without nose/pytest if they worked before without it I think. > > My guess is that all your options do that, so I think we should take > the one that gives the nicest maintainable code :). Though can't say I > looked enough into it to really make a well educated decision, that > probably means your option 2. > > - Sebastian > > > > > Thoughts? > > Chuck > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Jul 11 19:06:12 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 11 Jul 2017 18:06:12 -0500 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: On Tue, Jul 11, 2017 at 5:04 PM, Thomas Caswell wrote: > Going with option 2 is probably the best option so that you can use pytest > fixtures and parameterization. > I agree -- those are worth a lot! -CHB > Might be worth looking at how Matplotlib re-arranged things on our master > branch to maintain back-compatibility with nose-specific tools that were > used by down-stream projects. > > Tom > > On Tue, Jul 11, 2017 at 4:22 PM Sebastian Berg > wrote: > >> On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote: >> > Hi All, >> > >> > Just looking for opinions and feedback on the need to keep NumPy from >> > having a hard nose/pytest dependency. The options as I see them are: >> > >> > pytest is never imported until the tests are run -- current practice >> > with nose >> > pytest is never imported unless the testfiles are imported -- what I >> > would like >> > pytest is imported together when numpy is -- what we need to avoid. >> > Currently the approach has been 1), but I think 2) makes more sense >> > and allows more flexibility. >> >> >> I am not quite sure about everything here. My guess is we can do >> whatever we want when it comes to our own tests, and I don't mind just >> switching everything to pytest (I for one am happy as long as I can run >> `runtests.py` ;)). >> When it comes to the utils we provide, those should keep working >> without nose/pytest if they worked before without it I think. >> >> My guess is that all your options do that, so I think we should take >> the one that gives the nicest maintainable code :). Though can't say I >> looked enough into it to really make a well educated decision, that >> probably means your option 2. >> >> - Sebastian >> >> >> >> > Thoughts? >> > Chuck >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jul 12 03:26:53 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 12 Jul 2017 19:26:53 +1200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: On Wed, Jul 12, 2017 at 11:06 AM, Chris Barker wrote: > > > On Tue, Jul 11, 2017 at 5:04 PM, Thomas Caswell > wrote: > >> Going with option 2 is probably the best option so that you can use >> pytest fixtures and parameterization. >> > > I agree -- those are worth a lot! > Maybe I'm dense, but I don't quite see the difference between 1 and 2. Test files should never be imported unless tests are run, they're not part of any public API nor do they currently have __init__.py files. Ralf > > -CHB > > > >> Might be worth looking at how Matplotlib re-arranged things on our master >> branch to maintain back-compatibility with nose-specific tools that were >> used by down-stream projects. >> >> Tom >> >> On Tue, Jul 11, 2017 at 4:22 PM Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >> >>> On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote: >>> > Hi All, >>> > >>> > Just looking for opinions and feedback on the need to keep NumPy from >>> > having a hard nose/pytest dependency. The options as I see them are: >>> > >>> > pytest is never imported until the tests are run -- current practice >>> > with nose >>> > pytest is never imported unless the testfiles are imported -- what I >>> > would like >>> > pytest is imported together when numpy is -- what we need to avoid. >>> > Currently the approach has been 1), but I think 2) makes more sense >>> > and allows more flexibility. >>> >>> >>> I am not quite sure about everything here. My guess is we can do >>> whatever we want when it comes to our own tests, and I don't mind just >>> switching everything to pytest (I for one am happy as long as I can run >>> `runtests.py` ;)). >>> When it comes to the utils we provide, those should keep working >>> without nose/pytest if they worked before without it I think. >>> >>> My guess is that all your options do that, so I think we should take >>> the one that gives the nicest maintainable code :). Though can't say I >>> looked enough into it to really make a well educated decision, that >>> probably means your option 2. >>> >>> - Sebastian >>> >>> >>> >>> > Thoughts? >>> > Chuck >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at python.org >>> > https://mail.python.org/mailman/listinfo/numpy-discussion___ >>> ____________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 12 07:53:21 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 12 Jul 2017 05:53:21 -0600 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: On Wed, Jul 12, 2017 at 1:26 AM, Ralf Gommers wrote: > > > On Wed, Jul 12, 2017 at 11:06 AM, Chris Barker > wrote: > >> >> >> On Tue, Jul 11, 2017 at 5:04 PM, Thomas Caswell >> wrote: >> >>> Going with option 2 is probably the best option so that you can use >>> pytest fixtures and parameterization. >>> >> >> I agree -- those are worth a lot! >> > > Maybe I'm dense, but I don't quite see the difference between 1 and 2. > Test files should never be imported unless tests are run, they're not part > of any public API nor do they currently have __init__.py files. > > Ralf > In practice, that would generally be true, but the nose testing tools were 1, all nose imports were buried in functions that ran during testing. Whether or not that was by intent I don't know. But having an explicit consensus on 2, which seems to be the case here, is helpful because it allows better use of pytest fixtures. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Jul 12 16:14:09 2017 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 12 Jul 2017 22:14:09 +0200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: Charles R Harris kirjoitti 12.07.2017 klo 13:53: > In practice, that would generally be true, but the nose testing tools > were 1, all nose imports were buried in functions that ran during > testing. Whether or not that was by intent I don't know. But having an > explicit consensus on 2, which seems to be the case here, is helpful > because it allows better use of pytest fixtures. I guess the question is about shipping new pytest fixtures as a part of the public API of numpy.testing, for use by 3rd party projects. If the issue is only with Numpy's own tests, they can import stuff from a private submodule that's not imported by "import numpy.testing", so it does not introduce a dependency. (Similar thing for the public API might also be possible e.g. "import numpy.testing.pytest_fixtures" but it comes at the cost of a new submodule.) So I guess a main question actually is: how much of the public API in numpy.testing should be ported to pytest for use by 3rd projects? The numerical assert functions are obviously useful. The warnings suppression (pytest warning stuff IIRC doesn't deal with warning registries nor work around the bugs in warnings.catch_warnings) similarly --- it could make sense to actually upstream it... But I'm not so clear about the rest. Pauli From pav at iki.fi Wed Jul 12 16:20:36 2017 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 12 Jul 2017 22:20:36 +0200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: Charles R Harris kirjoitti 12.07.2017 klo 13:53: > In practice, that would generally be true, but the nose testing tools > were 1, all nose imports were buried in functions that ran during > testing. Whether or not that was by intent I don't know. But having an > explicit consensus on 2, which seems to be the case here, is helpful > because it allows better use of pytest fixtures. I guess the question is about shipping new pytest fixtures as a part of the public API of numpy.testing, for use by 3rd party projects. If the issue is only with Numpy's own tests, they can import stuff from a private submodule that's not imported by "import numpy.testing", so it does not introduce a dependency. (Similar thing for the public API might also be possible e.g. "import numpy.testing.pytest_fixtures" but it comes at the cost of a new submodule.) So I guess a main question actually is: how much of the public API in numpy.testing should be ported to pytest for use by 3rd projects? The numerical assert functions are obviously useful. The warnings suppression (pytest warning stuff IIRC doesn't deal with warning registries nor work around the bugs in warnings.catch_warnings) similarly --- it could make sense to actually upstream it... But I'm not so clear about the rest. Pauli From pav at iki.fi Wed Jul 12 16:00:08 2017 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 12 Jul 2017 22:00:08 +0200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: <384aa6a2-e40c-2f3d-99f8-6e400462ce2a@iki.fi> Charles R Harris kirjoitti 12.07.2017 klo 13:53: > In practice, that would generally be true, but the nose testing tools > were 1, all nose imports were buried in functions that ran during > testing. Whether or not that was by intent I don't know. But having an > explicit consensus on 2, which seems to be the case here, is helpful > because it allows better use of pytest fixtures. I guess the question is about shipping new pytest fixtures as a part of the public API of numpy.testing, for use by 3rd party projects. If the issue is only with Numpy's own tests, they can import stuff from a private submodule that's not imported by "import numpy.testing", so it does not introduce a dependency. (Similar thing for the public API might also be possible e.g. "import numpy.testing.pytest_fixtures" but it comes at the cost of a new submodule.) So I guess a main question actually is: how much of the public API in numpy.testing should be ported to pytest for use by 3rd projects? The numerical assert functions are obviously useful. The warnings suppression (pytest warning stuff IIRC doesn't deal with warning registries nor work around the bugs in warnings.catch_warnings) similarly --- it could make sense to actually upstream it... But I'm not so clear about the rest. Pauli From ralf.gommers at gmail.com Thu Jul 13 06:41:02 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Jul 2017 22:41:02 +1200 Subject: [Numpy-discussion] pytest and degrees of separation. In-Reply-To: References: <1499808114.30273.6.camel@sipsolutions.net> Message-ID: On Thu, Jul 13, 2017 at 8:14 AM, Pauli Virtanen wrote: > Charles R Harris kirjoitti 12.07.2017 klo 13:53: > > In practice, that would generally be true, but the nose testing tools > > were 1, all nose imports were buried in functions that ran during > > testing. Whether or not that was by intent I don't know. But having an > > explicit consensus on 2, which seems to be the case here, is helpful > > because it allows better use of pytest fixtures. > > I guess the question is about shipping new pytest fixtures as a part of > the public API of numpy.testing, for use by 3rd party projects. > Agreed. That's a different question, and I'd prefer to keep things as they are in that respect. Otherwise it's basically a hard dependency of numpy itself on pytest. > If the issue is only with Numpy's own tests, they can import stuff from > a private submodule that's not imported by "import numpy.testing", so it > does not introduce a dependency. > > (Similar thing for the public API might also be possible e.g. "import > numpy.testing.pytest_fixtures" but it comes at the cost of a new > submodule.) > > So I guess a main question actually is: how much of the public API in > numpy.testing should be ported to pytest for use by 3rd projects? > > The numerical assert functions are obviously useful. > > The warnings suppression (pytest warning stuff IIRC doesn't deal with > warning registries nor work around the bugs in warnings.catch_warnings) > similarly --- it could make sense to actually upstream it... > > But I'm not so clear about the rest. > Agreed, nothing in the decorators that obviously needs a pytest-based implementation. The Tester class may be the one thing. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Thu Jul 13 07:34:50 2017 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 13 Jul 2017 13:34:50 +0200 Subject: [Numpy-discussion] Adding out to bincount Message-ID: There is an ongoing discussion on #9397 about adding an out= keyword argument to np.bincount(). Presently the design seems headed toward: - the new counts will be added to the contents of out, perhaps we need a better name than out, suggestions welcome - if minlength is specified and it doesn't match exactly the size of out, an error will be raised, - if any of the input indices is outside the bounds of the out array, an error will be raised, and - if the out array is not of the exact right type, i.e. np.double if weights are specified, np.intp if not, an error will be raised. If you have an opinion about this functionality please head over there and have your say. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Jul 14 14:09:58 2017 From: cournape at gmail.com (David Cournapeau) Date: Fri, 14 Jul 2017 19:09:58 +0100 Subject: [Numpy-discussion] Remote sprinting for scipy 2017 Message-ID: Hi, Is there a numpy-specific communication channel for this year's sprints ? Could not see anything on the scipy sprints page Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Gfeller at swisscom.com Mon Jul 17 05:13:55 2017 From: Martin.Gfeller at swisscom.com (Martin.Gfeller at swisscom.com) Date: Mon, 17 Jul 2017 09:13:55 +0000 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)? Message-ID: Dear all I have object array of arrays, which I compare element-wise to None in various places: >>> a = numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),None],dtype=numpy.object) >>> a array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]), None], dtype=object) >>> numpy.equal(a,None) FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. So far, I always ignored the warning, for lack of an idea how to resolve it. Now, with Numpy 1.13, I have to resolve the issue, because it fails with: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() It seem that the numpy.equal is applied to each inner array, returning a Boolean array for each element, which cannot be coerced to a single Boolean. The expression >>> numpy.vectorize(operator.is_)(a,None) gives the desired result, but feels a bit clumsy. Is there a cleaner, efficient way to do an element-wise (but shallow) comparison? Thank you and best regards, Martin Gfeller, Swisscom From sebastian at sipsolutions.net Mon Jul 17 05:41:28 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 17 Jul 2017 11:41:28 +0200 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)? In-Reply-To: References: Message-ID: <1500284488.6357.5.camel@sipsolutions.net> On Mon, 2017-07-17 at 09:13 +0000, Martin.Gfeller at swisscom.com wrote: > Dear all > > I have object array of arrays, which I compare element-wise to None > in various places: > > > > > a = > > > > numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),Non > > > > e],dtype=numpy.object) > > > > a > > array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]), > None], dtype=object) > > > > numpy.equal(a,None) > > FutureWarning: comparison to `None` will result in an elementwise > object comparison in the future. > > > So far, I always ignored the warning, for lack of an idea how to > resolve it.? > > Now, with Numpy 1.13, I have to resolve the issue, because it fails > with:? > > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all()? > > It seem that the numpy.equal is applied to each inner array, > returning a Boolean array for each element, which cannot be coerced > to a single Boolean. > > The expression? > > > > > numpy.vectorize(operator.is_)(a,None) > > gives the desired result, but feels a bit clumsy.? > Yes, I guess ones bug is someone elses feature :(, if it is very bad, we could delay the deprecation probably. For a solutions, maybe we could add a ufunc for elementwise `is` on object arrays (dunno about the name, maybe `object_identity`. Just some quick thoughts. - Sebastian > Is there a cleaner, efficient way to do an element-wise (but shallow) > comparison?? > > Thank you and best regards, > Martin Gfeller, Swisscom > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Mon Jul 17 12:44:47 2017 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Jul 2017 09:44:47 -0700 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)? In-Reply-To: References: Message-ID: On Mon, Jul 17, 2017 at 2:13 AM, wrote: > > Dear all > > I have object array of arrays, which I compare element-wise to None in various places: > > >>> a = numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),None],dtype=numpy.object) > >>> a > array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]), None], dtype=object) > >>> numpy.equal(a,None) > FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. > > > So far, I always ignored the warning, for lack of an idea how to resolve it. > > Now, with Numpy 1.13, I have to resolve the issue, because it fails with: > > ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() > > It seem that the numpy.equal is applied to each inner array, returning a Boolean array for each element, which cannot be coerced to a single Boolean. > > The expression > > >>> numpy.vectorize(operator.is_)(a,None) > > gives the desired result, but feels a bit clumsy. Wrap the clumsiness up in a documented, tested utility function with a descriptive name and use that function everywhere instead. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Mon Jul 17 13:52:30 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Mon, 17 Jul 2017 17:52:30 +0000 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)? In-Reply-To: References: Message-ID: Here?s a hack that lets you keep using ==: class IsCompare: __array_priority__ = 999999 # needed to make it work on either side of `==` def __init__(self, val): self._val = val def __eq__(self, other): return other is self._val def __neq__(self, other): return other is not self._val a == IsCompare(None) # a is None a == np.array(IsCompare(None)) # broadcasted a is None Eric ? On Mon, 17 Jul 2017 at 17:45 Robert Kern wrote: > On Mon, Jul 17, 2017 at 2:13 AM, wrote: > > > > Dear all > > > > I have object array of arrays, which I compare element-wise to None in > various places: > > > > >>> a = > numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),None],dtype=numpy.object) > > >>> a > > array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]), > None], dtype=object) > > >>> numpy.equal(a,None) > > FutureWarning: comparison to `None` will result in an elementwise object > comparison in the future. > > > > > > So far, I always ignored the warning, for lack of an idea how to resolve > it. > > > > Now, with Numpy 1.13, I have to resolve the issue, because it fails with: > > > > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > > > > It seem that the numpy.equal is applied to each inner array, returning a > Boolean array for each element, which cannot be coerced to a single Boolean. > > > > The expression > > > > >>> numpy.vectorize(operator.is_)(a,None) > > > > gives the desired result, but feels a bit clumsy. > > Wrap the clumsiness up in a documented, tested utility function with a > descriptive name and use that function everywhere instead. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 17 13:57:11 2017 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Jul 2017 10:57:11 -0700 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)? In-Reply-To: References: Message-ID: On Mon, Jul 17, 2017 at 10:52 AM, Eric Wieser wrote: > Here?s a hack that lets you keep using ==: > > class IsCompare: > __array_priority__ = 999999 # needed to make it work on either side of `==` > def __init__(self, val): self._val = val > def __eq__(self, other): return other is self._val > def __neq__(self, other): return other is not self._val > > a == IsCompare(None) # a is None > a == np.array(IsCompare(None)) # broadcasted a is None > > Frankly, I'd stick with a well-named utility function. It's much more kind to those who have to read the code (e.g. you in 6 months). :-) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Tue Jul 18 09:37:45 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Tue, 18 Jul 2017 13:37:45 +0000 Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never return masked In-Reply-To: References: Message-ID: When using ndarray.squeeze, a view is returned, which means you can do the follow (somewhat-contrived) operation: >>> def fill_contrived(a): a.squeeze()[...] = 2 return a >>> fill_contrived(np.array([1])) array(2) However, when tried with a masked array, this can fail, breaking liskov subsitution: >>> fill_contrived(np.ma.array([1], mask=[True])) MaskError: Cannot alter the masked element. This fails because squeeze breaks the contract of returning a view, instead deciding sometimes to return masked. There is a patch that fixes this in gh-9432 - however, by necessity it breaks any existing code that uses m_arr.squeeze() is np.ma.masked. Is this too breaking a change? Eric ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Jul 18 09:52:08 2017 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 18 Jul 2017 09:52:08 -0400 Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never return masked In-Reply-To: References: Message-ID: This sort of change seems very similar to the np.diag() change a few years ago. Are there lessons we could learn from then that we could apply to here? Why would the returned view not be a masked array? Ben Root On Tue, Jul 18, 2017 at 9:37 AM, Eric Wieser wrote: > When using ndarray.squeeze, a view is returned, which means you can do > the follow (somewhat-contrived) operation: > > >>> def fill_contrived(a): > a.squeeze()[...] = 2 > return a > >>> fill_contrived(np.array([1])) > array(2) > > However, when tried with a masked array, this can fail, breaking liskov > subsitution: > > >>> fill_contrived(np.ma.array([1], mask=[True])) > MaskError: Cannot alter the masked element. > > This fails because squeeze breaks the contract of returning a view, > instead deciding sometimes to return masked. > > There is a patch that fixes this in gh-9432 > - however, by necessity it > breaks any existing code that uses m_arr.squeeze() is np.ma.masked. > > Is this too breaking a change? > > Eric > ? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Gfeller at swisscom.com Wed Jul 19 04:31:34 2017 From: Martin.Gfeller at swisscom.com (Martin.Gfeller at swisscom.com) Date: Wed, 19 Jul 2017 08:31:34 +0000 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in Message-ID: <3439a529dd3b4e19bdfb1d8d080aa824@SG001745.corproot.net> Thank you for your help! Sebastian, I couldn't agree more with someone's bug being someone else's feature! A fast identity ufunc would be useful, though. Actually, numpy.frompyfunc(operator.is_,2,1) is much faster than the numpy.vectorize approach - only about 35% slower on quick measurement than the direct ==, as opposed to 62% slower with vectorize (with otypes hint). Robert, yes, that's what I already did provisionally. Eric, that is a nice puzzle - but I agree with Robert about understanding by code maintainers. Thanks again, and best regards, Martin On Mon, 17 Jul 2017 11:41 Sebastian Berg write > Yes, I guess ones bug is someone elses feature :(, if it is very bad, we could delay the deprecation probably. For a solutions, maybe > we could add a ufunc for elementwise `is` on object arrays (dunno about the name, maybe `object_identity`. > Just some quick thoughts. > - Sebastian On Mon, 17 Jul 2017 at 17:45 Robert Kern wrote: > Wrap the clumsiness up in a documented, tested utility function with a descriptive name and use that function everywhere instead. > Robert Kern On Mon, Jul 17, 2017 at 10:52 AM, Eric Wieser wrote: > Here's a hack that lets you keep using ==: > > class IsCompare: > __array_priority__ = 999999 # needed to make it work on either side of `==` > def __init__(self, val): self._val = val > def __eq__(self, other): return other is self._val > def __neq__(self, other): return other is not self._val > > a == IsCompare(None) # a is None > a == np.array(IsCompare(None)) # broadcasted a is None > > From sebastian at sipsolutions.net Wed Jul 19 08:55:18 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 19 Jul 2017 14:55:18 +0200 Subject: [Numpy-discussion] How to compare an array of arrays elementwise to None in In-Reply-To: <3439a529dd3b4e19bdfb1d8d080aa824@SG001745.corproot.net> References: <3439a529dd3b4e19bdfb1d8d080aa824@SG001745.corproot.net> Message-ID: <1500468918.11386.5.camel@sipsolutions.net> On Wed, 2017-07-19 at 08:31 +0000, Martin.Gfeller at swisscom.com wrote: > Thank you for your help! > > Sebastian, I couldn't agree more with someone's bug being someone > else's feature! A fast identity ufunc would be useful, though.? > An `object_identity` ufunc should be very easy to implement, the bigger work is likely to actually decide on it and the name. Also should probably check back with the PyPy guys to make sure it would also work on PyPy. - Sebastian > Actually, numpy.frompyfunc(operator.is_,2,1) is much faster than the > numpy.vectorize approach - only about 35% slower on quick > measurement? > than the direct ==, as opposed to 62% slower with vectorize (with > otypes hint).? > > Robert, yes, that's what I already did provisionally.? > > Eric, that is a nice puzzle - but I agree with Robert about > understanding by code maintainers.? > > Thanks again, and best regards, > Martin > > > > > On Mon, 17 Jul 2017 11:41 Sebastian Berg > write > > > Yes, I guess ones bug is someone elses feature :(, if it is very > > bad, we could delay the deprecation probably. For a solutions, > > maybe? > > we could add a ufunc??for elementwise `is` on object arrays (dunno > > about the name, maybe `object_identity`. > > Just some quick thoughts. > > - Sebastian > > On Mon, 17 Jul 2017 at 17:45 Robert Kern > wrote: > > > Wrap the clumsiness up in a documented, tested utility function > > with a descriptive name and use that function everywhere instead. > > Robert Kern > > On Mon, Jul 17, 2017 at 10:52 AM, Eric Wieser l.com> > wrote: > > > Here's a hack that lets you keep using ==: > > > > class IsCompare: > > ????__array_priority__ = 999999??# needed to make it work on either > > side of `==` > > ????def __init__(self, val): self._val = val > > ????def __eq__(self, other): return other is self._val > > ????def __neq__(self, other): return other is not self._val > > > > a == IsCompare(None)??# a is None > > a == np.array(IsCompare(None)) # broadcasted a is None > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Fri Jul 21 02:41:51 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 21 Jul 2017 18:41:51 +1200 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] Important: 2017 NumFOCUS Summit & Sustainability Workshop In-Reply-To: <8EA01960-BE4A-4B5E-B1BD-6A417243BC1B@numfocus.org> References: <8EA01960-BE4A-4B5E-B1BD-6A417243BC1B@numfocus.org> Message-ID: Hi all, This is a public service announcement: for the first time NumFOCUS is organising a sustainability workshop, details below. NumPy is able to send two representatives; the NumPy steering committee chose Chuck and Nathaniel. I'll be present as well as NumFOCUS board member. Last year there was a NumFOCUS summit as well, but no sustainability workshop. It looks like going forward there will be one a year of these. If there's more interest than tickets next year we'll look at rotating the NumPy representatives. Cheers, Ralf ---------- Forwarded message ---------- From: Christie Koehler Date: Thu, Jun 29, 2017 at 10:52 AM Subject: [NumFOCUS Projects] Important: 2017 NumFOCUS Summit & Sustainability Workshop To: "projects at numfocus.org" Dear Project Leads, Following up from our ?save the date? note a few weeks ago, here is our formal invitation to the 2017 NumFOCUS Summit & Sustainability Workshop taking place in *Austin, Texas on 10-11 October, 2017*. All participants planning to attend should register no later than July 19th, 2017. New at this year?s Summit is our first ever Sustainability Workshop. The goal of the Workshop is to guide project leads and core contributors in identifying an appropriate sustainability plan for their project as well as the initial steps required to start implementing that plan. We would like to have at least 1-2 representatives from each NumFOCUS sponsored project attend the workshop. Members of the NumFOCUS Board of Directors, Advisory Council, and Sustainability Advisory Board who would like to attend are welcome to do so. Project Leads are tasked with selecting which project representative(s) should attend the Summit/Workshop, according to their governance practices. Once you?ve selected your reps, you may simply forward this email to them and ask them to register. You do not need to otherwise inform NumFOCUS of your choice prior. ------------------------------ N.B. ? FYI, NumFOCUS is hoping to arrange an event for the Austin data science community to benefit from the presence of so many core maintainers of the scientific computing stack. This would take place immediately prior to the NumFOCUS Summit. No details are available yet ? if we succeed in securing a venue, NumFOCUS staff will be in touch to invite your participation. We just wanted to give you a friendly heads-up as you consider your travel plans. Direct questions about this event to info at numfocus.org. ------------------------------ FAQ (Please review before registering): What am I committing to if I register for the Summit/Workshop? Those attending the Summit/Workshop should plan to attend all day on the 10th and 11th. You can view the tentative schedule for the Summit/Workshop here . Additionally, each project sending a representative to the Summit should commit to preparing and delivering a 5-minute lightning talk about the state of their project. NumFOCUS staff will be available to help with these presentations. Who do you recommend we send to represent our project? While we recognize that people have multiple roles, we recommend sending 1 person who can represent the technical aspects of your project (such as a person who develops or maintains code) and 1 person who can represent the community/business aspects (such as a person who works with the user/developer community). Representatives can be paid or unpaid contributors, but should be sufficiently invested and involved in your project such that they will both want to and are able to continue the sustainability work started at the Workshop. Can we send more than 2 reps from our project? Maybe. Our travel budget is based on two people from each project. That number is also what we are basing our venue, catering, and other logistical arrangements upon. Once we start collecting registrations, however, we may find that not all projects can send two representatives and/or that some projects can cover the cost of travel for one or more of their reps. So, a max of two registrations for each project will be available and thereafter anyone from your project wanting to attend will be given the option of being added to the waitlist. Because registrations are ?first-come, first-served? you should make sure your first two choices of representative register before subsequent ones. Are you covering travel expenses for project reps? For those traveling from outside the United States, too? Yes. We have a budget for reimbursing participants for travel-related costs, even those needing to travel from outside the United States. To ensure the best use of these funds, we?ll ask you when you register if you need NumFOCUS to reimburse you for travel. For all participants, we will handle hotel reservations and payments. We will book hotel accommodations for all participants for three nights: 9-11 October. Please email summit-travel at numfocus.org if you need different arrangements. Those participants needing to be reimbursed for other travel-related expenses should generally expect to cover those expenses at the time they are incurred and submit them to NumFOCUS for prompt reimbursement after the Summit/Workshop. But if this will present a hardship for you, let us know and we?ll make other arrangements. For further details about what we?ll reimburse you for, read our travel reimbursement policy (coming soon). If you are traveling from outside of the United States and might need assistance with visa or other immigration-related issues, please let us know as soon as possible. What if I have other questions not answered here? Ask via this list, reply directly to Christie, or email summit-info at numfocus.org. Cheers, Christie -- Christie Koehler Projects Director, NumFOCUS christie at numfocus.org +1 415-317-7603 mobile/signal Need to meet with me? https://calendly.com/numfocus-christie -- You received this message because you are subscribed to the Google Groups "Fiscally Sponsored Project Representatives" group. To unsubscribe from this group and stop receiving emails from it, send an email to projects+unsubscribe at numfocus.org. To post to this group, send email to projects at numfocus.org. Visit this group at https://groups.google.com/a/numfocus.org/group/projects/ . To view this discussion on the web visit https://groups.google.com/a/ numfocus.org/d/msgid/projects/8EA01960-BE4A-4B5E-B1BD- 6A417243BC1B%40numfocus.org . -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Jul 21 02:52:31 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 21 Jul 2017 18:52:31 +1200 Subject: [Numpy-discussion] NumPy steering councils members Message-ID: Hi all, It has been well over a year since we put together the governance structure and steering council ( https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#governance-people). We haven't reviewed the people on the steering council in that time. Based on the criteria for membership I would like to make the following suggestion (note, not discussed with everyone in private beforehand): Adding the following people to the steering council: - Eric Wieser - Marten van Kerkwijk - Stephan Hoyer - Allan Haldane Removing the following people from the steering council due to inactivity: - Alex Griffing Note that I've tried to contact Alex directly before, but he has not replied and has 0 activity on GitHub for the last year. I will try once more though. Thoughts? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Fri Jul 21 04:33:12 2017 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 21 Jul 2017 10:33:12 +0200 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 8:52 AM, Ralf Gommers wrote: > Hi all, > > It has been well over a year since we put together the governance > structure and steering council (https://docs.scipy.org/doc/ > numpy-dev/dev/governance/people.html#governance-people). We haven't > reviewed the people on the steering council in that time. Based on the > criteria for membership I would like to make the following suggestion > (note, not discussed with everyone in private beforehand): > > Adding the following people to the steering council: > - Eric Wieser > - Marten van Kerkwijk > - Stephan Hoyer > - Allan Haldane > +1. I mean, +4! :-) Jaime > > Removing the following people from the steering council due to inactivity: > - Alex Griffing > > Note that I've tried to contact Alex directly before, but he has not > replied and has 0 activity on GitHub for the last year. I will try once > more though. > > Thoughts? > > Cheers, > Ralf > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri Jul 21 10:58:19 2017 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 21 Jul 2017 16:58:19 +0200 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: References: Message-ID: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> On 21.07.2017 08:52, Ralf Gommers wrote: > Hi all, > > It has been well over a year since we put together the governance > structure and steering council > (https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#governance-people). > We haven't reviewed the people on the steering council in that time. > Based on the criteria for membership I would like to make the following > suggestion (note, not discussed with everyone in private beforehand): > > Adding the following people to the steering council: > - Eric Wieser > - Marten van Kerkwijk > - Stephan Hoyer > - Allan Haldane > Eric and Marten have only been members with commit rights for 6 months, While they have been contributing and very valuable to the project for significantly longer, I do think this it is a bit to short time to be considered for the steering council. I certainly approve of them becoming members at some point, but I do want to avoid the steering council to grow to large to quick as long as it does not need more members to do its job. What I do want to avoid is that the steering council becomes like our committers list, a group that only grows and never shrinks as long as the occasional heartbeat is heard. That said if we think the current steering council is not able to fulfil its purpose I do offer my seat for a replacement as I currently have not really been contributing much. cheers, Julian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 845 bytes Desc: OpenPGP digital signature URL: From sebastian at sipsolutions.net Fri Jul 21 12:36:28 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Jul 2017 18:36:28 +0200 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> References: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> Message-ID: <1500654988.5019.1.camel@sipsolutions.net> On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > On 21.07.2017 08:52, Ralf Gommers wrote: > > Hi all, > > > > It has been well over a year since we put together the governance > > structure and steering council > > (https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#go > > vernance-people). > > We haven't reviewed the people on the steering council in that > > time. > > Based on the criteria for membership I would like to make the > > following > > suggestion (note, not discussed with everyone in private > > beforehand): > > > > Adding the following people to the steering council: > > - Eric Wieser > > - Marten van Kerkwijk > > - Stephan Hoyer > > - Allan Haldane > > > > > Eric and Marten have only been members with commit rights for 6 > months, > While they have been contributing and very valuable to the project > for > significantly longer, I do think this it is a bit to short time to be > considered for the steering council. > I certainly approve of them becoming members at some point, but I do > want to avoid the steering council to grow to large to quick as long > as > it does not need more members to do its job. > What I do want to avoid is that the steering council becomes like our > committers list, a group that only grows and never shrinks as long as > the occasional heartbeat is heard. > > That said if we think the current steering council is not able to > fulfil > its purpose I do offer my seat for a replacement as I currently have > not > really been contributing much. I doubt that ;). IIRC the rules were "at least one year", so you are probably right that we should delay the official status until then, but I care much personally. I think all of us are in the position where we don't mind giving up this "official" position in favor of more active people (just to note, IIRC in two years now, it was _somewhat_ used a single time when we donate a bit of numpy money to the mingwpy effort). I am not sure if we had it, but we could put in (up to changes of course), a rough number of people we aim to have on it. Just so we don't forget to discuss that there should be a bit flux. And I am all for some flux, because I would think it silly if those who actually make decisions don't end up on it because someone is occasionally throws in a comment. And yes, that person may well be me :). - Sebastian > cheers, > Julian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Fri Jul 21 15:59:37 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Jul 2017 12:59:37 -0700 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: <1500654988.5019.1.camel@sipsolutions.net> References: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> <1500654988.5019.1.camel@sipsolutions.net> Message-ID: On Jul 21, 2017 9:36 AM, "Sebastian Berg" wrote: On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > On 21.07.2017 08:52, Ralf Gommers wrote: > > Hi all, > > > > It has been well over a year since we put together the governance > > structure and steering council > > (https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#go > > vernance-people). > > We haven't reviewed the people on the steering council in that > > time. > > Based on the criteria for membership I would like to make the > > following > > suggestion (note, not discussed with everyone in private > > beforehand): > > > > Adding the following people to the steering council: > > - Eric Wieser > > - Marten van Kerkwijk > > - Stephan Hoyer > > - Allan Haldane > > > > > Eric and Marten have only been members with commit rights for 6 > months, > While they have been contributing and very valuable to the project > for > significantly longer, I do think this it is a bit to short time to be > considered for the steering council. > I certainly approve of them becoming members at some point, but I do > want to avoid the steering council to grow to large to quick as long > as > it does not need more members to do its job. > What I do want to avoid is that the steering council becomes like our > committers list, a group that only grows and never shrinks as long as > the occasional heartbeat is heard. > > That said if we think the current steering council is not able to > fulfil > its purpose I do offer my seat for a replacement as I currently have > not > really been contributing much. I doubt that ;). IIRC the rules were "at least one year", so you are probably right that we should delay the official status until then, but I care much personally. Fwiw, the rule to qualify is at least one year of "contributions" that are "sustained" and "substantial". Having a commit bit definitely helps with some kinds of contributions (merging PRs, triaging bugs), but there's no clock that starts ticking when someone gets a commit bit; contributions before that count too. """ To become eligible to join the Steering Council, an individual must be a Project Contributor who has produced contributions that are substantial in quality and quantity, and sustained over at least one year. Potential Council Members are nominated by existing Council members, and become members following consensus of the existing Council members, and confirmation that the potential Member is interested and willing to serve in that capacity. [...] When considering potential Members, the Council will look at candidates with a comprehensive view of their contributions. This will include but is not limited to code, code review, infrastructure work, mailing list and chat participation, community help/building, education and outreach, design work, etc. """ Also FWIW, the jupyter steering council is currently 15 people, or 16 including Fernando: https://github.com/jupyter/governance/blob/master/people.md By comparison, Numpy's currently has 8, so Ralf's proposal would bring it to 11: https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#governance-people Looking at the NumPy council, then with the exception of Alex who I haven't heard from in a while, it looks like a list of people who regularly speak up and have sensible things to say, so I don't personally see any problem with keeping everyone around. It's not like the council is an active working group; it's mainly for occasional oversight and boring logistics. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Jul 21 16:18:23 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Jul 2017 22:18:23 +0200 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: References: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> <1500654988.5019.1.camel@sipsolutions.net> Message-ID: <1500668303.15252.1.camel@sipsolutions.net> On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote: > On Jul 21, 2017 9:36 AM, "Sebastian Berg" > wrote: > On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > > On 21.07.2017 08:52, Ralf Gommers wrote: > Also FWIW, the jupyter steering council is currently 15 people, or 16 > including Fernando: > ??https://github.com/jupyter/governance/blob/master/people.md > > By comparison, Numpy's currently has 8, so Ralf's proposal would > bring it to 11: > ??https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#gov > ernance-people > > Looking at the NumPy council, then with the exception of Alex who I > haven't heard from in a while, it looks like a list of people who > regularly speak up and have sensible things to say, so I don't > personally see any problem with keeping everyone around. It's not > like the council is an active working group; it's mainly for > occasional oversight and boring logistics. > For what its worth, I fully agree. Frankly, I thought the lits might be longer ;). And yes, while I can understand that there might be a problem at some point, I am sure we are far from it for a while. Anyway, I think all of those four people Ralf mentioned would be a great addition (and if anyone wants to suggest someone else please speak up). - Sebastian > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From chunwei.yuan at gmail.com Fri Jul 21 17:11:22 2017 From: chunwei.yuan at gmail.com (Chun-Wei Yuan) Date: Fri, 21 Jul 2017 14:11:22 -0700 Subject: [Numpy-discussion] quantile() or percentile() Message-ID: There's an ongoing effort to introduce quantile() into numpy. You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%): https://github.com/numpy/numpy/pull/9213 Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this. The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other. I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). Best, C -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Jul 21 17:21:44 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 21 Jul 2017 17:21:44 -0400 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: I think that there would be a very good reason to have a separate function if we were to introduce weights to the inputs, similarly to the way that we have mean and average. This would have some (positive) repercussions like making weighted histograms with the Freedman-Diaconis binwidth estimator a possibility. I have had this change on the back-burner for a long time, mainly because I was too lazy to figure out how to include it in the C code. However, I will take a closer look. Regards, -Joe On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan wrote: > There's an ongoing effort to introduce quantile() into numpy. You'd use > it just like percentile(), but would input your q value in probability > space (0.5 for 50%): > > https://github.com/numpy/numpy/pull/9213 > > Since there's a great deal of overlap between these two functions, we'd > like to solicit opinions on how to move forward on this. > > The current thinking is to tolerate the redundancy and keep both, using > one as the engine for the other. I'm partial to having quantile because > 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). > > Best, > > C > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chunwei.yuan at gmail.com Fri Jul 21 17:34:45 2017 From: chunwei.yuan at gmail.com (Chun-Wei Yuan) Date: Fri, 21 Jul 2017 14:34:45 -0700 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: Just to provide some context, 9213 actually spawned off of this guy: https://github.com/numpy/numpy/pull/9211 which might address the weighted inputs issue Joe brought up. C On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I think that there would be a very good reason to have a separate function > if we were to introduce weights to the inputs, similarly to the way that we > have mean and average. This would have some (positive) repercussions like > making weighted histograms with the Freedman-Diaconis binwidth estimator a > possibility. I have had this change on the back-burner for a long time, > mainly because I was too lazy to figure out how to include it in the C > code. However, I will take a closer look. > > Regards, > > -Joe > > > > On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan > wrote: > >> There's an ongoing effort to introduce quantile() into numpy. You'd use >> it just like percentile(), but would input your q value in probability >> space (0.5 for 50%): >> >> https://github.com/numpy/numpy/pull/9213 >> >> Since there's a great deal of overlap between these two functions, we'd >> like to solicit opinions on how to move forward on this. >> >> The current thinking is to tolerate the redundancy and keep both, using >> one as the engine for the other. I'm partial to having quantile because >> 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). >> >> Best, >> >> C >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Jul 21 18:43:02 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 21 Jul 2017 18:43:02 -0400 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: While #9211 is a good start, it is pretty inefficient in terms of the fact that it performs an O(nlogn) sort of the array. It is possible to reduce the time to O(n) by using a similar partitioning algorithm to the one in the C code of percentile. I will look into it as soon as I can. -Joe On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan wrote: > Just to provide some context, 9213 actually spawned off of this guy: > > https://github.com/numpy/numpy/pull/9211 > > which might address the weighted inputs issue Joe brought up. > > C > > On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz < > jfoxrabinovitz at gmail.com> wrote: > >> I think that there would be a very good reason to have a separate >> function if we were to introduce weights to the inputs, similarly to the >> way that we have mean and average. This would have some (positive) >> repercussions like making weighted histograms with the Freedman-Diaconis >> binwidth estimator a possibility. I have had this change on the back-burner >> for a long time, mainly because I was too lazy to figure out how to include >> it in the C code. However, I will take a closer look. >> >> Regards, >> >> -Joe >> >> >> >> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan >> wrote: >> >>> There's an ongoing effort to introduce quantile() into numpy. You'd use >>> it just like percentile(), but would input your q value in probability >>> space (0.5 for 50%): >>> >>> https://github.com/numpy/numpy/pull/9213 >>> >>> Since there's a great deal of overlap between these two functions, we'd >>> like to solicit opinions on how to move forward on this. >>> >>> The current thinking is to tolerate the redundancy and keep both, using >>> one as the engine for the other. I'm partial to having quantile because >>> 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). >>> >>> Best, >>> >>> C >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chunwei.yuan at gmail.com Fri Jul 21 19:42:17 2017 From: chunwei.yuan at gmail.com (Chun-Wei Yuan) Date: Fri, 21 Jul 2017 16:42:17 -0700 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: That would be great. I just used np.argsort because it was familiar to me. Didn't know about the C code. On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > While #9211 is a good start, it is pretty inefficient in terms of the fact > that it performs an O(nlogn) sort of the array. It is possible to reduce > the time to O(n) by using a similar partitioning algorithm to the one in > the C code of percentile. I will look into it as soon as I can. > > -Joe > > On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan > wrote: > >> Just to provide some context, 9213 actually spawned off of this guy: >> >> https://github.com/numpy/numpy/pull/9211 >> >> which might address the weighted inputs issue Joe brought up. >> >> C >> >> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz < >> jfoxrabinovitz at gmail.com> wrote: >> >>> I think that there would be a very good reason to have a separate >>> function if we were to introduce weights to the inputs, similarly to the >>> way that we have mean and average. This would have some (positive) >>> repercussions like making weighted histograms with the Freedman-Diaconis >>> binwidth estimator a possibility. I have had this change on the back-burner >>> for a long time, mainly because I was too lazy to figure out how to include >>> it in the C code. However, I will take a closer look. >>> >>> Regards, >>> >>> -Joe >>> >>> >>> >>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan >>> wrote: >>> >>>> There's an ongoing effort to introduce quantile() into numpy. You'd >>>> use it just like percentile(), but would input your q value in probability >>>> space (0.5 for 50%): >>>> >>>> https://github.com/numpy/numpy/pull/9213 >>>> >>>> Since there's a great deal of overlap between these two functions, we'd >>>> like to solicit opinions on how to move forward on this. >>>> >>>> The current thinking is to tolerate the redundancy and keep both, using >>>> one as the engine for the other. I'm partial to having quantile because >>>> 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). >>>> >>>> Best, >>>> >>>> C >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Sat Jul 22 06:50:41 2017 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sat, 22 Jul 2017 12:50:41 +0200 Subject: [Numpy-discussion] Dropping support for Accelerate Message-ID: A few months ago, I had the innocent intention to wrap LDLt decomposition routines of LAPACK into SciPy but then I am made aware that the minimum required version of LAPACK/BLAS was due to Accelerate framework. Since then I've been following the core SciPy team and others' discussion on this issue. We have been exchanging opinions for quite a while now within various SciPy issues and PRs about the ever-increasing Accelerate-related issues and I've compiled a brief summary about the ongoing discussions to reduce the clutter. First, I would like to kindly invite everyone to contribute and sharpen the cases presented here https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate The reason I specifically wanted to post this also in NumPy mailing list is to probe for the situation from the NumPy-Accelerate perspective. Is there any NumPy specific problem that would indirectly effect SciPy should the support for Accelerate is dropped? -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jul 23 05:15:40 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Jul 2017 02:15:40 -0700 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: I've been wishing we'd stop shipping Accelerate for years, because of how it breaks multiprocessing ? that doesn't seem to be on your list yet. On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat wrote: > A few months ago, I had the innocent intention to wrap LDLt decomposition > routines of LAPACK into SciPy but then I am made aware that the minimum > required version of LAPACK/BLAS was due to Accelerate framework. Since then > I've been following the core SciPy team and others' discussion on this > issue. > > We have been exchanging opinions for quite a while now within various SciPy > issues and PRs about the ever-increasing Accelerate-related issues and I've > compiled a brief summary about the ongoing discussions to reduce the > clutter. > > First, I would like to kindly invite everyone to contribute and sharpen the > cases presented here > > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate > > The reason I specifically wanted to post this also in NumPy mailing list is > to probe for the situation from the NumPy-Accelerate perspective. Is there > any NumPy specific problem that would indirectly effect SciPy should the > support for Accelerate is dropped? > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith -- https://vorpus.org From ilhanpolat at gmail.com Sun Jul 23 11:16:35 2017 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sun, 23 Jul 2017 17:16:35 +0200 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: That's probably because I know nothing about the issue, is there any reference I can read about? But in general, please feel free populate new items in the wiki page. On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith wrote: > I've been wishing we'd stop shipping Accelerate for years, because of > how it breaks multiprocessing ? that doesn't seem to be on your list > yet. > > On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat wrote: > > A few months ago, I had the innocent intention to wrap LDLt decomposition > > routines of LAPACK into SciPy but then I am made aware that the minimum > > required version of LAPACK/BLAS was due to Accelerate framework. Since > then > > I've been following the core SciPy team and others' discussion on this > > issue. > > > > We have been exchanging opinions for quite a while now within various > SciPy > > issues and PRs about the ever-increasing Accelerate-related issues and > I've > > compiled a brief summary about the ongoing discussions to reduce the > > clutter. > > > > First, I would like to kindly invite everyone to contribute and sharpen > the > > cases presented here > > > > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate > > > > The reason I specifically wanted to post this also in NumPy mailing list > is > > to probe for the situation from the NumPy-Accelerate perspective. Is > there > > any NumPy specific problem that would indirectly effect SciPy should the > > support for Accelerate is dropped? > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Sun Jul 23 11:22:08 2017 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Sun, 23 Jul 2017 10:22:08 -0500 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: See https://mail.scipy.org/pipermail/numpy-discussion/2012-August/063589.html and replies in that thread. Quote from an Apple engineer in that thread: "For API outside of POSIX, including GCD and technologies like Accelerate, we do not support usage on both sides of a fork(). For this reason among others, use of fork() without exec is discouraged in general in processes that use layers above POSIX." On Sun, Jul 23, 2017 at 10:16 AM, Ilhan Polat wrote: > That's probably because I know nothing about the issue, is there any > reference I can read about? > > But in general, please feel free populate new items in the wiki page. > > On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith wrote: > >> I've been wishing we'd stop shipping Accelerate for years, because of >> how it breaks multiprocessing ? that doesn't seem to be on your list >> yet. >> >> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat >> wrote: >> > A few months ago, I had the innocent intention to wrap LDLt >> decomposition >> > routines of LAPACK into SciPy but then I am made aware that the minimum >> > required version of LAPACK/BLAS was due to Accelerate framework. Since >> then >> > I've been following the core SciPy team and others' discussion on this >> > issue. >> > >> > We have been exchanging opinions for quite a while now within various >> SciPy >> > issues and PRs about the ever-increasing Accelerate-related issues and >> I've >> > compiled a brief summary about the ongoing discussions to reduce the >> > clutter. >> > >> > First, I would like to kindly invite everyone to contribute and sharpen >> the >> > cases presented here >> > >> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate >> > >> > The reason I specifically wanted to post this also in NumPy mailing >> list is >> > to probe for the situation from the NumPy-Accelerate perspective. Is >> there >> > any NumPy specific problem that would indirectly effect SciPy should the >> > support for Accelerate is dropped? >> > >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> >> >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Sun Jul 23 12:07:18 2017 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sun, 23 Jul 2017 18:07:18 +0200 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: Ouch, that's from 2012 :( I'll add this thread as a reference to the wiki list. On Sun, Jul 23, 2017 at 5:22 PM, Nathan Goldbaum wrote: > See https://mail.scipy.org/pipermail/numpy-discussion/ > 2012-August/063589.html and replies in that thread. > > Quote from an Apple engineer in that thread: > > "For API outside of POSIX, including GCD and technologies like Accelerate, > we do not support usage on both sides of a fork(). For this reason among > others, use of fork() without exec is discouraged in general in processes > that use layers above POSIX." > > On Sun, Jul 23, 2017 at 10:16 AM, Ilhan Polat > wrote: > >> That's probably because I know nothing about the issue, is there any >> reference I can read about? >> >> But in general, please feel free populate new items in the wiki page. >> >> On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith wrote: >> >>> I've been wishing we'd stop shipping Accelerate for years, because of >>> how it breaks multiprocessing ? that doesn't seem to be on your list >>> yet. >>> >>> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat >>> wrote: >>> > A few months ago, I had the innocent intention to wrap LDLt >>> decomposition >>> > routines of LAPACK into SciPy but then I am made aware that the minimum >>> > required version of LAPACK/BLAS was due to Accelerate framework. Since >>> then >>> > I've been following the core SciPy team and others' discussion on this >>> > issue. >>> > >>> > We have been exchanging opinions for quite a while now within various >>> SciPy >>> > issues and PRs about the ever-increasing Accelerate-related issues and >>> I've >>> > compiled a brief summary about the ongoing discussions to reduce the >>> > clutter. >>> > >>> > First, I would like to kindly invite everyone to contribute and >>> sharpen the >>> > cases presented here >>> > >>> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate >>> > >>> > The reason I specifically wanted to post this also in NumPy mailing >>> list is >>> > to probe for the situation from the NumPy-Accelerate perspective. Is >>> there >>> > any NumPy specific problem that would indirectly effect SciPy should >>> the >>> > support for Accelerate is dropped? >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at python.org >>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>> > >>> >>> >>> >>> -- >>> Nathaniel J. Smith -- https://vorpus.org >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobmerhebi at gmail.com Mon Jul 24 10:37:40 2017 From: bobmerhebi at gmail.com (Bob) Date: Mon, 24 Jul 2017 16:37:40 +0200 Subject: [Numpy-discussion] Slice nested arrays, How to Message-ID: Hello, I created the following array by converting it from a nested list: a = np.array([np.array([ 17.56578416, 16.82712825, 16.57992292, 15.83534836]), np.array([ 17.9002445 , 17.35024876, 16.69733472, 15.78809856]), np.array([ 17.90086839, 17.64315136, 17.40653009, 17.26346787, 16.99901931, 16.87787178, 16.68278558, 16.56006419, 16.43672445]), np.array([ 17.91147242, 17.2770623 , 17.0320501 , 16.73729491, 16.4910479 ])], dtype=object) I wish to slice the first element of each sub-array so I can perform basic statistics (mean, sd, etc...0). How can I do that for large data without resorting to loops? Here's the result I want with a loop: s = np.zeros(4) for i in np.arange(4): s[i] = a[i][0] array([ 17.56578416, 17.9002445 , 17.90086839, 17.91147242]) Thank you From sebastian at sipsolutions.net Mon Jul 24 10:54:27 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 24 Jul 2017 16:54:27 +0200 Subject: [Numpy-discussion] Slice nested arrays, How to In-Reply-To: References: Message-ID: <1500908067.13334.2.camel@sipsolutions.net> On Mon, 2017-07-24 at 16:37 +0200, Bob wrote: > Hello, > > I created the following array by converting it from a nested list: > > ????a = np.array([np.array([ > 17.56578416,??16.82712825,??16.57992292,? > 15.83534836]), > ???????np.array([ 17.9002445 > ,??17.35024876,??16.69733472,??15.78809856]), > ???????np.array([ > 17.90086839,??17.64315136,??17.40653009,??17.26346787, > ????????16.99901931,??16.87787178,??16.68278558,??16.56006419,? > 16.43672445]), > ???????np.array([ 17.91147242,??17.2770623 ,??17.0320501 ,? > 16.73729491,??16.4910479 ])], dtype=object) > > I wish to slice the first element of each sub-array so I can perform > basic statistics (mean, sd, etc...0). > > How can I do that for large data without resorting to loops? Here's > the > result I want with a loop: > Arrays of arrays are not very nice in these regards, you could use np.frompyfunc/np.vectorize together with `operator.getitem` to avoid the loop. It probably will not be much faster though. - Sebastian > ????s = np.zeros(4) > ????for i in np.arange(4): > ????????s[i] = a[i][0] > > ????array([ 17.56578416,??17.9002445 ,??17.90086839,??17.91147242]) > > Thank you > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From wrw at mac.com Mon Jul 24 13:15:15 2017 From: wrw at mac.com (William Ray Wing) Date: Mon, 24 Jul 2017 13:15:15 -0400 Subject: [Numpy-discussion] Slice nested arrays, How to In-Reply-To: References: Message-ID: > On Jul 24, 2017, at 10:37 AM, Bob wrote: > > Hello, > > I created the following array by converting it from a nested list: > > a = np.array([np.array([ 17.56578416, 16.82712825, 16.57992292, > 15.83534836]), > np.array([ 17.9002445 , 17.35024876, 16.69733472, 15.78809856]), > np.array([ 17.90086839, 17.64315136, 17.40653009, 17.26346787, > 16.99901931, 16.87787178, 16.68278558, 16.56006419, > 16.43672445]), > np.array([ 17.91147242, 17.2770623 , 17.0320501 , > 16.73729491, 16.4910479 ])], dtype=object) > > I wish to slice the first element of each sub-array so I can perform > basic statistics (mean, sd, etc...0). > Have you considered using Pandas? Assuming I understand what you are trying to do, that nested list could read directly into a Pandas 2D data frame. Extracting the first element of each column (or row) is then fast and efficient. Bill > How can I do that for large data without resorting to loops? Here's the > result I want with a loop: > > s = np.zeros(4) > for i in np.arange(4): > s[i] = a[i][0] > > array([ 17.56578416, 17.9002445 , 17.90086839, 17.91147242]) > > Thank you > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Tue Jul 25 07:13:02 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 25 Jul 2017 13:13:02 +0200 Subject: [Numpy-discussion] NumPy steering councils members In-Reply-To: <1500668303.15252.1.camel@sipsolutions.net> References: <549d3b08-a9bf-ed6b-ab1e-62d66594f2dc@googlemail.com> <1500654988.5019.1.camel@sipsolutions.net> <1500668303.15252.1.camel@sipsolutions.net> Message-ID: <1500981182.21305.3.camel@sipsolutions.net> Hi all, so I guess this means: Unless anyone protests (soon, though at least a week from now probably) publicly or privately. We will invite four new members to the steering council and, if they accept, they will be added soon [1]. These are: - Eric Wieser - Marten van Kerkwijk - Stephan Hoyer - Allan Haldane all of whom have done considerable work for NumPy for a long time. I would like to also note again that I am happy about any additional suggestions. Alex Griffin will be informed that depending on his wishes, he may have to leave soon or within about a year (IIRC that was about what the governance docs say). Regards, Sebastian [1] Two of whom may be appointed with some delay due to the one year rule. We may have to hash out details here. On Fri, 2017-07-21 at 22:18 +0200, Sebastian Berg wrote: > On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote: > > On Jul 21, 2017 9:36 AM, "Sebastian Berg" > et > > > wrote: > > > > On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > > > On 21.07.2017 08:52, Ralf Gommers wrote: > > > > Also FWIW, the jupyter steering council is currently 15 people, or > > 16 > > including Fernando: > > ??https://github.com/jupyter/governance/blob/master/people.md > > > > By comparison, Numpy's currently has 8, so Ralf's proposal would > > bring it to 11: > > ??https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#g > > ov > > ernance-people > > > > Looking at the NumPy council, then with the exception of Alex who I > > haven't heard from in a while, it looks like a list of people who > > regularly speak up and have sensible things to say, so I don't > > personally see any problem with keeping everyone around. It's not > > like the council is an active working group; it's mainly for > > occasional oversight and boring logistics. > > > > For what its worth, I fully agree. Frankly, I thought the lits might > be > longer ;). And yes, while I can understand that there might be a > problem at some point, I am sure we are far from it for a while. > > Anyway, I think all of those four people Ralf mentioned would be a > great addition (and if anyone wants to suggest someone else please > speak up). > > - Sebastian > > > > -n > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From matthew.brett at gmail.com Tue Jul 25 07:57:46 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 25 Jul 2017 12:57:46 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: Hi, On Sun, Jul 23, 2017 at 5:07 PM, Ilhan Polat wrote: > Ouch, that's from 2012 :( I'll add this thread as a reference to the wiki > list. > > > On Sun, Jul 23, 2017 at 5:22 PM, Nathan Goldbaum > wrote: >> >> See >> https://mail.scipy.org/pipermail/numpy-discussion/2012-August/063589.html >> and replies in that thread. >> >> Quote from an Apple engineer in that thread: >> >> "For API outside of POSIX, including GCD and technologies like Accelerate, >> we do not support usage on both sides of a fork(). For this reason among >> others, use of fork() without exec is discouraged in general in processes >> that use layers above POSIX." >> >> On Sun, Jul 23, 2017 at 10:16 AM, Ilhan Polat >> wrote: >>> >>> That's probably because I know nothing about the issue, is there any >>> reference I can read about? >>> >>> But in general, please feel free populate new items in the wiki page. >>> >>> On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith wrote: >>>> >>>> I've been wishing we'd stop shipping Accelerate for years, because of >>>> how it breaks multiprocessing ? that doesn't seem to be on your list >>>> yet. >>>> >>>> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat >>>> wrote: >>>> > A few months ago, I had the innocent intention to wrap LDLt >>>> > decomposition >>>> > routines of LAPACK into SciPy but then I am made aware that the >>>> > minimum >>>> > required version of LAPACK/BLAS was due to Accelerate framework. Since >>>> > then >>>> > I've been following the core SciPy team and others' discussion on this >>>> > issue. >>>> > >>>> > We have been exchanging opinions for quite a while now within various >>>> > SciPy >>>> > issues and PRs about the ever-increasing Accelerate-related issues and >>>> > I've >>>> > compiled a brief summary about the ongoing discussions to reduce the >>>> > clutter. >>>> > >>>> > First, I would like to kindly invite everyone to contribute and >>>> > sharpen the >>>> > cases presented here >>>> > >>>> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate >>>> > >>>> > The reason I specifically wanted to post this also in NumPy mailing >>>> > list is >>>> > to probe for the situation from the NumPy-Accelerate perspective. Is >>>> > there >>>> > any NumPy specific problem that would indirectly effect SciPy should >>>> > the >>>> > support for Accelerate is dropped? >>>> > >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion at python.org >>>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>>> > >>>> >>>> >>>> >>>> -- >>>> Nathaniel J. Smith -- https://vorpus.org >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion I added some more discussion, and some links to previous discussion on the mailing list. I also pointed to this PR : https://github.com/MacPython/numpy-wheels/pull/1 - which builds OpenBLAS wheels for numpy. The same kind of thing would work fine for Scipy. Cheers, Matthew From njs at pobox.com Tue Jul 25 09:19:58 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Jul 2017 06:19:58 -0700 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: I updated the bit about OpenBLAS wheel with some more information on the status of that work. It's not super important, but FYI. I also want to disagree with this characterization of the Accelerate/multiprocessing issue: "This problem was due to a bug in multiprocessing and is fixed in Python 3.4 and later; Accelerate was POSIX compliant but multiprocessing was not." In 3.4 it became possible to *work around* this issue, but it requires configuring the multiprocessing module in a non-default way, which means that the common end-user experience is still that they try using multiprocessing, and they get random hangs with no other feedback, and then spend hours or days debugging before they discover this configuration option. (And the problem occurs on MacOS only, so you get extra fun when e.g. a module is developed on Windows or Linux and then you give it to a less-technical collaborator on MacOS and it breaks on their computer and you have no idea why.) And the workaround is suboptimal -- fork()'s memory-sharing semantics are very powerful. I've had cases where I could easily and efficiently solve a problem using multiprocessing in fork() mode, but where enabling the workaround for Accelerate would have made it impossible. (Specifically this happened because I had a huge read-only data structure that I could load once in the parent process, and then all the child processes could share it through fork's virtual memory magic; I didn't have enough memory to load two copies of it, yet fork let me have 10 or 20 virtual copies.) Technically, yes, mixing threads and fork can't be done in a POSIX-compliant manner. But no-one runs their code on an abstract POSIX machine, and on actual systems it's totally possible to make this work reliably. OpenBLAS does it. Users don't care if Apple is technically correct, they just want their stuff to work. -n On Tue, Jul 25, 2017 at 4:57 AM, Matthew Brett wrote: > Hi, > > On Sun, Jul 23, 2017 at 5:07 PM, Ilhan Polat wrote: >> Ouch, that's from 2012 :( I'll add this thread as a reference to the wiki >> list. >> >> >> On Sun, Jul 23, 2017 at 5:22 PM, Nathan Goldbaum >> wrote: >>> >>> See >>> https://mail.scipy.org/pipermail/numpy-discussion/2012-August/063589.html >>> and replies in that thread. >>> >>> Quote from an Apple engineer in that thread: >>> >>> "For API outside of POSIX, including GCD and technologies like Accelerate, >>> we do not support usage on both sides of a fork(). For this reason among >>> others, use of fork() without exec is discouraged in general in processes >>> that use layers above POSIX." >>> >>> On Sun, Jul 23, 2017 at 10:16 AM, Ilhan Polat >>> wrote: >>>> >>>> That's probably because I know nothing about the issue, is there any >>>> reference I can read about? >>>> >>>> But in general, please feel free populate new items in the wiki page. >>>> >>>> On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith wrote: >>>>> >>>>> I've been wishing we'd stop shipping Accelerate for years, because of >>>>> how it breaks multiprocessing ? that doesn't seem to be on your list >>>>> yet. >>>>> >>>>> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat >>>>> wrote: >>>>> > A few months ago, I had the innocent intention to wrap LDLt >>>>> > decomposition >>>>> > routines of LAPACK into SciPy but then I am made aware that the >>>>> > minimum >>>>> > required version of LAPACK/BLAS was due to Accelerate framework. Since >>>>> > then >>>>> > I've been following the core SciPy team and others' discussion on this >>>>> > issue. >>>>> > >>>>> > We have been exchanging opinions for quite a while now within various >>>>> > SciPy >>>>> > issues and PRs about the ever-increasing Accelerate-related issues and >>>>> > I've >>>>> > compiled a brief summary about the ongoing discussions to reduce the >>>>> > clutter. >>>>> > >>>>> > First, I would like to kindly invite everyone to contribute and >>>>> > sharpen the >>>>> > cases presented here >>>>> > >>>>> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate >>>>> > >>>>> > The reason I specifically wanted to post this also in NumPy mailing >>>>> > list is >>>>> > to probe for the situation from the NumPy-Accelerate perspective. Is >>>>> > there >>>>> > any NumPy specific problem that would indirectly effect SciPy should >>>>> > the >>>>> > support for Accelerate is dropped? >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > NumPy-Discussion mailing list >>>>> > NumPy-Discussion at python.org >>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Nathaniel J. Smith -- https://vorpus.org >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > > I added some more discussion, and some links to previous discussion on > the mailing list. > > I also pointed to this PR : > https://github.com/MacPython/numpy-wheels/pull/1 - which builds > OpenBLAS wheels for numpy. The same kind of thing would work fine for > Scipy. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith -- https://vorpus.org From matthew.brett at gmail.com Tue Jul 25 09:48:52 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 25 Jul 2017 14:48:52 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith wrote: > I updated the bit about OpenBLAS wheel with some more information on > the status of that work. It's not super important, but FYI. Maybe remove the bit (of my text) that you crossed out, or removed the strikethrough and qualify? At the moment it's confusing, because I believe what I wrote is correct, so leaving in there and crossed out looks kinda weird. Cheers, Matthew From njs at pobox.com Tue Jul 25 10:00:43 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Jul 2017 07:00:43 -0700 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett wrote: > On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith wrote: >> I updated the bit about OpenBLAS wheel with some more information on >> the status of that work. It's not super important, but FYI. > > Maybe remove the bit (of my text) that you crossed out, or removed the > strikethrough and qualify? At the moment it's confusing, because I > believe what I wrote is correct, so leaving in there and crossed out > looks kinda weird. Eh, it's a little weird because there's no specification needed really, we can implement it any time we want to. It was stalled for a long time because I ran into arcane technical problems dealing with the MacOS linker, but that's solved and now it's just stalled due to lack of attention. I deleted the text but feel free to qualify further if you think it's useful. -n -- Nathaniel J. Smith -- https://vorpus.org From matthew.brett at gmail.com Tue Jul 25 10:05:30 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 25 Jul 2017 15:05:30 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 3:00 PM, Nathaniel Smith wrote: > On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett wrote: >> On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith wrote: >>> I updated the bit about OpenBLAS wheel with some more information on >>> the status of that work. It's not super important, but FYI. >> >> Maybe remove the bit (of my text) that you crossed out, or removed the >> strikethrough and qualify? At the moment it's confusing, because I >> believe what I wrote is correct, so leaving in there and crossed out >> looks kinda weird. > > Eh, it's a little weird because there's no specification needed > really, we can implement it any time we want to. It was stalled for a > long time because I ran into arcane technical problems dealing with > the MacOS linker, but that's solved and now it's just stalled due to > lack of attention. > > I deleted the text but feel free to qualify further if you think it's useful. Are you saying that we should consider this specification approved already? Or that we should go ahead without waiting for approval? I guess the latter. I guess you're saying you think there would be no bad consequences for doing this if the spec subsequently changed before being approved? It might be worth adding something like that to the text, in case there's somebody who wants to do some work on that. Cheers, Matthew From njs at pobox.com Tue Jul 25 10:14:25 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Jul 2017 07:14:25 -0700 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 7:05 AM, Matthew Brett wrote: > On Tue, Jul 25, 2017 at 3:00 PM, Nathaniel Smith wrote: >> On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett wrote: >>> On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith wrote: >>>> I updated the bit about OpenBLAS wheel with some more information on >>>> the status of that work. It's not super important, but FYI. >>> >>> Maybe remove the bit (of my text) that you crossed out, or removed the >>> strikethrough and qualify? At the moment it's confusing, because I >>> believe what I wrote is correct, so leaving in there and crossed out >>> looks kinda weird. >> >> Eh, it's a little weird because there's no specification needed >> really, we can implement it any time we want to. It was stalled for a >> long time because I ran into arcane technical problems dealing with >> the MacOS linker, but that's solved and now it's just stalled due to >> lack of attention. >> >> I deleted the text but feel free to qualify further if you think it's useful. > > Are you saying that we should consider this specification approved > already? Or that we should go ahead without waiting for approval? I > guess the latter. I guess you're saying you think there would be no > bad consequences for doing this if the spec subsequently changed > before being approved? It might be worth adding something like that > to the text, in case there's somebody who wants to do some work on > that. It's not a PEP. It will never be approved because there is no-one to approve it :-). The only reason for writing it as a spec is to potentially help coordinate with others who want to get in on making these kinds of packages themselves, and the main motivator for that will be if one of us starts doing it and proves it works... -n -- Nathaniel J. Smith -- https://vorpus.org From matthew.brett at gmail.com Tue Jul 25 10:23:53 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 25 Jul 2017 15:23:53 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 3:14 PM, Nathaniel Smith wrote: > On Tue, Jul 25, 2017 at 7:05 AM, Matthew Brett wrote: >> On Tue, Jul 25, 2017 at 3:00 PM, Nathaniel Smith wrote: >>> On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett wrote: >>>> On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith wrote: >>>>> I updated the bit about OpenBLAS wheel with some more information on >>>>> the status of that work. It's not super important, but FYI. >>>> >>>> Maybe remove the bit (of my text) that you crossed out, or removed the >>>> strikethrough and qualify? At the moment it's confusing, because I >>>> believe what I wrote is correct, so leaving in there and crossed out >>>> looks kinda weird. >>> >>> Eh, it's a little weird because there's no specification needed >>> really, we can implement it any time we want to. It was stalled for a >>> long time because I ran into arcane technical problems dealing with >>> the MacOS linker, but that's solved and now it's just stalled due to >>> lack of attention. >>> >>> I deleted the text but feel free to qualify further if you think it's useful. >> >> Are you saying that we should consider this specification approved >> already? Or that we should go ahead without waiting for approval? I >> guess the latter. I guess you're saying you think there would be no >> bad consequences for doing this if the spec subsequently changed >> before being approved? It might be worth adding something like that >> to the text, in case there's somebody who wants to do some work on >> that. > > It's not a PEP. It will never be approved because there is no-one to > approve it :-). Sure, but it is a pull-request, it hasn't been merged - so I assume that someone is expecting to make or receive more feedback on it. > The only reason for writing it as a spec is to > potentially help coordinate with others who want to get in on making > these kinds of packages themselves, and the main motivator for that > will be if one of us starts doing it and proves it works... If I had to guess, I'd guess that you are saying Yes to "no bad consequences" (above)? Would you mind adding something about that in the text to make it clear? Cheers, Matthew From jfoxrabinovitz at gmail.com Fri Jul 28 18:25:11 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 28 Jul 2017 18:25:11 -0400 Subject: [Numpy-discussion] ENH: ratio function to mimic diff Message-ID: I have created PR#9481 to introduce a `ratio` function that behaves very similarly to `diff`, except that it divides successive elements instead of subtracting them. It has some handling built in for zero division, as well as the ability to select between `/` and `//` operators. There is currently no masked version. Perhaps someone could suggest a simple mechanism for hooking np.ma.true_divide and np.ma.floor_divide in as the operators instead of the regular np.* versions. Please let me know your thoughts. Regards, -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Sat Jul 29 06:26:19 2017 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sat, 29 Jul 2017 12:26:19 +0200 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: Yet another twirl to the existing spaghetti https://www.continuum.io/blog/developer-blog/open-sourcing-anaconda-accelerate On Tue, Jul 25, 2017 at 4:23 PM, Matthew Brett wrote: > On Tue, Jul 25, 2017 at 3:14 PM, Nathaniel Smith wrote: > > On Tue, Jul 25, 2017 at 7:05 AM, Matthew Brett > wrote: > >> On Tue, Jul 25, 2017 at 3:00 PM, Nathaniel Smith wrote: > >>> On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett < > matthew.brett at gmail.com> wrote: > >>>> On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith > wrote: > >>>>> I updated the bit about OpenBLAS wheel with some more information on > >>>>> the status of that work. It's not super important, but FYI. > >>>> > >>>> Maybe remove the bit (of my text) that you crossed out, or removed the > >>>> strikethrough and qualify? At the moment it's confusing, because I > >>>> believe what I wrote is correct, so leaving in there and crossed out > >>>> looks kinda weird. > >>> > >>> Eh, it's a little weird because there's no specification needed > >>> really, we can implement it any time we want to. It was stalled for a > >>> long time because I ran into arcane technical problems dealing with > >>> the MacOS linker, but that's solved and now it's just stalled due to > >>> lack of attention. > >>> > >>> I deleted the text but feel free to qualify further if you think it's > useful. > >> > >> Are you saying that we should consider this specification approved > >> already? Or that we should go ahead without waiting for approval? I > >> guess the latter. I guess you're saying you think there would be no > >> bad consequences for doing this if the spec subsequently changed > >> before being approved? It might be worth adding something like that > >> to the text, in case there's somebody who wants to do some work on > >> that. > > > > It's not a PEP. It will never be approved because there is no-one to > > approve it :-). > > Sure, but it is a pull-request, it hasn't been merged - so I assume > that someone is expecting to make or receive more feedback on it. > > > The only reason for writing it as a spec is to > > potentially help coordinate with others who want to get in on making > > these kinds of packages themselves, and the main motivator for that > > will be if one of us starts doing it and proves it works... > > If I had to guess, I'd guess that you are saying Yes to "no bad > consequences" (above)? Would you mind adding something about that in > the text to make it clear? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Jul 29 06:34:48 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 29 Jul 2017 11:34:48 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: Hi, On Sat, Jul 29, 2017 at 11:26 AM, Ilhan Polat wrote: > Yet another twirl to the existing spaghetti > > https://www.continuum.io/blog/developer-blog/open-sourcing-anaconda-accelerate > Just to avoid some obvious confusion, Anaconda Accelerate is nothing to do with macOS Accelerate: """ Accelerate currently is composed of three different feature sets: Python wrappers around NVIDIA GPU libraries for linear algebra, FFT, sparse matrix operations, sorting and searching. Python wrappers around some of Intel?s MKL Vector Math Library functions A ?data profiler? tool based on cProfile and SnakeViz. """ Cheers, Matthew From matthew.brett at gmail.com Sat Jul 29 06:37:51 2017 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 29 Jul 2017 11:37:51 +0100 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: On Sat, Jul 29, 2017 at 11:34 AM, Matthew Brett wrote: > Hi, > > On Sat, Jul 29, 2017 at 11:26 AM, Ilhan Polat wrote: >> Yet another twirl to the existing spaghetti >> >> https://www.continuum.io/blog/developer-blog/open-sourcing-anaconda-accelerate >> > > Just to avoid some obvious confusion, Anaconda Accelerate is nothing > to do with macOS Accelerate: Sorry - I didn't mean to imply that anyone was confused apart from me :) Cheers2, Matthew From ilhanpolat at gmail.com Sat Jul 29 07:17:00 2017 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sat, 29 Jul 2017 13:17:00 +0200 Subject: [Numpy-discussion] Dropping support for Accelerate In-Reply-To: References: Message-ID: If it can confuse you, imagine what would happen to regular users like me. That's why I wanted to mention this in advance that this also needs some sort of a "No this is not related to Anaconda Accelerate" disclaimer at some place if need be. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sat Jul 29 12:22:25 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 29 Jul 2017 16:22:25 +0000 Subject: [Numpy-discussion] ENH: ratio function to mimic diff In-Reply-To: References: Message-ID: This is an interesting idea, but I don't understand the use cases for this function. In particular, what would you use n-th order ratios for? One use case I can think of is estimating the slope of a log-scaled plot. But here exp(diff(log(x))) is an easy substitute. I guess ratio() would work in cases where values are both positive and negative, but again I don't know when that would be useful. If your signal crosses zero, ratios are likely to diverge. On Fri, Jul 28, 2017 at 3:25 PM Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I have created PR#9481 to introduce a `ratio` function that behaves very > similarly to `diff`, except that it divides successive elements instead of > subtracting them. It has some handling built in for zero division, as well > as the ability to select between `/` and `//` operators. > > There is currently no masked version. Perhaps someone could suggest a > simple mechanism for hooking np.ma.true_divide and np.ma.floor_divide in as > the operators instead of the regular np.* versions. > > Please let me know your thoughts. > > Regards, > > -Joe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Jul 29 12:54:18 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Jul 2017 09:54:18 -0700 Subject: [Numpy-discussion] ENH: ratio function to mimic diff In-Reply-To: References: Message-ID: I'd also like to see a more detailed motivation for this. And, if it is useful, then that would make 3 operations that have special case pairwise moving window variants (subtract, floor_divide, true_divide). 3 is a lot of special cases. Should there instead be a generic mechanism for doing this for arbitrary binary operations? -n On Jul 28, 2017 3:25 PM, "Joseph Fox-Rabinovitz" wrote: > I have created PR#9481 to introduce a `ratio` function that behaves very > similarly to `diff`, except that it divides successive elements instead of > subtracting them. It has some handling built in for zero division, as well > as the ability to select between `/` and `//` operators. > > There is currently no masked version. Perhaps someone could suggest a > simple mechanism for hooking np.ma.true_divide and np.ma.floor_divide in as > the operators instead of the regular np.* versions. > > Please let me know your thoughts. > > Regards, > > -Joe > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Sat Jul 29 19:25:32 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Sat, 29 Jul 2017 19:25:32 -0400 Subject: [Numpy-discussion] ENH: ratio function to mimic diff In-Reply-To: References: Message-ID: On Jul 29, 2017 12:23, "Stephan Hoyer" wrote: This is an interesting idea, but I don't understand the use cases for this function. In particular, what would you use n-th order ratios for? There is no good use case for the nth order differences that I am aware of. I just added that to mimic the way diff works. One use case I can think of is estimating the slope of a log-scaled plot. But here exp(diff(log(x))) is an easy substitute. My original motivation was very similar to that. I was looking for the largest geometric gap in a sorted sequence of numbers. Taking logs and exponents seemed like a sledge hammer for that task. I guess ratio() would work in cases where values are both positive and negative, but again I don't know when that would be useful. If your signal crosses zero, ratios are likely to diverge. They would, but looking for sign changes is easy, and I added an argument to flag actual zeros. On Fri, Jul 28, 2017 at 3:25 PM Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I have created PR#9481 to introduce a `ratio` function that behaves very > similarly to `diff`, except that it divides successive elements instead of > subtracting them. It has some handling built in for zero division, as well > as the ability to select between `/` and `//` operators. > > There is currently no masked version. Perhaps someone could suggest a > simple mechanism for hooking np.ma.true_divide and np.ma.floor_divide in as > the operators instead of the regular np.* versions. > > Please let me know your thoughts. > > Regards, > > -Joe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Sat Jul 29 19:32:37 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Sat, 29 Jul 2017 19:32:37 -0400 Subject: [Numpy-discussion] ENH: ratio function to mimic diff In-Reply-To: References: Message-ID: On Jul 29, 2017 12:55, "Nathaniel Smith" wrote: I'd also like to see a more detailed motivation for this. And, if it is useful, then that would make 3 operations that have special case pairwise moving window variants (subtract, floor_divide, true_divide). 3 is a lot of special cases. Should there instead be a generic mechanism for doing this for arbitrary binary operations? Perhaps another method for ufuncs of two arguments? I agree that there should be a generic mechanism since a lack of one is what is preventing me from applying this to masked arrays immediately. It would have to take in some domain filter, like many of the translated masked functions do. A ufunc could provide that transparently. -n On Jul 28, 2017 3:25 PM, "Joseph Fox-Rabinovitz" wrote: > I have created PR#9481 to introduce a `ratio` function that behaves very > similarly to `diff`, except that it divides successive elements instead of > subtracting them. It has some handling built in for zero division, as well > as the ability to select between `/` and `//` operators. > > There is currently no masked version. Perhaps someone could suggest a > simple mechanism for hooking np.ma.true_divide and np.ma.floor_divide in as > the operators instead of the regular np.* versions. > > Please let me know your thoughts. > > Regards, > > -Joe > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: