From chaoyuejoy at gmail.com Thu Dec 1 02:56:41 2011 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 1 Dec 2011 08:56:41 +0100 Subject: [Numpy-discussion] what statistical module to use for python? In-Reply-To: References: Message-ID: thanks, I should do it but I forgot.... chao 2011/12/1 > On Wed, Nov 30, 2011 at 1:16 PM, Chao YUE wrote: > > Hi all, > > > > I just want to broadly ask what statistical package are you guys using? I > > mean routine statistical function like linear regression, GLM, ANOVA... > etc. > > > > I know there is SciKits packages like statsmodels, but are there more > > general and complete ones? > > > > thanks to all, > > I forwarded it to the scipy-user mailing list since that is more suitable. > > Josef > > > > > Chao > > -- > > > *********************************************************************************** > > Chao YUE > > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > > UMR 1572 CEA-CNRS-UVSQ > > Batiment 712 - Pe 119 > > 91191 GIF Sur YVETTE Cedex > > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 > > > ************************************************************************************ > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Dec 1 05:20:08 2011 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 01 Dec 2011 11:20:08 +0100 Subject: [Numpy-discussion] np.dot and array order In-Reply-To: References: Message-ID: 01.12.2011 03:31, josef.pktd at gmail.com kirjoitti: [clip] > I thought np.dot is Lapack based and favors fortran order, but if the > second array is fortran ordered, then dot takes twice as long. It uses C-LAPACK, and will make copies if the arrays are not in C-order. -- Pauli Virtanen From pierre.haessig at crans.org Thu Dec 1 06:17:42 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 01 Dec 2011 12:17:42 +0100 Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: Message-ID: <4ED76256.4020909@crans.org> Le 01/12/2011 02:44, Karl Kappler a ?crit : > Also note that I have had a similar problem with much smaller arrays, > say 24 x 3076 Hi Karl, Could you post a self-contained code with such a "small" array (or even smaller. the smaller, the better...) so that we can run it and play with it ? -- Pierre From thouis at gmail.com Thu Dec 1 08:52:03 2011 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 1 Dec 2011 14:52:03 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data Message-ID: Is this expected behavior? >>> np.array([-345,4,2,'ABC']) array(['-34', '4', '2', 'ABC'], dtype='|S3') >>> np.version.full_version '1.6.1' >>> np.version.git_revision '68538b74483009c2c2d1644ef00397014f95a696' Ray Jones From pierre.haessig at crans.org Thu Dec 1 09:47:47 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 01 Dec 2011 15:47:47 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: <4ED79393.8020609@crans.org> Le 01/12/2011 14:52, Thouis (Ray) Jones a ?crit : > Is this expected behavior? > >>>> np.array([-345,4,2,'ABC']) > array(['-34', '4', '2', 'ABC'], dtype='|S3') > > With my numpy 1.5.1, I got indeed a different result: In [1]: np.array([-345,4,2,'ABC']) Out[1]: array(['-345', '4', '2', 'ABC'], dtype='|S8') The type casting is a bit different, and actually may better match what you expect, but still a casting is required (i.e. you cannot have a "numpy.array() of mixed integers and strings" because numpy arrays only store *homogenous* sets of data) Now one question remains for me : why use a numpy array to store a few strings, and not just a regular Python list ? Best, Pierre From thouis.jones at curie.fr Thu Dec 1 09:53:32 2011 From: thouis.jones at curie.fr (Thouis Jones) Date: Thu, 1 Dec 2011 15:53:32 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: <4ED79393.8020609@crans.org> References: <4ED79393.8020609@crans.org> Message-ID: On Thu, Dec 1, 2011 at 15:47, Pierre Haessig wrote: > Le 01/12/2011 14:52, Thouis (Ray) Jones a ?crit : >> Is this expected behavior? >> >>>>> np.array([-345,4,2,'ABC']) >> array(['-34', '4', '2', 'ABC'], dtype='|S3') >> >> > With my numpy 1.5.1, I got indeed a different result: > > In [1]: np.array([-345,4,2,'ABC']) > Out[1]: > array(['-345', '4', '2', 'ABC'], > ? ? ?dtype='|S8') This is closer to what I would expect. > The type casting is a bit different, and actually may better match what > you expect, but still a casting is required > (i.e. you cannot have a "numpy.array() of mixed integers and strings" > because numpy arrays only store *homogenous* sets of data) Of course, but when converting from a non-homogenous python list, I would expect it to do something reasonable (or at least not as bad as turning -345 into '-34'). > Now one question remains for me : why use a numpy array to store a few > strings, and not just a regular Python list ? It was a small test case. The actual data is much larger. Ray Jones From ben.root at ou.edu Thu Dec 1 10:29:39 2011 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 1 Dec 2011 09:29:39 -0600 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: <4ED79393.8020609@crans.org> Message-ID: On Thursday, December 1, 2011, Thouis Jones wrote: > On Thu, Dec 1, 2011 at 15:47, Pierre Haessig wrote: >> Le 01/12/2011 14:52, Thouis (Ray) Jones a ?crit : >>> Is this expected behavior? >>> >>>>>> np.array([-345,4,2,'ABC']) >>> array(['-34', '4', '2', 'ABC'], dtype='|S3') >>> >>> >> With my numpy 1.5.1, I got indeed a different result: >> >> In [1]: np.array([-345,4,2,'ABC']) >> Out[1]: >> array(['-345', '4', '2', 'ABC'], >> dtype='|S8') > > This is closer to what I would expect. > >> The type casting is a bit different, and actually may better match what >> you expect, but still a casting is required >> (i.e. you cannot have a "numpy.array() of mixed integers and strings" >> because numpy arrays only store *homogenous* sets of data) > > Of course, but when converting from a non-homogenous python list, I > would expect it to do something reasonable (or at least not as bad as > turning -345 into '-34'). > >> Now one question remains for me : why use a numpy array to store a few >> strings, and not just a regular Python list ? > > It was a small test case. The actual data is much larger. > > Ray Jones > This is total speculation on my part. My suspicion is that the loading process sees numbers and starts casting in that manner, then it sees the string and realizes that it has to cast everything to a fixed width string. The width is determined as the width of the longest string. Since -345 was already processed as a number, it never considers its string representation length. Does the same problem occur if -345 comes after "ABC"? Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis.jones at curie.fr Thu Dec 1 10:43:52 2011 From: thouis.jones at curie.fr (Thouis Jones) Date: Thu, 1 Dec 2011 16:43:52 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: <4ED79393.8020609@crans.org> Message-ID: On Thu, Dec 1, 2011 at 16:29, Benjamin Root wrote: > Does the same problem occur if -345 comes after "ABC"? Yes. >>> np.array(list(reversed([-345,4,2,'ABC']))) array(['ABC', '2', '4', '-34'], dtype='|S3') From charlesr.harris at gmail.com Thu Dec 1 11:39:51 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 1 Dec 2011 09:39:51 -0700 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: On Thu, Dec 1, 2011 at 6:52 AM, Thouis (Ray) Jones wrote: > Is this expected behavior? > > >>> np.array([-345,4,2,'ABC']) > array(['-34', '4', '2', 'ABC'], dtype='|S3') > > > Given that strings should be the result, this looks like a bug. It's a bit of a corner case that probably slipped through during the recent work on casting. There needs to be tests for these sorts of things, so if you find more oddities post them so we can add them. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonas.ruebsam at web.de Thu Dec 1 11:52:35 2011 From: jonas.ruebsam at web.de (jonasr) Date: Thu, 1 Dec 2011 08:52:35 -0800 (PST) Subject: [Numpy-discussion] build numpy matrix out of smaller matrix Message-ID: <32896004.post@talk.nabble.com> Hi, is there any possibility to define a numpy matrix, via a smaller given matrix, i.e. in matlab i can do this like a=[1 2 ; 3 4 ] A=[a a ; a a ] so that i finally get A=[ [1,2,1,2] [3,4,3,4] [1,2,1,2] [3,4,3,4]] i tried different things on numpy which didn't work any ideas ? thank you -- View this message in context: http://old.nabble.com/build-numpy-matrix-out-of-smaller-matrix-tp32896004p32896004.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From derek at astro.physik.uni-goettingen.de Thu Dec 1 12:15:18 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Thu, 1 Dec 2011 18:15:18 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: <06ADD0D6-DFC3-48E1-AA94-5C8996BA295C@astro.physik.uni-goettingen.de> On 1 Dec 2011, at 17:39, Charles R Harris wrote: > On Thu, Dec 1, 2011 at 6:52 AM, Thouis (Ray) Jones wrote: > Is this expected behavior? > > >>> np.array([-345,4,2,'ABC']) > array(['-34', '4', '2', 'ABC'], dtype='|S3') > > > > Given that strings should be the result, this looks like a bug. It's a bit of a corner case that probably slipped through during the recent work on casting. There needs to be tests for these sorts of things, so if you find more oddities post them so we can add them. As it is not dependent on the string appearing before or after the numbers, numerical values appear to always be processed first before any string transformation, even if you explicitly specify the string format - consider the following (1.6.1): >>> np.array((2, 12,0.1+2j)) array([ 2.0+0.j, 12.0+0.j, 0.1+2.j]) >>> np.array((2, 12,0.001+2j)) array([ 2.00000000e+00+0.j, 1.20000000e+01+0.j, 1.00000000e-03+2.j]) >>> np.array((2, 12,0.001+2j), dtype='|S8') array(['2', '12', '(0.001+2'], dtype='|S8') - notice the last value is only truncated because it had first been converted into a "standard" complex representation, so maybe the problem is already in the way Python treats the input. Cheers, Derek From ben.root at ou.edu Thu Dec 1 12:26:43 2011 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 1 Dec 2011 11:26:43 -0600 Subject: [Numpy-discussion] build numpy matrix out of smaller matrix In-Reply-To: <32896004.post@talk.nabble.com> References: <32896004.post@talk.nabble.com> Message-ID: On Thu, Dec 1, 2011 at 10:52 AM, jonasr wrote: > > Hi, > is there any possibility to define a numpy matrix, via a smaller given > matrix, i.e. in matlab > i can do this like > > a=[1 2 ; 3 4 ] > > > A=[a a ; a a ] > > so that i finally get > > A=[ [1,2,1,2] > [3,4,3,4] > [1,2,1,2] > [3,4,3,4]] > > i tried different things on numpy which didn't work > any ideas ? > > thank you > > numpy.tile() might be what you are looking for. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Thu Dec 1 12:28:40 2011 From: e.antero.tammi at gmail.com (eat) Date: Thu, 1 Dec 2011 19:28:40 +0200 Subject: [Numpy-discussion] build numpy matrix out of smaller matrix In-Reply-To: <32896004.post@talk.nabble.com> References: <32896004.post@talk.nabble.com> Message-ID: Hi, On Thu, Dec 1, 2011 at 6:52 PM, jonasr wrote: > > Hi, > is there any possibility to define a numpy matrix, via a smaller given > matrix, i.e. in matlab > i can do this like > > a=[1 2 ; 3 4 ] > > > A=[a a ; a a ] > > so that i finally get > > A=[ [1,2,1,2] > [3,4,3,4] > [1,2,1,2] > [3,4,3,4]] > > i tried different things on numpy which didn't work > any ideas ? > Perhaps something like this: In []: a= np.array([[1, 2], [3, 4]]) In []: np.c_[[a, a], [a, a]] Out[]: array([[[1, 2, 1, 2], [3, 4, 3, 4]], [[1, 2, 1, 2], [3, 4, 3, 4]]]) Regards, eat > > thank you > > > -- > View this message in context: > http://old.nabble.com/build-numpy-matrix-out-of-smaller-matrix-tp32896004p32896004.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Dec 1 13:16:15 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 1 Dec 2011 13:16:15 -0500 Subject: [Numpy-discussion] build numpy matrix out of smaller matrix In-Reply-To: References: <32896004.post@talk.nabble.com> Message-ID: On Thu, Dec 1, 2011 at 12:26 PM, Benjamin Root wrote: > On Thu, Dec 1, 2011 at 10:52 AM, jonasr wrote: >> >> >> Hi, >> is there any possibility to define a numpy matrix, via a smaller given >> matrix, i.e. in matlab >> i can do this like >> >> a=[1 2 ; 3 4 ] >> >> >> A=[a a ; a a ] >> >> so that i finally get >> >> A=[ [1,2,1,2] >> ? ? ?[3,4,3,4] >> ? ? ?[1,2,1,2] >> ? ? ?[3,4,3,4]] >> >> i tried different things on numpy which didn't work >> any ideas ? >> >> thank you >> > > numpy.tile() might be what you are looking for. or which is my favorite tile and repeat replacement >>> a= np.array([[1, 2], [3, 4]]) >>> np.kron(np.ones((2,2)), a) array([[ 1., 2., 1., 2.], [ 3., 4., 3., 4.], [ 1., 2., 1., 2.], [ 3., 4., 3., 4.]]) >>> np.kron(a, np.ones((2,2))) array([[ 1., 1., 2., 2.], [ 1., 1., 2., 2.], [ 3., 3., 4., 4.], [ 3., 3., 4., 4.]]) Josef > > Cheers! > Ben Root > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From e.antero.tammi at gmail.com Thu Dec 1 13:41:16 2011 From: e.antero.tammi at gmail.com (eat) Date: Thu, 1 Dec 2011 20:41:16 +0200 Subject: [Numpy-discussion] build numpy matrix out of smaller matrix In-Reply-To: References: <32896004.post@talk.nabble.com> Message-ID: Oops, slightly incorrect answer, but anyway my intention was more along the lines: In []: a= np.array([[1, 2], [3, 4]]) In []: np.c_[[a, a], [a, a]].reshape(4, 4) Out[]: array([[1, 2, 1, 2], [3, 4, 3, 4], [1, 2, 1, 2], [3, 4, 3, 4]]) On Thu, Dec 1, 2011 at 8:16 PM, wrote: > On Thu, Dec 1, 2011 at 12:26 PM, Benjamin Root wrote: > > On Thu, Dec 1, 2011 at 10:52 AM, jonasr wrote: > >> > >> > >> Hi, > >> is there any possibility to define a numpy matrix, via a smaller given > >> matrix, i.e. in matlab > >> i can do this like > >> > >> a=[1 2 ; 3 4 ] > >> > >> > >> A=[a a ; a a ] > >> > >> so that i finally get > >> > >> A=[ [1,2,1,2] > >> [3,4,3,4] > >> [1,2,1,2] > >> [3,4,3,4]] > >> > >> i tried different things on numpy which didn't work > >> any ideas ? > >> > >> thank you > >> > > > > numpy.tile() might be what you are looking for. > > or which is my favorite tile and repeat replacement > > >>> a= np.array([[1, 2], [3, 4]]) > >>> np.kron(np.ones((2,2)), a) > array([[ 1., 2., 1., 2.], > [ 3., 4., 3., 4.], > [ 1., 2., 1., 2.], > [ 3., 4., 3., 4.]]) > > >>> np.kron(a, np.ones((2,2))) > array([[ 1., 1., 2., 2.], > [ 1., 1., 2., 2.], > [ 3., 3., 4., 4.], > [ 3., 3., 4., 4.]]) > But, of'course this is way more generic (and preferable) approach to utilize. eat > > Josef > > > > > Cheers! > > Ben Root > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnetotellurics at gmail.com Thu Dec 1 15:17:19 2011 From: magnetotellurics at gmail.com (kneil) Date: Thu, 1 Dec 2011 12:17:19 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: Message-ID: <32898294.post@talk.nabble.com> Hi Oliver, indeed that was a typo, I should have used cut and paste. I was using .transpose() Olivier Delalleau-2 wrote: > > I guess it's just a typo on your part, but just to make sure, you are > using > .transpose(), not .transpose, correct? > > -=- Olivier > > 2011/11/30 Karl Kappler > >> Hello, >> I am somewhat new to scipy/numpy so please point me in the right >> direction >> if I am posting to an incorrect forum. >> >> The experience which has prompted my post is the following: >> I have a numpy array Y where the elements of Y are >> type(Y[0,0]) >> Out[709]: >> >> The absolute values of the real and complex values do not far exceed say >> 1e-10. The shape of Y is (24, 49218). >> When I perform the operation: C = dot(Y,Y.conj().transpose), i.e. I form >> the covariance matrix by multiplying T by its conjugate transpose, I >> sometimes get NaN in the array C. >> >> I can imagine some reasons why this may happen, but what is truly >> puzzling >> to me is that I will be working in ipython and will execute for example: >> find(isnan(C)) and will be returned an list of elements of C which are >> NaN, >> fine, but then I recalculate C, and repeat the find(isnan(C)) command and >> I get a different answer. >> >> I type: >> find(isnan(dot(Y,Y.conj().transpose))) >> and an empty array is returned. Repeated calls of the same command >> however result in a non-empty array. In fact, the sequence of arrays >> returned from several consecutive calls varies. Sometimes there are tens >> of >> NaNs, sometimes none. >> >> I have been performing a collection of experiments for some hours and >> cannot get to the bottom of this; >> Some things I have tried: >> 1. Cast the array Y as a matrix X and calculate X*X.H --- in this case i >> get the same result in that sometimes I have NaN and sometimes I do not. >> 2. set A=X.H and calculate X*A --- same results* >> 3. set b=A.copy() and calc X*b --- same results*. >> 4. find(isnan(real(X*X.H))) --- same results* >> 5. find(isnan(real(X)*real(X.H))) - no NaN appear >> >> *N.B. "Same results" does not mean that the same indices were going NaN, >> simply that I was getting back a different result if I ran the command >> say >> a dozen times. >> >> So it would seem that it has something to do with the complex >> multiplication. I am wondering if there is too much dynamic range being >> used in the calculation? It absolutely amazes me that I can perform the >> same complex-arithmetic operation sitting at the command line and obtain >> different results each time. In one case I ran a for loop where I >> performed the multiplication 1000 times and found that 694 trials had no >> NaN and 306 trials had NaN. >> >> Saving X to file and then reloading it in a new ipython interpreter >> typically resulted in no NaN. >> >> For a fixed interpretter and instance of X or Y, the indices which go NaN >> (when they do) sometimes repeat many times and sometimes they vary >> apparently at random. >> >> Also note that I have had a similar problem with much smaller arrays, say >> 24 x 3076 >> >> I have also tried 'upping' the numpy array to complex256, I have like >> 12GB >> of RAM... >> >> This happens both in ipython and when I call my function from the command >> line. >> >> Does this sound familiar to anyone? Is my machine possessed? >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32898294.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From Chris.Barker at noaa.gov Thu Dec 1 15:35:00 2011 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu, 01 Dec 2011 12:35:00 -0800 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: <06ADD0D6-DFC3-48E1-AA94-5C8996BA295C@astro.physik.uni-goettingen.de> References: <06ADD0D6-DFC3-48E1-AA94-5C8996BA295C@astro.physik.uni-goettingen.de> Message-ID: <4ED7E4F4.9050502@noaa.gov> On 12/1/2011 9:15 AM, Derek Homeier wrote: >>>> np.array((2, 12,0.001+2j), dtype='|S8') > array(['2', '12', '(0.001+2'], dtype='|S8') > > - notice the last value is only truncated because it had first been converted into > a "standard" complex representation, so maybe the problem is already in the way > Python treats the input. no -- it's truncated because you've specified a 8 char long string, and the string representation of complex is longer than that. I assume that numpy is using the objects __str__ or __repr__: In [13]: str(0.001+2j) Out[13]: '(0.001+2j)' In [14]: repr(0.001+2j) Out[14]: '(0.001+2j)' I think the only bug we've identified here is that numpy is selecting the string size based on the longest string input, rather than checking to see how long the string representation of the numeric input is as well. if there is a long-enough string in there, it works fine: In [15]: np.array([-345,4,2,'ABC', 'abcde']) Out[15]: array(['-345', '4', '2', 'ABC', 'abcde'], dtype='|S5') An open question is what it should do if you specify the length of the string dtype, but one of the values can't be fit into that size. At this point, it truncates, but should it raise an error? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From magnetotellurics at gmail.com Thu Dec 1 15:47:11 2011 From: magnetotellurics at gmail.com (kneil) Date: Thu, 1 Dec 2011 12:47:11 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: <4ED76256.4020909@crans.org> References: <4ED76256.4020909@crans.org> Message-ID: <32898383.post@talk.nabble.com> Hi Pierre, I was thinking about uploading some examples but strangely, when I store the array using for example: np.save('Y',Y) and then reload it in a new workspace, I find that the problem does not reproduce. It would seem somehow to be associated with the 'overhead' of the workspace I am in... The context here is that I read in 24 files, totaling about 7GB, and then forming data matrices of size 24 x N, where N varies. I tried for example this morning to run the same code, but working with only 12 of the files - just to see if NaNs appeared. No NaN appeared however when the machine was being less 'taxed'. Strangely enough, I also seterr(all='raise') in the workspace before executing this (in the case where I read all 24 files and do net NaN) and I do not get any messages about the NaN while the calculation is taking place. If you want to play with this I would be willing to put the data on a file sharing site (its around 7.1G of data) together with the code and you could play with it from there. The code is not too many lines - under 100 lines, and I am sure I could trim it down from there. Let me know if you are interested. cheers, K Pierre Haessig-2 wrote: > > Le 01/12/2011 02:44, Karl Kappler a ?crit : >> Also note that I have had a similar problem with much smaller arrays, >> say 24 x 3076 > Hi Karl, > Could you post a self-contained code with such a "small" array (or even > smaller. the smaller, the better...) so that we can run it and play with > it ? > -- > Pierre > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32898383.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From derek at astro.physik.uni-goettingen.de Thu Dec 1 16:29:15 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Thu, 1 Dec 2011 22:29:15 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: <4ED7E4F4.9050502@noaa.gov> References: <06ADD0D6-DFC3-48E1-AA94-5C8996BA295C@astro.physik.uni-goettingen.de> <4ED7E4F4.9050502@noaa.gov> Message-ID: On 1 Dec 2011, at 21:35, Chris Barker wrote: > On 12/1/2011 9:15 AM, Derek Homeier wrote: >>>>> np.array((2, 12,0.001+2j), dtype='|S8') >> array(['2', '12', '(0.001+2'], dtype='|S8') >> >> - notice the last value is only truncated because it had first been converted into >> a "standard" complex representation, so maybe the problem is already in the way >> Python treats the input. > > no -- it's truncated because you've specified a 8 char long string, and > the string representation of complex is longer than that. I assume that > numpy is using the objects __str__ or __repr__: > > In [13]: str(0.001+2j) > Out[13]: '(0.001+2j)' > > In [14]: repr(0.001+2j) > Out[14]: '(0.001+2j)' > That's what I meant with the "Python-side" of the issue, but you're right, there is no numerical conversion involved. > I think the only bug we've identified here is that numpy is selecting > the string size based on the longest string input, rather than checking > to see how long the string representation of the numeric input is as > well. if there is a long-enough string in there, it works fine: > > In [15]: np.array([-345,4,2,'ABC', 'abcde']) > Out[15]: > array(['-345', '4', '2', 'ABC', 'abcde'], > dtype='|S5') > > An open question is what it should do if you specify the length of the > string dtype, but one of the values can't be fit into that size. At this > point, it truncates, but should it raise an error? I would probably raise a warning rather than an error - I think if the user explicitly specifies a string length, they should be aware that the data might be truncated (and might even want this behaviour). Another "issue" could be that the string representation can look quite different from what has been typed in, like In [95]: np.array(('abcdefg', 12, 0.00001+2j), dtype='|S12') Out[95]: array(['abcdefg', '12', '(1e-05+2j)'], dtype='|S12') but then I think one has to accept that _ 0.00001+2j _ is not a string and thus cannot be guaranteed to be represented in that exact way - it can be either understood as a numerical object or not at all (i.e. one should just type it in as a string - with quotes - if one wants string-behaviour). Cheers, Derek From jkington at wisc.edu Thu Dec 1 17:19:51 2011 From: jkington at wisc.edu (Joe Kington) Date: Thu, 01 Dec 2011 16:19:51 -0600 Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: <32898383.post@talk.nabble.com> References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> Message-ID: On Thu, Dec 1, 2011 at 2:47 PM, kneil wrote: > > Hi Pierre, > I was thinking about uploading some examples but strangely, when I store > the > array using for example: np.save('Y',Y) > and then reload it in a new workspace, I find that the problem does not > reproduce. It would seem somehow to be > associated with the 'overhead' of the workspace I am in... > > The context here is that I read in 24 files, totaling about 7GB, and then > forming data matrices of size 24 x N, where N varies. I tried for example > this morning to run the same code, but working with only 12 of the files - > just to see if NaNs appeared. No NaN appeared however when the machine was > being less 'taxed'. > Are you using non-ECC RAM, by chance? (Though if you have >4GB of ram, I can't imagine that you wouldn't be using ECC...) Alternately, have you run memtest lately? That sound suspiciously like bad ram... > > Strangely enough, I also seterr(all='raise') in the workspace before > executing this (in the case where I read all 24 files and do net NaN) and I > do not get any messages about the NaN while the calculation is taking > place. > > If you want to play with this I would be willing to put the data on a file > sharing site (its around 7.1G of data) together with the code and you could > play with it from there. The code is not too many lines - under 100 lines, > and I am sure I could trim it down from there. > > Let me know if you are interested. > cheers, > K > > > Pierre Haessig-2 wrote: > > > > Le 01/12/2011 02:44, Karl Kappler a ?crit : > >> Also note that I have had a similar problem with much smaller arrays, > >> say 24 x 3076 > > Hi Karl, > > Could you post a self-contained code with such a "small" array (or even > > smaller. the smaller, the better...) so that we can run it and play with > > it ? > > -- > > Pierre > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > View this message in context: > http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32898383.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Thu Dec 1 23:01:02 2011 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 1 Dec 2011 20:01:02 -0800 Subject: [Numpy-discussion] scipy.org still says source in some subversion repo -- should be git !? In-Reply-To: References: Message-ID: On Mon, Nov 28, 2011 at 1:19 PM, Matthew Brett wrote: > Maybe the content could be put in > http://github.com/scipy/scipy.github.com so we can make pull requests > there? > The source is here: https://github.com/scipy/scipy.org-new -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Dec 1 23:25:02 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 1 Dec 2011 20:25:02 -0800 Subject: [Numpy-discussion] scipy.org still says source in some subversion repo -- should be git !? In-Reply-To: References: Message-ID: Yo, On Thu, Dec 1, 2011 at 8:01 PM, Jarrod Millman wrote: > On Mon, Nov 28, 2011 at 1:19 PM, Matthew Brett > wrote: >> >> Maybe the content could be put in >> http://github.com/scipy/scipy.github.com so we can make pull requests >> there? > > > The source is here: > ? https://github.com/scipy/scipy.org-new Are you then the person to ask about merging pull requests and uploading the docs? See you (literally), Matthew From magnetotellurics at gmail.com Thu Dec 1 23:46:40 2011 From: magnetotellurics at gmail.com (kneil) Date: Thu, 1 Dec 2011 20:46:40 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> Message-ID: <32900355.post@talk.nabble.com> Hi Pierre, I confirmed with the guy who put together the machine that it is non-ECC RAM. You know, now that i think about it, this machine seems to crash a fair amount more often than its identical twin which sits on a desk near me. I researched memtest a bit... downloaded and compiled it, but I do not quite understand the finer points of using it... it seems that I want to remove my RAM cards and test them one at a time. Do you know a good reference for using it. I think at this point the best thing to do will be to dump my data/code to portable HDD and load it on the other computer with same specs as this one. If it runs without generating any NaN then I will proceed to a full memtest. Thanks for the advice. -Karl Joe Kington-2 wrote: > > On Thu, Dec 1, 2011 at 2:47 PM, kneil wrote: > >> >> Hi Pierre, >> I was thinking about uploading some examples but strangely, when I store >> the >> array using for example: np.save('Y',Y) >> and then reload it in a new workspace, I find that the problem does not >> reproduce. It would seem somehow to be >> associated with the 'overhead' of the workspace I am in... >> > >> The context here is that I read in 24 files, totaling about 7GB, and then >> forming data matrices of size 24 x N, where N varies. I tried for >> example >> this morning to run the same code, but working with only 12 of the files >> - >> just to see if NaNs appeared. No NaN appeared however when the machine >> was >> being less 'taxed'. >> > > Are you using non-ECC RAM, by chance? (Though if you have >4GB of ram, I > can't imagine that you wouldn't be using ECC...) > > Alternately, have you run memtest lately? That sound suspiciously like > bad > ram... > > >> >> Strangely enough, I also seterr(all='raise') in the workspace before >> executing this (in the case where I read all 24 files and do net NaN) and >> I >> do not get any messages about the NaN while the calculation is taking >> place. >> >> If you want to play with this I would be willing to put the data on a >> file >> sharing site (its around 7.1G of data) together with the code and you >> could >> play with it from there. The code is not too many lines - under 100 >> lines, >> and I am sure I could trim it down from there. >> >> Let me know if you are interested. >> cheers, >> K >> >> >> Pierre Haessig-2 wrote: >> > >> > Le 01/12/2011 02:44, Karl Kappler a ?crit : >> >> Also note that I have had a similar problem with much smaller arrays, >> >> say 24 x 3076 >> > Hi Karl, >> > Could you post a self-contained code with such a "small" array (or even >> > smaller. the smaller, the better...) so that we can run it and play >> with >> > it ? >> > -- >> > Pierre > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32900355.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From thouis at gmail.com Fri Dec 2 10:23:46 2011 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Fri, 2 Dec 2011 16:23:46 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: On Thu, Dec 1, 2011 at 17:39, Charles R Harris wrote: > Given that strings should be the result, this looks like a bug. It's a bit > of a corner case that probably slipped through during the recent work on > casting. There needs to be tests for these sorts of things, so if you find > more oddities post them so we can add them. I'm happy to add a patch and tests, but could use some guidance... It looks like discover_itemsize() in core/src/multiarray/ctors.c should compute the length of the string or unicode representation of the object based on the eventual type, but looking at UNICODE_setitem() and STRING_setitem() in core/src/multiarray/arraytypes.c.src, this is not trivial. Perhaps the object-to-unicode/string parts of UNICODE_setitem/STRING_setitem can be extracted into separate functions that can be called from *_setitem as well as discover_itemsize. discover_itemsize would also need to know the type it's discovering for (string or unicode or user-defined). Not sure what to do to handle user-defined types (error?). If that's is too complicated, maybe discover_itemsize should return -1 (or warn, but given the danger of truncation, that seems a bit weak) if asked to discover from data that doesn't have a length. This would result in dtype=object when np.array is handed a mixed int/string list. I wonder, also, if STRING_setitem and UNICODE_setitem shouldn't emit a warning if asked to truncate data? Ray Jones From charlesr.harris at gmail.com Fri Dec 2 12:53:44 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 2 Dec 2011 10:53:44 -0700 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: On Fri, Dec 2, 2011 at 8:23 AM, Thouis (Ray) Jones wrote: > On Thu, Dec 1, 2011 at 17:39, Charles R Harris > wrote: > > Given that strings should be the result, this looks like a bug. It's a > bit > > of a corner case that probably slipped through during the recent work on > > casting. There needs to be tests for these sorts of things, so if you > find > > more oddities post them so we can add them. > > I'm happy to add a patch and tests, but could use some guidance... > > It looks like discover_itemsize() in core/src/multiarray/ctors.c > should compute the length of the string or unicode representation of > the object based on the eventual type, but looking at > UNICODE_setitem() and STRING_setitem() in > core/src/multiarray/arraytypes.c.src, this is not trivial. > > Perhaps the object-to-unicode/string parts of > UNICODE_setitem/STRING_setitem can be extracted into separate > functions that can be called from *_setitem as well as > discover_itemsize. discover_itemsize would also need to know the > type it's discovering for (string or unicode or user-defined). > > After sleeping on this, I think an object array in this situation would be the better choice and wouldn't result in lost information. This might change the behavior of some functions though, so would need testing. Not sure what to do to handle user-defined types (error?). > > If that's is too complicated, maybe discover_itemsize should return -1 > (or warn, but given the danger of truncation, that seems a bit weak) > if asked to discover from data that doesn't have a length. This would > result in dtype=object when np.array is handed a mixed int/string > list. > > I wonder, also, if STRING_setitem and UNICODE_setitem shouldn't emit a > warning if asked to truncate data? > > I think a warning would be useful. But I don't use strings much so input from a user might carry more weight. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnetotellurics at gmail.com Fri Dec 2 22:10:51 2011 From: magnetotellurics at gmail.com (kneil) Date: Fri, 2 Dec 2011 19:10:51 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: <32900355.post@talk.nabble.com> References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> Message-ID: <32906553.post@talk.nabble.com> OK - here is the status update on this problem: 1. To test for bad RAM, I saved the code plus data onto a usb drive and manually transferred it to a colleague's desktop machine with same specs as mine. The NaN values continued to appear at random, so it is unlikely to be bad RAM - unless its contagious. 2. Here is how I am temporarily working around the problem: Right before performing the multiplication of X*X.H where X=asmatrix(Y), I save X to file using np.save('X',X). Then I reload it via X=np.load('X.npy'). Cast as matrix: X=asmatrix(X); and carry on: S=X*X.H. I have not seen any NaN although I have only run it a few times... but it seems to work. I'd be very curious to hear any ideas on why this problem exists in the first place, and why save/load gets me around it. For the record I am running ubuntu 11.04, have 16GB RAM (not 12) and use python2.7. cheers, Karl Hi Pierre, I confirmed with the guy who put together the machine that it is non-ECC RAM. You know, now that i think about it, this machine seems to crash a fair amount more often than its identical twin which sits on a desk near me. I researched memtest a bit... downloaded and compiled it, but I do not quite understand the finer points of using it... it seems that I want to remove my RAM cards and test them one at a time. Do you know a good reference for using it. I think at this point the best thing to do will be to dump my data/code to portable HDD and load it on the other computer with same specs as this one. If it runs without generating any NaN then I will proceed to a full memtest. Thanks for the advice. -Karl On Thu, Dec 1, 2011 at 2:47 PM, kneil wrote: Are you using non-ECC RAM, by chance? (Though if you have >4GB of ram, I can't imagine that you wouldn't be using ECC...) Alternately, have you run memtest lately? That sound suspiciously like bad ram... -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32906553.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From njs at pobox.com Fri Dec 2 22:30:56 2011 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 2 Dec 2011 22:30:56 -0500 Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: <32906553.post@talk.nabble.com> References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> <32906553.post@talk.nabble.com> Message-ID: If save/load actually makes a reliable difference, then it would be useful to do something like this, and see what you see: save("X", X) X2 = load("X.npy") diff = (X == X2) # did save/load change anything? any(diff) # if so, then what changed? X[diff] X2[diff] # any subtle differences in floating point representation? X[diff][0].view(np.uint8) X2[diff][0].view(np.uint8) (You should still run memtest. It's very easy - just install it with your package manager, then reboot. Hold down the shift key while booting, and you'll get a boot menu. Choose memtest, and then leave it to run overnight.) - Nathaniel On Dec 2, 2011 10:10 PM, "kneil" wrote: > > OK - here is the status update on this problem: > 1. To test for bad RAM, I saved the code plus data onto a usb drive and > manually transferred it to a colleague's desktop machine with same specs as > mine. The NaN values continued to appear at random, so it is unlikely to > be > bad RAM - unless its contagious. > > 2. Here is how I am temporarily working around the problem: Right before > performing the multiplication of X*X.H where X=asmatrix(Y), I save X to > file > using np.save('X',X). Then I reload it via X=np.load('X.npy'). Cast as > matrix: X=asmatrix(X); and carry on: S=X*X.H. I have not seen any NaN > although I have only run it a few times... but it seems to work. > > I'd be very curious to hear any ideas on why this problem exists in the > first place, and why save/load gets me around it. > > For the record I am running ubuntu 11.04, have 16GB RAM (not 12) and use > python2.7. > cheers, > Karl > > Hi Pierre, > > I confirmed with the guy who put together the machine that it is non-ECC > RAM. You know, now that i think about it, this machine seems to crash a > fair amount more often than its identical twin which sits on a desk near > me. > I researched memtest a bit... downloaded and compiled it, but I do not > quite > understand the finer points of using it... it seems that I want to remove > my > RAM cards and test them one at a time. Do you know a good reference for > using it. > > I think at this point the best thing to do will be to dump my data/code to > portable HDD and load it on the other computer with same specs as this one. > If it runs without generating any NaN then I will proceed to a full > memtest. > > Thanks for the advice. > -Karl > > On Thu, Dec 1, 2011 at 2:47 PM, kneil wrote: > > Are you using non-ECC RAM, by chance? (Though if you have >4GB of ram, I > can't imagine that you wouldn't be using ECC...) > > Alternately, have you run memtest lately? That sound suspiciously like bad > ram... > > > > -- > View this message in context: > http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32906553.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkraft4 at gmail.com Sat Dec 3 01:35:03 2011 From: rkraft4 at gmail.com (Robin Kraft) Date: Sat, 3 Dec 2011 01:35:03 -0500 Subject: [Numpy-discussion] "upsample" or scale an array Message-ID: I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows: [[1, 2], [3, 4]] becomes [[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]] It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss. Many thanks! -Robin From warren.weckesser at enthought.com Sat Dec 3 01:51:21 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 3 Dec 2011 00:51:21 -0600 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: > I need to take an array - derived from raster GIS data - and upsample or > scale it. That is, I need to repeat each value in each dimension so that, > for example, a 2x2 array becomes a 4x4 array as follows: > > [[1, 2], > [3, 4]] > > becomes > > [[1,1,2,2], > [1,1,2,2], > [3,3,4,4] > [3,3,4,4]] > > It seems like some combination of np.resize or np.repeat and reshape + > rollaxis would do the trick, but I'm at a loss. > > Many thanks! > > -Robin > Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron: In [14]: a = arange(12).reshape(3,4) In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkraft4 at gmail.com Sat Dec 3 11:02:02 2011 From: rkraft4 at gmail.com (Robin Kraft) Date: Sat, 3 Dec 2011 11:02:02 -0500 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: Message-ID: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM. I also just found this StackOverflow post with another solution. a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays. In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop Or for a 43200x4800 array: In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb. Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner! -Robin On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: > On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: > > > I need to take an array - derived from raster GIS data - and upsample or > > scale it. That is, I need to repeat each value in each dimension so that, > > for example, a 2x2 array becomes a 4x4 array as follows: > > > > [[1, 2], > > [3, 4]] > > > > becomes > > > > [[1,1,2,2], > > [1,1,2,2], > > [3,3,4,4] > > [3,3,4,4]] > > > > It seems like some combination of np.resize or np.repeat and reshape + > > rollaxis would do the trick, but I'm at a loss. > > > > Many thanks! > > > > -Robin > > > > > Just a day or so ago, Josef Perktold showed one way of accomplishing this > using numpy.kron: > > In [14]: a = arange(12).reshape(3,4) > > In [15]: a > Out[15]: > array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11]]) > > In [16]: kron(a, ones((2,2))) > Out[16]: > array([[ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 8., 8., 9., 9., 10., 10., 11., 11.], > [ 8., 8., 9., 9., 10., 10., 11., 11.]]) > > > Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Sat Dec 3 11:05:47 2011 From: shish at keba.be (Olivier Delalleau) Date: Sat, 3 Dec 2011 11:05:47 -0500 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: You can also use numpy.tile -=- Olivier 2011/12/3 Robin Kraft > Thanks Warren, this is great, and even handles giant arrays just fine if > you've got enough RAM. > > I also just found this StackOverflow post with another solution. > > a.repeat(2, axis=0).repeat(2, axis=1). > http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array > > np.kron lets you do more, but for my simple use case the repeat() method > is faster and more ram efficient with large arrays. > > In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') > > In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) > 10 loops, best of 3: 182 ms per loop > > In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) > 1 loops, best of 3: 513 ms per loop > > > Or for a 43200x4800 array: > > In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') > > In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) > 1 loops, best of 3: 6.92 s per loop > > In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) > 1 loops, best of 3: 27.8 s per loop > > In this case repeat() peaked at about 1gb of ram usage while np.kron hit > about 1.7gb. > > Thanks again Warren. I'd tried way too many variations on reshape and > rollaxis, and should have come to the Numpy list a lot sooner! > > -Robin > > > On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: > > On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: > > >* I need to take an array - derived from raster GIS data - and upsample or*>* scale it. That is, I need to repeat each value in each dimension so that,*>* for example, a 2x2 array becomes a 4x4 array as follows:*>**>* [[1, 2],*>* [3, 4]]*>**>* becomes*>**>* [[1,1,2,2],*>* [1,1,2,2],*>* [3,3,4,4]*>* [3,3,4,4]]*>**>* It seems like some combination of np.resize or np.repeat and reshape +*>* rollaxis would do the trick, but I'm at a loss.*>**>* Many thanks!*>**>* -Robin*>** > > Just a day or so ago, Josef Perktold showed one way of accomplishing this > using numpy.kron: > > In [14]: a = arange(12).reshape(3,4) > > In [15]: a > Out[15]: > array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11]]) > > In [16]: kron(a, ones((2,2))) > Out[16]: > array([[ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 8., 8., 9., 9., 10., 10., 11., 11.], > [ 8., 8., 9., 9., 10., 10., 11., 11.]]) > > > Warren > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkraft4 at gmail.com Sat Dec 3 12:22:48 2011 From: rkraft4 at gmail.com (Robin Kraft) Date: Sat, 3 Dec 2011 12:22:48 -0500 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: That does repeat the elements, but doesn't get them into the desired order. In [4]: print a [[1 2] [3 4]] In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]]) In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]]) It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]) It would take some more reshaping/axis rolling to get there, but it seems doable. Anyone know what combination of manipulations would work with the result of np.tile? -Robin On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote: > You can also use numpy.tile > > -=- Olivier > > 2011/12/3 Robin Kraft >> Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM. >> >> I also just found this StackOverflow post with another solution. >> >> a.repeat(2, axis=0).repeat(2, axis=1). >> http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array >> >> np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays. >> >> In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') >> >> In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) >> 10 loops, best of 3: 182 ms per loop >> >> In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) >> 1 loops, best of 3: 513 ms per loop >> >> >> Or for a 43200x4800 array: >> >> In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') >> >> In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) >> 1 loops, best of 3: 6.92 s per loop >> >> In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) >> 1 loops, best of 3: 27.8 s per loop >> >> In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb. >> >> Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner! >> >> -Robin >> >> >> On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: >>> On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: >>> >>> > I need to take an array - derived from raster GIS data - and upsample or >>> > scale it. That is, I need to repeat each value in each dimension so that, >>> > for example, a 2x2 array becomes a 4x4 array as follows: >>> > >>> > [[1, 2], >>> > [3, 4]] >>> > >>> > becomes >>> > >>> > [[1,1,2,2], >>> > [1,1,2,2], >>> > [3,3,4,4] >>> > [3,3,4,4]] >>> > >>> > It seems like some combination of np.resize or np.repeat and reshape + >>> > rollaxis would do the trick, but I'm at a loss. >>> > >>> > Many thanks! >>> > >>> > -Robin >>> > >>> >>> >>> Just a day or so ago, Josef Perktold showed one way of accomplishing this >>> using numpy.kron: >>> >>> In [14]: a = arange(12).reshape(3,4) >>> >>> In [15]: a >>> Out[15]: >>> array([[ 0, 1, 2, 3], >>> [ 4, 5, 6, 7], >>> [ 8, 9, 10, 11]]) >>> >>> In [16]: kron(a, ones((2,2))) >>> Out[16]: >>> array([[ 0., 0., 1., 1., 2., 2., 3., 3.], >>> [ 0., 0., 1., 1., 2., 2., 3., 3.], >>> [ 4., 4., 5., 5., 6., 6., 7., 7.], >>> [ 4., 4., 5., 5., 6., 6., 7., 7.], >>> [ 8., 8., 9., 9., 10., 10., 11., 11.], >>> [ 8., 8., 9., 9., 10., 10., 11., 11.]]) >>> >>> >>> Warren >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Sat Dec 3 12:47:48 2011 From: shish at keba.be (Olivier Delalleau) Date: Sat, 3 Dec 2011 12:47:48 -0500 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: Ah sorry, I hadn't read carefully enough what you were trying to achieve. I think the double repeat solution looks like your best option then. -=- Olivier 2011/12/3 Robin Kraft > That does repeat the elements, but doesn't get them into the desired order. > > In [4]: print a > [[1 2] > [3 4]] > > In [7]: np.tile(a, 4) > Out[7]: > array([[1, 2, 1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4, 3, 4]]) > > In [8]: np.tile(a, 4).reshape(4,4) > Out[8]: > array([[1, 2, 1, 2], > [1, 2, 1, 2], > [3, 4, 3, 4], > [3, 4, 3, 4]]) > > It's close, but I want to repeat the elements along the two > axes, effectively stretching it by the lower right corner: > > array([[1, 1, 2, 2], > [1, 1, 2, 2], > [3, 3, 4, 4], > [3, 3, 4, 4]]) > > It would take some more reshaping/axis rolling to get there, but it seems > doable. > > Anyone know what combination of manipulations would work with the result > of np.tile? > > -Robin > > > > On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote: > > You can also use numpy.tile > > -=- Olivier > > 2011/12/3 Robin Kraft > > Thanks Warren, this is great, and even handles giant arrays just fine if > you've got enough RAM. > > I also just found this StackOverflow post with another solution. > > a.repeat(2, axis=0).repeat(2, axis=1). > http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array > > np.kron lets you do more, but for my simple use case the repeat() method > is faster and more ram efficient with large arrays. > > In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') > > In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) > 10 loops, best of 3: 182 ms per loop > > In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) > 1 loops, best of 3: 513 ms per loop > > > Or for a 43200x4800 array: > > In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') > > In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) > 1 loops, best of 3: 6.92 s per loop > > In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) > 1 loops, best of 3: 27.8 s per loop > > In this case repeat() peaked at about 1gb of ram usage while np.kron hit > about 1.7gb. > > Thanks again Warren. I'd tried way too many variations on reshape and > rollaxis, and should have come to the Numpy list a lot sooner! > > -Robin > > > On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: > > On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: > > >* I need to take an array - derived from raster GIS data - and upsample or*>* scale it. That is, I need to repeat each value in each dimension so that,*>* for example, a 2x2 array becomes a 4x4 array as follows:*>**>* [[1, 2],*>* [3, 4]]*>**>* becomes*>**>* [[1,1,2,2],*>* [1,1,2,2],*>* [3,3,4,4]*>* [3,3,4,4]]*>**>* It seems like some combination of np.resize or np.repeat and reshape +*>* rollaxis would do the trick, but I'm at a loss.*>**>* Many thanks!*>**>* -Robin*>** > > Just a day or so ago, Josef Perktold showed one way of accomplishing this > using numpy.kron: > > In [14]: a = arange(12).reshape(3,4) > > In [15]: a > Out[15]: > array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11]]) > > In [16]: kron(a, ones((2,2))) > Out[16]: > array([[ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 0., 0., 1., 1., 2., 2., 3., 3.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 4., 4., 5., 5., 6., 6., 7., 7.], > [ 8., 8., 9., 9., 10., 10., 11., 11.], > [ 8., 8., 9., 9., 10., 10., 11., 11.]]) > > > Warren > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Sat Dec 3 12:50:18 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Sat, 3 Dec 2011 18:50:18 +0100 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: <4E3275D6-DD83-47F1-A189-D4DBCACB5425@astro.physik.uni-goettingen.de> On 03.12.2011, at 6:22PM, Robin Kraft wrote: > That does repeat the elements, but doesn't get them into the desired order. > > In [4]: print a > [[1 2] > [3 4]] > > In [7]: np.tile(a, 4) > Out[7]: > array([[1, 2, 1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4, 3, 4]]) > > In [8]: np.tile(a, 4).reshape(4,4) > Out[8]: > array([[1, 2, 1, 2], > [1, 2, 1, 2], > [3, 4, 3, 4], > [3, 4, 3, 4]]) > > It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: > > array([[1, 1, 2, 2], > [1, 1, 2, 2], > [3, 3, 4, 4], > [3, 3, 4, 4]]) > > It would take some more reshaping/axis rolling to get there, but it seems doable. > > Anyone know what combination of manipulations would work with the result of np.tile? > Rolling was the keyword: np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4)) [[1 1 2 2] [1 1 2 2] [3 3 4 4] [3 3 4 4]] I leave the generalisation and timing up to you, but it seems for a = np.arange(M**2).reshape(M,-1) np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) should do the trick. Cheers, Derek From derek at astro.physik.uni-goettingen.de Sat Dec 3 12:57:36 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Sat, 3 Dec 2011 18:57:36 +0100 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: <22C59B24-1DD8-4285-9C04-74582F5AD95E@astro.physik.uni-goettingen.de> On 03.12.2011, at 6:47PM, Olivier Delalleau wrote: > Ah sorry, I hadn't read carefully enough what you were trying to achieve. I think the double repeat solution looks like your best option then. Considering that it is a lot shorter than fixing the tile() result, you are probably right (I've only now looked closer at the repeat() solution ;-). I'd still be interested in the performance - since I think none of the reshape or rollaxis operations actually move any data in memory (for numpy > 1.6), it might still be faster. Cheers, Derek From rkraft4 at gmail.com Sat Dec 3 13:06:22 2011 From: rkraft4 at gmail.com (Robin Kraft) Date: Sat, 3 Dec 2011 13:06:22 -0500 Subject: [Numpy-discussion] "upsample" or scale an array In-Reply-To: References: <991EB7C3-97FF-4996-A698-6B38CFBF2BC6@gmail.com> Message-ID: <70C4E8AC-ED7A-469E-AE96-52277B03074E@gmail.com> Ha! I knew it had to be possible! Thanks Derek. So for and N = 2 (now on my laptop): In [70]: M = 1200 In [69]: N = 2 In [71]: a = np.random.randint(0, 255, (M**2)).reshape(M,-1) In [76]: timeit np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 10 loops, best of 3: 99.1 ms per loop In [78]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 85.6 ms per loop In [79]: timeit np.kron(a, np.ones((2,2), 'uint8')) 1 loops, best of 3: 521 ms per loop It turns out np.kron and repeat are pretty straightforward for multi-dimensional data too - scaling or stretching a stacked array representing pixel data over time, for example. Nothing changes for np.kron - it handles the additional dimensionality by itself. With repeat you just tell it to operate on the last two dimensions. So to sum up: 1) np.kron is cool for the simplicity of the code and simple scaling to N dimensions. It's also handy if you want to scale the array elements themselves too. 2) repeat() along the last N axes is a bit more intuitive (i.e. less magical) to me and has a better performance profile. 3) Derek's reshape/rolling solution is almost as fast but it gives me a headache trying to visualize what it's actually doing. I don't want to think about adding another dimension ... Thanks for the help folks. Here's scaling of a hypothetical time series (i.e. 3 axes), where each sub-array represents a month. In [26]: print a [[[1 2] [3 4]] [[1 2] [3 4]] [[1 2] [3 4]]] In [27]: np.kron(a, np.ones((2,2), dtype='uint8')) Out[27]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) In [64]: a.repeat(2, axis=1).repeat(2, axis=2) Out[64]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) On Dec. 3, 2011, at 12:50PM, Derek Homeier wrote: > On 03.12.2011, at 6:22PM, Robin Kraft wrote: > > > That does repeat the elements, but doesn't get them into the desired order. > > > > In [4]: print a > > [[1 2] > > [3 4]] > > > > In [7]: np.tile(a, 4) > > Out[7]: > > array([[1, 2, 1, 2, 1, 2, 1, 2], > > [3, 4, 3, 4, 3, 4, 3, 4]]) > > > > In [8]: np.tile(a, 4).reshape(4,4) > > Out[8]: > > array([[1, 2, 1, 2], > > [1, 2, 1, 2], > > [3, 4, 3, 4], > > [3, 4, 3, 4]]) > > > > It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: > > > > array([[1, 1, 2, 2], > > [1, 1, 2, 2], > > [3, 3, 4, 4], > > [3, 3, 4, 4]]) > > > > It would take some more reshaping/axis rolling to get there, but it seems doable. > > > > Anyone know what combination of manipulations would work with the result of np.tile? > > > Rolling was the keyword: > > np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4)) > [[1 1 2 2] > [1 1 2 2] > [3 3 4 4] > [3 3 4 4]] > > I leave the generalisation and timing up to you, but it seems for > a = np.arange(M**2).reshape(M,-1) > > np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) > > should do the trick. > > Cheers, > Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Sat Dec 3 19:28:10 2011 From: irving at naml.us (Geoffrey Irving) Date: Sat, 3 Dec 2011 16:28:10 -0800 Subject: [Numpy-discussion] bug in PyArray_GetCastFunc Message-ID: When attempting to cast to a user defined type, PyArray_GetCast looks up the cast function in the dictionary but doesn't check if the entry exists. This causes segfaults. Here's a patch. Geoffrey diff --git a/numpy/core/src/multiarray/convert_datatype.c b/numpy/core/src/multiarray/convert_datatype.c index 818d558..4b8f38b 100644 --- a/numpy/core/src/multiarray/convert_datatype.c +++ b/numpy/core/src/multiarray/convert_datatype.c @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num) key = PyInt_FromLong(type_num); cobj = PyDict_GetItem(obj, key); Py_DECREF(key); - if (NpyCapsule_Check(cobj)) { + if (cobj && NpyCapsule_Check(cobj)) { castfunc = NpyCapsule_AsVoidPtr(cobj); } } From teoliphant at gmail.com Sat Dec 3 21:18:41 2011 From: teoliphant at gmail.com (Travis Oliphant) Date: Sat, 3 Dec 2011 20:18:41 -0600 Subject: [Numpy-discussion] NumPy Governance Message-ID: Hi everyone, There have been some wonderfully vigorous discussions over the past few months that have made it clear that we need some clarity about how decisions will be made in the NumPy community. When we were a smaller bunch of people it seemed easier to come to an agreement and things pretty much evolved based on (mostly) consensus and who was available to actually do the work. There is a need for a more clear structure so that we know how decisions will get made and so that code can move forward while paying attention to the current user-base. There has been a "steering committee" structure for SciPy in the past, and I have certainly been prone to lump both NumPy and SciPy together given that I have a strong interest in and have spent a great amount of time working on both projects. Others have also spent time on both projects. However, I think it is critical at this stage to clearly separate the projects and define a governing structure that is fair and agreeable for NumPy. SciPy has multiple modules and will probably need structure around each module independently. For now, I wanted to open up a discussion to see what people thought about NumPy's governance. My initial thoughts: * discussions happen as they do now on the mailing list * a small group of developers (5-11) constitute the "board" and major decisions are made by vote of that group (not just simple majority --- needs at least 2/3 +1 votes). * votes are +1/+0/-0/-1 * if a topic is difficult to resolve it is moved off the main list and discussed on a separate "board" mailing list --- these should be rare, but parts of the NA discussion would probably qualify * This board mailing list is "publically" viewable but only board members may post. * The board is renewed and adjusted each year --- based on nomination and 2/3 vote of the current board until board is at 11. * The chairman of the board is voted by a majority of the board and has veto power unless over-ridden by 3/4 of the board. * Petitions to remove people off the board can be made by 50+ independent reverse nominations (hopefully people will just withdraw if they are no longer active). All of these points are open for discussion. I just thought I would start the conversation. I will be much more active this next year with NumPy and will be very interested in the direction NumPy is taking. I'm hoping to discern by this conversation, who else is very interested in the direction of NumPy so that the first board can be formally constituted. Best regards, -Travis From irving at naml.us Sat Dec 3 22:14:11 2011 From: irving at naml.us (Geoffrey Irving) Date: Sat, 3 Dec 2011 19:14:11 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types Message-ID: Hello, I'm trying to add a fixed precision rational number dtype to numpy, and am running into an issue trying to register ufunc loops. The code in question looks like int npy_rational = PyArray_RegisterDataType(&rational_descr); PyObject* equal = ... // extract equal object from the imported numpy module int types[3] = {npy_rational,npy_rational,NPY_BOOL}; if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc,npy_rational,rational_ufunc_##name,_types,0)<0) return 0; In Python 2.6.7 with the latest numpy from git, I get >>> from rational import * >>> i = array([rational(5,3)]) >>> i array([5/3], dtype=rational) >>> equal(i,i) Traceback (most recent call last): File "", line 1, in TypeError: ufunc 'equal' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' The same thing happens with (rational,rational)->rational ufuncs like multiply. The full extension module code is here: https://github.com/girving/poker/blob/rational/rational.cpp I realize this isn't much information to go on, but let me know if anything comes to mind in terms of possible reasons or further tests to run. Unfortunately it looks like the ufunc ntypes and types properties aren't updated based on user-defined loops, so I'm not yet sure if the problem is in registry or resolution. It's also possible someone else hit this before: http://projects.scipy.org/numpy/ticket/1913. Thanks, Geoffrey From matthew.brett at gmail.com Sat Dec 3 22:42:03 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 3 Dec 2011 19:42:03 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: Hi Travis, On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant wrote: > > Hi everyone, > > There have been some wonderfully vigorous discussions over the past few months that have made it clear that we need some clarity about how decisions will be made in the NumPy community. > > When we were a smaller bunch of people it seemed easier to come to an agreement and things pretty much evolved based on (mostly) consensus and who was available to actually do the work. > > There is a need for a more clear structure so that we know how decisions will get made and so that code can move forward while paying attention to the current user-base. ? There has been a "steering committee" structure for SciPy in the past, and I have certainly been prone to lump both NumPy and SciPy together given that I have a strong interest in and have spent a great amount of time working on both projects. ? ?Others have also spent time on both projects. > > However, I think it is critical at this stage to clearly separate the projects and define a governing structure that is fair and agreeable for NumPy. ? SciPy has multiple modules and will probably need structure around each module independently. ? ?For now, I wanted to open up a discussion to see what people thought about NumPy's governance. > > My initial thoughts: > > ? ? ? ?* discussions happen as they do now on the mailing list > ? ? ? ?* a small group of developers (5-11) constitute the "board" and major decisions are made by vote of that group (not just simple majority --- needs at least 2/3 +1 votes). > ? ? ? ?* votes are +1/+0/-0/-1 > ? ? ? ?* if a topic is difficult to resolve it is moved off the main list and discussed on a separate "board" mailing list --- these should be rare, but parts of the NA discussion would probably qualify > ? ? ? ?* This board mailing list is "publically" viewable but only board members may post. > ? ? ? ?* The board is renewed and adjusted each year --- based on nomination and 2/3 vote of the current board until board is at 11. > ? ? ? ?* The chairman of the board is voted by a majority of the board and has veto power unless over-ridden by 3/4 of the board. > ? ? ? ?* Petitions to remove people off the board can be made by 50+ independent reverse nominations (hopefully people will just withdraw if they are no longer active). Thanks very much for starting this discussion. You have probably seen that my preference would be for all discussions to be public - in the sense that all can contribute. So, it seems reasonable to me to have 'board' as you describe, but that the board should vote on the same mailing list as the rest of the discussion. Having a separate mailing list for discussion makes the separation overt between those with a granted voice and those without, and I would hope for a structure which emphasized discsussion in an open forum. Put another way, what advantage would having a separate public mailing list have? How does this governance compare to that of - say - Linux or Python or Debian? My worry will be that it will be too tempting to terminate discussions and proceed to resolve by vote, when voting (as Karl Vogel describes) may still do harm. What will be the position - maybe I mean your position - on consensus as Nathaniel has described it? I feel the masked array discussion would have been more productive (an maybe shorter and more to the point) if there had been some rule-of-thumb that every effort is made to reach consensus before proceeding to implementation - or a vote. For example, in the masked array discussion, I would have liked to be able to say 'hold on, we have a rule that we try our best to reach consensus; I do not feel we have done that yet'. See you, Matthew I guess that the From teoliphant at gmail.com Sat Dec 3 22:58:36 2011 From: teoliphant at gmail.com (Travis Oliphant) Date: Sat, 3 Dec 2011 21:58:36 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: <81147240-6F99-4CD1-847E-E892CC38E9BC@enthought.com> I like the idea of trying to reach consensus first. The only point of having a board is to have someway to resolve issues should consensus not be reachable. Believe me, I'm not that excited about a separate mailing list. It would be great if we could resolve everything on a single list. -Travis On Dec 3, 2011, at 9:42 PM, Matthew Brett wrote: > Hi Travis, > > On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant wrote: >> >> Hi everyone, >> >> There have been some wonderfully vigorous discussions over the past few months that have made it clear that we need some clarity about how decisions will be made in the NumPy community. >> >> When we were a smaller bunch of people it seemed easier to come to an agreement and things pretty much evolved based on (mostly) consensus and who was available to actually do the work. >> >> There is a need for a more clear structure so that we know how decisions will get made and so that code can move forward while paying attention to the current user-base. There has been a "steering committee" structure for SciPy in the past, and I have certainly been prone to lump both NumPy and SciPy together given that I have a strong interest in and have spent a great amount of time working on both projects. Others have also spent time on both projects. >> >> However, I think it is critical at this stage to clearly separate the projects and define a governing structure that is fair and agreeable for NumPy. SciPy has multiple modules and will probably need structure around each module independently. For now, I wanted to open up a discussion to see what people thought about NumPy's governance. >> >> My initial thoughts: >> >> * discussions happen as they do now on the mailing list >> * a small group of developers (5-11) constitute the "board" and major decisions are made by vote of that group (not just simple majority --- needs at least 2/3 +1 votes). >> * votes are +1/+0/-0/-1 >> * if a topic is difficult to resolve it is moved off the main list and discussed on a separate "board" mailing list --- these should be rare, but parts of the NA discussion would probably qualify >> * This board mailing list is "publically" viewable but only board members may post. >> * The board is renewed and adjusted each year --- based on nomination and 2/3 vote of the current board until board is at 11. >> * The chairman of the board is voted by a majority of the board and has veto power unless over-ridden by 3/4 of the board. >> * Petitions to remove people off the board can be made by 50+ independent reverse nominations (hopefully people will just withdraw if they are no longer active). > > Thanks very much for starting this discussion. > > You have probably seen that my preference would be for all discussions > to be public - in the sense that all can contribute. So, it seems > reasonable to me to have 'board' as you describe, but that the board > should vote on the same mailing list as the rest of the discussion. > Having a separate mailing list for discussion makes the separation > overt between those with a granted voice and those without, and I > would hope for a structure which emphasized discsussion in an open > forum. > > Put another way, what advantage would having a separate public mailing > list have? > > How does this governance compare to that of - say - Linux or Python or Debian? > > My worry will be that it will be too tempting to terminate discussions > and proceed to resolve by vote, when voting (as Karl Vogel describes) > may still do harm. > > What will be the position - maybe I mean your position - on consensus > as Nathaniel has described it? I feel the masked array discussion > would have been more productive (an maybe shorter and more to the > point) if there had been some rule-of-thumb that every effort is made > to reach consensus before proceeding to implementation - or a vote. > > For example, in the masked array discussion, I would have liked to be > able to say 'hold on, we have a rule that we try our best to reach > consensus; I do not feel we have done that yet'. > > See you, > > Matthew > > I guess that the > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion --- Travis Oliphant Enthought, Inc. oliphant at enthought.com 1-512-536-1057 http://www.enthought.com From warren.weckesser at enthought.com Sun Dec 4 00:11:11 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 3 Dec 2011 23:11:11 -0600 Subject: [Numpy-discussion] Convert datetime64 to python datetime.datetime in numpy 1.6.1? Message-ID: In numpy 1.6.1, what's the most straightforward way to convert a datetime64 to a python datetime.datetime? E.g. I have In [1]: d = datetime64("2011-12-03 12:34:56.75") In [2]: d Out[2]: 2011-12-03 12:34:56.750000 I want the same time as a datetime.datetime instance. My best hack so far is to parse repr(d) with datetime.datetime.strptime: In [3]: import datetime In [4]: dt = datetime.datetime.strptime(repr(d), "%Y-%m-%d %H:%M:%S.%f") In [5]: dt Out[5]: datetime.datetime(2011, 12, 3, 12, 34, 56, 750000) That works--unless there are no microseconds, in which case ".%f" must be removed from the format string--but there must be a better way. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 4 01:43:43 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 3 Dec 2011 23:43:43 -0700 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 7:18 PM, Travis Oliphant wrote: > > Hi everyone, > > There have been some wonderfully vigorous discussions over the past few > months that have made it clear that we need some clarity about how > decisions will be made in the NumPy community. > > When we were a smaller bunch of people it seemed easier to come to an > agreement and things pretty much evolved based on (mostly) consensus and > who was available to actually do the work. > > There is a need for a more clear structure so that we know how decisions > will get made and so that code can move forward while paying attention to > the current user-base. There has been a "steering committee" structure > for SciPy in the past, and I have certainly been prone to lump both NumPy > and SciPy together given that I have a strong interest in and have spent a > great amount of time working on both projects. Others have also spent > time on both projects. > > However, I think it is critical at this stage to clearly separate the > projects and define a governing structure that is fair and agreeable for > NumPy. SciPy has multiple modules and will probably need structure around > each module independently. For now, I wanted to open up a discussion to > see what people thought about NumPy's governance. > > My initial thoughts: > > * discussions happen as they do now on the mailing list > * a small group of developers (5-11) constitute the "board" and > major decisions are made by vote of that group (not just simple majority > --- needs at least 2/3 +1 votes). > * votes are +1/+0/-0/-1 > * if a topic is difficult to resolve it is moved off the main list > and discussed on a separate "board" mailing list --- these should be rare, > but parts of the NA discussion would probably qualify > * This board mailing list is "publically" viewable but only board > members may post. > * The board is renewed and adjusted each year --- based on > nomination and 2/3 vote of the current board until board is at 11. > * The chairman of the board is voted by a majority of the board and > has veto power unless over-ridden by 3/4 of the board. > * Petitions to remove people off the board can be made by 50+ > independent reverse nominations (hopefully people will just withdraw if > they are no longer active). > > All of these points are open for discussion. I just thought I would start > the conversation. I will be much more active this next year with NumPy > and will be very interested in the direction NumPy is taking. I'm hoping > to discern by this conversation, who else is very interested in the > direction of NumPy so that the first board can be formally constituted. > > If the purpose of the board is to resolve controversies, the 2/3 requirement is going to cause problems. The reason majority votes are usually used and that committees are set up with an odd number of members is that nothing gets resolved otherwise. Doing nothing is not a solution to missing consensus. Furthermore, at the current time, I don't think there are 5 active developers, let alone 11. With hard work you might scrape together two or three. Having 5 or 11 people making decisions for the two or three actually doing the work isn't going to go over well. I would propose a technical board of one or three people who can step in if an issue look like it needs outside intervention. And I would suggest at least one of the members be someone from the outside but familiar with the project, say someone like Fernando. The one member model is if we decide to go with a benevolent dictator. Note that for the smaller boards both the 2/3'rds and majority votes would be the same number of people ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dpinte at enthought.com Sun Dec 4 06:30:07 2011 From: dpinte at enthought.com (Didrik Pinte) Date: Sun, 4 Dec 2011 12:30:07 +0100 Subject: [Numpy-discussion] Convert datetime64 to python datetime.datetime in numpy 1.6.1? In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 6:11 AM, Warren Weckesser wrote: > In numpy 1.6.1, what's the most straightforward way to convert a datetime64 > to a python datetime.datetime?? E.g. I have > > In [1]: d = datetime64("2011-12-03 12:34:56.75") > > In [2]: d > Out[2]: 2011-12-03 12:34:56.750000 > > I want the same time as a datetime.datetime instance.? My best hack so far > is to parse repr(d) with datetime.datetime.strptime: > > In [3]: import datetime > > In [4]: dt = datetime.datetime.strptime(repr(d), "%Y-%m-%d %H:%M:%S.%f") > > In [5]: dt > Out[5]: datetime.datetime(2011, 12, 3, 12, 34, 56, 750000) > > That works--unless there are no microseconds, in which case ".%f" must be > removed from the format string--but there must be a better way. > > Warren Warren, You can do that : In [13]: a = array(["2011-12-03 12:34:56.75"], dtype=datetime64) In [14]: b = a.astype(object) In [15]: b[0] Out[15]: datetime.datetime(2011, 12, 3, 12, 34, 56, 750000) Not sure how efficient it is but it works fine. -- Didrik From warren.weckesser at enthought.com Sun Dec 4 07:07:03 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 4 Dec 2011 06:07:03 -0600 Subject: [Numpy-discussion] Convert datetime64 to python datetime.datetime in numpy 1.6.1? In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 5:30 AM, Didrik Pinte wrote: > On Sun, Dec 4, 2011 at 6:11 AM, Warren Weckesser > wrote: > > In numpy 1.6.1, what's the most straightforward way to convert a > datetime64 > > to a python datetime.datetime? E.g. I have > > > > In [1]: d = datetime64("2011-12-03 12:34:56.75") > > > > In [2]: d > > Out[2]: 2011-12-03 12:34:56.750000 > > > > I want the same time as a datetime.datetime instance. My best hack so > far > > is to parse repr(d) with datetime.datetime.strptime: > > > > In [3]: import datetime > > > > In [4]: dt = datetime.datetime.strptime(repr(d), "%Y-%m-%d %H:%M:%S.%f") > > > > In [5]: dt > > Out[5]: datetime.datetime(2011, 12, 3, 12, 34, 56, 750000) > > > > That works--unless there are no microseconds, in which case ".%f" must be > > removed from the format string--but there must be a better way. > > > > Warren > > > Warren, > > You can do that : > > In [13]: a = array(["2011-12-03 12:34:56.75"], dtype=datetime64) > > In [14]: b = a.astype(object) > > In [15]: b[0] > Out[15]: datetime.datetime(2011, 12, 3, 12, 34, 56, 750000) > > Not sure how efficient it is but it works fine. > > -- Didrik > Thanks, Didrik, that's just what I needed. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Sun Dec 4 08:16:03 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Sun, 04 Dec 2011 08:16:03 -0500 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: <4EDB7293.90302@gmail.com> On 12/4/2011 1:43 AM, Charles R Harris wrote: > I don't think there are 5 active developers, let alone 11. > With hard work you might scrape together two or three. > Having 5 or 11 people making decisions for the two or > three actually doing the work isn't going to go over well. Very true! But you might consider including on any board a developer or two from important projects that are very NumPy dependent. (E.g., Matplotlib.) One other thing: how about starting with a "board" of 3 and a rule that says any active developer can request to join, that additions are determined by majority vote of the existing board, and that having the board both small and odd numbered is a priority? (Fixing the board size in advance for a project we all hope will grow substantially seems odd.) fwiw, Alan Isaac From charlesr.harris at gmail.com Sun Dec 4 10:48:18 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 08:48:18 -0700 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: <4EDB7293.90302@gmail.com> References: <4EDB7293.90302@gmail.com> Message-ID: On Sun, Dec 4, 2011 at 6:16 AM, Alan G Isaac wrote: > On 12/4/2011 1:43 AM, Charles R Harris wrote: > > I don't think there are 5 active developers, let alone 11. > > With hard work you might scrape together two or three. > > Having 5 or 11 people making decisions for the two or > > three actually doing the work isn't going to go over well. > > Very true! But you might consider including on any board > a developer or two from important projects that are very > NumPy dependent. (E.g., Matplotlib.) > > That's a good idea. > One other thing: how about starting with a "board" of 3 > and a rule that says any active developer can request to > join, that additions are determined by majority vote of > the existing board, and that having the board both small > and odd numbered is a priority? (Fixing the board size > in advance for a project we all hope will grow substantially > seems odd.) > > I'm thinking of a board for resolving technical conflicts. Involving more people for larger discussions about direction and coordination with other projects would be the right thing to do. I think that can mostly be left informal at the moment but having an official get together at an event might be useful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 4 12:29:45 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 10:29:45 -0700 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 8:14 PM, Geoffrey Irving wrote: > Hello, > > I'm trying to add a fixed precision rational number dtype to numpy, > and am running into an issue trying to register ufunc loops. The code > in question looks like > > int npy_rational = PyArray_RegisterDataType(&rational_descr); > PyObject* equal = ... // extract equal object from the imported numpy > module > int types[3] = {npy_rational,npy_rational,NPY_BOOL}; > if > (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc,npy_rational,rational_ufunc_##name,_types,0)<0) > return 0; > > In Python 2.6.7 with the latest numpy from git, I get > > >>> from rational import * > >>> i = array([rational(5,3)]) > >>> i > array([5/3], dtype=rational) > >>> equal(i,i) > Traceback (most recent call last): > File "", line 1, in > TypeError: ufunc 'equal' not supported for the input types, and > the inputs could not be safely coerced to any supported types > according to the casting rule ''safe'' > > The same thing happens with (rational,rational)->rational ufuncs like > multiply. > > The full extension module code is here: > > https://github.com/girving/poker/blob/rational/rational.cpp > > I realize this isn't much information to go on, but let me know if > anything comes to mind in terms of possible reasons or further tests > to run. Unfortunately it looks like the ufunc ntypes and types > properties aren't updated based on user-defined loops, so I'm not yet > sure if the problem is in registry or resolution. > > It's also possible someone else hit this before: > http://projects.scipy.org/numpy/ticket/1913. > > I haven't tried adding a new type and can't offer any suggestions. But there was a recent implementation of a quaternion type that might be worth looking at for comparison. You can find it here . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 4 13:02:32 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 11:02:32 -0700 Subject: [Numpy-discussion] bug in PyArray_GetCastFunc In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving wrote: > When attempting to cast to a user defined type, PyArray_GetCast looks > up the cast function in the dictionary but doesn't check if the entry > exists. This causes segfaults. Here's a patch. > > Geoffrey > > diff --git a/numpy/core/src/multiarray/convert_datatype.c > b/numpy/core/src/multiarray/convert_datatype.c > index 818d558..4b8f38b 100644 > --- a/numpy/core/src/multiarray/convert_datatype.c > +++ b/numpy/core/src/multiarray/convert_datatype.c > @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num) > key = PyInt_FromLong(type_num); > cobj = PyDict_GetItem(obj, key); > Py_DECREF(key); > - if (NpyCapsule_Check(cobj)) { > + if (cobj && NpyCapsule_Check(cobj)) { > castfunc = NpyCapsule_AsVoidPtr(cobj); > } > } > __ I'm thinking NpyCapsule_Check should catch this. From the documentation it probably should: int PyCObject_Check(PyObject * *p*) Return true if its argument is a PyCObject I don't think NULL is a valid PyCObject ;) However, it should be easy to add the NULL check to the numpy version of the function. I'll do that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From san.programming at gmail.com Sun Dec 4 13:41:04 2011 From: san.programming at gmail.com (Santhosh R) Date: Mon, 5 Dec 2011 00:11:04 +0530 Subject: [Numpy-discussion] stacking scalars with column vector Message-ID: Hi , I am trying to learn python to convert some of my mlab codes to python.Am new to python.Some help here would be appreciated I am trying to make a column vector by stacking two scalars-->Xstart,Xend and second column of n X 2 2Darray-->A In matlab the code is b=[Xstart;A(:,1);Xend] I have made the following code import numpy as np x=np.array([[Xstart]]) arr_temp=np.array([A[:,0]]) x=np.vstack((x,arr_temp.T)) x=np.vstack((x,[[Xend]])) This works, but is there a more elegant way of doing this. Also if i define arr_temp as arr_temp=A[:,0], the array arr_temp which is supposed to be of shape (n,1) shows as (n,) .Why is this Thanks Santhosh -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Sun Dec 4 14:57:07 2011 From: shish at keba.be (Olivier Delalleau) Date: Sun, 4 Dec 2011 14:57:07 -0500 Subject: [Numpy-discussion] stacking scalars with column vector In-Reply-To: References: Message-ID: You can do it in one shot with: x = np.vstack((Xstart, A[:, 0:1], Xend)) Using A[:, 0:1] instead of A[:, 0] lets you keep it as a 2d matrix (this should answer your last question). Then the scalars Xstart and Xend will automatically be broadcasted to accomodate the shape of A[:, 0:1], so you don't need to write [[Xstart]] and [[Xend]]. -=- Olivier 2011/12/4 Santhosh R > Hi , > > I am trying to learn python to convert some of my mlab codes to python.Am > new to python.Some help here would be appreciated > > I am trying to make a column vector by stacking two scalars-->Xstart,Xend > and second column of n X 2 2Darray-->A > > In matlab the code is b=[Xstart;A(:,1);Xend] > > I have made the following code > > import numpy as np > x=np.array([[Xstart]]) > arr_temp=np.array([A[:,0]]) > x=np.vstack((x,arr_temp.T)) > x=np.vstack((x,[[Xend]])) > > This works, but is there a more elegant way of doing this. > Also if i define arr_temp as arr_temp=A[:,0], the array arr_temp which > is supposed to be of shape (n,1) shows as (n,) .Why is this > > > Thanks > > Santhosh > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Sun Dec 4 19:41:11 2011 From: irving at naml.us (Geoffrey Irving) Date: Sun, 4 Dec 2011 16:41:11 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: This may be the problem. Simple diffs are pleasant. I'm guessing this code doesn't get a lot of testing. Glad it's there, though! Geoffrey diff --git a/numpy/core/src/umath/ufunc_type_resolution.c b/numpy/core/src/umath/ufunc_type_resolution.c index 0d6cf19..a93eda1 100644 --- a/numpy/core/src/umath/ufunc_type_resolution.c +++ b/numpy/core/src/umath/ufunc_type_resolution.c @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject *self, case -1: return -1; /* A loop was found */ - case 1: + case 0: return 0; } } On Sun, Dec 4, 2011 at 9:29 AM, Charles R Harris wrote: > > > On Sat, Dec 3, 2011 at 8:14 PM, Geoffrey Irving wrote: >> >> Hello, >> >> I'm trying to add a fixed precision rational number dtype to numpy, >> and am running into an issue trying to register ufunc loops. ?The code >> in question looks like >> >> ? ?int npy_rational = PyArray_RegisterDataType(&rational_descr); >> ? ?PyObject* equal = ... // extract equal object from the imported numpy >> module >> ? ?int types[3] = {npy_rational,npy_rational,NPY_BOOL}; >> ? ?if >> (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc,npy_rational,rational_ufunc_##name,_types,0)<0) >> ? ? ? ?return 0; >> >> In Python 2.6.7 with the latest numpy from git, I get >> >> ? ?>>> from rational import * >> ? ?>>> i = array([rational(5,3)]) >> ? ?>>> i >> ? ?array([5/3], dtype=rational) >> ? ?>>> equal(i,i) >> ? ?Traceback (most recent call last): >> ? ? ?File "", line 1, in >> ? ?TypeError: ufunc 'equal' not supported for the input types, and >> the inputs could not be safely coerced to any supported types >> according to the casting rule ''safe'' >> >> The same thing happens with (rational,rational)->rational ufuncs like >> multiply. >> >> The full extension module code is here: >> >> ? ?https://github.com/girving/poker/blob/rational/rational.cpp >> >> I realize this isn't much information to go on, but let me know if >> anything comes to mind in terms of possible reasons or further tests >> to run. ?Unfortunately it looks like the ufunc ntypes and types >> properties aren't updated based on user-defined loops, so I'm not yet >> sure if the problem is in registry or resolution. >> >> It's also possible someone else hit this before: >> http://projects.scipy.org/numpy/ticket/1913. >> > > I haven't tried adding a new type and can't offer any suggestions. But there > was a recent implementation of a quaternion type that might be worth looking > at for comparison. You can find it here. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sun Dec 4 20:18:39 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 18:18:39 -0700 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving wrote: > This may be the problem. Simple diffs are pleasant. I'm guessing > this code doesn't get a lot of testing. Glad it's there, though! > > Geoffrey > > diff --git a/numpy/core/src/umath/ufunc_type_resolution.c > b/numpy/core/src/umath/ufunc_type_resolution.c > index 0d6cf19..a93eda1 100644 > --- a/numpy/core/src/umath/ufunc_type_resolution.c > +++ b/numpy/core/src/umath/ufunc_type_resolution.c > @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject *self, > case -1: > return -1; > /* A loop was found */ > - case 1: > + case 0: > return 0; > } > } > > Heh. Can you verify that this fixes the problem? That function is only called once and its return value is passed up the chain, but the documented return values of that calling function are -1, 0. So the documentation needs to be changed if this is the right thing to do. Speaking of tests... I was wondering if you could be talked into putting together a simple user type for including in the tests? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Sun Dec 4 20:30:48 2011 From: irving at naml.us (Geoffrey Irving) Date: Sun, 4 Dec 2011 17:30:48 -0800 Subject: [Numpy-discussion] bug in PyArray_GetCastFunc In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 10:02 AM, Charles R Harris wrote: > > > On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving wrote: >> >> When attempting to cast to a user defined type, PyArray_GetCast looks >> up the cast function in the dictionary but doesn't check if the entry >> exists. ?This causes segfaults. ?Here's a patch. >> >> Geoffrey >> >> diff --git a/numpy/core/src/multiarray/convert_datatype.c >> b/numpy/core/src/multiarray/convert_datatype.c >> index 818d558..4b8f38b 100644 >> --- a/numpy/core/src/multiarray/convert_datatype.c >> +++ b/numpy/core/src/multiarray/convert_datatype.c >> @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num) >> ? ? ? ? ? ? key = PyInt_FromLong(type_num); >> ? ? ? ? ? ? cobj = PyDict_GetItem(obj, key); >> ? ? ? ? ? ? Py_DECREF(key); >> - ? ? ? ? ? ?if (NpyCapsule_Check(cobj)) { >> + ? ? ? ? ? ?if (cobj && NpyCapsule_Check(cobj)) { >> ? ? ? ? ? ? ? ? castfunc = NpyCapsule_AsVoidPtr(cobj); >> ? ? ? ? ? ? } >> ? ? ? ? } >> __ > > > I'm thinking NpyCapsule_Check should catch this. From the documentation it > probably should: > > int PyCObject_Check(PyObject *p) > Return true if its argument is a PyCObject > > I don't think NULL is a valid PyCObject ;) However, it should be easy to add > the NULL check to the numpy version of the function. I'll do that. That would work, but I think would match the rest of the Python API better if NpyCapsule_Check required a nonnull argument. PyCapsule_Check and essentially every other Python API function have documented undefined behavior if you pass in null, so it might be surprising one numpy macro violates this. Incidentally, every other use of NpyCapsule_Check correctly tests for null. Geoffrey From charlesr.harris at gmail.com Sun Dec 4 20:45:18 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 18:45:18 -0700 Subject: [Numpy-discussion] bug in PyArray_GetCastFunc In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 6:30 PM, Geoffrey Irving wrote: > On Sun, Dec 4, 2011 at 10:02 AM, Charles R Harris > wrote: > > > > > > On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving wrote: > >> > >> When attempting to cast to a user defined type, PyArray_GetCast looks > >> up the cast function in the dictionary but doesn't check if the entry > >> exists. This causes segfaults. Here's a patch. > >> > >> Geoffrey > >> > >> diff --git a/numpy/core/src/multiarray/convert_datatype.c > >> b/numpy/core/src/multiarray/convert_datatype.c > >> index 818d558..4b8f38b 100644 > >> --- a/numpy/core/src/multiarray/convert_datatype.c > >> +++ b/numpy/core/src/multiarray/convert_datatype.c > >> @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int > type_num) > >> key = PyInt_FromLong(type_num); > >> cobj = PyDict_GetItem(obj, key); > >> Py_DECREF(key); > >> - if (NpyCapsule_Check(cobj)) { > >> + if (cobj && NpyCapsule_Check(cobj)) { > >> castfunc = NpyCapsule_AsVoidPtr(cobj); > >> } > >> } > >> __ > > > > > > I'm thinking NpyCapsule_Check should catch this. From the documentation > it > > probably should: > > > > int PyCObject_Check(PyObject *p) > > Return true if its argument is a PyCObject > > > > I don't think NULL is a valid PyCObject ;) However, it should be easy to > add > > the NULL check to the numpy version of the function. I'll do that. > > That would work, but I think would match the rest of the Python API > better if NpyCapsule_Check required a nonnull argument. > PyCapsule_Check and essentially every other Python API function have > documented undefined behavior if you pass in null, so it might be > surprising one numpy macro violates this. Incidentally, every other > use of NpyCapsule_Check correctly tests for null. > > Good points. I may change it back ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Sun Dec 4 20:59:11 2011 From: irving at naml.us (Geoffrey Irving) Date: Sun, 4 Dec 2011 17:59:11 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 5:18 PM, Charles R Harris wrote: > > > On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving wrote: >> >> This may be the problem. ?Simple diffs are pleasant. ?I'm guessing >> this code doesn't get a lot of testing. ?Glad it's there, though! >> >> Geoffrey >> >> diff --git a/numpy/core/src/umath/ufunc_type_resolution.c >> b/numpy/core/src/umath/ufunc_type_resolution.c >> index 0d6cf19..a93eda1 100644 >> --- a/numpy/core/src/umath/ufunc_type_resolution.c >> +++ b/numpy/core/src/umath/ufunc_type_resolution.c >> @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject *self, >> ? ? ? ? ? ? case -1: >> ? ? ? ? ? ? ? ? return -1; >> ? ? ? ? ? ? /* A loop was found */ >> - ? ? ? ? ? ?case 1: >> + ? ? ? ? ? ?case 0: >> ? ? ? ? ? ? ? ? return 0; >> ? ? ? ? } >> ? ? } >> > > Heh. Can you verify that this fixes the problem? That function is only > called once? and its return value is passed up the chain, but the documented > return values of that calling function are -1, 0. So the documentation needs > to be changed if this is the right thing to do. Actually, that patch was wrong, since linear_search_userloop_type_resolver needs to return three values (error, not-found, success). A better patch follows. I can confirm that this gets me further, but I get other failures down the line, so more fixes may follow. I'll push the branch with all my fixes for convenience once I have everything working. > Speaking of tests... I was wondering if you could be talked into putting > together a simple user type for including in the tests? Yep, though likely not for a couple weeks. If there's interest, I could also be convinced to sanitize my entire rational class so you could include that directly. Currently it's both C++ and uses some gcc specific features like __int128_t. Basically it's numerator/denominator, where both are 64 bit integers, and an OverflowError is thrown if anything can't be represented as such (possibly a different exception would be better in cases like (1<<64)/((1<<64)+1)). It would be easy to generalize it to rational32 vs. rational64 as well. If you want tests but not rational, it would be straightforward to strip what I have down to a bare bones test case. As for the patch below, I wouldn't bother looking at it until I get the rest of the bugs out of the way (whether they're in my code or numpy). Geoffrey ----------------------------------------------------- diff --git a/numpy/core/src/umath/ufunc_type_resolution.c b/numpy/core/src/umath/ufunc_type_resolution.c index 0d6cf19..4e81e92 100644 --- a/numpy/core/src/umath/ufunc_type_resolution.c +++ b/numpy/core/src/umath/ufunc_type_resolution.c @@ -1656,7 +1656,7 @@ linear_search_userloop_type_resolver(PyUFuncObject *self, /* Found a match */ case 1: set_ufunc_loop_data_types(self, op, out_dtype, types); - return 0; + return 1; } funcdata = funcdata->next; From san.programming at gmail.com Sun Dec 4 21:40:05 2011 From: san.programming at gmail.com (Santhosh R) Date: Mon, 5 Dec 2011 08:10:05 +0530 Subject: [Numpy-discussion] stacking scalars with column vector In-Reply-To: References: Message-ID: Thanks Oliver. You can do it in one shot with: > > x = np.vstack((Xstart, A[:, 0:1], Xend)) > > Using A[:, 0:1] instead of A[:, 0] lets you keep it as a 2d matrix (this > should answer your last question). > Then the scalars Xstart and Xend will automatically be broadcasted to > accomodate the shape of A[:, 0:1], so you don't need to write [[Xstart]] > and [[Xend]]. > > -=- Olivier > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 4 21:45:33 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Dec 2011 19:45:33 -0700 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 6:59 PM, Geoffrey Irving wrote: > On Sun, Dec 4, 2011 at 5:18 PM, Charles R Harris > wrote: > > > > > > On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving wrote: > >> > >> This may be the problem. Simple diffs are pleasant. I'm guessing > >> this code doesn't get a lot of testing. Glad it's there, though! > >> > >> Geoffrey > >> > >> diff --git a/numpy/core/src/umath/ufunc_type_resolution.c > >> b/numpy/core/src/umath/ufunc_type_resolution.c > >> index 0d6cf19..a93eda1 100644 > >> --- a/numpy/core/src/umath/ufunc_type_resolution.c > >> +++ b/numpy/core/src/umath/ufunc_type_resolution.c > >> @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject *self, > >> case -1: > >> return -1; > >> /* A loop was found */ > >> - case 1: > >> + case 0: > >> return 0; > >> } > >> } > >> > > > > Heh. Can you verify that this fixes the problem? That function is only > > called once and its return value is passed up the chain, but the > documented > > return values of that calling function are -1, 0. So the documentation > needs > > to be changed if this is the right thing to do. > > Actually, that patch was wrong, since > linear_search_userloop_type_resolver needs to return three values > (error, not-found, success). A better patch follows. I can confirm > that this gets me further, but I get other failures down the line, so > more fixes may follow. I'll push the branch with all my fixes for > convenience once I have everything working. > > > Speaking of tests... I was wondering if you could be talked into putting > > together a simple user type for including in the tests? > > Yep, though likely not for a couple weeks. If there's interest, I > could also be convinced to sanitize my entire rational class so you > could include that directly. Currently it's both C++ and uses some > gcc specific features like __int128_t. Basically it's > numerator/denominator, where both are 64 bit integers, and an > OverflowError is thrown if anything can't be represented as such > (possibly a different exception would be better in cases like > (1<<64)/((1<<64)+1)). It would be easy to generalize it to rational32 > vs. rational64 as well. > > If you want tests but not rational, it would be straightforward to > strip what I have down to a bare bones test case. > > We'll see how much interest there is. If it becomes official you may get more feedback on features. There are some advantages to having some user types in numpy. One is that otherwise they tend to get lost, another is that having a working example or two provides a templates for others to work from, and finally they provide test material. Because official user types aren't assigned anywhere there might also be some conflicts. Maybe something like an extension types module would be a way around that. In any case, I think both rational numbers and quaternions would be useful to have and I hope there is some discussion of how to do that. Rationals may be a bit trickier than quaternions though, as usually they are used to provide exact arithmetic without concern for precision. I don't know how restrictive the 64 bit limitation will be in practice. What are you using them for? > As for the patch below, I wouldn't bother looking at it until I get > the rest of the bugs out of the way (whether they're in my code or > numpy). > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From teoliphant at gmail.com Sun Dec 4 23:32:33 2011 From: teoliphant at gmail.com (Travis Oliphant) Date: Sun, 4 Dec 2011 22:32:33 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: <4EDB7293.90302@gmail.com> References: <4EDB7293.90302@gmail.com> Message-ID: Great points. My initial suggestion of 5-11 was more about current board size rather than trying to fix it. I agree that having someone represent from major downstream projects would be a great thing. -Travis On Dec 4, 2011, at 7:16 AM, Alan G Isaac wrote: > On 12/4/2011 1:43 AM, Charles R Harris wrote: >> I don't think there are 5 active developers, let alone 11. >> With hard work you might scrape together two or three. >> Having 5 or 11 people making decisions for the two or >> three actually doing the work isn't going to go over well. > > Very true! But you might consider including on any board > a developer or two from important projects that are very > NumPy dependent. (E.g., Matplotlib.) > > One other thing: how about starting with a "board" of 3 > and a rule that says any active developer can request to > join, that additions are determined by majority vote of > the existing board, and that having the board both small > and odd numbered is a priority? (Fixing the board size > in advance for a project we all hope will grow substantially > seems odd.) > > fwiw, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion --- Travis Oliphant Enthought, Inc. oliphant at enthought.com 1-512-536-1057 http://www.enthought.com From irving at naml.us Mon Dec 5 02:37:29 2011 From: irving at naml.us (Geoffrey Irving) Date: Sun, 4 Dec 2011 23:37:29 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 6:45 PM, Charles R Harris wrote: > > > On Sun, Dec 4, 2011 at 6:59 PM, Geoffrey Irving wrote: >> >> On Sun, Dec 4, 2011 at 5:18 PM, Charles R Harris >> wrote: >> > >> > >> > On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving wrote: >> >> >> >> This may be the problem. ?Simple diffs are pleasant. ?I'm guessing >> >> this code doesn't get a lot of testing. ?Glad it's there, though! >> >> >> >> Geoffrey >> >> >> >> diff --git a/numpy/core/src/umath/ufunc_type_resolution.c >> >> b/numpy/core/src/umath/ufunc_type_resolution.c >> >> index 0d6cf19..a93eda1 100644 >> >> --- a/numpy/core/src/umath/ufunc_type_resolution.c >> >> +++ b/numpy/core/src/umath/ufunc_type_resolution.c >> >> @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject *self, >> >> ? ? ? ? ? ? case -1: >> >> ? ? ? ? ? ? ? ? return -1; >> >> ? ? ? ? ? ? /* A loop was found */ >> >> - ? ? ? ? ? ?case 1: >> >> + ? ? ? ? ? ?case 0: >> >> ? ? ? ? ? ? ? ? return 0; >> >> ? ? ? ? } >> >> ? ? } >> >> >> > >> > Heh. Can you verify that this fixes the problem? That function is only >> > called once? and its return value is passed up the chain, but the >> > documented >> > return values of that calling function are -1, 0. So the documentation >> > needs >> > to be changed if this is the right thing to do. >> >> Actually, that patch was wrong, since >> linear_search_userloop_type_resolver needs to return three values >> (error, not-found, success). ?A better patch follows. ?I can confirm >> that this gets me further, but I get other failures down the line, so >> more fixes may follow. ?I'll push the branch with all my fixes for >> convenience once I have everything working. >> >> > Speaking of tests... I was wondering if you could be talked into putting >> > together a simple user type for including in the tests? >> >> Yep, though likely not for a couple weeks. ?If there's interest, I >> could also be convinced to sanitize my entire rational class so you >> could include that directly. ?Currently it's both C++ and uses some >> gcc specific features like __int128_t. ?Basically it's >> numerator/denominator, where both are 64 bit integers, and an >> OverflowError is thrown if anything can't be represented as such >> (possibly a different exception would be better in cases like >> (1<<64)/((1<<64)+1)). ?It would be easy to generalize it to rational32 >> vs. rational64 as well. >> >> If you want tests but not rational, it would be straightforward to >> strip what I have down to a bare bones test case. >> > > We'll see how much interest there is. If it becomes official you may get > more feedback on features. There are some advantages to having some user > types in numpy. One is that otherwise they tend to get lost, another is that > having a working example or two provides a templates for others to work > from, and finally they provide test material. Because official user types > aren't assigned anywhere there might also be some conflicts. Maybe something > like an extension types module would be a way around that. In any case, I > think both rational numbers and quaternions would be useful to have and I > hope there is some discussion of how to do that. Rationals may be a bit > trickier than quaternions though, as usually they are used to provide exact > arithmetic without concern for precision. I don't know how restrictive the > 64 bit limitation will be in practice. What are you using them for? I'm using them for frivolous analysis of poker Nash equilibria. I'll let others decide if it has any non-toy uses. 64 bits seems to be enough for me, though it's possible that I'll run in trouble with other examples. It still exact, though, in the sense that it throws an exception rather than doing anything weird if it overflows. And it has the key advantage of being orders of magnitude faster than object arrays of Fractions. Back to the bugs: here's a branch with all the changes I needed to get rational arithmetic to work: https://github.com/girving/numpy I discovered two more after the last email. One is another simple 0 vs. 1 bug, and another is somewhat optional: commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 Author: Geoffrey Irving Date: Sun Dec 4 20:03:46 2011 -0800 After loops, check for PyErr_Occurred() even if needs_api is 0 For certain types of user defined classes, casting and ufunc loops normally run without the Python API, but occasionally need to throw an error. Currently we assume that !needs_api means no error occur. However, the fastest way to implement such loops is to run without the GIL normally and use PyGILState_Ensure/Release if an error occurs. In order to support this usage pattern, change all post-loop checks from needs_api && PyErr_Occurred() to simply PyErr_Occurred() Geoffrey From thouis.jones at curie.fr Mon Dec 5 05:45:57 2011 From: thouis.jones at curie.fr (Thouis Jones) Date: Mon, 5 Dec 2011 11:45:57 +0100 Subject: [Numpy-discussion] numpy.array() of mixed integers and strings can truncate data In-Reply-To: References: Message-ID: On Fri, Dec 2, 2011 at 18:53, Charles R Harris wrote: > After sleeping on this, I think an object array in this situation would be > the better choice and wouldn't result in lost information. This might change > the behavior of > some functions though, so would need testing. I tried to come up with a simple patch to achieve this, but I think this is beyond me, particularly since I think something different has to happen for these cases: np.array([1234, 'ab']) np.array([1234]).astype('|S2') I tried a few things (changing the rules in PyArray_PromoteTypes(), other places), but I think I'm more likely to break some corner case than fix this cleanly. I filed a ticket (#1990) and a pull request to add a test to the 1.6.x maintenance branch, for someone more knowledgeable than me to address. I tried to write the test so that either choosing dtype=object or dtype= would both pass. Ray Jones From perry at stsci.edu Mon Dec 5 07:22:34 2011 From: perry at stsci.edu (Perry Greenfield) Date: Mon, 5 Dec 2011 07:22:34 -0500 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: <4EDB7293.90302@gmail.com> Message-ID: I'm not sure I'm crazy about leaving final decision making for a board. A board may be a good way of carefully considering the issues, and it could make it's own recommendation (with a sufficient majority). But in the end I think one person needs to decide (and that decision may go against the board consensus, presumably only rarely). Why shouldn't that person be you? Perry On Dec 4, 2011, at 11:32 PM, Travis Oliphant wrote: > Great points. My initial suggestion of 5-11 was more about current > board size rather than trying to fix it. > > I agree that having someone represent from major downstream projects > would be a great thing. > > -Travis > > > On Dec 4, 2011, at 7:16 AM, Alan G Isaac wrote: > >> On 12/4/2011 1:43 AM, Charles R Harris wrote: >>> I don't think there are 5 active developers, let alone 11. >>> With hard work you might scrape together two or three. >>> Having 5 or 11 people making decisions for the two or >>> three actually doing the work isn't going to go over well. >> >> Very true! But you might consider including on any board >> a developer or two from important projects that are very >> NumPy dependent. (E.g., Matplotlib.) >> >> One other thing: how about starting with a "board" of 3 >> and a rule that says any active developer can request to >> join, that additions are determined by majority vote of >> the existing board, and that having the board both small >> and odd numbered is a priority? (Fixing the board size >> in advance for a project we all hope will grow substantially >> seems odd.) >> >> fwiw, >> Alan Isaac >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > --- > Travis Oliphant > Enthought, Inc. > oliphant at enthought.com > 1-512-536-1057 > http://www.enthought.com > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Dec 5 09:59:01 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Dec 2011 07:59:01 -0700 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: Hi Geoffrey, On Mon, Dec 5, 2011 at 12:37 AM, Geoffrey Irving wrote: > On Sun, Dec 4, 2011 at 6:45 PM, Charles R Harris > wrote: > > > > > > On Sun, Dec 4, 2011 at 6:59 PM, Geoffrey Irving wrote: > >> > >> On Sun, Dec 4, 2011 at 5:18 PM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving > wrote: > >> >> > >> >> This may be the problem. Simple diffs are pleasant. I'm guessing > >> >> this code doesn't get a lot of testing. Glad it's there, though! > >> >> > >> >> Geoffrey > >> >> > >> >> diff --git a/numpy/core/src/umath/ufunc_type_resolution.c > >> >> b/numpy/core/src/umath/ufunc_type_resolution.c > >> >> index 0d6cf19..a93eda1 100644 > >> >> --- a/numpy/core/src/umath/ufunc_type_resolution.c > >> >> +++ b/numpy/core/src/umath/ufunc_type_resolution.c > >> >> @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject > *self, > >> >> case -1: > >> >> return -1; > >> >> /* A loop was found */ > >> >> - case 1: > >> >> + case 0: > >> >> return 0; > >> >> } > >> >> } > >> >> > >> > > >> > Heh. Can you verify that this fixes the problem? That function is only > >> > called once and its return value is passed up the chain, but the > >> > documented > >> > return values of that calling function are -1, 0. So the documentation > >> > needs > >> > to be changed if this is the right thing to do. > >> > >> Actually, that patch was wrong, since > >> linear_search_userloop_type_resolver needs to return three values > >> (error, not-found, success). A better patch follows. I can confirm > >> that this gets me further, but I get other failures down the line, so > >> more fixes may follow. I'll push the branch with all my fixes for > >> convenience once I have everything working. > >> > >> > Speaking of tests... I was wondering if you could be talked into > putting > >> > together a simple user type for including in the tests? > >> > >> Yep, though likely not for a couple weeks. If there's interest, I > >> could also be convinced to sanitize my entire rational class so you > >> could include that directly. Currently it's both C++ and uses some > >> gcc specific features like __int128_t. Basically it's > >> numerator/denominator, where both are 64 bit integers, and an > >> OverflowError is thrown if anything can't be represented as such > >> (possibly a different exception would be better in cases like > >> (1<<64)/((1<<64)+1)). It would be easy to generalize it to rational32 > >> vs. rational64 as well. > >> > >> If you want tests but not rational, it would be straightforward to > >> strip what I have down to a bare bones test case. > >> > > > > We'll see how much interest there is. If it becomes official you may get > > more feedback on features. There are some advantages to having some user > > types in numpy. One is that otherwise they tend to get lost, another is > that > > having a working example or two provides a templates for others to work > > from, and finally they provide test material. Because official user types > > aren't assigned anywhere there might also be some conflicts. Maybe > something > > like an extension types module would be a way around that. In any case, I > > think both rational numbers and quaternions would be useful to have and I > > hope there is some discussion of how to do that. Rationals may be a bit > > trickier than quaternions though, as usually they are used to provide > exact > > arithmetic without concern for precision. I don't know how restrictive > the > > 64 bit limitation will be in practice. What are you using them for? > > I'm using them for frivolous analysis of poker Nash equilibria. I'll > let others decide if it has any non-toy uses. 64 bits seems to be > enough for me, though it's possible that I'll run in trouble with > other examples. It still exact, though, in the sense that it throws > an exception rather than doing anything weird if it overflows. And it > has the key advantage of being orders of magnitude faster than object > arrays of Fractions. > > Back to the bugs: here's a branch with all the changes I needed to get > rational arithmetic to work: > > https://github.com/girving/numpy > > I discovered two more after the last email. One is another simple 0 > vs. 1 bug, and another is somewhat optional: > > commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 > Author: Geoffrey Irving > Date: Sun Dec 4 20:03:46 2011 -0800 > > After loops, check for PyErr_Occurred() even if needs_api is 0 > > For certain types of user defined classes, casting and ufunc loops > normally run without the Python API, but occasionally need to throw > an error. Currently we assume that !needs_api means no error occur. > However, the fastest way to implement such loops is to run without > the GIL normally and use PyGILState_Ensure/Release if an error occurs. > > In order to support this usage pattern, change all post-loop checks from > > needs_api && PyErr_Occurred() > > to simply > > PyErr_Occurred() > > Thanks. Could you put this work into a separate branch, say fixuserloops, and enter a pull request? It's best not to work in master and a pull request make things easier to discuss and merge. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Mon Dec 5 10:55:52 2011 From: irving at naml.us (Geoffrey Irving) Date: Mon, 5 Dec 2011 07:55:52 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 6:59 AM, Charles R Harris wrote: > Hi Geoffrey, > > On Mon, Dec 5, 2011 at 12:37 AM, Geoffrey Irving wrote: >> >> On Sun, Dec 4, 2011 at 6:45 PM, Charles R Harris >> wrote: >> > >> > >> > On Sun, Dec 4, 2011 at 6:59 PM, Geoffrey Irving wrote: >> >> >> >> On Sun, Dec 4, 2011 at 5:18 PM, Charles R Harris >> >> wrote: >> >> > >> >> > >> >> > On Sun, Dec 4, 2011 at 5:41 PM, Geoffrey Irving >> >> > wrote: >> >> >> >> >> >> This may be the problem. ?Simple diffs are pleasant. ?I'm guessing >> >> >> this code doesn't get a lot of testing. ?Glad it's there, though! >> >> >> >> >> >> Geoffrey >> >> >> >> >> >> diff --git a/numpy/core/src/umath/ufunc_type_resolution.c >> >> >> b/numpy/core/src/umath/ufunc_type_resolution.c >> >> >> index 0d6cf19..a93eda1 100644 >> >> >> --- a/numpy/core/src/umath/ufunc_type_resolution.c >> >> >> +++ b/numpy/core/src/umath/ufunc_type_resolution.c >> >> >> @@ -1866,7 +1866,7 @@ linear_search_type_resolver(PyUFuncObject >> >> >> *self, >> >> >> ? ? ? ? ? ? case -1: >> >> >> ? ? ? ? ? ? ? ? return -1; >> >> >> ? ? ? ? ? ? /* A loop was found */ >> >> >> - ? ? ? ? ? ?case 1: >> >> >> + ? ? ? ? ? ?case 0: >> >> >> ? ? ? ? ? ? ? ? return 0; >> >> >> ? ? ? ? } >> >> >> ? ? } >> >> >> >> >> > >> >> > Heh. Can you verify that this fixes the problem? That function is >> >> > only >> >> > called once? and its return value is passed up the chain, but the >> >> > documented >> >> > return values of that calling function are -1, 0. So the >> >> > documentation >> >> > needs >> >> > to be changed if this is the right thing to do. >> >> >> >> Actually, that patch was wrong, since >> >> linear_search_userloop_type_resolver needs to return three values >> >> (error, not-found, success). ?A better patch follows. ?I can confirm >> >> that this gets me further, but I get other failures down the line, so >> >> more fixes may follow. ?I'll push the branch with all my fixes for >> >> convenience once I have everything working. >> >> >> >> > Speaking of tests... I was wondering if you could be talked into >> >> > putting >> >> > together a simple user type for including in the tests? >> >> >> >> Yep, though likely not for a couple weeks. ?If there's interest, I >> >> could also be convinced to sanitize my entire rational class so you >> >> could include that directly. ?Currently it's both C++ and uses some >> >> gcc specific features like __int128_t. ?Basically it's >> >> numerator/denominator, where both are 64 bit integers, and an >> >> OverflowError is thrown if anything can't be represented as such >> >> (possibly a different exception would be better in cases like >> >> (1<<64)/((1<<64)+1)). ?It would be easy to generalize it to rational32 >> >> vs. rational64 as well. >> >> >> >> If you want tests but not rational, it would be straightforward to >> >> strip what I have down to a bare bones test case. >> >> >> > >> > We'll see how much interest there is. If it becomes official you may get >> > more feedback on features. There are some advantages to having some user >> > types in numpy. One is that otherwise they tend to get lost, another is >> > that >> > having a working example or two provides a templates for others to work >> > from, and finally they provide test material. Because official user >> > types >> > aren't assigned anywhere there might also be some conflicts. Maybe >> > something >> > like an extension types module would be a way around that. In any case, >> > I >> > think both rational numbers and quaternions would be useful to have and >> > I >> > hope there is some discussion of how to do that. Rationals may be a bit >> > trickier than quaternions though, as usually they are used to provide >> > exact >> > arithmetic without concern for precision. I don't know how restrictive >> > the >> > 64 bit limitation will be in practice. What are you using them for? >> >> I'm using them for frivolous analysis of poker Nash equilibria. ?I'll >> let others decide if it has any non-toy uses. ?64 bits seems to be >> enough for me, though it's possible that I'll run in trouble with >> other examples. ?It still exact, though, in the sense that it throws >> an exception rather than doing anything weird if it overflows. ?And it >> has the key advantage of being orders of magnitude faster than object >> arrays of Fractions. >> >> Back to the bugs: here's a branch with all the changes I needed to get >> rational arithmetic to work: >> >> ? ?https://github.com/girving/numpy >> >> I discovered two more after the last email. ?One is another simple 0 >> vs. 1 bug, and another is somewhat optional: >> >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 >> Author: Geoffrey Irving >> Date: ? Sun Dec 4 20:03:46 2011 -0800 >> >> ? ?After loops, check for PyErr_Occurred() even if needs_api is 0 >> >> ? ?For certain types of user defined classes, casting and ufunc loops >> ? ?normally run without the Python API, but occasionally need to throw >> ? ?an error. ?Currently we assume that !needs_api means no error occur. >> ? ?However, the fastest way to implement such loops is to run without >> ? ?the GIL normally and use PyGILState_Ensure/Release if an error occurs. >> >> ? ?In order to support this usage pattern, change all post-loop checks >> from >> >> ? ? ? ?needs_api && PyErr_Occurred() >> >> ? ?to simply >> >> ? ? ? ?PyErr_Occurred() >> > > Thanks. Could you put this work into a separate branch, say fixuserloops, > and enter a pull request? It's best not to work in master and a pull request > make things easier to discuss and merge. Done: https://github.com/numpy/numpy/pull/175 Thanks, Geoffrey From cournape at gmail.com Mon Dec 5 11:58:29 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 5 Dec 2011 11:58:29 -0500 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 9:45 PM, Charles R Harris wrote: > > We'll see how much interest there is. If it becomes official you may get > more feedback on features. There are some advantages to having some user > types in numpy. One is that otherwise they tend to get lost, another is that > having a working example or two provides a templates for others to work > from, and finally they provide test material. Because official user types > aren't assigned anywhere there might also be some conflicts. Maybe something > like an extension types module would be a way around that. In any case, I > think both rational numbers and quaternions would be useful to have and I > hope there is some discussion of how to do that. I agree that those will be useful, but I am worried about adding more stuff in multiarray. User-types should really be separated from multiarray. Ideally, they should be plugins but separated from multiarray would be a good first step. I realize it is a bit unfair to have this ready for Geoffray's code changes, but depending on the timelines for the 2.0.0 milestone, I think this would be a useful thing to have. Otherwise, if some ABI/API changes are needed after 2.0, we will be dragged down with this for years. I am willing to spend time on this. Geoffray, does this sound acceptable to you ? David From mwwiebe at gmail.com Mon Dec 5 12:25:19 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Mon, 5 Dec 2011 09:25:19 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving wrote: > > > Back to the bugs: here's a branch with all the changes I needed to get > rational arithmetic to work: > > https://github.com/girving/numpy > > I discovered two more after the last email. One is another simple 0 > vs. 1 bug, and another is somewhat optional: > > commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 > Author: Geoffrey Irving > Date: Sun Dec 4 20:03:46 2011 -0800 > > After loops, check for PyErr_Occurred() even if needs_api is 0 > > For certain types of user defined classes, casting and ufunc loops > normally run without the Python API, but occasionally need to throw > an error. Currently we assume that !needs_api means no error occur. > However, the fastest way to implement such loops is to run without > the GIL normally and use PyGILState_Ensure/Release if an error occurs. > > In order to support this usage pattern, change all post-loop checks from > > needs_api && PyErr_Occurred() > > to simply > > PyErr_Occurred() > To support this properly, I think we would need to convert needs_api into an enum with this hybrid mode as another case. While it isn't done currently, I was imagining using a thread pool to multithread the trivially data-parallel operations when needs_api is false, and I suspect the PyGILState_Ensure/Release would trigger undefined behavior in a thread created entirely outside of the Python system. For comparison, I created a special mechanism for simplified multi-threaded exceptions in the nditer in the 'errmsg' parameter: http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext Worth considering is also the fact that the PyGILState API is incompatible with multiple embedded interpreters. Maybe that's not something anyone does with NumPy, though. -Mark > > Geoffrey > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Dec 5 12:25:24 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 05 Dec 2011 11:25:24 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: <4EDB7293.90302@gmail.com> Message-ID: <4EDCFE84.9020608@gmail.com> On 12/05/2011 06:22 AM, Perry Greenfield wrote: > I'm not sure I'm crazy about leaving final decision making for a > board. A board may be a good way of carefully considering the issues, > and it could make it's own recommendation (with a sufficient > majority). But in the end I think one person needs to decide (and that > decision may go against the board consensus, presumably only rarely). > > Why shouldn't that person be you? > > Perry > I have similar thoughts because I just do not see how a board would work especially given that anyone can be a 'core developer' because the distributed aspect removes that 'entry barrier'. I also think that there needs to be something formal like Linux Kernel Summit (see the excellent coverage by LWN.net; http://lwn.net/Articles/KernelSummit2011/). I know that people get together to talk at meetings or via invitation (http://blog.fperez.org/2011/05/austin-trip-ipython-at-tacc-and.html). This would provide a good opportunity to hash out concerns, introduce new features and identify community needs that cannot be adequately addressed via electronic communication. The datarray is a 'good' example of how this could work except that it has not been pushed upstream yet! (It would be a excellent example if it had been pushed upstream :-) hint, hint.) I also must disagree with statement of Travis that "discussions happen as they do now on the mailing list". This is simply not true because the mailing lists, tickets and pull requests are not connected so these have their own discussion threads. Sure there are some nice examples, Mark did tell us about this NA branch but the actual merge was still a surprise. So I think better communication of these such as emailing the list with a set 'public comment period' before requests are merged (longer periods for major changes). Bruce > On Dec 4, 2011, at 11:32 PM, Travis Oliphant wrote: > >> Great points. My initial suggestion of 5-11 was more about current >> board size rather than trying to fix it. >> >> I agree that having someone represent from major downstream projects >> would be a great thing. >> >> -Travis >> >> >> On Dec 4, 2011, at 7:16 AM, Alan G Isaac wrote: >> >>> On 12/4/2011 1:43 AM, Charles R Harris wrote: >>>> I don't think there are 5 active developers, let alone 11. >>>> With hard work you might scrape together two or three. >>>> Having 5 or 11 people making decisions for the two or >>>> three actually doing the work isn't going to go over well. >>> Very true! But you might consider including on any board >>> a developer or two from important projects that are very >>> NumPy dependent. (E.g., Matplotlib.) >>> >>> One other thing: how about starting with a "board" of 3 >>> and a rule that says any active developer can request to >>> join, that additions are determined by majority vote of >>> the existing board, and that having the board both small >>> and odd numbered is a priority? (Fixing the board size >>> in advance for a project we all hope will grow substantially >>> seems odd.) >>> >>> fwiw, >>> Alan Isaac >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> --- >> Travis Oliphant >> Enthought, Inc. >> oliphant at enthought.com >> 1-512-536-1057 >> http://www.enthought.com >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From markflorisson88 at gmail.com Mon Dec 5 12:37:15 2011 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 5 Dec 2011 17:37:15 +0000 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On 5 December 2011 17:25, Mark Wiebe wrote: > On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving wrote: >> >> >> >> >> Back to the bugs: here's a branch with all the changes I needed to get >> rational arithmetic to work: >> >> ? ?https://github.com/girving/numpy >> >> I discovered two more after the last email. ?One is another simple 0 >> vs. 1 bug, and another is somewhat optional: >> >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 >> Author: Geoffrey Irving >> Date: ? Sun Dec 4 20:03:46 2011 -0800 >> >> ? ?After loops, check for PyErr_Occurred() even if needs_api is 0 >> >> ? ?For certain types of user defined classes, casting and ufunc loops >> ? ?normally run without the Python API, but occasionally need to throw >> ? ?an error. ?Currently we assume that !needs_api means no error occur. >> ? ?However, the fastest way to implement such loops is to run without >> ? ?the GIL normally and use PyGILState_Ensure/Release if an error occurs. >> >> ? ?In order to support this usage pattern, change all post-loop checks >> from >> >> ? ? ? ?needs_api && PyErr_Occurred() >> >> ? ?to simply >> >> ? ? ? ?PyErr_Occurred() > > > To support this properly, I think we would need to convert needs_api into an > enum with this hybrid mode as another case. While it isn't done currently, I > was imagining using a thread pool to multithread the trivially data-parallel > operations when needs_api is false, and I suspect the > PyGILState_Ensure/Release would trigger undefined behavior in a thread > created entirely outside of the Python system. PyGILState_Ensure/Release can be safely used by non-python threads with the only requirement that the GIL has been initialized previously in the main thread (PyEval_InitThreads). > For comparison, I created a > special mechanism for simplified multi-threaded exceptions in the nditer in > the 'errmsg' parameter: > > http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext > > Worth considering is also the fact that the PyGILState API is incompatible > with multiple embedded interpreters. Maybe that's not something anyone does > with NumPy, though. > > -Mark > >> >> >> Geoffrey >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From mwwiebe at gmail.com Mon Dec 5 12:39:44 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Mon, 5 Dec 2011 09:39:44 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 8:58 AM, David Cournapeau wrote: > On Sun, Dec 4, 2011 at 9:45 PM, Charles R Harris > wrote: > > > > > We'll see how much interest there is. If it becomes official you may get > > more feedback on features. There are some advantages to having some user > > types in numpy. One is that otherwise they tend to get lost, another is > that > > having a working example or two provides a templates for others to work > > from, and finally they provide test material. Because official user types > > aren't assigned anywhere there might also be some conflicts. Maybe > something > > like an extension types module would be a way around that. In any case, I > > think both rational numbers and quaternions would be useful to have and I > > hope there is some discussion of how to do that. > > I agree that those will be useful, but I am worried about adding more > stuff in multiarray. User-types should really be separated from > multiarray. Ideally, they should be plugins but separated from > multiarray would be a good first step. > I think the object and datetime dtypes should also be moved out of the core multiarray module at some point. The user-type mechanism could be improved a lot based on Martin's feedback after he did the quaternion implementation, and needs further expansion to be able to support object and datetime arrays as currently implemented. I realize it is a bit unfair to have this ready for Geoffray's code > changes, but depending on the timelines for the 2.0.0 milestone, I > think this would be a useful thing to have. Otherwise, if some ABI/API > changes are needed after 2.0, we will be dragged down with this for > years. I am willing to spend time on this. Geoffray, does this sound > acceptable to you ? > A rational type could be added without breaking the ABI, in the same way it was done for datetime and half in 1.6. I think the revamp of the user-type mechanism needs its own NEP design document, because changing it will be a very delicate operation in dealing with how it interacts with the NumPy core, and making it much more programmer-friendly will take a fair number of design iterations. -Mark > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Dec 5 12:46:34 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 05 Dec 2011 11:46:34 -0600 Subject: [Numpy-discussion] astype does not work with NA object Message-ID: <4EDD037A.6030603@gmail.com> Hi, I mistakenly filed ticket 1973 "Can not display a masked array containing np.NA values even if masked" (http://projects.scipy.org/numpy/ticket/1973) against masked array because that was where I found it. But the actual error is that the astype function does not handle the NA object: $ python Python 2.7 (r27:82500, Sep 16 2010, 18:02:00) [GCC 4.5.1 20100907 (Red Hat 4.5.1-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '2.0.0.dev-059334c' >>> np.array([1,2,3,4]).astype(float) array([ 1., 2., 3., 4.]) >>> np.array([1,2,3,np.NA]).astype(float) Traceback (most recent call last): File "", line 1, in ValueError: Cannot assign NA to an array which does not support NAs >>> a=np.array([1,2,3,4], maskna=True) >>> a[3]=np.NA >>> a array([1, 2, 3, NA]) >>> a.astype(float) Traceback (most recent call last): File "", line 1, in ValueError: Cannot assign NA to an array which does not support NAs >>>a*1.0 array([ 1., 2., 3., NA]) Bruce From mwwiebe at gmail.com Mon Dec 5 12:48:34 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Mon, 5 Dec 2011 09:48:34 -0800 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 9:37 AM, mark florisson wrote: > On 5 December 2011 17:25, Mark Wiebe wrote: > > On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving wrote: > >> > >> > >> > >> > >> Back to the bugs: here's a branch with all the changes I needed to get > >> rational arithmetic to work: > >> > >> https://github.com/girving/numpy > >> > >> I discovered two more after the last email. One is another simple 0 > >> vs. 1 bug, and another is somewhat optional: > >> > >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 > >> Author: Geoffrey Irving > >> Date: Sun Dec 4 20:03:46 2011 -0800 > >> > >> After loops, check for PyErr_Occurred() even if needs_api is 0 > >> > >> For certain types of user defined classes, casting and ufunc loops > >> normally run without the Python API, but occasionally need to throw > >> an error. Currently we assume that !needs_api means no error occur. > >> However, the fastest way to implement such loops is to run without > >> the GIL normally and use PyGILState_Ensure/Release if an error > occurs. > >> > >> In order to support this usage pattern, change all post-loop checks > >> from > >> > >> needs_api && PyErr_Occurred() > >> > >> to simply > >> > >> PyErr_Occurred() > > > > > > To support this properly, I think we would need to convert needs_api > into an > > enum with this hybrid mode as another case. While it isn't done > currently, I > > was imagining using a thread pool to multithread the trivially > data-parallel > > operations when needs_api is false, and I suspect the > > PyGILState_Ensure/Release would trigger undefined behavior in a thread > > created entirely outside of the Python system. > > PyGILState_Ensure/Release can be safely used by non-python threads > with the only requirement that the GIL has been initialized previously > in the main thread (PyEval_InitThreads). > Is there a way this could efficiently be used to propagate any errors back to the main thread, for example using TBB as the thread pool? The innermost task code which calls the inner loop can't call PyErr_Occurred() without first calling PyGILState_Ensure itself, which would kill utilization. Maybe this is an ABI problem in NumPy that needs to be fixed, to mandate that inner loops always return an error code and disallow them from setting the Python exception state without returning failure. -Mark > > > For comparison, I created a > > special mechanism for simplified multi-threaded exceptions in the nditer > in > > the 'errmsg' parameter: > > > > > http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext > > > > Worth considering is also the fact that the PyGILState API is > incompatible > > with multiple embedded interpreters. Maybe that's not something anyone > does > > with NumPy, though. > > > > -Mark > > > >> > >> > >> Geoffrey > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markflorisson88 at gmail.com Mon Dec 5 12:57:50 2011 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 5 Dec 2011 17:57:50 +0000 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On 5 December 2011 17:48, Mark Wiebe wrote: > On Mon, Dec 5, 2011 at 9:37 AM, mark florisson > wrote: >> >> On 5 December 2011 17:25, Mark Wiebe wrote: >> > On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving wrote: >> >> >> >> >> >> >> >> >> >> Back to the bugs: here's a branch with all the changes I needed to get >> >> rational arithmetic to work: >> >> >> >> ? ?https://github.com/girving/numpy >> >> >> >> I discovered two more after the last email. ?One is another simple 0 >> >> vs. 1 bug, and another is somewhat optional: >> >> >> >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 >> >> Author: Geoffrey Irving >> >> Date: ? Sun Dec 4 20:03:46 2011 -0800 >> >> >> >> ? ?After loops, check for PyErr_Occurred() even if needs_api is 0 >> >> >> >> ? ?For certain types of user defined classes, casting and ufunc loops >> >> ? ?normally run without the Python API, but occasionally need to throw >> >> ? ?an error. ?Currently we assume that !needs_api means no error occur. >> >> ? ?However, the fastest way to implement such loops is to run without >> >> ? ?the GIL normally and use PyGILState_Ensure/Release if an error >> >> occurs. >> >> >> >> ? ?In order to support this usage pattern, change all post-loop checks >> >> from >> >> >> >> ? ? ? ?needs_api && PyErr_Occurred() >> >> >> >> ? ?to simply >> >> >> >> ? ? ? ?PyErr_Occurred() >> > >> > >> > To support this properly, I think we would need to convert needs_api >> > into an >> > enum with this hybrid mode as another case. While it isn't done >> > currently, I >> > was imagining using a thread pool to multithread the trivially >> > data-parallel >> > operations when needs_api is false, and I suspect the >> > PyGILState_Ensure/Release would trigger undefined behavior in a thread >> > created entirely outside of the Python system. >> >> PyGILState_Ensure/Release can be safely used by non-python threads >> with the only requirement that the GIL has been initialized previously >> in the main thread (PyEval_InitThreads). > > > Is there a way this could efficiently be used to propagate any errors back > to the main thread, for example using TBB as the thread pool? The innermost > task code which calls the inner loop can't call PyErr_Occurred() without > first calling PyGILState_Ensure itself, which would kill utilization. No, there is no way these things can be efficient, as the GIL is likely contented anyway (I wasn't making a point for these functions, just wanted to clarify). There is in fact the additional problem that PyGILState_Ensure would initialize a threadstate, you set an exception, and when you call PyGILState_Release the threadstate gets deleted along with the exception, before you will even have a chance to check with PyErr_Occurred(). For cython.parallel I worked around this by calling PyGILState_Ensure (to initialize the thread state), followed immediately by Py_BEGIN_ALLOW_THREADS before starting any work. You then have to fetch the exception and restore it in another thread when you want to propagate it. It's a total mess, it's inefficient and if you can avoid it you should. > Maybe this is an ABI problem in NumPy that needs to be fixed, to mandate > that inner loops always return an error code and disallow them from setting > the Python exception state without returning failure. That would likely be the best thing. > -Mark > >> >> >> > For comparison, I created a >> > special mechanism for simplified multi-threaded exceptions in the nditer >> > in >> > the 'errmsg' parameter: >> > >> > >> > http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext >> > >> > Worth considering is also the fact that the PyGILState API is >> > incompatible >> > with multiple embedded interpreters. Maybe that's not something anyone >> > does >> > with NumPy, though. >> > >> > -Mark >> > >> >> >> >> >> >> Geoffrey >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From markflorisson88 at gmail.com Mon Dec 5 12:59:49 2011 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 5 Dec 2011 17:59:49 +0000 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On 5 December 2011 17:57, mark florisson wrote: > On 5 December 2011 17:48, Mark Wiebe wrote: >> On Mon, Dec 5, 2011 at 9:37 AM, mark florisson >> wrote: >>> >>> On 5 December 2011 17:25, Mark Wiebe wrote: >>> > On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving wrote: >>> >> >>> >> >>> >> >>> >> >>> >> Back to the bugs: here's a branch with all the changes I needed to get >>> >> rational arithmetic to work: >>> >> >>> >> ? ?https://github.com/girving/numpy >>> >> >>> >> I discovered two more after the last email. ?One is another simple 0 >>> >> vs. 1 bug, and another is somewhat optional: >>> >> >>> >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0 >>> >> Author: Geoffrey Irving >>> >> Date: ? Sun Dec 4 20:03:46 2011 -0800 >>> >> >>> >> ? ?After loops, check for PyErr_Occurred() even if needs_api is 0 >>> >> >>> >> ? ?For certain types of user defined classes, casting and ufunc loops >>> >> ? ?normally run without the Python API, but occasionally need to throw >>> >> ? ?an error. ?Currently we assume that !needs_api means no error occur. >>> >> ? ?However, the fastest way to implement such loops is to run without >>> >> ? ?the GIL normally and use PyGILState_Ensure/Release if an error >>> >> occurs. >>> >> >>> >> ? ?In order to support this usage pattern, change all post-loop checks >>> >> from >>> >> >>> >> ? ? ? ?needs_api && PyErr_Occurred() >>> >> >>> >> ? ?to simply >>> >> >>> >> ? ? ? ?PyErr_Occurred() >>> > >>> > >>> > To support this properly, I think we would need to convert needs_api >>> > into an >>> > enum with this hybrid mode as another case. While it isn't done >>> > currently, I >>> > was imagining using a thread pool to multithread the trivially >>> > data-parallel >>> > operations when needs_api is false, and I suspect the >>> > PyGILState_Ensure/Release would trigger undefined behavior in a thread >>> > created entirely outside of the Python system. >>> >>> PyGILState_Ensure/Release can be safely used by non-python threads >>> with the only requirement that the GIL has been initialized previously >>> in the main thread (PyEval_InitThreads). >> >> >> Is there a way this could efficiently be used to propagate any errors back >> to the main thread, for example using TBB as the thread pool? The innermost >> task code which calls the inner loop can't call PyErr_Occurred() without >> first calling PyGILState_Ensure itself, which would kill utilization. > > No, there is no way these things can be efficient, as the GIL is > likely contented anyway (I wasn't making a point for these functions, > just wanted to clarify). There is in fact the additional problem that > PyGILState_Ensure would initialize a threadstate, you set an > exception, and when you call PyGILState_Release the threadstate gets > deleted along with the exception, before you will even have a chance > to check with PyErr_Occurred(). To clarify, this case will only happen if you're doing this from a non-Python thread that doesn't have a threadstate to begin with. > For cython.parallel I worked around this by calling PyGILState_Ensure > (to initialize the thread state), followed immediately by > Py_BEGIN_ALLOW_THREADS before starting any work. You then have to > fetch the exception and restore it in another thread when you want to > propagate it. It's a total mess, it's inefficient and if you can avoid > it you should. > >> Maybe this is an ABI problem in NumPy that needs to be fixed, to mandate >> that inner loops always return an error code and disallow them from setting >> the Python exception state without returning failure. > > That would likely be the best thing. > >> -Mark >> >>> >>> >>> > For comparison, I created a >>> > special mechanism for simplified multi-threaded exceptions in the nditer >>> > in >>> > the 'errmsg' parameter: >>> > >>> > >>> > http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext >>> > >>> > Worth considering is also the fact that the PyGILState API is >>> > incompatible >>> > with multiple embedded interpreters. Maybe that's not something anyone >>> > does >>> > with NumPy, though. >>> > >>> > -Mark >>> > >>> >> >>> >> >>> >> Geoffrey >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> From mwwiebe at gmail.com Mon Dec 5 13:06:31 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Mon, 5 Dec 2011 10:06:31 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant wrote: > > Hi everyone, > > There have been some wonderfully vigorous discussions over the past few > months that have made it clear that we need some clarity about how > decisions will be made in the NumPy community. > > When we were a smaller bunch of people it seemed easier to come to an > agreement and things pretty much evolved based on (mostly) consensus and > who was available to actually do the work. > > There is a need for a more clear structure so that we know how decisions > will get made and so that code can move forward while paying attention to > the current user-base. There has been a "steering committee" structure > for SciPy in the past, and I have certainly been prone to lump both NumPy > and SciPy together given that I have a strong interest in and have spent a > great amount of time working on both projects. Others have also spent > time on both projects. > > However, I think it is critical at this stage to clearly separate the > projects and define a governing structure that is fair and agreeable for > NumPy. SciPy has multiple modules and will probably need structure around > each module independently. For now, I wanted to open up a discussion to > see what people thought about NumPy's governance. > > My initial thoughts: > > * discussions happen as they do now on the mailing list > * a small group of developers (5-11) constitute the "board" and > major decisions are made by vote of that group (not just simple majority > --- needs at least 2/3 +1 votes). > * votes are +1/+0/-0/-1 > * if a topic is difficult to resolve it is moved off the main list > and discussed on a separate "board" mailing list --- these should be rare, > but parts of the NA discussion would probably qualify > * This board mailing list is "publically" viewable but only board > members may post. > * The board is renewed and adjusted each year --- based on > nomination and 2/3 vote of the current board until board is at 11. > * The chairman of the board is voted by a majority of the board and > has veto power unless over-ridden by 3/4 of the board. > * Petitions to remove people off the board can be made by 50+ > independent reverse nominations (hopefully people will just withdraw if > they are no longer active). > > All of these points are open for discussion. I just thought I would start > the conversation. I will be much more active this next year with NumPy > and will be very interested in the direction NumPy is taking. I'm hoping > to discern by this conversation, who else is very interested in the > direction of NumPy so that the first board can be formally constituted. > I'm definitely in support of something along these lines. My experience entering NumPy development was that the development process, coding standards, and other aspects of the process are not very well specified, and people have many differing ideas about what has already been agreed upon. I would recommend that fixing this state of affairs be placed high on the agenda of the board, with the goal of making it easier to attract new developers. A few people have proposed the BDFL approach, as in CPython development. In practice, I believe Guido has done very well in the role because he only uses the power as a last resort. Even if NumPy adopts a similar approach, having a board along the lines Travis proposes would still be a good thing, and having a BDFL would just mean that there's someone who could override the will of the board and make an entirely different choice. It may be worth considering how the governance structure is related to the different levels of the NumPy codebase. There is a (very) small group of people who have contributed significant amounts of C code, a larger group of people who have contributed significant amounts of Python code, many people who have contributed small C and/or Python patches, and a large number of people who contribute bug reports, email list comments, etc. It may be worth designing the board taking into account these different groups of developers and users. -Mark > > Best regards, > > -Travis > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Mon Dec 5 14:43:19 2011 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 5 Dec 2011 13:43:19 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 12:06 PM, Mark Wiebe wrote: > On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant wrote: > >> >> Hi everyone, >> >> There have been some wonderfully vigorous discussions over the past few >> months that have made it clear that we need some clarity about how >> decisions will be made in the NumPy community. >> >> When we were a smaller bunch of people it seemed easier to come to an >> agreement and things pretty much evolved based on (mostly) consensus and >> who was available to actually do the work. >> >> There is a need for a more clear structure so that we know how decisions >> will get made and so that code can move forward while paying attention to >> the current user-base. There has been a "steering committee" structure >> for SciPy in the past, and I have certainly been prone to lump both NumPy >> and SciPy together given that I have a strong interest in and have spent a >> great amount of time working on both projects. Others have also spent >> time on both projects. >> >> However, I think it is critical at this stage to clearly separate the >> projects and define a governing structure that is fair and agreeable for >> NumPy. SciPy has multiple modules and will probably need structure around >> each module independently. For now, I wanted to open up a discussion to >> see what people thought about NumPy's governance. >> >> My initial thoughts: >> >> * discussions happen as they do now on the mailing list >> * a small group of developers (5-11) constitute the "board" and >> major decisions are made by vote of that group (not just simple majority >> --- needs at least 2/3 +1 votes). >> * votes are +1/+0/-0/-1 >> * if a topic is difficult to resolve it is moved off the main list >> and discussed on a separate "board" mailing list --- these should be rare, >> but parts of the NA discussion would probably qualify >> * This board mailing list is "publically" viewable but only board >> members may post. >> * The board is renewed and adjusted each year --- based on >> nomination and 2/3 vote of the current board until board is at 11. >> * The chairman of the board is voted by a majority of the board >> and has veto power unless over-ridden by 3/4 of the board. >> * Petitions to remove people off the board can be made by 50+ >> independent reverse nominations (hopefully people will just withdraw if >> they are no longer active). >> >> All of these points are open for discussion. I just thought I would >> start the conversation. I will be much more active this next year with >> NumPy and will be very interested in the direction NumPy is taking. I'm >> hoping to discern by this conversation, who else is very interested in the >> direction of NumPy so that the first board can be formally constituted. >> > > I'm definitely in support of something along these lines. My experience > entering NumPy development was that the development process, coding > standards, and other aspects of the process are not very well specified, > and people have many differing ideas about what has already been agreed > upon. I would recommend that fixing this state of affairs be placed high on > the agenda of the board, with the goal of making it easier to attract new > developers. > > A few people have proposed the BDFL approach, as in CPython development. > In practice, I believe Guido has done very well in the role because he only > uses the power as a last resort. Even if NumPy adopts a similar approach, > having a board along the lines Travis proposes would still be a good thing, > and having a BDFL would just mean that there's someone who could override > the will of the board and make an entirely different choice. > > It may be worth considering how the governance structure is related to the > different levels of the NumPy codebase. There is a (very) small group of > people who have contributed significant amounts of C code, a larger group > of people who have contributed significant amounts of Python code, many > people who have contributed small C and/or Python patches, and a large > number of people who contribute bug reports, email list comments, etc. It > may be worth designing the board taking into account these different groups > of developers and users. > > -Mark > > >> >> Best regards, >> >> -Travis >> >> Just some thoughts I have from this discussion. 1. I think that we need to encourage and entice more NumPy developers/contributors. Having a board of only a few core developers puts us right back in the same boat we were in during the whole NA discussion, only more codified. Increasing the size of the board with more core developers would diversify thought and counter-act "group-think". I think that this problem needs to be solved before anything else. Travis, I am currently in the process of getting hired by a company that has really embraced the Python/SciPy software stack, but unfortunately has not fully embraced OSS community development. I wonder if Enthought could produce (or maybe already has) teaching material showing other companies how to incorporate community development into their business models (in particular development of the SciPy stack)? Maybe we can get some more commitments of resources from other companies? 2. Nothing against Travis, but I am somewhat wary of declaring Travis as a BDFL while he is the head of Enthought. I think Travis has done an excellent job of respecting the hierarchy of development (where EPD is downstream of the OSS SciPy stack). Having Travis as BDFL might, IMHO, create possible conflicts of interest in the future. Again, I am not saying that Travis would act against the interest of the SciPy stack, but rather, I would like to avoid putting Travis in a position where such decisions could become tempting. 3. I definitely +1 the idea of including representatives of other projects on the board. Each project could have their own process for selecting a member to represent them. 4. I agree with Charles that a separate list would probably be detrimental. I also agree with Bruce that since the move to github, we have had things become fragmented, and I think that needs to be re-unified. matplotlib is experiencing the same problem, and I have to wonder if some other groups have already come up with a solution to this. Maybe make a psuedo github account with the mailing list address as its email address? Just my 2 cents for now. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Dec 5 14:43:57 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 5 Dec 2011 20:43:57 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? Message-ID: Hi all, It's been a little over 6 months since the release of 1.6.0 and the NA debate has quieted down, so I'd like to ask your opinion on the timing of 1.7.0. It looks to me like we have a healthy amount of bug fixes and small improvements, plus three larger chucks of work: - datetime - NA - Bento support My impression is that both datetime and NA are releasable, but should be labeled "tech preview" or something similar, because they may still see significant changes. Please correct me if I'm wrong. There's still some maintenance work to do and pull requests to merge, but a beta release by Christmas should be feasible. What do you all think? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Dec 5 15:07:09 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 5 Dec 2011 15:07:09 -0500 Subject: [Numpy-discussion] failure to register ufunc loops for user defined types In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 12:39 PM, Mark Wiebe wrote: > On Mon, Dec 5, 2011 at 8:58 AM, David Cournapeau wrote: >> >> On Sun, Dec 4, 2011 at 9:45 PM, Charles R Harris >> wrote: >> >> > >> > We'll see how much interest there is. If it becomes official you may get >> > more feedback on features. There are some advantages to having some user >> > types in numpy. One is that otherwise they tend to get lost, another is >> > that >> > having a working example or two provides a templates for others to work >> > from, and finally they provide test material. Because official user >> > types >> > aren't assigned anywhere there might also be some conflicts. Maybe >> > something >> > like an extension types module would be a way around that. In any case, >> > I >> > think both rational numbers and quaternions would be useful to have and >> > I >> > hope there is some discussion of how to do that. >> >> I agree that those will be useful, but I am worried about adding more >> stuff in multiarray. User-types should really be separated from >> multiarray. Ideally, they should be plugins but separated from >> multiarray would be a good first step. > > > I think the object and datetime dtypes should also be moved out of the core > multiarray module at some point. Indeed. > The user-type mechanism could be improved a > lot based on Martin's feedback after he did the quaternion implementation, > and needs further expansion to be able to support object and datetime arrays > as currently implemented. > >> I realize it is a bit unfair to have this ready for Geoffray's code >> changes, but depending on the timelines for the 2.0.0 milestone, I >> think this would be a useful thing to have. Otherwise, if some ABI/API >> changes are needed after 2.0, we will be dragged down with this for >> years. I am willing to spend time on this. Geoffray, does this sound >> acceptable to you ? > > > A rational type could be added without breaking the ABI, in the same way it > was done for datetime and half in 1.6. I think the revamp of the user-type > mechanism needs its own NEP design document, because changing it will be a > very delicate operation in dealing with how it interacts with the NumPy > core, and making it much more programmer-friendly will take a fair number of > design iterations. I am not worried about breaking the ABI when adding it, but rather with the issues once we remove it to put it somewhere else. In that sense, adding it for 1.7 is not much of an issue, but having it in 2.x more concerned. How difficult do you think it would be to separate it at least at the API level (i.e. it would still be in multiarray.so/ufunc.so, but as clearly separate as possible) ? A few days of work or more ? David From oliphant at enthought.com Mon Dec 5 15:08:58 2011 From: oliphant at enthought.com (Travis Oliphant) Date: Mon, 5 Dec 2011 15:08:58 -0500 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: <8ED90085-E4EF-4144-A418-37C495DC26BB@enthought.com> I like the idea. Is there resolution to the NA question? -- Travis Oliphant (on a mobile) 512-826-7480 On Dec 5, 2011, at 2:43 PM, Ralf Gommers wrote: > Hi all, > > It's been a little over 6 months since the release of 1.6.0 and the NA debate has quieted down, so I'd like to ask your opinion on the timing of 1.7.0. It looks to me like we have a healthy amount of bug fixes and small improvements, plus three larger chucks of work: > > - datetime > - NA > - Bento support > > My impression is that both datetime and NA are releasable, but should be labeled "tech preview" or something similar, because they may still see significant changes. Please correct me if I'm wrong. > > There's still some maintenance work to do and pull requests to merge, but a beta release by Christmas should be feasible. What do you all think? > > Cheers, > Ralf > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Dec 5 15:10:38 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Dec 2011 13:10:38 -0700 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 12:43 PM, Benjamin Root wrote: > On Mon, Dec 5, 2011 at 12:06 PM, Mark Wiebe wrote: > >> On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant wrote: >> >>> >>> Hi everyone, >>> >>> There have been some wonderfully vigorous discussions over the past few >>> months that have made it clear that we need some clarity about how >>> decisions will be made in the NumPy community. >>> >>> When we were a smaller bunch of people it seemed easier to come to an >>> agreement and things pretty much evolved based on (mostly) consensus and >>> who was available to actually do the work. >>> >>> There is a need for a more clear structure so that we know how decisions >>> will get made and so that code can move forward while paying attention to >>> the current user-base. There has been a "steering committee" structure >>> for SciPy in the past, and I have certainly been prone to lump both NumPy >>> and SciPy together given that I have a strong interest in and have spent a >>> great amount of time working on both projects. Others have also spent >>> time on both projects. >>> >>> However, I think it is critical at this stage to clearly separate the >>> projects and define a governing structure that is fair and agreeable for >>> NumPy. SciPy has multiple modules and will probably need structure around >>> each module independently. For now, I wanted to open up a discussion to >>> see what people thought about NumPy's governance. >>> >>> My initial thoughts: >>> >>> * discussions happen as they do now on the mailing list >>> * a small group of developers (5-11) constitute the "board" and >>> major decisions are made by vote of that group (not just simple majority >>> --- needs at least 2/3 +1 votes). >>> * votes are +1/+0/-0/-1 >>> * if a topic is difficult to resolve it is moved off the main >>> list and discussed on a separate "board" mailing list --- these should be >>> rare, but parts of the NA discussion would probably qualify >>> * This board mailing list is "publically" viewable but only board >>> members may post. >>> * The board is renewed and adjusted each year --- based on >>> nomination and 2/3 vote of the current board until board is at 11. >>> * The chairman of the board is voted by a majority of the board >>> and has veto power unless over-ridden by 3/4 of the board. >>> * Petitions to remove people off the board can be made by 50+ >>> independent reverse nominations (hopefully people will just withdraw if >>> they are no longer active). >>> >>> All of these points are open for discussion. I just thought I would >>> start the conversation. I will be much more active this next year with >>> NumPy and will be very interested in the direction NumPy is taking. I'm >>> hoping to discern by this conversation, who else is very interested in the >>> direction of NumPy so that the first board can be formally constituted. >>> >> >> I'm definitely in support of something along these lines. My experience >> entering NumPy development was that the development process, coding >> standards, and other aspects of the process are not very well specified, >> and people have many differing ideas about what has already been agreed >> upon. I would recommend that fixing this state of affairs be placed high on >> the agenda of the board, with the goal of making it easier to attract new >> developers. >> >> A few people have proposed the BDFL approach, as in CPython development. >> In practice, I believe Guido has done very well in the role because he only >> uses the power as a last resort. Even if NumPy adopts a similar approach, >> having a board along the lines Travis proposes would still be a good thing, >> and having a BDFL would just mean that there's someone who could override >> the will of the board and make an entirely different choice. >> >> It may be worth considering how the governance structure is related to >> the different levels of the NumPy codebase. There is a (very) small group >> of people who have contributed significant amounts of C code, a larger >> group of people who have contributed significant amounts of Python code, >> many people who have contributed small C and/or Python patches, and a large >> number of people who contribute bug reports, email list comments, etc. It >> may be worth designing the board taking into account these different groups >> of developers and users. >> >> -Mark >> >> >>> >>> Best regards, >>> >>> -Travis >>> >>> > Just some thoughts I have from this discussion. > > 1. I think that we need to encourage and entice more NumPy > developers/contributors. Having a board of only a few core developers puts > us right back in the same boat we were in during the whole NA discussion, > only more codified. Increasing the size of the board with more core > developers would diversify thought and counter-act "group-think". I think > that this problem needs to be solved before anything else. > > Well, that's a tough one. Numpy development tends to attract folks with spare time, i.e., students*, and those with an itch to scratch. Itched scratched, degree obtained, they go back to their primary interest or on to jobs and the rest of life. So developers come and go and a board needs to be pretty flexible to reflect that. We can't wave around wads of cash or stock options to attract new people. Mark does a good job of pointing out some of the barriers to entry, and we could try to lower those. Indeed, things are much better than they were but more can be done. Chuck * I expect students think they are overworked, and sometimes they are, but they also tend to be young and energetic with flexible schedules and few outside commitments. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 5 15:13:39 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Dec 2011 13:13:39 -0700 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: <8ED90085-E4EF-4144-A418-37C495DC26BB@enthought.com> References: <8ED90085-E4EF-4144-A418-37C495DC26BB@enthought.com> Message-ID: On Mon, Dec 5, 2011 at 1:08 PM, Travis Oliphant wrote: > I like the idea. Is there resolution to the NA question? > No, people still disagree and are likely to do so for years to come with no end in sight. That's why the preview label. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Dec 5 15:39:20 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 5 Dec 2011 21:39:20 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: <8ED90085-E4EF-4144-A418-37C495DC26BB@enthought.com> Message-ID: On Mon, Dec 5, 2011 at 9:13 PM, Charles R Harris wrote: > > > On Mon, Dec 5, 2011 at 1:08 PM, Travis Oliphant wrote: > >> I like the idea. Is there resolution to the NA question? >> > > No, people still disagree and are likely to do so for years to come with > no end in sight. That's why the preview label. > > Agreed that it's not resolved, but I think we at least got to the point where we agreed not to back out the complete missing data additions. So if we clearly say that we keep all options for future API changes open (=preview label), I don't think that the issue should hold up a numpy release indefinitely. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rogerb at rogerbinns.com Mon Dec 5 16:28:06 2011 From: rogerb at rogerbinns.com (Roger Binns) Date: Mon, 05 Dec 2011 13:28:06 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? Message-ID: <4EDD3766.1010902@rogerbinns.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 (Note I'm a programmer type, not a math type and am doing coding directed by a matlab user.) I'm trying to do an fft on multiple columns of data at once (ultimately feeding into a correlation calculation). I can use fft() to work on one column: data=[23, 43, 53, 54, 0, 10] powtwo=8 # nearest power of two size numpy.fft.fft(data, powtwo) I want to do that but using fftn (the matlab user said it is the right function) but I can't work out from the docs or experimentation how the input data should be formatted. eg is it row major or column major. For example the above could be: data=[ [23, 43, 53, 54, 0, 10] ] or data=[ [23], [43], [53], [54], [0], [10] ] All the examples in the docs use square inputs (ie x and y axes are the same length) so that doesn't help. The documentation shows examples of the output, but not the input. I found code passing in a single int (not a list of int) as the s parameter, but that also gives me an error. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk7dN2YACgkQmOOfHg372QQ4YQCg4sKmtx8UAoEOuosWzUofw/KZ B5AAoKeHzP8HgpvDrXDANj0wqll5L9MO =iRAX -----END PGP SIGNATURE----- From Tim.Burgess at noaa.gov Mon Dec 5 16:42:01 2011 From: Tim.Burgess at noaa.gov (Tim Burgess) Date: Tue, 06 Dec 2011 07:42:01 +1000 Subject: [Numpy-discussion] numpy 1.7.0 release In-Reply-To: References: Message-ID: <5669DEDA-AC9D-4BE1-AC31-4587A04F76D5@noaa.gov> > > On Mon, Dec 5, 2011 at 9:13 PM, Charles R Harris > wrote: > >> >> >> On Mon, Dec 5, 2011 at 1:08 PM, Travis Oliphant wrote: >> >>> I like the idea. Is there resolution to the NA question? >>> >> >> No, people still disagree and are likely to do so for years to come with >> no end in sight. That's why the preview label. >> >> Agreed that it's not resolved, but I think we at least got to the point > where we agreed not to back out the complete missing data additions. So if > we clearly say that we keep all options for future API changes open > (=preview label), I don't think that the issue should hold up a numpy > release indefinitely. > > Ralf I think a release is a good idea. In addition to the previous points mentioned, having NA in as a preview in a 1.7.0 release will likely raise it's visibility - a lot of people will read release notes of a newer version but won't ever track discussions in a mailing list. Tim Burgess Software Engineer - Coral Reef Watch Satellite Applications and Research - NESDIS National Oceanic and Atmospheric Administration From cournape at gmail.com Mon Dec 5 17:19:28 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 5 Dec 2011 17:19:28 -0500 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <4EDD3766.1010902@rogerbinns.com> References: <4EDD3766.1010902@rogerbinns.com> Message-ID: On Mon, Dec 5, 2011 at 4:28 PM, Roger Binns wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > (Note I'm a programmer type, not a math type and am doing coding directed > by a matlab user.) > > I'm trying to do an fft on multiple columns of data at once (ultimately > feeding into a correlation calculation). ?I can use fft() to work on one > column: > > ?data=[23, 43, 53, 54, 0, 10] > ?powtwo=8 # nearest power of two size > ?numpy.fft.fft(data, powtwo) I am not I understand what you are trying to do ? numpy.fft.fft will compute fft on every *row*, or every column if you say pass axis=0 argument: numpy.fft.fft(data, 8, axis=0) # conceptually equivalent to the following for i in range(data.shape[1]): numpy.fft.fft(data[:, i], 8) # apply fft to each column separately fftn is for multi-dimensional fft, which is something else than doing a fft on every column, but this is true in matlab as well. cheers, David From questions.anon at gmail.com Mon Dec 5 17:29:43 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 6 Dec 2011 09:29:43 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: Maybe I am asking the wrong question or could go about this another way. I have thousands of numpy arrays to flick through, could I just identify which arrays have NAN's and for now ignore the entire array. is there a simple way to do this? any feedback will be greatly appreciated. On Thu, Dec 1, 2011 at 12:16 PM, questions anon wrote: > I am trying to calculate the mean across many netcdf files. I cannot use > numpy.mean because there are too many files to concatenate and I end up > with a memory error. I have enabled the below code to do what I need but I > have a few nan values in some of my arrays. Is there a way to ignore these > somewhere in my code. I seem to face this problem often so I would love a > command that ignores blanks in my array before I continue on to the next > processing step. > Any feedback is greatly appreciated. > > > netCDF_list=[] > for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + > '*/02/')+ glob.glob(MainFolder + '*/12/'): > for ncfile in glob.glob(dir + '*.nc'): > netCDF_list.append(ncfile) > > slice_counter=0 > print netCDF_list > > for filename in netCDF_list: > ncfile=netCDF4.Dataset(filename) > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > for i in xrange(0,len(TSFC)-1,1): > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, TSFC[i]) > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array(TSFC[i]) > > TSFC_avg=N.true_divide(running_sum, slice_counter) > N.set_printoptions(threshold='nan') > print "the TSFC_avg is:", TSFC_avg > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Dec 5 17:45:10 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 5 Dec 2011 17:45:10 -0500 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 5:29 PM, questions anon wrote: > Maybe I am asking the wrong question or could go about this another way. > I have thousands of numpy arrays to flick through, could I just identify > which arrays have NAN's and for now ignore the entire array. is there a > simple way to do this? Doing np.any(np.isnan(a)) for an array a should answer this exact question David From xabart at gmail.com Mon Dec 5 17:50:40 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Tue, 6 Dec 2011 09:50:40 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: Hi, I don't know if it is the best choice, but this is what I do in my code: for each slice: indexnonNaN=np.isfinite(SliceOf Toto) SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN] and then perform all operation I want o on the last array. i hope it does answer your question Xavier 2011/12/6 questions anon > Maybe I am asking the wrong question or could go about this another way. > I have thousands of numpy arrays to flick through, could I just identify > which arrays have NAN's and for now ignore the entire array. is there a > simple way to do this? > any feedback will be greatly appreciated. > > On Thu, Dec 1, 2011 at 12:16 PM, questions anon wrote: > >> I am trying to calculate the mean across many netcdf files. I cannot use >> numpy.mean because there are too many files to concatenate and I end up >> with a memory error. I have enabled the below code to do what I need but I >> have a few nan values in some of my arrays. Is there a way to ignore these >> somewhere in my code. I seem to face this problem often so I would love a >> command that ignores blanks in my array before I continue on to the next >> processing step. >> Any feedback is greatly appreciated. >> >> >> netCDF_list=[] >> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + >> '*/02/')+ glob.glob(MainFolder + '*/12/'): >> for ncfile in glob.glob(dir + '*.nc'): >> netCDF_list.append(ncfile) >> >> slice_counter=0 >> print netCDF_list >> >> for filename in netCDF_list: >> ncfile=netCDF4.Dataset(filename) >> TSFC=ncfile.variables['T_SFC'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> TSFC=MA.masked_values(TSFC, fillvalue) >> for i in xrange(0,len(TSFC)-1,1): >> slice_counter +=1 >> #print slice_counter >> try: >> running_sum=N.add(running_sum, TSFC[i]) >> except NameError: >> print "Initiating the running total of my >> variable..." >> running_sum=N.array(TSFC[i]) >> >> TSFC_avg=N.true_divide(running_sum, slice_counter) >> N.set_printoptions(threshold='nan') >> print "the TSFC_avg is:", TSFC_avg >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Dec 5 18:17:56 2011 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 5 Dec 2011 15:17:56 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 12:10 PM, Charles R Harris wrote: > Well, that's a tough one. Numpy development tends to attract folks with > spare time, i.e., students*, and those with an itch to scratch. Itched > scratched, degree obtained, they go back to their primary interest or on to > jobs and the rest of life. NumPy does seem to be different in this regard, in that many of the developers stick around (even if they're not active on the code any longer), think about potential issues and new directions, take part in discussions, teach at conferences, organise workshops, write, etc. I agree with Matthew that using a board should be a last resort, and mildly disagree with Perry that it would be better to have a single person make the final call. The advantage of a benevolent dictator is that you have a coherent driving vision, but at the cost of sacrificing community ownership. As for barriers to entry, improving the the nature of discourse on the mailing list (when it comes to thorny issues) would be good. Technical barriers are not that hard to breach for our community; setting the right social atmosphere is crucial. Regards St?fan From questions.anon at gmail.com Mon Dec 5 21:53:51 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 6 Dec 2011 13:53:51 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: Thanks for responding. I have tried several ways of adding the command, one of which is: for i in TSFC: if N.any(N.isnan(TSFC)): break else: pass but nothing is happening, is there some particular way I need to add this command? I have posted all below: netCDF_list=[] for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + '*/02/')+ glob.glob(MainFolder + '*/12/'): #print dir for ncfile in glob.glob(dir + '*.nc'): netCDF_list.append(ncfile) slice_counter=0 print netCDF_list for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) for a in TSFC: if N.any(N.isnan(TSFC)): break else: pass for i in xrange(0,len(TSFC)-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFC[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFC[i]) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg On Tue, Dec 6, 2011 at 9:45 AM, David Cournapeau wrote: > On Mon, Dec 5, 2011 at 5:29 PM, questions anon > wrote: > > Maybe I am asking the wrong question or could go about this another way. > > I have thousands of numpy arrays to flick through, could I just identify > > which arrays have NAN's and for now ignore the entire array. is there a > > simple way to do this? > > Doing np.any(np.isnan(a)) for an array a should answer this exact question > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Mon Dec 5 22:06:07 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 6 Dec 2011 14:06:07 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: I have also tried Xavier's suggestion but only end up with one value as my average (instead of an array). I used: for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFC=SliceofTotoWithoutNan entire script: netCDF_list=[] for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + '*/02/')+ glob.glob(MainFolder + '*/12/'): #print dir for ncfile in glob.glob(dir + '*.nc'): netCDF_list.append(ncfile) slice_counter=0 print netCDF_list for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFC=SliceofTotoWithoutNan for i in xrange(0,len(TSFC)-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFC[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFC[i]) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy wrote: > Hi, > I don't know if it is the best choice, but this is what I do in my code: > > for each slice: > indexnonNaN=np.isfinite(SliceOf Toto) > SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN] > > and then perform all operation I want o on the last array. > > i hope it does answer your question > > Xavier > > > 2011/12/6 questions anon > >> Maybe I am asking the wrong question or could go about this another way. >> I have thousands of numpy arrays to flick through, could I just identify >> which arrays have NAN's and for now ignore the entire array. is there a >> simple way to do this? >> any feedback will be greatly appreciated. >> >> On Thu, Dec 1, 2011 at 12:16 PM, questions anon > > wrote: >> >>> I am trying to calculate the mean across many netcdf files. I cannot use >>> numpy.mean because there are too many files to concatenate and I end up >>> with a memory error. I have enabled the below code to do what I need but I >>> have a few nan values in some of my arrays. Is there a way to ignore these >>> somewhere in my code. I seem to face this problem often so I would love a >>> command that ignores blanks in my array before I continue on to the next >>> processing step. >>> Any feedback is greatly appreciated. >>> >>> >>> netCDF_list=[] >>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + >>> '*/02/')+ glob.glob(MainFolder + '*/12/'): >>> for ncfile in glob.glob(dir + '*.nc'): >>> netCDF_list.append(ncfile) >>> >>> slice_counter=0 >>> print netCDF_list >>> >>> for filename in netCDF_list: >>> ncfile=netCDF4.Dataset(filename) >>> TSFC=ncfile.variables['T_SFC'][:] >>> fillvalue=ncfile.variables['T_SFC']._FillValue >>> TSFC=MA.masked_values(TSFC, fillvalue) >>> for i in xrange(0,len(TSFC)-1,1): >>> slice_counter +=1 >>> #print slice_counter >>> try: >>> running_sum=N.add(running_sum, TSFC[i]) >>> except NameError: >>> print "Initiating the running total of my >>> variable..." >>> running_sum=N.array(TSFC[i]) >>> >>> TSFC_avg=N.true_divide(running_sum, slice_counter) >>> N.set_printoptions(threshold='nan') >>> print "the TSFC_avg is:", TSFC_avg >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > ? Quand le gouvernement viole les droits du peuple, l'insurrection est, > pour le peuple et pour chaque portion du peuple, le plus sacr? des droits > et le plus indispensable des devoirs ? > > D?claration des droits de l'homme et du citoyen, article 35, 1793 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xabart at gmail.com Mon Dec 5 22:31:31 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Tue, 6 Dec 2011 14:31:31 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: Well, I would see solutions: 1- to keep how your code is, withj a python list (you can stack numpy arrays if they have the same dimensions): for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) TSFCWithOutNan=[] for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFCWithOutNan .append( SliceofTotoWithoutNan ) for i in xrange(0,len(TSFCWithOutNan )-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFCWithOutNan [i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFCWithOutNan [i]) ... or 2- everything in the same loop: slice_counter =0 for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, SliceofTotoWithoutNan ) except NameError: print "Initiating the running total of my variable..." running_sum=N.array( SliceofTotoWithoutNan ) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg See if it works. it is just a rapid guess Xavier for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + '*/02/')+ glob.glob(MainFolder + '*/12/'): > #print dir > > for ncfile in glob.glob(dir + '*.nc'): > netCDF_list.append(ncfile) > > slice_counter=0 > print netCDF_list > for filename in netCDF_list: > ncfile=netCDF4.Dataset(filename) > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > for a in TSFC: > indexnonNaN=N.isfinite(a) > SliceofTotoWithoutNan=a[indexnonNaN] > print SliceofTotoWithoutNan > TSFC=SliceofTotoWithoutNan > > > for i in xrange(0,len(TSFC)-1,1): > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, TSFC[i]) > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array(TSFC[i]) > > TSFC_avg=N.true_divide(running_sum, slice_counter) > N.set_printoptions(threshold='nan') > print "the TSFC_avg is:", TSFC_avg > > > > > On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy wrote: > >> Hi, >> I don't know if it is the best choice, but this is what I do in my code: >> >> for each slice: >> indexnonNaN=np.isfinite(SliceOf Toto) >> SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN] >> >> and then perform all operation I want o on the last array. >> >> i hope it does answer your question >> >> Xavier >> >> >> 2011/12/6 questions anon >> >>> Maybe I am asking the wrong question or could go about this another way. >>> I have thousands of numpy arrays to flick through, could I just identify >>> which arrays have NAN's and for now ignore the entire array. is there a >>> simple way to do this? >>> any feedback will be greatly appreciated. >>> >>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon < >>> questions.anon at gmail.com> wrote: >>> >>>> I am trying to calculate the mean across many netcdf files. I cannot >>>> use numpy.mean because there are too many files to concatenate and I end up >>>> with a memory error. I have enabled the below code to do what I need but I >>>> have a few nan values in some of my arrays. Is there a way to ignore these >>>> somewhere in my code. I seem to face this problem often so I would love a >>>> command that ignores blanks in my array before I continue on to the next >>>> processing step. >>>> Any feedback is greatly appreciated. >>>> >>>> >>>> netCDF_list=[] >>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + >>>> '*/02/')+ glob.glob(MainFolder + '*/12/'): >>>> for ncfile in glob.glob(dir + '*.nc'): >>>> netCDF_list.append(ncfile) >>>> >>>> slice_counter=0 >>>> print netCDF_list >>>> >>>> for filename in netCDF_list: >>>> ncfile=netCDF4.Dataset(filename) >>>> TSFC=ncfile.variables['T_SFC'][:] >>>> fillvalue=ncfile.variables['T_SFC']._FillValue >>>> TSFC=MA.masked_values(TSFC, fillvalue) >>>> for i in xrange(0,len(TSFC)-1,1): >>>> slice_counter +=1 >>>> #print slice_counter >>>> try: >>>> running_sum=N.add(running_sum, TSFC[i]) >>>> except NameError: >>>> print "Initiating the running total of my >>>> variable..." >>>> running_sum=N.array(TSFC[i]) >>>> >>>> TSFC_avg=N.true_divide(running_sum, slice_counter) >>>> N.set_printoptions(threshold='nan') >>>> print "the TSFC_avg is:", TSFC_avg >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> >> -- >> ? Quand le gouvernement viole les droits du peuple, l'insurrection est, >> pour le peuple et pour chaque portion du peuple, le plus sacr? des droits >> et le plus indispensable des devoirs ? >> >> D?claration des droits de l'homme et du citoyen, article 35, 1793 >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Mon Dec 5 23:27:37 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 6 Dec 2011 15:27:37 +1100 Subject: [Numpy-discussion] ignore NAN in numpy.true_divide() In-Reply-To: References: Message-ID: thanks again for you response. I must still be doing something wrong!! both options resulted in : the TSFC_avg is: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 1st option: slice_counter=0 for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) TSFCWithOutNan=[] for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFCWithOutNan.append(SliceofTotoWithoutNan) for i in xrange(0,len(TSFCWithOutNan)-1,1): slice_counter +=1 try: running_sum=N.add(running_sum, TSFCWithOutNan[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFCWithOutNan[i]) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg the 2nd option : for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) slice_counter=0 for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] slice_counter +=1 try: running_sum=N.add(running_sum, SliceofTotoWithoutNan) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(SliceofTotoWithoutNan) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg On Tue, Dec 6, 2011 at 2:31 PM, Xavier Barthelemy wrote: > Well, I would see solutions: > 1- to keep how your code is, withj a python list (you can stack numpy > arrays if they have the same dimensions): > > for filename in netCDF_list: > ncfile=netCDF4.Dataset(filename) > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > TSFCWithOutNan=[] > for a in TSFC: > indexnonNaN=N.isfinite(a) > SliceofTotoWithoutNan=a[indexnonNaN] > print SliceofTotoWithoutNan > TSFCWithOutNan .append( SliceofTotoWithoutNan ) > > > > for i in xrange(0,len(TSFCWithOutNan )-1,1): > > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, > TSFCWithOutNan [i]) > > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array(TSFCWithOutNan [i]) > ... > > or 2- everything in the same loop: > > slice_counter =0 > for a in TSFC: > indexnonNaN=N.isfinite(a) > SliceofTotoWithoutNan=a[indexnonNaN] > slice_counter +=1 > #print slice_counter > try: > running_sum=N.add(running_sum, > SliceofTotoWithoutNan ) > > except NameError: > print "Initiating the running total of my > variable..." > running_sum=N.array( SliceofTotoWithoutNan > ) > TSFC_avg=N.true_divide(running_sum, slice_counter) > N.set_printoptions(threshold='nan') > print "the TSFC_avg is:", TSFC_avg > > See if it works. it is just a rapid guess > Xavier > > > for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + > '*/02/')+ glob.glob(MainFolder + '*/12/'): > >> #print dir >> >> for ncfile in glob.glob(dir + '*.nc'): >> netCDF_list.append(ncfile) >> >> slice_counter=0 >> print netCDF_list >> for filename in netCDF_list: >> ncfile=netCDF4.Dataset(filename) >> TSFC=ncfile.variables['T_SFC'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> TSFC=MA.masked_values(TSFC, fillvalue) >> for a in TSFC: >> indexnonNaN=N.isfinite(a) >> SliceofTotoWithoutNan=a[indexnonNaN] >> print SliceofTotoWithoutNan >> TSFC=SliceofTotoWithoutNan >> >> >> for i in xrange(0,len(TSFC)-1,1): >> slice_counter +=1 >> #print slice_counter >> try: >> running_sum=N.add(running_sum, TSFC[i]) >> except NameError: >> print "Initiating the running total of my >> variable..." >> running_sum=N.array(TSFC[i]) >> >> TSFC_avg=N.true_divide(running_sum, slice_counter) >> N.set_printoptions(threshold='nan') >> print "the TSFC_avg is:", TSFC_avg >> >> >> >> >> On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy wrote: >> >>> Hi, >>> I don't know if it is the best choice, but this is what I do in my code: >>> >>> for each slice: >>> indexnonNaN=np.isfinite(SliceOf Toto) >>> SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN] >>> >>> and then perform all operation I want o on the last array. >>> >>> i hope it does answer your question >>> >>> Xavier >>> >>> >>> 2011/12/6 questions anon >>> >>>> Maybe I am asking the wrong question or could go about this another >>>> way. >>>> I have thousands of numpy arrays to flick through, could I just >>>> identify which arrays have NAN's and for now ignore the entire array. is >>>> there a simple way to do this? >>>> any feedback will be greatly appreciated. >>>> >>>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon < >>>> questions.anon at gmail.com> wrote: >>>> >>>>> I am trying to calculate the mean across many netcdf files. I cannot >>>>> use numpy.mean because there are too many files to concatenate and I end up >>>>> with a memory error. I have enabled the below code to do what I need but I >>>>> have a few nan values in some of my arrays. Is there a way to ignore these >>>>> somewhere in my code. I seem to face this problem often so I would love a >>>>> command that ignores blanks in my array before I continue on to the next >>>>> processing step. >>>>> Any feedback is greatly appreciated. >>>>> >>>>> >>>>> netCDF_list=[] >>>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + >>>>> '*/02/')+ glob.glob(MainFolder + '*/12/'): >>>>> for ncfile in glob.glob(dir + '*.nc'): >>>>> netCDF_list.append(ncfile) >>>>> >>>>> slice_counter=0 >>>>> print netCDF_list >>>>> >>>>> for filename in netCDF_list: >>>>> ncfile=netCDF4.Dataset(filename) >>>>> TSFC=ncfile.variables['T_SFC'][:] >>>>> fillvalue=ncfile.variables['T_SFC']._FillValue >>>>> TSFC=MA.masked_values(TSFC, fillvalue) >>>>> for i in xrange(0,len(TSFC)-1,1): >>>>> slice_counter +=1 >>>>> #print slice_counter >>>>> try: >>>>> running_sum=N.add(running_sum, TSFC[i]) >>>>> except NameError: >>>>> print "Initiating the running total of my >>>>> variable..." >>>>> running_sum=N.array(TSFC[i]) >>>>> >>>>> TSFC_avg=N.true_divide(running_sum, slice_counter) >>>>> N.set_printoptions(threshold='nan') >>>>> print "the TSFC_avg is:", TSFC_avg >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> >>> -- >>> ? Quand le gouvernement viole les droits du peuple, l'insurrection est, >>> pour le peuple et pour chaque portion du peuple, le plus sacr? des droits >>> et le plus indispensable des devoirs ? >>> >>> D?claration des droits de l'homme et du citoyen, article 35, 1793 >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > ? Quand le gouvernement viole les droits du peuple, l'insurrection est, > pour le peuple et pour chaque portion du peuple, le plus sacr? des droits > et le plus indispensable des devoirs ? > > D?claration des droits de l'homme et du citoyen, article 35, 1793 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnetotellurics at gmail.com Mon Dec 5 23:50:37 2011 From: magnetotellurics at gmail.com (kneil) Date: Mon, 5 Dec 2011 20:50:37 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> <32906553.post@talk.nabble.com> Message-ID: <32922174.post@talk.nabble.com> Hi Nathaniel, Thanks for the suggestion. I more or less implemented it: np.save('X',X); X2=np.load('X.npy') X2=np.asmatrix(X2) diffy = (X != X2) if diffy.any(): print X[diffy] print X2[diffy] print X[diffy][0].view(np.uint8) print X2[diffy][0].view(np.uint8) S=X*X.H/k S2=X2*X2.H/k nanElts=find(isnan(S)) if len(nanElts)!=0: print 'WARNING: Nans in S:'+str(find(isnan(S))) print 'WARNING: Nans in S2:'+str(find(isnan(S2))) My ouput, (when I got NaN) mostly indicated that both arrays are numerically identical, and that they evaluated to have the same nan-value entries. For example >>WARNING: Nans in S:[ 6 16] >>WARNING: Nans in S2:[ 6 16] Another time I got as output: >>WARNING: Nans in S:[ 26 36 46 54 64 72 82 92 100 110 128 138 146 156 166 174 184 192 202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342 352 360 370 380 388 398 416 426 434 444 454 464 474] >>WARNING: Nans in S2:[ 26 36 46 54 64 72 82 92 100 110 128 138 146 156 166 174 184 192 202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342 352 360 370 380 388 398 416 426 434 444 454 464 474] These were different arrays I think. At anyrate, those two results appeared from two runs of the exact same code. I do not use any random numbers in the code by the way. Most of the time the code runs without any nan showing up at all, so this is an improvement. *I am pretty sure that one time there were nan in S, but not in S2, yet still no difference was observed in the two matrices X and X2. But, I did not save that output, so I can't prove it to myself, ... but I am pretty sure I saw that. I will try and run memtest tonight. I am going out of town for a week and probably wont be able to test until next week. cheers, Karl I also think What was beyond w: 1. I have many less NaN than I used to, but still get NaN in S, but NOT in S2! Nathaniel Smith wrote: > > If save/load actually makes a reliable difference, then it would be useful > to do something like this, and see what you see: > > save("X", X) > X2 = load("X.npy") > diff = (X == X2) > # did save/load change anything? > any(diff) > # if so, then what changed? > X[diff] > X2[diff] > # any subtle differences in floating point representation? > X[diff][0].view(np.uint8) > X2[diff][0].view(np.uint8) > > (You should still run memtest. It's very easy - just install it with your > package manager, then reboot. Hold down the shift key while booting, and > you'll get a boot menu. Choose memtest, and then leave it to run > overnight.) > > - Nathaniel > On Dec 2, 2011 10:10 PM, "kneil" wrote: > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32922174.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From matthew.brett at gmail.com Tue Dec 6 00:32:48 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 5 Dec 2011 21:32:48 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: Hi, 2011/12/5 St?fan van der Walt : > As for barriers to entry, improving the the nature of discourse on the > mailing list (when it comes to thorny issues) would be good. > Technical barriers are not that hard to breach for our community; > setting the right social atmosphere is crucial. I'm just about to get on a plane and am going to be out of internet range for a while, so, in the spirit of constructive discussion: In the spirit of use-cases: Would it be fair to say that the two contentious recent discussions have been: The numpy ABI breakage, 2.0 vs 1.5.1 discussion The masked array discussion(s) ? What did we do wrong or right in each of these two discussions? What could we have done better? What process would help us to do better? Travis - for your board-only-post mailing list - my feeling is that this is going in the wrong direction. The effect of the board-only mailing list is to explicitly remove non-qualified people from the discussion. This will make it more explicit that the substantial decisions will be make by a few important people. Do you (Travis - or Mark?) think that, if this had happened earlier in the masked array discussion, it would have been less contentious, or had more substantial content? My instinct would be the reverse, and the best solution would have been to pause and commit to beating out the issues and getting agreement. See you, Matthew From xabart at gmail.com Tue Dec 6 00:53:09 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Tue, 6 Dec 2011 16:53:09 +1100 Subject: [Numpy-discussion] idea of optimisation? Message-ID: Hi everyone I was wondering if there is a more optimal way to write what follows: I am studying waves, so I have an array of wave crests positions, Xcrest and the positions of the ZeroCrossings, Xzeros. The goal is to find between which Xzeros my xcrest are. XXX1=XCrest CrestZerosNeighbour=np.zeros([len(XCrest),2], dtype='d') for nn in range(len(Xzeros)-1): X1=Xzeros[nn] X2=Xzeros[nn+1] indexxx1=np.where((X1<=XXX1) & (XXX1 < X2)) try: CrestZerosNeighbour[indexxx1[0]]=np.array([X1,X2]) except: pass Someone has an idea? in the spirit of (numpy.ma.masked_outside) which does exactly the opposite I want: it masks an array outside an interval. I would like to mask everything except the interval that contains my value. I do this operation a large number of times , and a loop is time consuming. thanks Xavier -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.froger at gmail.com Tue Dec 6 02:38:16 2011 From: david.froger at gmail.com (David Froger) Date: Tue, 06 Dec 2011 08:38:16 +0100 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: References: Message-ID: <1323155858-sup-8509@david-desktop> Excerpts from Xavier Barthelemy's message of mar. d?c. 06 06:53:09 +0100 2011: > Hi everyone > > I was wondering if there is a more optimal way to write what follows: > I am studying waves, so I have an array of wave crests positions, Xcrest > and the positions of the ZeroCrossings, Xzeros. > > The goal is to find between which Xzeros my xcrest are. > > > XXX1=XCrest > CrestZerosNeighbour=np.zeros([len(XCrest),2], dtype='d') > for nn in range(len(Xzeros)-1): > X1=Xzeros[nn] > X2=Xzeros[nn+1] > indexxx1=np.where((X1<=XXX1) & (XXX1 < X2)) > try: > CrestZerosNeighbour[indexxx1[0]]=np.array([X1,X2]) > except: > pass > > Someone has an idea? in the spirit of (numpy.ma.masked_outside) which does > exactly the opposite I want: it masks an array outside an interval. I would > like to mask everything except the interval that contains my value. > I do this operation a large number of times , and a loop is time consuming. Hi, My first idea would be to write a function in C or Fortran that return Xzeros index (instead of values). Algorithms may be optimized according to the inputs: if the XCrest are Xzeros sorted, if len(Xcreast) >> len(Xzeros) using dichotomy... But I would be interested to see a solution with masked array too. -- From rogerb at rogerbinns.com Tue Dec 6 02:49:48 2011 From: rogerb at rogerbinns.com (Roger Binns) Date: Mon, 05 Dec 2011 23:49:48 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: References: <4EDD3766.1010902@rogerbinns.com> Message-ID: <4EDDC91C.3070307@rogerbinns.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/12/11 14:19, David Cournapeau wrote: > I am not I understand what you are trying to do ? I had a slight misunderstanding with the math guy and had believed that for our purposes we could feed in 16 columns and get one "column" of fft output. However we do actually need 16 columns of output, each corresponding to a column of input. It seems he is obsessed with optimisation and apparently when calculating an fft of a known size it would save some redundant calculations operating on all 16 columns at once rather than doing them one at a time. That is what he assumed fftn did from the description. > numpy.fft.fft will compute fft on every *row*, or every column if you > say pass axis=0 argument: Note that I am using regular Python lists (they were JSON at one point) and the fft documentation is incomprehensible to someone who hasn't used numpy before and only cares about fft (there are a lot of matches for Google searches about fft and python pointing to numpy). The doc doesn't actually say what axis is and doesn't have an example. Additionally a "shape" attribute is used which is peculiar to whatever numpy uses as its data representation. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk7dyRwACgkQmOOfHg372QToxgCfR7IoUfgGQVZEEiElnjbtx7yx R8EAnRfDg4y7AfFeSA8sQxVCq6ucgRG1 =gg2h -----END PGP SIGNATURE----- From xabart at gmail.com Tue Dec 6 02:51:22 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Tue, 6 Dec 2011 18:51:22 +1100 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: <1323155858-sup-8509@david-desktop> References: <1323155858-sup-8509@david-desktop> Message-ID: ok let me be more precise I have an Z array which is the elevation from this I extract a discrete array of Zero Crossing, and another discrete array of Crests. len(crest) is different than len(Xzeros). I have a threshold method to detect my "valid" crests, and sometimes there are 2 crests between two zero-crossing (grouping effect) Crest and Zeros are 2 different arrays, with positions. example: Zeros=[1,2,3,4] Arrays=[1.5,1.7,3.5] and yes arrays can be sorted. not a problm with this. Xavier 2011/12/6 David Froger > Excerpts from Xavier Barthelemy's message of mar. d?c. 06 06:53:09 +0100 > 2011: > > Hi everyone > > > > I was wondering if there is a more optimal way to write what follows: > > I am studying waves, so I have an array of wave crests positions, Xcrest > > and the positions of the ZeroCrossings, Xzeros. > > > > The goal is to find between which Xzeros my xcrest are. > > > > > > XXX1=XCrest > > CrestZerosNeighbour=np.zeros([len(XCrest),2], dtype='d') > > for nn in range(len(Xzeros)-1): > > X1=Xzeros[nn] > > X2=Xzeros[nn+1] > > indexxx1=np.where((X1<=XXX1) & (XXX1 < X2)) > > try: > > CrestZerosNeighbour[indexxx1[0]]=np.array([X1,X2]) > > except: > > pass > > > > Someone has an idea? in the spirit of (numpy.ma.masked_outside) which > does > > exactly the opposite I want: it masks an array outside an interval. I > would > > like to mask everything except the interval that contains my value. > > I do this operation a large number of times , and a loop is time > consuming. > > Hi, > > My first idea would be to write a function in C or Fortran that return > Xzeros > index (instead of values). Algorithms may be optimized according to the > inputs: > if the XCrest are Xzeros sorted, if len(Xcreast) >> len(Xzeros) > using > dichotomy... But I would be interested to see a solution with masked > array too. > > -- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.froger at gmail.com Tue Dec 6 03:54:03 2011 From: david.froger at gmail.com (David Froger) Date: Tue, 06 Dec 2011 09:54:03 +0100 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: References: <1323155858-sup-8509@david-desktop> Message-ID: <1323161307-sup-7696@david-desktop> Excerpts from Xavier Barthelemy's message of mar. d?c. 06 08:51:22 +0100 2011: > ok let me be more precise > > I have an Z array which is the elevation > from this I extract a discrete array of Zero Crossing, and another discrete > array of Crests. > len(crest) is different than len(Xzeros). I have a threshold method to > detect my "valid" crests, and sometimes there are 2 crests between two > zero-crossing (grouping effect) > > Crest and Zeros are 2 different arrays, with positions. example: > Zeros=[1,2,3,4] Arrays=[1.5,1.7,3.5] Thanks for the precision. My suggestion was to consider the alternative of rewriting the critical time consuming part of the code (the function that take XCrest and Xzeros as input and return CrestZerosNeighbour) in C or Fortran, this function si then wrapped into Python using Swig, or F2py, or Cython, or Weave. I think this is a typical case where this is usefull. Advantage is that in the C or Fortran function, you can code directly the algorithm you want and have an maximal optimization. Drawback is that you loose the simplicity of pure Python, you need to manage 2 languages and a tool to connect them... -- From pav at iki.fi Tue Dec 6 04:58:48 2011 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 06 Dec 2011 10:58:48 +0100 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <4EDDC91C.3070307@rogerbinns.com> References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> Message-ID: 06.12.2011 08:49, Roger Binns kirjoitti: > Note that I am using regular Python lists (they were JSON at one point) > and the fft documentation is incomprehensible to someone who hasn't used > numpy before and only cares about fft (there are a lot of matches for > Google searches about fft and python pointing to numpy). > > The doc doesn't actually say what axis is and doesn't have an example. > Additionally a "shape" attribute is used which is peculiar to whatever > numpy uses as its data representation. I think this cannot be helped --- it does not make sense to explain basic Numpy concepts in every docstring, especially `axis` and `shape` are very common. However, an example with the axis keyword could be useful. -- Pauli Virtanen From xabart at gmail.com Tue Dec 6 07:24:40 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Tue, 6 Dec 2011 23:24:40 +1100 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: <1323161307-sup-7696@david-desktop> References: <1323155858-sup-8509@david-desktop> <1323161307-sup-7696@david-desktop> Message-ID: Yes I understood what you said. I know these tools, and I am using them. I was just wandering if someone has a more one-liner-pythonic way to do it. I don't think it's worth importing a new fortran module. Thanks anyway :) Xavier 2011/12/6 David Froger > Excerpts from Xavier Barthelemy's message of mar. d?c. 06 08:51:22 +0100 > 2011: > > ok let me be more precise > > > > I have an Z array which is the elevation > > from this I extract a discrete array of Zero Crossing, and another > discrete > > array of Crests. > > len(crest) is different than len(Xzeros). I have a threshold method to > > detect my "valid" crests, and sometimes there are 2 crests between two > > zero-crossing (grouping effect) > > > > Crest and Zeros are 2 different arrays, with positions. example: > > Zeros=[1,2,3,4] Arrays=[1.5,1.7,3.5] > > Thanks for the precision. My suggestion was to consider the > alternative of > rewriting the critical time consuming part of the code (the function > that take > XCrest and Xzeros as input and return CrestZerosNeighbour) in C or > Fortran, this > function si then wrapped into Python using Swig, or F2py, or Cython, or > Weave. > I think this is a typical case where this is usefull. Advantage is that > in the > C or Fortran function, you can code directly the algorithm you want and > have an > maximal optimization. Drawback is that you loose the simplicity of pure > Python, > you need to manage 2 languages and a tool to connect them... > -- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 6 07:45:39 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 6 Dec 2011 07:45:39 -0500 Subject: [Numpy-discussion] adding unsigned int and int Message-ID: Hi, Is this intended? [~/] [1]: np.result_type(np.uint, np.int) [1]: dtype('float64') [~/] [2]: np.version.version [2]: '2.0.0.dev-aded70c' Skipper From matthew.brett at gmail.com Tue Dec 6 07:53:08 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 6 Dec 2011 04:53:08 -0800 Subject: [Numpy-discussion] adding unsigned int and int In-Reply-To: References: Message-ID: Hi, On Tue, Dec 6, 2011 at 4:45 AM, Skipper Seabold wrote: > Hi, > > Is this intended? > > [~/] > [1]: np.result_type(np.uint, np.int) > [1]: dtype('float64') I would guess so - if your system ints are 64 bit. int64 can't contain the range for uint64, nor can uint64 contain all int64, If there had been a larger int type, it would promote to int, I believe. At least on my system: In [4]: np.result_type(np.int32, np.uint32) Out[4]: dtype('int64') Best, Matthew From jsseabold at gmail.com Tue Dec 6 07:55:30 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 6 Dec 2011 07:55:30 -0500 Subject: [Numpy-discussion] adding unsigned int and int In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 7:53 AM, Matthew Brett wrote: > Hi, > > On Tue, Dec 6, 2011 at 4:45 AM, Skipper Seabold wrote: >> Hi, >> >> Is this intended? >> >> [~/] >> [1]: np.result_type(np.uint, np.int) >> [1]: dtype('float64') > > I would guess so - if your system ints are 64 bit. ?int64 can't > contain the range for uint64, nor can uint64 contain all int64, ?If > there had been a larger int type, it would promote to int, I believe. > At least on my system: > > In [4]: np.result_type(np.int32, np.uint32) > Out[4]: dtype('int64') > Makes sense. Thanks, Skipper From rogerb at rogerbinns.com Tue Dec 6 11:32:24 2011 From: rogerb at rogerbinns.com (Roger Binns) Date: Tue, 06 Dec 2011 08:32:24 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> Message-ID: <4EDE4398.90208@rogerbinns.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/12/11 01:58, Pauli Virtanen wrote: > I think this cannot be helped --- it does not make sense to explain > basic Numpy concepts in every docstring, especially `axis` and `shape` > are very common. They don't need to be explained on the page, but instead link to a page that does explain them. The test is that an experienced Python programmer should be able to understand what is going on from the fft doc page and every page it links to. Note that searching doesn't help: http://docs.scipy.org/doc/numpy/search.html?q=axis http://docs.scipy.org/doc/numpy/search.html?q=shape > However, an example with the axis keyword could be useful. And examples not using "square" inputs. With an input that has a long edge (eg time) and a short edge (value(s)) it is a lot easier to understand what is going on. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk7eQ5gACgkQmOOfHg372QQpzQCeNO7evHqLaKNNQIzHbDLY3RVD HmwAn1MGH6DdJND5lKINwAFXzLbDry5J =xycq -----END PGP SIGNATURE----- From tsyu80 at gmail.com Tue Dec 6 12:21:50 2011 From: tsyu80 at gmail.com (Tony Yu) Date: Tue, 6 Dec 2011 12:21:50 -0500 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: References: <1323155858-sup-8509@david-desktop> Message-ID: On Tue, Dec 6, 2011 at 2:51 AM, Xavier Barthelemy wrote: > ok let me be more precise > > I have an Z array which is the elevation > from this I extract a discrete array of Zero Crossing, and another > discrete array of Crests. > len(crest) is different than len(Xzeros). I have a threshold method to > detect my "valid" crests, and sometimes there are 2 crests between two > zero-crossing (grouping effect) > > Crest and Zeros are 2 different arrays, with positions. example: > Zeros=[1,2,3,4] Arrays=[1.5,1.7,3.5] > > > and yes arrays can be sorted. not a problm with this. > > Xavier > > I may be oversimplifying this, but does searchsorted do what you want? In [314]: xzeros=[1,2,3,4]; xcrests=[1.5,1.7,3.5] In [315]: np.searchsorted(xzeros, xcrests) Out[315]: array([1, 1, 3]) This returns the indexes of xzeros to the left of xcrests. -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From kmichael.aye at gmail.com Tue Dec 6 12:54:05 2011 From: kmichael.aye at gmail.com (K.-Michael Aye) Date: Tue, 6 Dec 2011 18:54:05 +0100 Subject: [Numpy-discussion] howto store 2D function values and their grid points Message-ID: Dear all, I can't wrap my head around this. Mathematically it's not hard, I just don't know how to store and access it without many loops. I have a function f(x,y). I would like to calculate it at x = arange(20,101,20) and y = arange(2,30,2) How do I store that in a multi-dimensional array and preserve the grid points where I did the calculation so that I could plot later groups of function plots like this: for elem in y-values plot(x_values, f(x,y=elem) or for elem in x-values: plot(y_values, f(x=elem,y) I can smell that the solution can not be so hard, but I currently understand how to keep the points where I did the evaluation of the function. Maybe it is also imaginable to do something with functional tricks as 'map'? Thanks for any suggestions! Best regards, Michael From Chris.Barker at noaa.gov Tue Dec 6 12:59:07 2011 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue, 06 Dec 2011 09:59:07 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <4EDE4398.90208@rogerbinns.com> References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> <4EDE4398.90208@rogerbinns.com> Message-ID: <4EDE57EB.7020401@noaa.gov> On 12/6/2011 8:32 AM, Roger Binns wrote: >> I think this cannot be helped --- it does not make sense to explain >> basic Numpy concepts in every docstring, especially `axis` and `shape` >> are very common. > > They don't need to be explained on the page, but instead link to a page > that does explain them. That would be a lot of links -- I understand your frustration, but fft is not a stand-alone fft lib -- it is a numpy module, tehre is litle choice but to understand a bit of numpy to use it. and "shape" and "axis" are quite core to numpy. > The test is that an experienced Python programmer > should be able to understand what is going on from the fft doc page and > every page it links to. that's not a reasonable expectation, sorry. I doubt you'd be able to use matlab's fft functions with no knowledge of MATLAB, either. > And examples not using "square" inputs. yes, good idea -- when I'm testing things, I always make the point of using non-square examples, so it's clear which asix is which. By the way, if you are getting JSON (from a web service?), converting to liats, converting to numpy arrays, then fft-ing (then maybe converting all that back?) I doubt that the performance of the ffts will be your bottleneck -- I'd write it the easiest way you can -- loops will be fine. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Dec 6 13:02:39 2011 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue, 06 Dec 2011 10:02:39 -0800 Subject: [Numpy-discussion] howto store 2D function values and their grid points In-Reply-To: References: Message-ID: <4EDE58BF.6090502@noaa.gov> On 12/6/2011 9:54 AM, K.-Michael Aye wrote: > I have a function f(x,y). > > I would like to calculate it at x = arange(20,101,20) and y = arange(2,30,2) > > How do I store that in a multi-dimensional array and preserve the grid > points where I did the calculation In [5]: X, Y = np.meshgrid(range(3), range(4)) In [6]: X Out[6]: array([[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]) In [7]: Y Out[7]: array([[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]]) now do your f(X, Y) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rogerb at rogerbinns.com Tue Dec 6 13:17:52 2011 From: rogerb at rogerbinns.com (Roger Binns) Date: Tue, 06 Dec 2011 10:17:52 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <4EDE57EB.7020401@noaa.gov> References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> <4EDE4398.90208@rogerbinns.com> <4EDE57EB.7020401@noaa.gov> Message-ID: <4EDE5C50.6040008@rogerbinns.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/12/11 09:59, Chris Barker wrote: > That would be a lot of links Huh? The current page defines axis in terms of axis so if you don't know what it is you have to look it up. The search is completely useless as I showed. You want using numpy to be some sort of secret one has to work hard at? What I was asking for was that the word axis in the doc page links to a page that explains terms like axis and shape as numpy uses them. > By the way, if you are getting JSON (from a web service?), converting > to liats, converting to numpy arrays, then fft-ing (then maybe > converting all that back?) I don't know what a liat is and I'm not converting to numpy arrays. I pass in regular Python lists and treat the result as though it was a regular Python list. I need to do 16 ffts with columns of data all the same length. That was why the interest in fftn (misplaced). Someone else mentioned that fft can actually do multiple ffts in one go and the math guy mentioned that it should be more efficient that way. At the moment I do indeed use a for loop with 16 iterations. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk7eXFAACgkQmOOfHg372QQlRwCdGtE7+yhBX6Ng/gQp24/M76n7 BVoAn28GtkQ5KKxENJLJwD4alAWUWhB+ =myKN -----END PGP SIGNATURE----- From jniehof at lanl.gov Tue Dec 6 13:41:30 2011 From: jniehof at lanl.gov (Jonathan T. Niehof) Date: Tue, 06 Dec 2011 11:41:30 -0700 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: <4EDE61DA.2030608@lanl.gov> Travis Oliphant wrote: > My initial thoughts: I don't have a horse in this race, but I do suggest people read Karl Fogel's book before too much designing of governance structure: http://producingoss.com/ (alas, it's not short, but it's a fairly easy read and you can get convenient dead-tree or ebook editions.) -- Jonathan Niehof ISR-3 Space Data Systems Los Alamos National Laboratory MS-D466 Los Alamos, NM 87545 Phone: 505-667-9595 email: jniehof at lanl.gov Correspondence / Technical data or Software Publicly Available From paul.anton.letnes at gmail.com Tue Dec 6 13:45:23 2011 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Tue, 6 Dec 2011 19:45:23 +0100 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <4EDE4398.90208@rogerbinns.com> References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> <4EDE4398.90208@rogerbinns.com> Message-ID: <1BB94CB5-01B9-4F95-86B1-4DE8F50D1320@gmail.com> On 6. des. 2011, at 17:32, Roger Binns wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 06/12/11 01:58, Pauli Virtanen wrote: >> I think this cannot be helped --- it does not make sense to explain >> basic Numpy concepts in every docstring, especially `axis` and `shape` >> are very common. > > They don't need to be explained on the page, but instead link to a page > that does explain them. The test is that an experienced Python programmer > should be able to understand what is going on from the fft doc page and > every page it links to. Note that searching doesn't help: First google hit, "numpy axis": http://www.scipy.org/Tentative_NumPy_Tutorial Under "Basics", third and forth sentences: """ NumPy's main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In Numpy dimensions are called axes. The number of axes is rank. """ Searching is useful indeed, just use google instead. You can do the "several column FFT" easily using the axes keyword. Have a look at that link. If you are using numpy functions like FFT, you are indeed converting to arrays. Python lists are poorly performing for numerical work and do not map well onto the C and Fortran based codes that give numpy and scipy their good performance. Hence, numpy will convert to arrays for you, and if you are treating the returned array as a list, you are probably (among other things) indexing it in an inefficient way. It would seem to me that if your obsession over the FFT must be misplaced. If you are passing multidimensional lists here and there, you must have a lot of overhead memory use. Since the FFT is so efficient, O(n log (n)), I'm guessing that it's not your bottleneck (as others have suggested). Try using the cProfile module (python stdlib) to profile your code, and verify where your bottlenecks are. Come back to this list for more advice when you have evidence for where your bottleneck actually is. As a side note: since the built-in search isn't really all that good, would it be possible to put a customized google search box there instead? Good luck Paul From rogerb at rogerbinns.com Tue Dec 6 15:24:19 2011 From: rogerb at rogerbinns.com (Roger Binns) Date: Tue, 06 Dec 2011 12:24:19 -0800 Subject: [Numpy-discussion] What does fftn take as parameters? In-Reply-To: <1BB94CB5-01B9-4F95-86B1-4DE8F50D1320@gmail.com> References: <4EDD3766.1010902@rogerbinns.com> <4EDDC91C.3070307@rogerbinns.com> <4EDE4398.90208@rogerbinns.com> <1BB94CB5-01B9-4F95-86B1-4DE8F50D1320@gmail.com> Message-ID: <4EDE79F3.60607@rogerbinns.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/12/11 10:45, Paul Anton Letnes wrote: > As a side note: since the built-in search isn't really all that good, > would it be possible to put a customized google search box there > instead? It is easy since Sphinx is being used. Copy searchbox.html from the sphinx basic theme into the templates directory and alter the form to use Google custom search instead. > First google hit, "numpy axis": > http://www.scipy.org/Tentative_NumPy_Tutorial It is still somewhat confusing but I guess trial and error will work it out. It isn't at all obvious that you can do multiple FFTs at once from the doc as everything is written in the singular. > If you are using numpy functions like FFT, you are indeed converting to > arrays. I have no issue with that. The data comes in over the network and is supplied to the fft function as input so it doesn't really make any difference whether I convert it into the numpy array type or the fft function does internally - it is still only done once. > and if you are treating the returned array as a list, you are probably > (among other things) indexing it in an inefficient way. I iterate over it once feeding the values into a separate calculation. > If you are passing multidimensional lists here and there, you must have > a lot of overhead memory use. The data structures are less than 10MB in Python's internal PyObject representation. The machines being used have 32GB of memory. There is only one explicit or implicit conversion between different formats. > I'm guessing that it's not your bottleneck (as others have suggested). > I'm doing work directed by a matlab/embedded developer and he is obsessed with performance. I'm trying to interpret his advice to do all the ffts at once in the context of the numpy fft apis. If you look at the fftw apis they go on about plans and reusing them so I assumed there was something to all this :-) I'm currently rewriting the non-FFT bits in C because they were far too slow. It is competing against a pure integer C algorithm that gives the same output but has an O(n^2) time. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk7eefMACgkQmOOfHg372QRAwACfXFK34pCgv32Xn3eri9FJbvis +XYAoNcUCILHVcQ6ri353Kp0YPrw0EbS =Z0KE -----END PGP SIGNATURE----- From ralf.gommers at googlemail.com Tue Dec 6 16:11:29 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 6 Dec 2011 22:11:29 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers wrote: > Hi all, > > It's been a little over 6 months since the release of 1.6.0 and the NA > debate has quieted down, so I'd like to ask your opinion on the timing of > 1.7.0. It looks to me like we have a healthy amount of bug fixes and small > improvements, plus three larger chucks of work: > > - datetime > - NA > - Bento support > > My impression is that both datetime and NA are releasable, but should be > labeled "tech preview" or something similar, because they may still see > significant changes. Please correct me if I'm wrong. > > There's still some maintenance work to do and pull requests to merge, but > a beta release by Christmas should be feasible. To be a bit more detailed here, these are the most significant pull requests / patches that I think can be merged with a limited amount of work: meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 sample_from function: https://github.com/numpy/numpy/pull/151 loadtable function: https://github.com/numpy/numpy/pull/143 Other maintenance things: - un-deprecate putmask - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." - fix failing einsum and polyfit tests - update release notes Cheers, Ralf What do you all think? > > Cheers, > Ralf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Dec 6 17:13:43 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 6 Dec 2011 17:13:43 -0500 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 4:11 PM, Ralf Gommers wrote: > > > On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers > wrote: >> >> Hi all, >> >> It's been a little over 6 months since the release of 1.6.0 and the NA >> debate has quieted down, so I'd like to ask your opinion on the timing of >> 1.7.0. It looks to me like we have a healthy amount of bug fixes and small >> improvements, plus three larger chucks of work: >> >> - datetime >> - NA >> - Bento support >> >> My impression is that both datetime and NA are releasable, but should be >> labeled "tech preview" or something similar, because they may still see >> significant changes. Please correct me if I'm wrong. >> >> There's still some maintenance work to do and pull requests to merge, but >> a beta release by Christmas should be feasible. > > > To be a bit more detailed here, these are the most significant pull requests > / patches that I think can be merged with a limited amount of work: > meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 > sample_from function: https://github.com/numpy/numpy/pull/151 > loadtable function: https://github.com/numpy/numpy/pull/143 > > Other maintenance things: > - un-deprecate putmask > - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." > - fix failing einsum and polyfit tests > - update release notes > > Cheers, > Ralf > > >> What do you all think? >> >> >> Cheers, >> Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > This isn't the place for this discussion but we should start talking about building a *high performance* flat file loading solution with good column type inference and sensible defaults, etc. It's clear that loadtable is aiming for highest compatibility-- for example I can read a 2800x30 file in < 50 ms with the read_table / read_csv functions I wrote myself recent in Cython (compared with loadtable taking > 1s as quoted in the pull request), but I don't handle European decimal formats and lots of other sources of unruliness. I personally don't believe in sacrificing an order of magnitude of performance in the 90% case for the 10% case-- so maybe it makes sense to have two functions around: a superfast custom CSV reader for well-behaved data, and a slower, but highly flexible, function like loadtable to fall back on. I think R has two functions read.csv and read.csv2, where read.csv2 is capable of dealing with things like European decimal format. - Wes From olegmikul at gmail.com Tue Dec 6 17:31:21 2011 From: olegmikul at gmail.com (Oleg Mikulya) Date: Tue, 6 Dec 2011 14:31:21 -0800 Subject: [Numpy-discussion] Slow Numpy/MKL vs Matlab/MKL Message-ID: Hi, How to make Numpy to match Matlab in term of performance ? I have tryied with different options, using different MKL libraries and ICC versions, still Numpy is below Matalb for certain basic tasks by ~2x. About 5 years ago I was able to get about same speed, not anymore. Matlab suppose to use same MKL, what it the reason of such Numpy slowness (beside one, yet fundamental, task) ? My conditions: *site.cfg*: <<<->>> [DEFAULT] library_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 include_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/include mkl_libs = mkl_mc3, mkl_intel_thread, mkl_intel_lp64, mkl_core blas_libs = mkl_blas95_lp64 lapack_libs = mkl_lapack95_lp64 [lapack_opt] library_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 include_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/include libraries = mkl_lapack95_lp64 [blas_opt] library_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 include_dirs = /opt/intel/composer_xe_2011_sp1.7.256/mkl/include libraries = mkl_blas95_lp64 <<<->>> *intelccompiler.py*: <<<->>> ... linker_flags = '-O3 -openmp -lpthread -xHOST -fPIC -parallel -m64' compiler_opt_flags = '-static -xHOST -O3 -fPIC -mkl=parallel -ipo -parallel -m64' icc_run_string = 'icc ' + compiler_opt_flags icpc_run_string = 'icpc ' + compiler_opt_flags linker_run_string = 'icc ' + linker_flags + ' -shared ' ... <<<->>> *test.py* <<<->>> import numpy from numpy import random import time n=10000 m=10000 A=random.rand(n,m) b=numpy.ones((n,1)) tic = time.time() x=numpy.linalg.solve(A,b) toc = time.time() dt=toc-tic print "lin eq %7.1f " %dt tic = time.time() x=numpy.linalg.eig(A) toc= time.time() dt=toc-tic print "eig %7.1f " %dt tic = time.time() x=numpy.linalg.svd(A) toc= time.time() dt=toc-tic print "svd %7.1f " %dt <<<->>> *test.m* <<<->>> n=10000; m=10000; A=rand(n,m); b=ones(n,1); disp('lin eq'); tic; x=A\b; toc; disp('Eig'); tic; x=eig(A); toc; disp('SVD'); tic; x=svd(A); toc; <<<->>> Results: Linux_2.6.32...x86_64, Scientific Linux 6.1, i7, 12 GB RAM, python 2.7, Numpy 1.6, Matlab R2011b: Test Python/Numpy Matlab lin eq 14.6 14.9 eig 750.8 312.8 svd 431.9 271.2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Tue Dec 6 18:52:33 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 7 Dec 2011 00:52:33 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: <689185A2-3936-4ABA-B73E-8773268F8580@astro.physik.uni-goettingen.de> On 06.12.2011, at 11:13PM, Wes McKinney wrote: > This isn't the place for this discussion but we should start talking > about building a *high performance* flat file loading solution with > good column type inference and sensible defaults, etc. It's clear that > loadtable is aiming for highest compatibility-- for example I can read > a 2800x30 file in < 50 ms with the read_table / read_csv functions I > wrote myself recent in Cython (compared with loadtable taking > 1s as > quoted in the pull request), but I don't handle European decimal > formats and lots of other sources of unruliness. I personally don't > believe in sacrificing an order of magnitude of performance in the 90% > case for the 10% case-- so maybe it makes sense to have two functions > around: a superfast custom CSV reader for well-behaved data, and a > slower, but highly flexible, function like loadtable to fall back on. > I think R has two functions read.csv and read.csv2, where read.csv2 is > capable of dealing with things like European decimal format. Generally I agree, there's a good case for that, but I have to point out that the 1s time quoted there was with all auto-detection extravaganza turned on. Actually, if I remember the discussions right, in default, single-pass reading mode, it comes even close to genfromtxt and loadtxt (on my machine 150-200 ms for 2800 rows x 30 columns real*8). Originally loadtxt was intended to be that no-frills, fast reader, but in practice it is rarely faster than genfromtxt as the conversion from input strings to Python objects seems to be the bottleneck most of the time. Speeding that up using Cython certainly would be a big gain (and then there also is the request to make loadtxt memory-efficient, which I have failed to follow up on for weeks and weeks?) Cheers, Derek From questions.anon at gmail.com Tue Dec 6 19:32:26 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 11:32:26 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping Message-ID: I would like to produce an array with the maximum values out of many (10000s) of arrays. I need to loop through many multidimentional arrays and if a value is larger (in the same place as the previous array) then I would like that value to replace it. e.g. a=[1,1,2,2 11,2,2 1,1,2,2] b=[1,1,3,2 2,1,0,0 1,1,2,0] where b>a replace with value in b, so the new a should be : a=[1,1,3,2] 2,1,2,2 1,1,2,2] and then keep looping through many arrays and replace whenever value is larger. I have tried numpy.putmask but that results in TypeError: putmask() argument 1 must be numpy.ndarray, not list Any other ideas? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 19:55:43 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 19:55:43 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: It may not be the most efficient way to do this, but you can do: mask = b > a a[mask] = b[mask] -=- Olivier 2011/12/6 questions anon > I would like to produce an array with the maximum values out of many > (10000s) of arrays. > I need to loop through many multidimentional arrays and if a value is > larger (in the same place as the previous array) then I would like that > value to replace it. > > e.g. > a=[1,1,2,2 > 11,2,2 > 1,1,2,2] > b=[1,1,3,2 > 2,1,0,0 > 1,1,2,0] > > where b>a replace with value in b, so the new a should be : > > a=[1,1,3,2] > 2,1,2,2 > 1,1,2,2] > > and then keep looping through many arrays and replace whenever value is > larger. > > I have tried numpy.putmask but that results in > TypeError: putmask() argument 1 must be numpy.ndarray, not list > Any other ideas? Thanks > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 6 20:36:10 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 6 Dec 2011 20:36:10 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau wrote: > It may not be the most efficient way to do this, but you can do: > mask = b > a > a[mask] = b[mask] > > -=- Olivier > > 2011/12/6 questions anon >> >> I would like to produce an array with the maximum values out of many >> (10000s) of arrays. >> I need to loop through many multidimentional arrays and if a value is >> larger (in the same place as the previous array) then I would like that >> value to replace it. >> >> e.g. >> a=[1,1,2,2 >> 11,2,2 >> 1,1,2,2] >> b=[1,1,3,2 >> 2,1,0,0 >> 1,1,2,0] >> >> where b>a replace with value in b, so the new a should be : >> >> a=[1,1,3,2] >> 2,1,2,2 >> 1,1,2,2] >> >> and then keep looping through many arrays and replace whenever value is >> larger. >> >> I have tried numpy.putmask but that results in >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >> Any other ideas? Thanks if I understand correctly it's a minimum.reduce numpy >>> a = np.concatenate((np.arange(5)[::-1], np.arange(5)))*np.ones((4,3,1)) >>> np.minimum.reduce(a, axis=2) array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]) >>> a.T.shape (10, 3, 4) python with iterable >>> reduce(np.maximum, a.T) array([[ 4., 4., 4., 4.], [ 4., 4., 4., 4.], [ 4., 4., 4., 4.]]) >>> reduce(np.minimum, a.T) array([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]) Josef >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Dec 6 20:40:39 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 6 Dec 2011 18:40:39 -0700 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 12:43 PM, Ralf Gommers wrote: > Hi all, > > It's been a little over 6 months since the release of 1.6.0 and the NA > debate has quieted down, so I'd like to ask your opinion on the timing of > 1.7.0. It looks to me like we have a healthy amount of bug fixes and small > improvements, plus three larger chucks of work: > > - datetime > - NA > - Bento support > > My impression is that both datetime and NA are releasable, but should be > labeled "tech preview" or something similar, because they may still see > significant changes. Please correct me if I'm wrong. > > There's still some maintenance work to do and pull requests to merge, but > a beta release by Christmas should be feasible. What do you all think? > > Sounds reasonable. I'd also like to add some features to the polynomial modules. 1. Improved zero finding with Gaussian quadrature. With the improved zeros and using an expression for Gaussian weights things work well up to degree 100. 2. Small change to the evaluation functions adding a "tensor" keyword and allowing arrays of coefficients. This change allows easy evaluation of tensor products of the polynomials, either on a grid or on a set of points, which is handy for working in two or three dimensions. I think the recent fixes for user types should also go in, but I'd like Mark to review them first. The R like sampling function could also go in, I just had a question about the name 'choice', but if that is an R compatible name I'm OK with it. In fact, I think many of the current pull requests could go in with the addition of a few tests in some cases. I'll fix the ma.polyfit function test sometime this week. Maybe we should have a weekend devoted to getting the current request in. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 6 20:58:12 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 12:58:12 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: Hi Olivier, No that does not seem to do anything am I missing another step whereever b is greater than a replace b with a? thanks On Wed, Dec 7, 2011 at 11:55 AM, Olivier Delalleau wrote: > It may not be the most efficient way to do this, but you can do: > mask = b > a > a[mask] = b[mask] > > -=- Olivier > > 2011/12/6 questions anon > >> I would like to produce an array with the maximum values out of many >> (10000s) of arrays. >> I need to loop through many multidimentional arrays and if a value is >> larger (in the same place as the previous array) then I would like that >> value to replace it. >> >> e.g. >> a=[1,1,2,2 >> 11,2,2 >> 1,1,2,2] >> b=[1,1,3,2 >> 2,1,0,0 >> 1,1,2,0] >> >> where b>a replace with value in b, so the new a should be : >> >> a=[1,1,3,2] >> 2,1,2,2 >> 1,1,2,2] >> >> and then keep looping through many arrays and replace whenever value is >> larger. >> >> I have tried numpy.putmask but that results in >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >> Any other ideas? Thanks >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 6 21:04:25 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 13:04:25 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: thanks for responding Josef but that is not really what I am looking for, I have a multidimensional array and if the next array has any values greater than what is in my first array I want to replace them. The data are contained in netcdf files. I can achieve what I want if I combine all of my arrays using numpy concatenate and then using the command numpy.max(myarray, axis=0) but because I have so many arrays I end up with a memory error so I need to find a way to get the maximum while looping. On Wed, Dec 7, 2011 at 12:36 PM, wrote: > On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau wrote: > > It may not be the most efficient way to do this, but you can do: > > mask = b > a > > a[mask] = b[mask] > > > > -=- Olivier > > > > 2011/12/6 questions anon > >> > >> I would like to produce an array with the maximum values out of many > >> (10000s) of arrays. > >> I need to loop through many multidimentional arrays and if a value is > >> larger (in the same place as the previous array) then I would like that > >> value to replace it. > >> > >> e.g. > >> a=[1,1,2,2 > >> 11,2,2 > >> 1,1,2,2] > >> b=[1,1,3,2 > >> 2,1,0,0 > >> 1,1,2,0] > >> > >> where b>a replace with value in b, so the new a should be : > >> > >> a=[1,1,3,2] > >> 2,1,2,2 > >> 1,1,2,2] > >> > >> and then keep looping through many arrays and replace whenever value is > >> larger. > >> > >> I have tried numpy.putmask but that results in > >> TypeError: putmask() argument 1 must be numpy.ndarray, not list > >> Any other ideas? Thanks > > if I understand correctly it's a minimum.reduce > > numpy > > >>> a = np.concatenate((np.arange(5)[::-1], np.arange(5)))*np.ones((4,3,1)) > >>> np.minimum.reduce(a, axis=2) > array([[ 0., 0., 0.], > [ 0., 0., 0.], > [ 0., 0., 0.], > [ 0., 0., 0.]]) > >>> a.T.shape > (10, 3, 4) > > python with iterable > > >>> reduce(np.maximum, a.T) > array([[ 4., 4., 4., 4.], > [ 4., 4., 4., 4.], > [ 4., 4., 4., 4.]]) > >>> reduce(np.minimum, a.T) > array([[ 0., 0., 0., 0.], > [ 0., 0., 0., 0.], > [ 0., 0., 0., 0.]]) > > Josef > > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 21:05:03 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 21:05:03 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: Weird, it worked for me (with a and b two 1d numpy arrays). Anyway, Josef's solution is probably much more efficient (especially if you can put all your arrays into a single tensor). -=- Olivier 2011/12/6 questions anon > Hi Olivier, > No that does not seem to do anything > am I missing another step whereever b is greater than a replace b with a? > thanks > > > On Wed, Dec 7, 2011 at 11:55 AM, Olivier Delalleau wrote: > >> It may not be the most efficient way to do this, but you can do: >> mask = b > a >> a[mask] = b[mask] >> >> -=- Olivier >> >> 2011/12/6 questions anon >> >>> I would like to produce an array with the maximum values out of many >>> (10000s) of arrays. >>> I need to loop through many multidimentional arrays and if a value is >>> larger (in the same place as the previous array) then I would like that >>> value to replace it. >>> >>> e.g. >>> a=[1,1,2,2 >>> 11,2,2 >>> 1,1,2,2] >>> b=[1,1,3,2 >>> 2,1,0,0 >>> 1,1,2,0] >>> >>> where b>a replace with value in b, so the new a should be : >>> >>> a=[1,1,3,2] >>> 2,1,2,2 >>> 1,1,2,2] >>> >>> and then keep looping through many arrays and replace whenever value is >>> larger. >>> >>> I have tried numpy.putmask but that results in >>> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>> Any other ideas? Thanks >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 6 21:06:15 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 13:06:15 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: I have 2d numpy arrays On Wed, Dec 7, 2011 at 1:05 PM, Olivier Delalleau wrote: > Weird, it worked for me (with a and b two 1d numpy arrays). Anyway, > Josef's solution is probably much more efficient (especially if you can put > all your arrays into a single tensor). > > > -=- Olivier > > 2011/12/6 questions anon > >> Hi Olivier, >> No that does not seem to do anything >> am I missing another step whereever b is greater than a replace b with a? >> thanks >> >> >> On Wed, Dec 7, 2011 at 11:55 AM, Olivier Delalleau wrote: >> >>> It may not be the most efficient way to do this, but you can do: >>> mask = b > a >>> a[mask] = b[mask] >>> >>> -=- Olivier >>> >>> 2011/12/6 questions anon >>> >>>> I would like to produce an array with the maximum values out of many >>>> (10000s) of arrays. >>>> I need to loop through many multidimentional arrays and if a value is >>>> larger (in the same place as the previous array) then I would like that >>>> value to replace it. >>>> >>>> e.g. >>>> a=[1,1,2,2 >>>> 11,2,2 >>>> 1,1,2,2] >>>> b=[1,1,3,2 >>>> 2,1,0,0 >>>> 1,1,2,0] >>>> >>>> where b>a replace with value in b, so the new a should be : >>>> >>>> a=[1,1,3,2] >>>> 2,1,2,2 >>>> 1,1,2,2] >>>> >>>> and then keep looping through many arrays and replace whenever value is >>>> larger. >>>> >>>> I have tried numpy.putmask but that results in >>>> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>>> Any other ideas? Thanks >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 21:07:04 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 21:07:04 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: If you need to do them one after the other, numpy.maximum(a, b) will do it (it won't work in-place on 'a' though, it'll make a new copy). -=- Olivier 2011/12/6 questions anon > thanks for responding Josef but that is not really what I am looking for, > I have a multidimensional array and if the next array has any values > greater than what is in my first array I want to replace them. The data are > contained in netcdf files. > I can achieve what I want if I combine all of my arrays using numpy > concatenate and then using the command numpy.max(myarray, axis=0) but > because I have so many arrays I end up with a memory error so I need to > find a way to get the maximum while looping. > > > > On Wed, Dec 7, 2011 at 12:36 PM, wrote: > >> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau wrote: >> > It may not be the most efficient way to do this, but you can do: >> > mask = b > a >> > a[mask] = b[mask] >> > >> > -=- Olivier >> > >> > 2011/12/6 questions anon >> >> >> >> I would like to produce an array with the maximum values out of many >> >> (10000s) of arrays. >> >> I need to loop through many multidimentional arrays and if a value is >> >> larger (in the same place as the previous array) then I would like that >> >> value to replace it. >> >> >> >> e.g. >> >> a=[1,1,2,2 >> >> 11,2,2 >> >> 1,1,2,2] >> >> b=[1,1,3,2 >> >> 2,1,0,0 >> >> 1,1,2,0] >> >> >> >> where b>a replace with value in b, so the new a should be : >> >> >> >> a=[1,1,3,2] >> >> 2,1,2,2 >> >> 1,1,2,2] >> >> >> >> and then keep looping through many arrays and replace whenever value is >> >> larger. >> >> >> >> I have tried numpy.putmask but that results in >> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >> >> Any other ideas? Thanks >> >> if I understand correctly it's a minimum.reduce >> >> numpy >> >> >>> a = np.concatenate((np.arange(5)[::-1], >> np.arange(5)))*np.ones((4,3,1)) >> >>> np.minimum.reduce(a, axis=2) >> array([[ 0., 0., 0.], >> [ 0., 0., 0.], >> [ 0., 0., 0.], >> [ 0., 0., 0.]]) >> >>> a.T.shape >> (10, 3, 4) >> >> python with iterable >> >> >>> reduce(np.maximum, a.T) >> array([[ 4., 4., 4., 4.], >> [ 4., 4., 4., 4.], >> [ 4., 4., 4., 4.]]) >> >>> reduce(np.minimum, a.T) >> array([[ 0., 0., 0., 0.], >> [ 0., 0., 0., 0.], >> [ 0., 0., 0., 0.]]) >> >> Josef >> >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Dec 6 21:08:10 2011 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Dec 2011 21:08:10 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: I think you want np.maximum(a, b, out=a) - Nathaniel On Dec 6, 2011 9:04 PM, "questions anon" wrote: > thanks for responding Josef but that is not really what I am looking for, > I have a multidimensional array and if the next array has any values > greater than what is in my first array I want to replace them. The data are > contained in netcdf files. > I can achieve what I want if I combine all of my arrays using numpy > concatenate and then using the command numpy.max(myarray, axis=0) but > because I have so many arrays I end up with a memory error so I need to > find a way to get the maximum while looping. > > > > On Wed, Dec 7, 2011 at 12:36 PM, wrote: > >> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau wrote: >> > It may not be the most efficient way to do this, but you can do: >> > mask = b > a >> > a[mask] = b[mask] >> > >> > -=- Olivier >> > >> > 2011/12/6 questions anon >> >> >> >> I would like to produce an array with the maximum values out of many >> >> (10000s) of arrays. >> >> I need to loop through many multidimentional arrays and if a value is >> >> larger (in the same place as the previous array) then I would like that >> >> value to replace it. >> >> >> >> e.g. >> >> a=[1,1,2,2 >> >> 11,2,2 >> >> 1,1,2,2] >> >> b=[1,1,3,2 >> >> 2,1,0,0 >> >> 1,1,2,0] >> >> >> >> where b>a replace with value in b, so the new a should be : >> >> >> >> a=[1,1,3,2] >> >> 2,1,2,2 >> >> 1,1,2,2] >> >> >> >> and then keep looping through many arrays and replace whenever value is >> >> larger. >> >> >> >> I have tried numpy.putmask but that results in >> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >> >> Any other ideas? Thanks >> >> if I understand correctly it's a minimum.reduce >> >> numpy >> >> >>> a = np.concatenate((np.arange(5)[::-1], >> np.arange(5)))*np.ones((4,3,1)) >> >>> np.minimum.reduce(a, axis=2) >> array([[ 0., 0., 0.], >> [ 0., 0., 0.], >> [ 0., 0., 0.], >> [ 0., 0., 0.]]) >> >>> a.T.shape >> (10, 3, 4) >> >> python with iterable >> >> >>> reduce(np.maximum, a.T) >> array([[ 4., 4., 4., 4.], >> [ 4., 4., 4., 4.], >> [ 4., 4., 4., 4.]]) >> >>> reduce(np.minimum, a.T) >> array([[ 0., 0., 0., 0.], >> [ 0., 0., 0., 0.], >> [ 0., 0., 0., 0.]]) >> >> Josef >> >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 21:12:47 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 21:12:47 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: Thanks, I didn't know you could specify the out array :) (to the OP: my initial suggestion, although probably not very efficient, seems to work with 2D arrays too, so I have no idea why it didn't work for you -- but Nathaniel's one seems to be the ideal one anyway). -=- Olivier 2011/12/6 Nathaniel Smith > I think you want > np.maximum(a, b, out=a) > > - Nathaniel > On Dec 6, 2011 9:04 PM, "questions anon" wrote: > >> thanks for responding Josef but that is not really what I am looking for, >> I have a multidimensional array and if the next array has any values >> greater than what is in my first array I want to replace them. The data are >> contained in netcdf files. >> I can achieve what I want if I combine all of my arrays using numpy >> concatenate and then using the command numpy.max(myarray, axis=0) but >> because I have so many arrays I end up with a memory error so I need to >> find a way to get the maximum while looping. >> >> >> >> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >> >>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau wrote: >>> > It may not be the most efficient way to do this, but you can do: >>> > mask = b > a >>> > a[mask] = b[mask] >>> > >>> > -=- Olivier >>> > >>> > 2011/12/6 questions anon >>> >> >>> >> I would like to produce an array with the maximum values out of many >>> >> (10000s) of arrays. >>> >> I need to loop through many multidimentional arrays and if a value is >>> >> larger (in the same place as the previous array) then I would like >>> that >>> >> value to replace it. >>> >> >>> >> e.g. >>> >> a=[1,1,2,2 >>> >> 11,2,2 >>> >> 1,1,2,2] >>> >> b=[1,1,3,2 >>> >> 2,1,0,0 >>> >> 1,1,2,0] >>> >> >>> >> where b>a replace with value in b, so the new a should be : >>> >> >>> >> a=[1,1,3,2] >>> >> 2,1,2,2 >>> >> 1,1,2,2] >>> >> >>> >> and then keep looping through many arrays and replace whenever value >>> is >>> >> larger. >>> >> >>> >> I have tried numpy.putmask but that results in >>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>> >> Any other ideas? Thanks >>> >>> if I understand correctly it's a minimum.reduce >>> >>> numpy >>> >>> >>> a = np.concatenate((np.arange(5)[::-1], >>> np.arange(5)))*np.ones((4,3,1)) >>> >>> np.minimum.reduce(a, axis=2) >>> array([[ 0., 0., 0.], >>> [ 0., 0., 0.], >>> [ 0., 0., 0.], >>> [ 0., 0., 0.]]) >>> >>> a.T.shape >>> (10, 3, 4) >>> >>> python with iterable >>> >>> >>> reduce(np.maximum, a.T) >>> array([[ 4., 4., 4., 4.], >>> [ 4., 4., 4., 4.], >>> [ 4., 4., 4., 4.]]) >>> >>> reduce(np.minimum, a.T) >>> array([[ 0., 0., 0., 0.], >>> [ 0., 0., 0., 0.], >>> [ 0., 0., 0., 0.]]) >>> >>> Josef >>> >>> >> >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 6 21:23:48 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 13:23:48 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: thanks for all of your help, that does look appropriate but I am not sure how to loop it over thousands of files. I need to keep the first array to compare with but replace any greater values as I loop through each array comparing back to the same array. does that make sense? On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau wrote: > Thanks, I didn't know you could specify the out array :) > > (to the OP: my initial suggestion, although probably not very efficient, > seems to work with 2D arrays too, so I have no idea why it didn't work for > you -- but Nathaniel's one seems to be the ideal one anyway). > > -=- Olivier > > > 2011/12/6 Nathaniel Smith > >> I think you want >> np.maximum(a, b, out=a) >> >> - Nathaniel >> On Dec 6, 2011 9:04 PM, "questions anon" >> wrote: >> >>> thanks for responding Josef but that is not really what I am looking >>> for, I have a multidimensional array and if the next array has any values >>> greater than what is in my first array I want to replace them. The data are >>> contained in netcdf files. >>> I can achieve what I want if I combine all of my arrays using numpy >>> concatenate and then using the command numpy.max(myarray, axis=0) but >>> because I have so many arrays I end up with a memory error so I need to >>> find a way to get the maximum while looping. >>> >>> >>> >>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>> >>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >>>> wrote: >>>> > It may not be the most efficient way to do this, but you can do: >>>> > mask = b > a >>>> > a[mask] = b[mask] >>>> > >>>> > -=- Olivier >>>> > >>>> > 2011/12/6 questions anon >>>> >> >>>> >> I would like to produce an array with the maximum values out of many >>>> >> (10000s) of arrays. >>>> >> I need to loop through many multidimentional arrays and if a value is >>>> >> larger (in the same place as the previous array) then I would like >>>> that >>>> >> value to replace it. >>>> >> >>>> >> e.g. >>>> >> a=[1,1,2,2 >>>> >> 11,2,2 >>>> >> 1,1,2,2] >>>> >> b=[1,1,3,2 >>>> >> 2,1,0,0 >>>> >> 1,1,2,0] >>>> >> >>>> >> where b>a replace with value in b, so the new a should be : >>>> >> >>>> >> a=[1,1,3,2] >>>> >> 2,1,2,2 >>>> >> 1,1,2,2] >>>> >> >>>> >> and then keep looping through many arrays and replace whenever value >>>> is >>>> >> larger. >>>> >> >>>> >> I have tried numpy.putmask but that results in >>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>>> >> Any other ideas? Thanks >>>> >>>> if I understand correctly it's a minimum.reduce >>>> >>>> numpy >>>> >>>> >>> a = np.concatenate((np.arange(5)[::-1], >>>> np.arange(5)))*np.ones((4,3,1)) >>>> >>> np.minimum.reduce(a, axis=2) >>>> array([[ 0., 0., 0.], >>>> [ 0., 0., 0.], >>>> [ 0., 0., 0.], >>>> [ 0., 0., 0.]]) >>>> >>> a.T.shape >>>> (10, 3, 4) >>>> >>>> python with iterable >>>> >>>> >>> reduce(np.maximum, a.T) >>>> array([[ 4., 4., 4., 4.], >>>> [ 4., 4., 4., 4.], >>>> [ 4., 4., 4., 4.]]) >>>> >>> reduce(np.minimum, a.T) >>>> array([[ 0., 0., 0., 0.], >>>> [ 0., 0., 0., 0.], >>>> [ 0., 0., 0., 0.]]) >>>> >>>> Josef >>>> >>>> >> >>>> >> _______________________________________________ >>>> >> NumPy-Discussion mailing list >>>> >> NumPy-Discussion at scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >> >>>> > >>>> > >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> > >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 21:36:43 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 21:36:43 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: The "out=a" keyword will ensure your first array will keep being updated. So you can do something like: a = my_list_of_arrays[0] for b in my_list_of_arrays[1:]: numpy.maximum(a, b, out=a) -=- Olivier 2011/12/6 questions anon > thanks for all of your help, that does look appropriate but I am not sure > how to loop it over thousands of files. > I need to keep the first array to compare with but replace any greater > values as I loop through each array comparing back to the same array. does > that make sense? > > > On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau wrote: > >> Thanks, I didn't know you could specify the out array :) >> >> (to the OP: my initial suggestion, although probably not very efficient, >> seems to work with 2D arrays too, so I have no idea why it didn't work for >> you -- but Nathaniel's one seems to be the ideal one anyway). >> >> -=- Olivier >> >> >> 2011/12/6 Nathaniel Smith >> >>> I think you want >>> np.maximum(a, b, out=a) >>> >>> - Nathaniel >>> On Dec 6, 2011 9:04 PM, "questions anon" >>> wrote: >>> >>>> thanks for responding Josef but that is not really what I am looking >>>> for, I have a multidimensional array and if the next array has any values >>>> greater than what is in my first array I want to replace them. The data are >>>> contained in netcdf files. >>>> I can achieve what I want if I combine all of my arrays using numpy >>>> concatenate and then using the command numpy.max(myarray, axis=0) but >>>> because I have so many arrays I end up with a memory error so I need to >>>> find a way to get the maximum while looping. >>>> >>>> >>>> >>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>>> >>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >>>>> wrote: >>>>> > It may not be the most efficient way to do this, but you can do: >>>>> > mask = b > a >>>>> > a[mask] = b[mask] >>>>> > >>>>> > -=- Olivier >>>>> > >>>>> > 2011/12/6 questions anon >>>>> >> >>>>> >> I would like to produce an array with the maximum values out of many >>>>> >> (10000s) of arrays. >>>>> >> I need to loop through many multidimentional arrays and if a value >>>>> is >>>>> >> larger (in the same place as the previous array) then I would like >>>>> that >>>>> >> value to replace it. >>>>> >> >>>>> >> e.g. >>>>> >> a=[1,1,2,2 >>>>> >> 11,2,2 >>>>> >> 1,1,2,2] >>>>> >> b=[1,1,3,2 >>>>> >> 2,1,0,0 >>>>> >> 1,1,2,0] >>>>> >> >>>>> >> where b>a replace with value in b, so the new a should be : >>>>> >> >>>>> >> a=[1,1,3,2] >>>>> >> 2,1,2,2 >>>>> >> 1,1,2,2] >>>>> >> >>>>> >> and then keep looping through many arrays and replace whenever >>>>> value is >>>>> >> larger. >>>>> >> >>>>> >> I have tried numpy.putmask but that results in >>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>>>> >> Any other ideas? Thanks >>>>> >>>>> if I understand correctly it's a minimum.reduce >>>>> >>>>> numpy >>>>> >>>>> >>> a = np.concatenate((np.arange(5)[::-1], >>>>> np.arange(5)))*np.ones((4,3,1)) >>>>> >>> np.minimum.reduce(a, axis=2) >>>>> array([[ 0., 0., 0.], >>>>> [ 0., 0., 0.], >>>>> [ 0., 0., 0.], >>>>> [ 0., 0., 0.]]) >>>>> >>> a.T.shape >>>>> (10, 3, 4) >>>>> >>>>> python with iterable >>>>> >>>>> >>> reduce(np.maximum, a.T) >>>>> array([[ 4., 4., 4., 4.], >>>>> [ 4., 4., 4., 4.], >>>>> [ 4., 4., 4., 4.]]) >>>>> >>> reduce(np.minimum, a.T) >>>>> array([[ 0., 0., 0., 0.], >>>>> [ 0., 0., 0., 0.], >>>>> [ 0., 0., 0., 0.]]) >>>>> >>>>> Josef >>>>> >>>>> >> >>>>> >> _______________________________________________ >>>>> >> NumPy-Discussion mailing list >>>>> >> NumPy-Discussion at scipy.org >>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >> >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > NumPy-Discussion mailing list >>>>> > NumPy-Discussion at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> > >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 6 21:44:42 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 6 Dec 2011 21:44:42 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau wrote: > The "out=a" keyword will ensure your first array will keep being updated. So > you can do something like: > > a = my_list_of_arrays[0] > for b in my_list_of_arrays[1:]: > ? numpy.maximum(a, b, out=a) I didn't think of the out argument which makes it more efficient, but in my example I used Python's reduce which takes an iterable and not one huge array. Josef > > -=- Olivier > > 2011/12/6 questions anon >> >> thanks for all of your help, that does look appropriate but I am not sure >> how to loop it over thousands of files. >> I need to keep the first array to compare with but replace any greater >> values as I loop through each array comparing back to the same array. does >> that make sense? >> >> >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau wrote: >>> >>> Thanks, I didn't know you could specify the out array :) >>> >>> (to the OP: my initial suggestion, although probably not very efficient, >>> seems to work with 2D arrays too, so I have no idea why it didn't work for >>> you -- but Nathaniel's one seems to be the ideal one anyway). >>> >>> -=- Olivier >>> >>> >>> 2011/12/6 Nathaniel Smith >>>> >>>> I think you want >>>> ? np.maximum(a, b, out=a) >>>> >>>> - Nathaniel >>>> >>>> On Dec 6, 2011 9:04 PM, "questions anon" >>>> wrote: >>>>> >>>>> thanks for responding Josef but that is not really what I am looking >>>>> for, I have a multidimensional array and if the next array has any values >>>>> greater than what is in my first array I want to replace them. The data are >>>>> contained in netcdf files. >>>>> I can achieve what I want if I combine all of my arrays using numpy >>>>> concatenate and then using the command numpy.max(myarray, axis=0) but >>>>> because I have so many arrays I end up with a memory error so I need to find >>>>> a way to get the maximum while looping. >>>>> >>>>> >>>>> >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>>>>> >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >>>>>> wrote: >>>>>> > It may not be the most efficient way to do this, but you can do: >>>>>> > mask = b > a >>>>>> > a[mask] = b[mask] >>>>>> > >>>>>> > -=- Olivier >>>>>> > >>>>>> > 2011/12/6 questions anon >>>>>> >> >>>>>> >> I would like to produce an array with the maximum values out of >>>>>> >> many >>>>>> >> (10000s) of arrays. >>>>>> >> I need to loop through many multidimentional arrays and if a value >>>>>> >> is >>>>>> >> larger (in the same place as the previous array) then I would like >>>>>> >> that >>>>>> >> value to replace it. >>>>>> >> >>>>>> >> e.g. >>>>>> >> a=[1,1,2,2 >>>>>> >> 11,2,2 >>>>>> >> 1,1,2,2] >>>>>> >> b=[1,1,3,2 >>>>>> >> 2,1,0,0 >>>>>> >> 1,1,2,0] >>>>>> >> >>>>>> >> where b>a replace with value in b, so the new a should be : >>>>>> >> >>>>>> >> a=[1,1,3,2] >>>>>> >> 2,1,2,2 >>>>>> >> 1,1,2,2] >>>>>> >> >>>>>> >> and then keep looping through many arrays and replace whenever >>>>>> >> value is >>>>>> >> larger. >>>>>> >> >>>>>> >> I have tried numpy.putmask but that results in >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>>>>> >> Any other ideas? Thanks >>>>>> >>>>>> if I understand correctly it's a minimum.reduce >>>>>> >>>>>> numpy >>>>>> >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) >>>>>> >>> np.minimum.reduce(a, axis=2) >>>>>> array([[ 0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0.]]) >>>>>> >>> a.T.shape >>>>>> (10, 3, 4) >>>>>> >>>>>> python with iterable >>>>>> >>>>>> >>> reduce(np.maximum, a.T) >>>>>> array([[ 4., ?4., ?4., ?4.], >>>>>> ? ? ? [ 4., ?4., ?4., ?4.], >>>>>> ? ? ? [ 4., ?4., ?4., ?4.]]) >>>>>> >>> reduce(np.minimum, a.T) >>>>>> array([[ 0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0.]]) >>>>>> >>>>>> Josef >>>>>> >>>>>> >> >>>>>> >> _______________________________________________ >>>>>> >> NumPy-Discussion mailing list >>>>>> >> NumPy-Discussion at scipy.org >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >> >>>>>> > >>>>>> > >>>>>> > _______________________________________________ >>>>>> > NumPy-Discussion mailing list >>>>>> > NumPy-Discussion at scipy.org >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> > >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From questions.anon at gmail.com Tue Dec 6 22:13:42 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 14:13:42 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: thanks again my only problem though is that the out=a in the loop does not seem to replace my a= outside the loop so my final a is whatever I started with for a. Not sure what I am doing wrong whether it is something with the loop or with the command. On Wed, Dec 7, 2011 at 1:44 PM, wrote: > On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau wrote: > > The "out=a" keyword will ensure your first array will keep being > updated. So > > you can do something like: > > > > a = my_list_of_arrays[0] > > for b in my_list_of_arrays[1:]: > > numpy.maximum(a, b, out=a) > > I didn't think of the out argument which makes it more efficient, but > in my example I used Python's reduce which takes an iterable and not > one huge array. > > Josef > > > > > > -=- Olivier > > > > 2011/12/6 questions anon > >> > >> thanks for all of your help, that does look appropriate but I am not > sure > >> how to loop it over thousands of files. > >> I need to keep the first array to compare with but replace any greater > >> values as I loop through each array comparing back to the same array. > does > >> that make sense? > >> > >> > >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau > wrote: > >>> > >>> Thanks, I didn't know you could specify the out array :) > >>> > >>> (to the OP: my initial suggestion, although probably not very > efficient, > >>> seems to work with 2D arrays too, so I have no idea why it didn't work > for > >>> you -- but Nathaniel's one seems to be the ideal one anyway). > >>> > >>> -=- Olivier > >>> > >>> > >>> 2011/12/6 Nathaniel Smith > >>>> > >>>> I think you want > >>>> np.maximum(a, b, out=a) > >>>> > >>>> - Nathaniel > >>>> > >>>> On Dec 6, 2011 9:04 PM, "questions anon" > >>>> wrote: > >>>>> > >>>>> thanks for responding Josef but that is not really what I am looking > >>>>> for, I have a multidimensional array and if the next array has any > values > >>>>> greater than what is in my first array I want to replace them. The > data are > >>>>> contained in netcdf files. > >>>>> I can achieve what I want if I combine all of my arrays using numpy > >>>>> concatenate and then using the command numpy.max(myarray, axis=0) but > >>>>> because I have so many arrays I end up with a memory error so I need > to find > >>>>> a way to get the maximum while looping. > >>>>> > >>>>> > >>>>> > >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: > >>>>>> > >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau > >>>>>> wrote: > >>>>>> > It may not be the most efficient way to do this, but you can do: > >>>>>> > mask = b > a > >>>>>> > a[mask] = b[mask] > >>>>>> > > >>>>>> > -=- Olivier > >>>>>> > > >>>>>> > 2011/12/6 questions anon > >>>>>> >> > >>>>>> >> I would like to produce an array with the maximum values out of > >>>>>> >> many > >>>>>> >> (10000s) of arrays. > >>>>>> >> I need to loop through many multidimentional arrays and if a > value > >>>>>> >> is > >>>>>> >> larger (in the same place as the previous array) then I would > like > >>>>>> >> that > >>>>>> >> value to replace it. > >>>>>> >> > >>>>>> >> e.g. > >>>>>> >> a=[1,1,2,2 > >>>>>> >> 11,2,2 > >>>>>> >> 1,1,2,2] > >>>>>> >> b=[1,1,3,2 > >>>>>> >> 2,1,0,0 > >>>>>> >> 1,1,2,0] > >>>>>> >> > >>>>>> >> where b>a replace with value in b, so the new a should be : > >>>>>> >> > >>>>>> >> a=[1,1,3,2] > >>>>>> >> 2,1,2,2 > >>>>>> >> 1,1,2,2] > >>>>>> >> > >>>>>> >> and then keep looping through many arrays and replace whenever > >>>>>> >> value is > >>>>>> >> larger. > >>>>>> >> > >>>>>> >> I have tried numpy.putmask but that results in > >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list > >>>>>> >> Any other ideas? Thanks > >>>>>> > >>>>>> if I understand correctly it's a minimum.reduce > >>>>>> > >>>>>> numpy > >>>>>> > >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], > >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) > >>>>>> >>> np.minimum.reduce(a, axis=2) > >>>>>> array([[ 0., 0., 0.], > >>>>>> [ 0., 0., 0.], > >>>>>> [ 0., 0., 0.], > >>>>>> [ 0., 0., 0.]]) > >>>>>> >>> a.T.shape > >>>>>> (10, 3, 4) > >>>>>> > >>>>>> python with iterable > >>>>>> > >>>>>> >>> reduce(np.maximum, a.T) > >>>>>> array([[ 4., 4., 4., 4.], > >>>>>> [ 4., 4., 4., 4.], > >>>>>> [ 4., 4., 4., 4.]]) > >>>>>> >>> reduce(np.minimum, a.T) > >>>>>> array([[ 0., 0., 0., 0.], > >>>>>> [ 0., 0., 0., 0.], > >>>>>> [ 0., 0., 0., 0.]]) > >>>>>> > >>>>>> Josef > >>>>>> > >>>>>> >> > >>>>>> >> _______________________________________________ > >>>>>> >> NumPy-Discussion mailing list > >>>>>> >> NumPy-Discussion at scipy.org > >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>>> >> > >>>>>> > > >>>>>> > > >>>>>> > _______________________________________________ > >>>>>> > NumPy-Discussion mailing list > >>>>>> > NumPy-Discussion at scipy.org > >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>>> > > >>>>>> _______________________________________________ > >>>>>> NumPy-Discussion mailing list > >>>>>> NumPy-Discussion at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> NumPy-Discussion mailing list > >>>>> NumPy-Discussion at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>> > >>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>> > >>> > >>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 22:34:41 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 22:34:41 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: Is 'a' a regular numpy array or something fancier? -=- Olivier 2011/12/6 questions anon > thanks again my only problem though is that the out=a in the loop does not > seem to replace my a= outside the loop so my final a is whatever I started > with for a. > Not sure what I am doing wrong whether it is something with the loop or > with the command. > > On Wed, Dec 7, 2011 at 1:44 PM, wrote: > >> On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau wrote: >> > The "out=a" keyword will ensure your first array will keep being >> updated. So >> > you can do something like: >> > >> > a = my_list_of_arrays[0] >> > for b in my_list_of_arrays[1:]: >> > numpy.maximum(a, b, out=a) >> >> I didn't think of the out argument which makes it more efficient, but >> in my example I used Python's reduce which takes an iterable and not >> one huge array. >> >> Josef >> >> >> > >> > -=- Olivier >> > >> > 2011/12/6 questions anon >> >> >> >> thanks for all of your help, that does look appropriate but I am not >> sure >> >> how to loop it over thousands of files. >> >> I need to keep the first array to compare with but replace any greater >> >> values as I loop through each array comparing back to the same array. >> does >> >> that make sense? >> >> >> >> >> >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau >> wrote: >> >>> >> >>> Thanks, I didn't know you could specify the out array :) >> >>> >> >>> (to the OP: my initial suggestion, although probably not very >> efficient, >> >>> seems to work with 2D arrays too, so I have no idea why it didn't >> work for >> >>> you -- but Nathaniel's one seems to be the ideal one anyway). >> >>> >> >>> -=- Olivier >> >>> >> >>> >> >>> 2011/12/6 Nathaniel Smith >> >>>> >> >>>> I think you want >> >>>> np.maximum(a, b, out=a) >> >>>> >> >>>> - Nathaniel >> >>>> >> >>>> On Dec 6, 2011 9:04 PM, "questions anon" >> >>>> wrote: >> >>>>> >> >>>>> thanks for responding Josef but that is not really what I am looking >> >>>>> for, I have a multidimensional array and if the next array has any >> values >> >>>>> greater than what is in my first array I want to replace them. The >> data are >> >>>>> contained in netcdf files. >> >>>>> I can achieve what I want if I combine all of my arrays using numpy >> >>>>> concatenate and then using the command numpy.max(myarray, axis=0) >> but >> >>>>> because I have so many arrays I end up with a memory error so I >> need to find >> >>>>> a way to get the maximum while looping. >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >> >>>>>> >> >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >> >>>>>> wrote: >> >>>>>> > It may not be the most efficient way to do this, but you can do: >> >>>>>> > mask = b > a >> >>>>>> > a[mask] = b[mask] >> >>>>>> > >> >>>>>> > -=- Olivier >> >>>>>> > >> >>>>>> > 2011/12/6 questions anon >> >>>>>> >> >> >>>>>> >> I would like to produce an array with the maximum values out of >> >>>>>> >> many >> >>>>>> >> (10000s) of arrays. >> >>>>>> >> I need to loop through many multidimentional arrays and if a >> value >> >>>>>> >> is >> >>>>>> >> larger (in the same place as the previous array) then I would >> like >> >>>>>> >> that >> >>>>>> >> value to replace it. >> >>>>>> >> >> >>>>>> >> e.g. >> >>>>>> >> a=[1,1,2,2 >> >>>>>> >> 11,2,2 >> >>>>>> >> 1,1,2,2] >> >>>>>> >> b=[1,1,3,2 >> >>>>>> >> 2,1,0,0 >> >>>>>> >> 1,1,2,0] >> >>>>>> >> >> >>>>>> >> where b>a replace with value in b, so the new a should be : >> >>>>>> >> >> >>>>>> >> a=[1,1,3,2] >> >>>>>> >> 2,1,2,2 >> >>>>>> >> 1,1,2,2] >> >>>>>> >> >> >>>>>> >> and then keep looping through many arrays and replace whenever >> >>>>>> >> value is >> >>>>>> >> larger. >> >>>>>> >> >> >>>>>> >> I have tried numpy.putmask but that results in >> >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >> >>>>>> >> Any other ideas? Thanks >> >>>>>> >> >>>>>> if I understand correctly it's a minimum.reduce >> >>>>>> >> >>>>>> numpy >> >>>>>> >> >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], >> >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) >> >>>>>> >>> np.minimum.reduce(a, axis=2) >> >>>>>> array([[ 0., 0., 0.], >> >>>>>> [ 0., 0., 0.], >> >>>>>> [ 0., 0., 0.], >> >>>>>> [ 0., 0., 0.]]) >> >>>>>> >>> a.T.shape >> >>>>>> (10, 3, 4) >> >>>>>> >> >>>>>> python with iterable >> >>>>>> >> >>>>>> >>> reduce(np.maximum, a.T) >> >>>>>> array([[ 4., 4., 4., 4.], >> >>>>>> [ 4., 4., 4., 4.], >> >>>>>> [ 4., 4., 4., 4.]]) >> >>>>>> >>> reduce(np.minimum, a.T) >> >>>>>> array([[ 0., 0., 0., 0.], >> >>>>>> [ 0., 0., 0., 0.], >> >>>>>> [ 0., 0., 0., 0.]]) >> >>>>>> >> >>>>>> Josef >> >>>>>> >> >>>>>> >> >> >>>>>> >> _______________________________________________ >> >>>>>> >> NumPy-Discussion mailing list >> >>>>>> >> NumPy-Discussion at scipy.org >> >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>>>>> >> >> >>>>>> > >> >>>>>> > >> >>>>>> > _______________________________________________ >> >>>>>> > NumPy-Discussion mailing list >> >>>>>> > NumPy-Discussion at scipy.org >> >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>>>>> > >> >>>>>> _______________________________________________ >> >>>>>> NumPy-Discussion mailing list >> >>>>>> NumPy-Discussion at scipy.org >> >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>>>> >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> NumPy-Discussion mailing list >> >>>>> NumPy-Discussion at scipy.org >> >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>>>> >> >>>> >> >>>> _______________________________________________ >> >>>> NumPy-Discussion mailing list >> >>>> NumPy-Discussion at scipy.org >> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> NumPy-Discussion mailing list >> >>> NumPy-Discussion at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>> >> >> >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 6 22:49:52 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 14:49:52 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: Something fancier I think, I am able to compare the result with my previous method so I can easily see I am doing something wrong. see code below: all_TSFC=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': print "dealing with ncfiles:", ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) ncfile.close() all_TSFC.append(TSFC) a=TSFC[0] for b in TSFC[1:]: N.maximum(a,b,out=a) big_array=N.ma.concatenate(all_TSFC) Max=big_array.max(axis=0) print "max is", Max,"a is", a On Wed, Dec 7, 2011 at 2:34 PM, Olivier Delalleau wrote: > Is 'a' a regular numpy array or something fancier? > > > -=- Olivier > > 2011/12/6 questions anon > >> thanks again my only problem though is that the out=a in the loop does >> not seem to replace my a= outside the loop so my final a is whatever I >> started with for a. >> Not sure what I am doing wrong whether it is something with the loop or >> with the command. >> >> On Wed, Dec 7, 2011 at 1:44 PM, wrote: >> >>> On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau wrote: >>> > The "out=a" keyword will ensure your first array will keep being >>> updated. So >>> > you can do something like: >>> > >>> > a = my_list_of_arrays[0] >>> > for b in my_list_of_arrays[1:]: >>> > numpy.maximum(a, b, out=a) >>> >>> I didn't think of the out argument which makes it more efficient, but >>> in my example I used Python's reduce which takes an iterable and not >>> one huge array. >>> >>> Josef >>> >>> >>> > >>> > -=- Olivier >>> > >>> > 2011/12/6 questions anon >>> >> >>> >> thanks for all of your help, that does look appropriate but I am not >>> sure >>> >> how to loop it over thousands of files. >>> >> I need to keep the first array to compare with but replace any greater >>> >> values as I loop through each array comparing back to the same array. >>> does >>> >> that make sense? >>> >> >>> >> >>> >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau >>> wrote: >>> >>> >>> >>> Thanks, I didn't know you could specify the out array :) >>> >>> >>> >>> (to the OP: my initial suggestion, although probably not very >>> efficient, >>> >>> seems to work with 2D arrays too, so I have no idea why it didn't >>> work for >>> >>> you -- but Nathaniel's one seems to be the ideal one anyway). >>> >>> >>> >>> -=- Olivier >>> >>> >>> >>> >>> >>> 2011/12/6 Nathaniel Smith >>> >>>> >>> >>>> I think you want >>> >>>> np.maximum(a, b, out=a) >>> >>>> >>> >>>> - Nathaniel >>> >>>> >>> >>>> On Dec 6, 2011 9:04 PM, "questions anon" >>> >>>> wrote: >>> >>>>> >>> >>>>> thanks for responding Josef but that is not really what I am >>> looking >>> >>>>> for, I have a multidimensional array and if the next array has any >>> values >>> >>>>> greater than what is in my first array I want to replace them. The >>> data are >>> >>>>> contained in netcdf files. >>> >>>>> I can achieve what I want if I combine all of my arrays using numpy >>> >>>>> concatenate and then using the command numpy.max(myarray, axis=0) >>> but >>> >>>>> because I have so many arrays I end up with a memory error so I >>> need to find >>> >>>>> a way to get the maximum while looping. >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>> >>>>>> >>> >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >>> >>>>>> wrote: >>> >>>>>> > It may not be the most efficient way to do this, but you can do: >>> >>>>>> > mask = b > a >>> >>>>>> > a[mask] = b[mask] >>> >>>>>> > >>> >>>>>> > -=- Olivier >>> >>>>>> > >>> >>>>>> > 2011/12/6 questions anon >>> >>>>>> >> >>> >>>>>> >> I would like to produce an array with the maximum values out of >>> >>>>>> >> many >>> >>>>>> >> (10000s) of arrays. >>> >>>>>> >> I need to loop through many multidimentional arrays and if a >>> value >>> >>>>>> >> is >>> >>>>>> >> larger (in the same place as the previous array) then I would >>> like >>> >>>>>> >> that >>> >>>>>> >> value to replace it. >>> >>>>>> >> >>> >>>>>> >> e.g. >>> >>>>>> >> a=[1,1,2,2 >>> >>>>>> >> 11,2,2 >>> >>>>>> >> 1,1,2,2] >>> >>>>>> >> b=[1,1,3,2 >>> >>>>>> >> 2,1,0,0 >>> >>>>>> >> 1,1,2,0] >>> >>>>>> >> >>> >>>>>> >> where b>a replace with value in b, so the new a should be : >>> >>>>>> >> >>> >>>>>> >> a=[1,1,3,2] >>> >>>>>> >> 2,1,2,2 >>> >>>>>> >> 1,1,2,2] >>> >>>>>> >> >>> >>>>>> >> and then keep looping through many arrays and replace whenever >>> >>>>>> >> value is >>> >>>>>> >> larger. >>> >>>>>> >> >>> >>>>>> >> I have tried numpy.putmask but that results in >>> >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not list >>> >>>>>> >> Any other ideas? Thanks >>> >>>>>> >>> >>>>>> if I understand correctly it's a minimum.reduce >>> >>>>>> >>> >>>>>> numpy >>> >>>>>> >>> >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], >>> >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) >>> >>>>>> >>> np.minimum.reduce(a, axis=2) >>> >>>>>> array([[ 0., 0., 0.], >>> >>>>>> [ 0., 0., 0.], >>> >>>>>> [ 0., 0., 0.], >>> >>>>>> [ 0., 0., 0.]]) >>> >>>>>> >>> a.T.shape >>> >>>>>> (10, 3, 4) >>> >>>>>> >>> >>>>>> python with iterable >>> >>>>>> >>> >>>>>> >>> reduce(np.maximum, a.T) >>> >>>>>> array([[ 4., 4., 4., 4.], >>> >>>>>> [ 4., 4., 4., 4.], >>> >>>>>> [ 4., 4., 4., 4.]]) >>> >>>>>> >>> reduce(np.minimum, a.T) >>> >>>>>> array([[ 0., 0., 0., 0.], >>> >>>>>> [ 0., 0., 0., 0.], >>> >>>>>> [ 0., 0., 0., 0.]]) >>> >>>>>> >>> >>>>>> Josef >>> >>>>>> >>> >>>>>> >> >>> >>>>>> >> _______________________________________________ >>> >>>>>> >> NumPy-Discussion mailing list >>> >>>>>> >> NumPy-Discussion at scipy.org >>> >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>>>>> >> >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > _______________________________________________ >>> >>>>>> > NumPy-Discussion mailing list >>> >>>>>> > NumPy-Discussion at scipy.org >>> >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>>>>> > >>> >>>>>> _______________________________________________ >>> >>>>>> NumPy-Discussion mailing list >>> >>>>>> NumPy-Discussion at scipy.org >>> >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> _______________________________________________ >>> >>>>> NumPy-Discussion mailing list >>> >>>>> NumPy-Discussion at scipy.org >>> >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>>>> >>> >>>> >>> >>>> _______________________________________________ >>> >>>> NumPy-Discussion mailing list >>> >>>> NumPy-Discussion at scipy.org >>> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> NumPy-Discussion mailing list >>> >>> NumPy-Discussion at scipy.org >>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> >>> >> >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 6 23:07:53 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 6 Dec 2011 23:07:53 -0500 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: I *think* it may work better if you replace the last 3 lines in your loop by: a=all_TSFC[0] if len(all_TSFC) > 1: N.maximum(a, TSFC, out=a) Not 100% sure that would work though, as I'm not entirely confident I understand your code. -=- Olivier 2011/12/6 questions anon > Something fancier I think, > I am able to compare the result with my previous method so I can easily > see I am doing something wrong. > see code below: > > > all_TSFC=[] > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > print "dealing with ncfiles:", ncfile > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > ncfile.close() > all_TSFC.append(TSFC) > a=TSFC[0] > for b in TSFC[1:]: > N.maximum(a,b,out=a) > > big_array=N.ma.concatenate(all_TSFC) > Max=big_array.max(axis=0) > print "max is", Max,"a is", a > > > On Wed, Dec 7, 2011 at 2:34 PM, Olivier Delalleau wrote: > >> Is 'a' a regular numpy array or something fancier? >> >> >> -=- Olivier >> >> 2011/12/6 questions anon >> >>> thanks again my only problem though is that the out=a in the loop does >>> not seem to replace my a= outside the loop so my final a is whatever I >>> started with for a. >>> Not sure what I am doing wrong whether it is something with the loop or >>> with the command. >>> >>> On Wed, Dec 7, 2011 at 1:44 PM, wrote: >>> >>>> On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau >>>> wrote: >>>> > The "out=a" keyword will ensure your first array will keep being >>>> updated. So >>>> > you can do something like: >>>> > >>>> > a = my_list_of_arrays[0] >>>> > for b in my_list_of_arrays[1:]: >>>> > numpy.maximum(a, b, out=a) >>>> >>>> I didn't think of the out argument which makes it more efficient, but >>>> in my example I used Python's reduce which takes an iterable and not >>>> one huge array. >>>> >>>> Josef >>>> >>>> >>>> > >>>> > -=- Olivier >>>> > >>>> > 2011/12/6 questions anon >>>> >> >>>> >> thanks for all of your help, that does look appropriate but I am not >>>> sure >>>> >> how to loop it over thousands of files. >>>> >> I need to keep the first array to compare with but replace any >>>> greater >>>> >> values as I loop through each array comparing back to the same >>>> array. does >>>> >> that make sense? >>>> >> >>>> >> >>>> >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau >>>> wrote: >>>> >>> >>>> >>> Thanks, I didn't know you could specify the out array :) >>>> >>> >>>> >>> (to the OP: my initial suggestion, although probably not very >>>> efficient, >>>> >>> seems to work with 2D arrays too, so I have no idea why it didn't >>>> work for >>>> >>> you -- but Nathaniel's one seems to be the ideal one anyway). >>>> >>> >>>> >>> -=- Olivier >>>> >>> >>>> >>> >>>> >>> 2011/12/6 Nathaniel Smith >>>> >>>> >>>> >>>> I think you want >>>> >>>> np.maximum(a, b, out=a) >>>> >>>> >>>> >>>> - Nathaniel >>>> >>>> >>>> >>>> On Dec 6, 2011 9:04 PM, "questions anon" >>> > >>>> >>>> wrote: >>>> >>>>> >>>> >>>>> thanks for responding Josef but that is not really what I am >>>> looking >>>> >>>>> for, I have a multidimensional array and if the next array has >>>> any values >>>> >>>>> greater than what is in my first array I want to replace them. >>>> The data are >>>> >>>>> contained in netcdf files. >>>> >>>>> I can achieve what I want if I combine all of my arrays using >>>> numpy >>>> >>>>> concatenate and then using the command numpy.max(myarray, axis=0) >>>> but >>>> >>>>> because I have so many arrays I end up with a memory error so I >>>> need to find >>>> >>>>> a way to get the maximum while looping. >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>>> >>>>>> >>>> >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau >>> > >>>> >>>>>> wrote: >>>> >>>>>> > It may not be the most efficient way to do this, but you can >>>> do: >>>> >>>>>> > mask = b > a >>>> >>>>>> > a[mask] = b[mask] >>>> >>>>>> > >>>> >>>>>> > -=- Olivier >>>> >>>>>> > >>>> >>>>>> > 2011/12/6 questions anon >>>> >>>>>> >> >>>> >>>>>> >> I would like to produce an array with the maximum values out >>>> of >>>> >>>>>> >> many >>>> >>>>>> >> (10000s) of arrays. >>>> >>>>>> >> I need to loop through many multidimentional arrays and if a >>>> value >>>> >>>>>> >> is >>>> >>>>>> >> larger (in the same place as the previous array) then I would >>>> like >>>> >>>>>> >> that >>>> >>>>>> >> value to replace it. >>>> >>>>>> >> >>>> >>>>>> >> e.g. >>>> >>>>>> >> a=[1,1,2,2 >>>> >>>>>> >> 11,2,2 >>>> >>>>>> >> 1,1,2,2] >>>> >>>>>> >> b=[1,1,3,2 >>>> >>>>>> >> 2,1,0,0 >>>> >>>>>> >> 1,1,2,0] >>>> >>>>>> >> >>>> >>>>>> >> where b>a replace with value in b, so the new a should be : >>>> >>>>>> >> >>>> >>>>>> >> a=[1,1,3,2] >>>> >>>>>> >> 2,1,2,2 >>>> >>>>>> >> 1,1,2,2] >>>> >>>>>> >> >>>> >>>>>> >> and then keep looping through many arrays and replace whenever >>>> >>>>>> >> value is >>>> >>>>>> >> larger. >>>> >>>>>> >> >>>> >>>>>> >> I have tried numpy.putmask but that results in >>>> >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not >>>> list >>>> >>>>>> >> Any other ideas? Thanks >>>> >>>>>> >>>> >>>>>> if I understand correctly it's a minimum.reduce >>>> >>>>>> >>>> >>>>>> numpy >>>> >>>>>> >>>> >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], >>>> >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) >>>> >>>>>> >>> np.minimum.reduce(a, axis=2) >>>> >>>>>> array([[ 0., 0., 0.], >>>> >>>>>> [ 0., 0., 0.], >>>> >>>>>> [ 0., 0., 0.], >>>> >>>>>> [ 0., 0., 0.]]) >>>> >>>>>> >>> a.T.shape >>>> >>>>>> (10, 3, 4) >>>> >>>>>> >>>> >>>>>> python with iterable >>>> >>>>>> >>>> >>>>>> >>> reduce(np.maximum, a.T) >>>> >>>>>> array([[ 4., 4., 4., 4.], >>>> >>>>>> [ 4., 4., 4., 4.], >>>> >>>>>> [ 4., 4., 4., 4.]]) >>>> >>>>>> >>> reduce(np.minimum, a.T) >>>> >>>>>> array([[ 0., 0., 0., 0.], >>>> >>>>>> [ 0., 0., 0., 0.], >>>> >>>>>> [ 0., 0., 0., 0.]]) >>>> >>>>>> >>>> >>>>>> Josef >>>> >>>>>> >>>> >>>>>> >> >>>> >>>>>> >> _______________________________________________ >>>> >>>>>> >> NumPy-Discussion mailing list >>>> >>>>>> >> NumPy-Discussion at scipy.org >>>> >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>>>> >> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > _______________________________________________ >>>> >>>>>> > NumPy-Discussion mailing list >>>> >>>>>> > NumPy-Discussion at scipy.org >>>> >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>>>> > >>>> >>>>>> _______________________________________________ >>>> >>>>>> NumPy-Discussion mailing list >>>> >>>>>> NumPy-Discussion at scipy.org >>>> >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> _______________________________________________ >>>> >>>>> NumPy-Discussion mailing list >>>> >>>>> NumPy-Discussion at scipy.org >>>> >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> >>>> NumPy-Discussion mailing list >>>> >>>> NumPy-Discussion at scipy.org >>>> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>> >>>> >>> >>>> >>> _______________________________________________ >>>> >>> NumPy-Discussion mailing list >>>> >>> NumPy-Discussion at scipy.org >>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> NumPy-Discussion mailing list >>>> >> NumPy-Discussion at scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >> >>>> > >>>> > >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> > >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Tue Dec 6 23:21:23 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 7 Dec 2011 05:21:23 +0100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: <92C17D98-ECB0-4E4B-967E-77F907C1BB67@astro.physik.uni-goettingen.de> On 07.12.2011, at 5:07AM, Olivier Delalleau wrote: > I *think* it may work better if you replace the last 3 lines in your loop by: > > a=all_TSFC[0] > if len(all_TSFC) > 1: > N.maximum(a, TSFC, out=a) > > Not 100% sure that would work though, as I'm not entirely confident I understand your code. > > -=- Olivier > > 2011/12/6 questions anon > Something fancier I think, > I am able to compare the result with my previous method so I can easily see I am doing something wrong. > see code below: > > > all_TSFC=[] > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > print "dealing with ncfiles:", ncfile > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) > ncfile.close() > all_TSFC.append(TSFC) > a=TSFC[0] > for b in TSFC[1:]: > N.maximum(a,b,out=a) > I also understood TSFC is already the array you want to work on, so above you'd just take a slice and overwrite the result in the next file iteration anyway. Iterating over the list all_TSFC should be correct, but I understood you don't want to load the entire input into memory in you working code. Then you can simply skip the list, just need to take care of initial conditions - something like the following should do: path=path+'/' a = None for ncfile in files: if ncfile[-3:]=='.nc': print "dealing with ncfiles:", ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) ncfile.close() if not is instance(a,N.ndarray): a=TSFC else: N.maximum(a, TSFC, out=a) HTH, Derek > big_array=N.ma.concatenate(all_TSFC) > Max=big_array.max(axis=0) > print "max is", Max,"a is", a > From Tim.Burgess at noaa.gov Tue Dec 6 23:42:44 2011 From: Tim.Burgess at noaa.gov (Tim Burgess) Date: Wed, 07 Dec 2011 14:42:44 +1000 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: On 07/12/2011, at 1:49 PM, questions anon wrote: > fillvalue=ncfile.variables['T_SFC']._FillValue > TSFC=MA.masked_values(TSFC, fillvalue) You can probably also eliminate the above two lines from your code. > TSFC=ncfile.variables['T_SFC'][:] If your NetCDF files are properly structured, the above line will give you a masked array. If you really need to put a fill value in to go to a non-masked array, better to do this just once after the maximums have been determined. Tim From questions.anon at gmail.com Tue Dec 6 23:54:34 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 15:54:34 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: sorry the 'all_TSFC' is for my other check of maximum using concatenate and N.max, I know that works so I am comparing it to this method. The only reason I need another method is for memory error issues. I like the code I have written so far as it makes sense to me. I can't get the extra examples I have been given to work and that is most likely because I don't understand them, these are the errors I get : Traceback (most recent call last): File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line 46, in N.maximum(a,TSFC,out=a) ValueError: non-broadcastable output operand with shape (106,193) doesn't match the broadcast shape (721,106,193) and Traceback (most recent call last): File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line 45, in if not instance(a, N.ndarray): NameError: name 'instance' is not defined On Wed, Dec 7, 2011 at 3:07 PM, Olivier Delalleau wrote: > I *think* it may work better if you replace the last 3 lines in your loop > by: > > a=all_TSFC[0] > if len(all_TSFC) > 1: > N.maximum(a, TSFC, out=a) > > Not 100% sure that would work though, as I'm not entirely confident I > understand your code. > > > -=- Olivier > > 2011/12/6 questions anon > >> Something fancier I think, >> I am able to compare the result with my previous method so I can easily >> see I am doing something wrong. >> see code below: >> >> >> all_TSFC=[] >> for (path, dirs, files) in os.walk(MainFolder): >> for dir in dirs: >> print dir >> path=path+'/' >> for ncfile in files: >> if ncfile[-3:]=='.nc': >> print "dealing with ncfiles:", ncfile >> ncfile=os.path.join(path,ncfile) >> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >> TSFC=ncfile.variables['T_SFC'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> TSFC=MA.masked_values(TSFC, fillvalue) >> ncfile.close() >> all_TSFC.append(TSFC) >> a=TSFC[0] >> for b in TSFC[1:]: >> N.maximum(a,b,out=a) >> >> big_array=N.ma.concatenate(all_TSFC) >> Max=big_array.max(axis=0) >> print "max is", Max,"a is", a >> >> >> On Wed, Dec 7, 2011 at 2:34 PM, Olivier Delalleau wrote: >> >>> Is 'a' a regular numpy array or something fancier? >>> >>> >>> -=- Olivier >>> >>> 2011/12/6 questions anon >>> >>>> thanks again my only problem though is that the out=a in the loop does >>>> not seem to replace my a= outside the loop so my final a is whatever I >>>> started with for a. >>>> Not sure what I am doing wrong whether it is something with the loop or >>>> with the command. >>>> >>>> On Wed, Dec 7, 2011 at 1:44 PM, wrote: >>>> >>>>> On Tue, Dec 6, 2011 at 9:36 PM, Olivier Delalleau >>>>> wrote: >>>>> > The "out=a" keyword will ensure your first array will keep being >>>>> updated. So >>>>> > you can do something like: >>>>> > >>>>> > a = my_list_of_arrays[0] >>>>> > for b in my_list_of_arrays[1:]: >>>>> > numpy.maximum(a, b, out=a) >>>>> >>>>> I didn't think of the out argument which makes it more efficient, but >>>>> in my example I used Python's reduce which takes an iterable and not >>>>> one huge array. >>>>> >>>>> Josef >>>>> >>>>> >>>>> > >>>>> > -=- Olivier >>>>> > >>>>> > 2011/12/6 questions anon >>>>> >> >>>>> >> thanks for all of your help, that does look appropriate but I am >>>>> not sure >>>>> >> how to loop it over thousands of files. >>>>> >> I need to keep the first array to compare with but replace any >>>>> greater >>>>> >> values as I loop through each array comparing back to the same >>>>> array. does >>>>> >> that make sense? >>>>> >> >>>>> >> >>>>> >> On Wed, Dec 7, 2011 at 1:12 PM, Olivier Delalleau >>>>> wrote: >>>>> >>> >>>>> >>> Thanks, I didn't know you could specify the out array :) >>>>> >>> >>>>> >>> (to the OP: my initial suggestion, although probably not very >>>>> efficient, >>>>> >>> seems to work with 2D arrays too, so I have no idea why it didn't >>>>> work for >>>>> >>> you -- but Nathaniel's one seems to be the ideal one anyway). >>>>> >>> >>>>> >>> -=- Olivier >>>>> >>> >>>>> >>> >>>>> >>> 2011/12/6 Nathaniel Smith >>>>> >>>> >>>>> >>>> I think you want >>>>> >>>> np.maximum(a, b, out=a) >>>>> >>>> >>>>> >>>> - Nathaniel >>>>> >>>> >>>>> >>>> On Dec 6, 2011 9:04 PM, "questions anon" < >>>>> questions.anon at gmail.com> >>>>> >>>> wrote: >>>>> >>>>> >>>>> >>>>> thanks for responding Josef but that is not really what I am >>>>> looking >>>>> >>>>> for, I have a multidimensional array and if the next array has >>>>> any values >>>>> >>>>> greater than what is in my first array I want to replace them. >>>>> The data are >>>>> >>>>> contained in netcdf files. >>>>> >>>>> I can achieve what I want if I combine all of my arrays using >>>>> numpy >>>>> >>>>> concatenate and then using the command numpy.max(myarray, >>>>> axis=0) but >>>>> >>>>> because I have so many arrays I end up with a memory error so I >>>>> need to find >>>>> >>>>> a way to get the maximum while looping. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Dec 7, 2011 at 12:36 PM, wrote: >>>>> >>>>>> >>>>> >>>>>> On Tue, Dec 6, 2011 at 7:55 PM, Olivier Delalleau < >>>>> shish at keba.be> >>>>> >>>>>> wrote: >>>>> >>>>>> > It may not be the most efficient way to do this, but you can >>>>> do: >>>>> >>>>>> > mask = b > a >>>>> >>>>>> > a[mask] = b[mask] >>>>> >>>>>> > >>>>> >>>>>> > -=- Olivier >>>>> >>>>>> > >>>>> >>>>>> > 2011/12/6 questions anon >>>>> >>>>>> >> >>>>> >>>>>> >> I would like to produce an array with the maximum values out >>>>> of >>>>> >>>>>> >> many >>>>> >>>>>> >> (10000s) of arrays. >>>>> >>>>>> >> I need to loop through many multidimentional arrays and if a >>>>> value >>>>> >>>>>> >> is >>>>> >>>>>> >> larger (in the same place as the previous array) then I >>>>> would like >>>>> >>>>>> >> that >>>>> >>>>>> >> value to replace it. >>>>> >>>>>> >> >>>>> >>>>>> >> e.g. >>>>> >>>>>> >> a=[1,1,2,2 >>>>> >>>>>> >> 11,2,2 >>>>> >>>>>> >> 1,1,2,2] >>>>> >>>>>> >> b=[1,1,3,2 >>>>> >>>>>> >> 2,1,0,0 >>>>> >>>>>> >> 1,1,2,0] >>>>> >>>>>> >> >>>>> >>>>>> >> where b>a replace with value in b, so the new a should be : >>>>> >>>>>> >> >>>>> >>>>>> >> a=[1,1,3,2] >>>>> >>>>>> >> 2,1,2,2 >>>>> >>>>>> >> 1,1,2,2] >>>>> >>>>>> >> >>>>> >>>>>> >> and then keep looping through many arrays and replace >>>>> whenever >>>>> >>>>>> >> value is >>>>> >>>>>> >> larger. >>>>> >>>>>> >> >>>>> >>>>>> >> I have tried numpy.putmask but that results in >>>>> >>>>>> >> TypeError: putmask() argument 1 must be numpy.ndarray, not >>>>> list >>>>> >>>>>> >> Any other ideas? Thanks >>>>> >>>>>> >>>>> >>>>>> if I understand correctly it's a minimum.reduce >>>>> >>>>>> >>>>> >>>>>> numpy >>>>> >>>>>> >>>>> >>>>>> >>> a = np.concatenate((np.arange(5)[::-1], >>>>> >>>>>> >>> np.arange(5)))*np.ones((4,3,1)) >>>>> >>>>>> >>> np.minimum.reduce(a, axis=2) >>>>> >>>>>> array([[ 0., 0., 0.], >>>>> >>>>>> [ 0., 0., 0.], >>>>> >>>>>> [ 0., 0., 0.], >>>>> >>>>>> [ 0., 0., 0.]]) >>>>> >>>>>> >>> a.T.shape >>>>> >>>>>> (10, 3, 4) >>>>> >>>>>> >>>>> >>>>>> python with iterable >>>>> >>>>>> >>>>> >>>>>> >>> reduce(np.maximum, a.T) >>>>> >>>>>> array([[ 4., 4., 4., 4.], >>>>> >>>>>> [ 4., 4., 4., 4.], >>>>> >>>>>> [ 4., 4., 4., 4.]]) >>>>> >>>>>> >>> reduce(np.minimum, a.T) >>>>> >>>>>> array([[ 0., 0., 0., 0.], >>>>> >>>>>> [ 0., 0., 0., 0.], >>>>> >>>>>> [ 0., 0., 0., 0.]]) >>>>> >>>>>> >>>>> >>>>>> Josef >>>>> >>>>>> >>>>> >>>>>> >> >>>>> >>>>>> >> _______________________________________________ >>>>> >>>>>> >> NumPy-Discussion mailing list >>>>> >>>>>> >> NumPy-Discussion at scipy.org >>>>> >>>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>>> >> >>>>> >>>>>> > >>>>> >>>>>> > >>>>> >>>>>> > _______________________________________________ >>>>> >>>>>> > NumPy-Discussion mailing list >>>>> >>>>>> > NumPy-Discussion at scipy.org >>>>> >>>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>>> > >>>>> >>>>>> _______________________________________________ >>>>> >>>>>> NumPy-Discussion mailing list >>>>> >>>>>> NumPy-Discussion at scipy.org >>>>> >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> >>>>> NumPy-Discussion mailing list >>>>> >>>>> NumPy-Discussion at scipy.org >>>>> >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>> >>>>> >>>> _______________________________________________ >>>>> >>>> NumPy-Discussion mailing list >>>>> >>>> NumPy-Discussion at scipy.org >>>>> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>>> >>> >>>>> >>> >>>>> >>> _______________________________________________ >>>>> >>> NumPy-Discussion mailing list >>>>> >>> NumPy-Discussion at scipy.org >>>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>> >>>>> >> >>>>> >> >>>>> >> _______________________________________________ >>>>> >> NumPy-Discussion mailing list >>>>> >> NumPy-Discussion at scipy.org >>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >> >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > NumPy-Discussion mailing list >>>>> > NumPy-Discussion at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> > >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Wed Dec 7 00:11:27 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 7 Dec 2011 06:11:27 +0100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: References: Message-ID: <9C212F22-2AF9-45B8-99E0-E232899D7A2E@astro.physik.uni-goettingen.de> On 07.12.2011, at 5:54AM, questions anon wrote: > sorry the 'all_TSFC' is for my other check of maximum using concatenate and N.max, I know that works so I am comparing it to this method. The only reason I need another method is for memory error issues. > I like the code I have written so far as it makes sense to me. I can't get the extra examples I have been given to work and that is most likely because I don't understand them, these are the errors I get : > > Traceback (most recent call last): > File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line 46, in > N.maximum(a,TSFC,out=a) > ValueError: non-broadcastable output operand with shape (106,193) doesn't match the broadcast shape (721,106,193) > > and > OK, then it seems we did not indeed grasp the entire scope of the problem - since you have initialised a from the previous array TSFC (not from TSFC[0]?!), this can only mean the arrays read in come in different shapes? I don't quite understand how the previous version did not raise an error then; but if you only want the (106,193)-subarray you have indeed to keep the loop for b in TSFC[:]: N.maximum(a,b,out=a) But you would have to find some way to distinguish between ndim=2 and ndim=3 input, if really both can occur... > > Traceback (most recent call last): > File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line 45, in > if not instance(a, N.ndarray): > NameError: name 'instance' is not defined > Sorry, typing error (or devious auto-correct?) - this should be 'isinstance()' Cheers, Derek From questions.anon at gmail.com Wed Dec 7 00:26:09 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 7 Dec 2011 16:26:09 +1100 Subject: [Numpy-discussion] loop through values in a array and find maximum as looping In-Reply-To: <9C212F22-2AF9-45B8-99E0-E232899D7A2E@astro.physik.uni-goettingen.de> References: <9C212F22-2AF9-45B8-99E0-E232899D7A2E@astro.physik.uni-goettingen.de> Message-ID: thanks for all your responses. I think I have FINALLY worked it out with all of your help. I just assigned one array from one ncfile to "a" at the beginning of my code and then ran the loop and it worked!! sorry for all the questions but I learn so much playing and getting ideas from others. Thanks again. code below for anyone else that needs to do the same. onefile=Dataset("E:/01/IDZ00026_T_SFC.nc", 'r+', 'NETCDF4') oneTSFC=onefile.variables['T_SFC'][:] a=oneTSFC[0] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][:] ncfile.close() for b in TSFC[:]: N.maximum(a,b, out=a) print a On Wed, Dec 7, 2011 at 4:11 PM, Derek Homeier < derek at astro.physik.uni-goettingen.de> wrote: > On 07.12.2011, at 5:54AM, questions anon wrote: > > > sorry the 'all_TSFC' is for my other check of maximum using concatenate > and N.max, I know that works so I am comparing it to this method. The only > reason I need another method is for memory error issues. > > I like the code I have written so far as it makes sense to me. I can't > get the extra examples I have been given to work and that is most likely > because I don't understand them, these are the errors I get : > > > > Traceback (most recent call last): > > File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line > 46, in > > N.maximum(a,TSFC,out=a) > > ValueError: non-broadcastable output operand with shape (106,193) > doesn't match the broadcast shape (721,106,193) > > > > and > > > OK, then it seems we did not indeed grasp the entire scope of the problem - > since you have initialised a from the previous array TSFC (not from > TSFC[0]?!), this can > only mean the arrays read in come in different shapes? I don't quite > understand how the > previous version did not raise an error then; but if you only want the > (106,193)-subarray > you have indeed to keep the loop > for b in TSFC[:]: > N.maximum(a,b,out=a) > > But you would have to find some way to distinguish between ndim=2 and > ndim=3 input, > if really both can occur... > > > > Traceback (most recent call last): > > File "d:\plot_summarystats\test_plot_remove_memoryerror_max.py", line > 45, in > > if not instance(a, N.ndarray): > > NameError: name 'instance' is not defined > > > Sorry, typing error (or devious auto-correct?) - this should be > 'isinstance()' > > Cheers, > Derek > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnetotellurics at gmail.com Wed Dec 7 02:20:44 2011 From: magnetotellurics at gmail.com (kneil) Date: Tue, 6 Dec 2011 23:20:44 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> <32906553.post@talk.nabble.com> Message-ID: <32927196.post@talk.nabble.com> Hi Nathaniel, The results of running memtest was a pass with no errors. -Karl Nathaniel Smith wrote: > > (You should still run memtest. It's very easy - just install it with your > package manager, then reboot. Hold down the shift key while booting, and > you'll get a boot menu. Choose memtest, and then leave it to run > overnight.) > > - Nathaniel > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32927196.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pav at iki.fi Wed Dec 7 05:02:58 2011 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 07 Dec 2011 11:02:58 +0100 Subject: [Numpy-discussion] Slow Numpy/MKL vs Matlab/MKL In-Reply-To: References: Message-ID: 06.12.2011 23:31, Oleg Mikulya kirjoitti: > How to make Numpy to match Matlab in term of performance ? I have tryied > with different options, using different MKL libraries and ICC versions, > still Numpy is below Matalb for certain basic tasks by ~2x. About 5 > years ago I was able to get about same speed, not anymore. Matlab > suppose to use same MKL, what it the reason of such Numpy slowness > (beside one, yet fundamental, task)? There should be no reason for a difference. It simply makes the calls to the external library, and the wrapper code is straightforward. If Numpy indeed is linked against MKL (check the build log), then one possible reason could be different threading options passed to MKL. -- Pauli Virtanen From pierre.haessig at crans.org Wed Dec 7 05:24:26 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Wed, 07 Dec 2011 11:24:26 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: <4EDF3EDA.2000003@crans.org> Le 06/12/2011 23:13, Wes McKinney a ?crit : > I think R has two functions read.csv and read.csv2, where read.csv2 is > capable of dealing with things like European decimal format. > I may be wrong, but from R's help I understand that read.csv, read.csv2, read.delim, ... are just calls to read.table with different default values (for separtor, decimal sign, ....) This function read.table is indeed pretty flexible (see signature below) Having a dedicated fast function for properly formatted CSV table may be a good idea. But how to define "properly formatted" ... I've seen many tiny variations so I'm not sure ! Now for my personal use, I was not so frustrated by loading performance but rather by NA support, so I wrote my own loadCsv function to get a masked array. It was nor beautiful, neither very efficient, but it does the job ! Best, Pierre read.table &co signatures : read.table(file, header = FALSE, sep = "", quote = "\"'", dec = ".", row.names, col.names, as.is = !stringsAsFactors, na.strings = "NA", colClasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill = !blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#", allowEscapes = FALSE, flush = FALSE, stringsAsFactors = default.stringsAsFactors(), fileEncoding = "", encoding = "unknown", text) read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".", fill = TRUE, comment.char="", ...) read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",", fill = TRUE, comment.char="", ...) --------------------------------------------------------- Copy paste from my own dirty "csv toolbox" NA = -9999. def _NA_conv(s): '''convert a string number representation into a float, with a special behaviour for "NA" values : if s=="" or "NA", it returns the key value NA (set to -9999.) ''' if s=='' or s=='NA': return NA else: return float(s) def loadCsv(filename, delimiter=',', usecols=None, skiprows=1): '''wrapper around numpy.loadtxt to load a properly R formatted CSV file with NA values of which the first row should be a header row Returns ------- (headers, data, dataNAs) ''' # 1) Read header headers = [] with open(filename) as f: line = f.readline().strip() headers = line.split(delimiter) if usecols: headers = [headers[i] for i in usecols] # 2) Read converters = None if usecols is not None: converters = dict(zip(usecols, [_NA_conv]*len(usecols))) data = np.loadtxt(filename, delimiter=delimiter, usecols=usecols, skiprows=skiprows, converters = converters ) dataNAs = (data == NA) # Set NAs to zero data[dataNAs] = 0. # Transforms array in "masked array" data = np.ma.masked_array(data, dataNAs) return (headers, data, dataNAs) From pgmdevlist at gmail.com Wed Dec 7 06:42:16 2011 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 7 Dec 2011 12:42:16 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: <4EDF3EDA.2000003@crans.org> References: <4EDF3EDA.2000003@crans.org> Message-ID: On Dec 07, 2011, at 11:24 , Pierre Haessig wrote: > > Now for my personal use, I was not so frustrated by loading performance > but rather by NA support, so I wrote my own loadCsv function to get a > masked array. It was nor beautiful, neither very efficient, but it does > the job ! Ever tried to use genfromtxt ? From pierre.haessig at crans.org Wed Dec 7 07:42:48 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Wed, 07 Dec 2011 13:42:48 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: <4EDF3EDA.2000003@crans.org> Message-ID: <4EDF5F48.8070801@crans.org> Le 07/12/2011 12:42, Pierre GM a ?crit : > Ever tried to use genfromtxt ? You'll guess I didn't ... next time I'll do ;-) Thanks for the tip ! Best, Pierre From shish at keba.be Wed Dec 7 08:43:14 2011 From: shish at keba.be (Olivier Delalleau) Date: Wed, 7 Dec 2011 08:43:14 -0500 Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: <32922174.post@talk.nabble.com> References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> <32906553.post@talk.nabble.com> <32922174.post@talk.nabble.com> Message-ID: I was trying to see if I could reproduce this problem, but your code fails with numpy 1.6.1 with: AttributeError: 'numpy.ndarray' object has no attribute 'H' Is X supposed to be a regular ndarray with dtype = 'complex128', or something else? -=- Olivier 2011/12/5 kneil > > Hi Nathaniel, > Thanks for the suggestion. I more or less implemented it: > > np.save('X',X); > X2=np.load('X.npy') > X2=np.asmatrix(X2) > diffy = (X != X2) > if diffy.any(): > print X[diffy] > print X2[diffy] > print X[diffy][0].view(np.uint8) > print X2[diffy][0].view(np.uint8) > S=X*X.H/k > S2=X2*X2.H/k > > nanElts=find(isnan(S)) > if len(nanElts)!=0: > print 'WARNING: Nans in S:'+str(find(isnan(S))) > print 'WARNING: Nans in S2:'+str(find(isnan(S2))) > > > > My ouput, (when I got NaN) mostly indicated that both arrays are > numerically > identical, and that they evaluated to have the same nan-value entries. > > For example > >>WARNING: Nans in S:[ 6 16] > >>WARNING: Nans in S2:[ 6 16] > > Another time I got as output: > > >>WARNING: Nans in S:[ 26 36 46 54 64 72 82 92 100 110 128 138 146 > 156 166 174 184 192 > 202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342 > 352 360 370 380 388 398 416 426 434 444 454 464 474] > >>WARNING: Nans in S2:[ 26 36 46 54 64 72 82 92 100 110 128 138 146 > 156 166 174 184 192 > 202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342 > 352 360 370 380 388 398 416 426 434 444 454 464 474] > > These were different arrays I think. At anyrate, those two results > appeared > from two runs of the exact same code. I do not use any random numbers in > the code by the way. Most of the time the code runs without any nan > showing > up at all, so this is an improvement. > > *I am pretty sure that one time there were nan in S, but not in S2, yet > still no difference was observed in the two matrices X and X2. But, I did > not save that output, so I can't prove it to myself, ... but I am pretty > sure I saw that. > > I will try and run memtest tonight. I am going out of town for a week and > probably wont be able to test until next week. > cheers, > Karl > > I also think What was beyond w: > 1. I have many less NaN than I used to, but still get NaN in S, > but NOT in S2! > > > > Nathaniel Smith wrote: > > > > If save/load actually makes a reliable difference, then it would be > useful > > to do something like this, and see what you see: > > > > save("X", X) > > X2 = load("X.npy") > > diff = (X == X2) > > # did save/load change anything? > > any(diff) > > # if so, then what changed? > > X[diff] > > X2[diff] > > # any subtle differences in floating point representation? > > X[diff][0].view(np.uint8) > > X2[diff][0].view(np.uint8) > > > > (You should still run memtest. It's very easy - just install it with your > > package manager, then reboot. Hold down the shift key while booting, and > > you'll get a boot menu. Choose memtest, and then leave it to run > > overnight.) > > > > - Nathaniel > > On Dec 2, 2011 10:10 PM, "kneil" wrote: > > > > > > -- > View this message in context: > http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32922174.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Wed Dec 7 11:23:14 2011 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Wed, 7 Dec 2011 17:23:14 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 22:11, Ralf Gommers wrote: > To be a bit more detailed here, these are the most significant pull requests > / patches that I think can be merged with a limited amount of work: > meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 > sample_from function: https://github.com/numpy/numpy/pull/151 > loadtable function: https://github.com/numpy/numpy/pull/143 > > Other maintenance things: > - un-deprecate putmask > - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." > - fix failing einsum and polyfit tests > - update release notes I'd suggest that, if possible, someone with sufficient knowledge to evaluate it look at ticket #1990 (data truncation from arrays of strings and integers), since it's both potentially dangerous, as well as a new bug introduced between 1.5.1 and 1.6.1. It might be straightforward to fix without too much difficulty, and if so, I think it's probably worth it. My opinion might be unduly influenced by a collaborator having been bitten by this bug recently, and having to throw away and redo a few weeks of calculations and analysis. Ray Jones From Chris.Barker at noaa.gov Wed Dec 7 13:50:14 2011 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Wed, 07 Dec 2011 10:50:14 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files Message-ID: <4EDFB566.4000303@noaa.gov> Hi folks, This is a continuation of a conversation already started, but i gave it a new, more appropriate, thread and subject. On 12/6/11 2:13 PM, Wes McKinney wrote: > we should start talking > about building a *high performance* flat file loading solution with > good column type inference and sensible defaults, etc. ... > I personally don't > believe in sacrificing an order of magnitude of performance in the 90% > case for the 10% case-- so maybe it makes sense to have two functions > around: a superfast custom CSV reader for well-behaved data, and a > slower, but highly flexible, function like loadtable to fall back on. I've wanted this for ages, and have done some work towards it, but like others, only had the time for a my-use-case-specific solution. A few thoughts: * If we have a good, fast ascii (or unicode?) to array reader, hopefully it could be leveraged for use in the more complex cases. So that rather than genfromtxt() being written from scratch, it would be a wrapper around the lower-level reader. * key to performance is to have the text to number to numpy type happening in C -- if you read the text with python, then convert to numbers, then to numpy arrays, it's simple going to be slow. * I think we want a solution that can be adapted to arbitrary text files -- not just tabular, CSV-style data. I have a lot of those to read - and some thoughts about how. Efforts I have made so far, and what I've learned from them: 1) fromfile(): fromfile (for text) is nice and fast, but buggy, and a bit too limited. I've posted various notes about this in the past (and, I'm pretty sure a couple tickets). They key missing features are: a) no support form commented lines (this is a lessor need, I think) b) there can be only one delimiter, and newlines are treated as generic whitespace. What this means is that if you have whitespace-delimited file, you can read multiple lines, but if it is, for instance, comma-delimited, then you can only read one line at a time, killing performance. c) there are various bugs if the text is malformed, or doesn't quite match what you're asking for (ie.e reading integers, but the tet is float) -- mostly really limited error checking. I spent some time digging into the code, and found it to be really hard to track C code. And very hard to update. The core idea is pretty nice -- each dtype should know how to read itself form a text file, but the implementation is painful. The key issue is that for floats and ints, anyway, it relies on the C atoi and atof functions. However, there have been patches to these that handle NaN better, etc, for numpy, and I think a python patch as well. So the code calls the numpy atoi, which does some checks, then calls the python atoi, which then calls the C lib atoi (I think all that...) In any case, the core bugs are due to the fact that atoi and friends doesn't return an error code, so you have to check if the pointer has been incremented to see if the read was successful -- this error checking is not propagated through all those levels of calls. It got really ugly to try to fix! Also, the use of the C atoi() means that locales may only be handled in the default way -- i.e. no way to read european-style floats on a system with a US locale. My conclusion -- the current code is too much a mess to try to deal with and fix! I also think it's a mistake to have text file reading a special case of fromfile(), it really should be a separate issue, though that's a minor API question. 2) FileScanner: FileScanner is some code a wrote years ago as a C extension - it's limited, but does the job and is pretty fast. It essentially calls fscanf() as many times as it gets a successful scan, skipping all invalid text, then returning a numpy array. You can also specify how many numbers you want read from the file. It only supports floats. Travis O. asked it it could be included in Scipy way back when, but I suspect none of my code actually made it in. If I had to do it again, I might write something similar in Cython, though I am still using it. My Conclusions: I think what we need is something similar to MATLAB's fscanf(): what it does is take a C-style format string, and apply it to your file over an over again as many times as it can, and returns an array. What's nice about this is that it can be purposed to efficiently read a wide variety of text files fast. For numpy, I imagine something like: fromtextfile(f, dtype=np.float64, comment=None, shape=None): """ read data from a text file, returning a numpy array f: is a filename or file-like object comment: is a string of the comment signifier. Anything on a line after this string will be ignored. dytpe: is a numpy dtype that you want read from the file shape: is the shape of the resulting array. If shape==None, the file will be read until EOF or until there is read error. By default, if there are newlines in the file, a 2-d array will be returned, with the newline signifying a new row in the array. """ This is actually pretty straightforward. If it support compound dtypes, then you can read a pretty complex CSV file, once you've determined the dtype for your "record" (row). It is also really simple to use for the simple cases. But of course, the implementation could be a pain -- I've been thinking that you could get a lot of it by creating a mapping from numpy dtypes to fscanf() format strings, then simply use fscanf for the actual file reading. This would certainly be easy for the easy cases. (maybe you'd want to use sscanf, so you could have the same code scan strings as well as files) Ideally, each dtype would know how to read itself from a string, but as I said above, the code for that is currently pretty ugly, so it may be easier to keep it separate. Anyway, I'd be glad to help with this effort. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bsouthey at gmail.com Wed Dec 7 14:45:28 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 7 Dec 2011 13:45:28 -0600 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 4:13 PM, Wes McKinney wrote: > On Tue, Dec 6, 2011 at 4:11 PM, Ralf Gommers > wrote: >> >> >> On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers >> wrote: >>> >>> Hi all, >>> >>> It's been a little over 6 months since the release of 1.6.0 and the NA >>> debate has quieted down, so I'd like to ask your opinion on the timing of >>> 1.7.0. It looks to me like we have a healthy amount of bug fixes and small >>> improvements, plus three larger chucks of work: >>> >>> - datetime >>> - NA >>> - Bento support >>> >>> My impression is that both datetime and NA are releasable, but should be >>> labeled "tech preview" or something similar, because they may still see >>> significant changes. Please correct me if I'm wrong. >>> >>> There's still some maintenance work to do and pull requests to merge, but >>> a beta release by Christmas should be feasible. >> >> >> To be a bit more detailed here, these are the most significant pull requests >> / patches that I think can be merged with a limited amount of work: >> meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 >> sample_from function: https://github.com/numpy/numpy/pull/151 >> loadtable function: https://github.com/numpy/numpy/pull/143 >> >> Other maintenance things: >> - un-deprecate putmask >> - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." >> - fix failing einsum and polyfit tests >> - update release notes >> >> Cheers, >> Ralf >> >> >>> What do you all think? >>> >>> >>> Cheers, >>> Ralf >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > This isn't the place for this discussion but we should start talking > about building a *high performance* flat file loading solution with > good column type inference and sensible defaults, etc. It's clear that > loadtable is aiming for highest compatibility-- for example I can read > a 2800x30 file in < 50 ms with the read_table / read_csv functions I > wrote myself recent in Cython (compared with loadtable taking > 1s as > quoted in the pull request), but I don't handle European decimal > formats and lots of other sources of unruliness. I personally don't > believe in sacrificing an order of magnitude of performance in the 90% > case for the 10% case-- so maybe it makes sense to have two functions > around: a superfast custom CSV reader for well-behaved data, and a > slower, but highly flexible, function like loadtable to fall back on. > I think R has two functions read.csv and read.csv2, where read.csv2 is > capable of dealing with things like European decimal format. > > - Wes > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion I do not agree with loadtable request simply because not wanting to have functions that do virtually the same thing - as the comments on the pull request (and Chris's email on 'Fast Reading of ASCII files'). I would like to see a valid user space justification for including it because just using regex's is not a suitable justification (but I agree it is a interesting feature): If loadtable will be a complete replacement for genfromtxt then there needs a plan towards supporting all the features of genfromtxt like 'skip_footer' and then genfromtxt needs to be set on the path to be depreciated. If loadtable is an intermediate between loadttxt and genfromtxt, then loadtable needs to be clear exactly what loadtable does not do that genfromtxt does (anything that loadtable does and genfromtxt does not do, should be filed as bug against genfromtxt). Knowing the case makes it easier to provide help by directing users to the appropriate function and which function should have bug reports against. For example, loadtxt requires 'Each row in the text file must have the same number of values' so one can direct a user to genfromtxt for that case rather than filing a bug report against loadtxt. I am also somewhat concerned regarding the NA object because of the limited implementation available. For example, numpy.dot is not implemented. Also there appears to be no plan to increase the implementation across numpy or support it long term. So while I have no problem with it being included, I do think there must be a serious commitment to having it fully supporting in the near future as well as providing a suitable long term roadmap. Otherwise it will just be a problematic code dump that will be difficult to support. Bruce From lou_boog2000 at yahoo.com Wed Dec 7 15:30:11 2011 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 7 Dec 2011 12:30:11 -0800 (PST) Subject: [Numpy-discussion] Simple way to launch python processes? In-Reply-To: <4EDFB566.4000303@noaa.gov> References: <4EDFB566.4000303@noaa.gov> Message-ID: <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> I would like to launch python modules or functions (I don't know which is easier to do, modules or functions) in separate Terminal windows so I can see the output from each as they execute. ?I need to be able to pass each module or function a set of parameters. ?I would like to do this from a python script already running in a Terminal window. ?In other words, I'd start up a "master" script and it would launch, say, three processes using another module or a function with different parameter values for each launch and each would run independently in its own Terminal window so stdout from each process would go to it's own respective window. ?When the process terminated the window would remain open. I've begun to look at subprocess modules, etc., but that's pretty confusing. I can do what I say above manually, but it's gotten clumsy as I want to run eventually in 12 cores. I have a Mac Pro running Mac OS X 10.6. If there is a better forum to ask this question, please let me know.? Thanks for any advice. ? -- Lou Pecora, my views are my own. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olegmikul at gmail.com Wed Dec 7 15:38:51 2011 From: olegmikul at gmail.com (Oleg Mikulya) Date: Wed, 7 Dec 2011 12:38:51 -0800 Subject: [Numpy-discussion] Slow Numpy/MKL vs Matlab/MKL In-Reply-To: References: Message-ID: Agree with your statement. Yes, it is MKL, indeed. For linear equations it is no difference, but there is difference for other functions. And yes, my suspicions is just threading options. How to pass them to MKL from python? Should I change some compiling options or environment options? On Wed, Dec 7, 2011 at 2:02 AM, Pauli Virtanen wrote: > 06.12.2011 23:31, Oleg Mikulya kirjoitti: > > How to make Numpy to match Matlab in term of performance ? I have tryied > > with different options, using different MKL libraries and ICC versions, > > still Numpy is below Matalb for certain basic tasks by ~2x. About 5 > > years ago I was able to get about same speed, not anymore. Matlab > > suppose to use same MKL, what it the reason of such Numpy slowness > > (beside one, yet fundamental, task)? > > There should be no reason for a difference. It simply makes the calls to > the external library, and the wrapper code is straightforward. > > If Numpy indeed is linked against MKL (check the build log), then one > possible reason could be different threading options passed to MKL. > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Wed Dec 7 15:43:27 2011 From: shish at keba.be (Olivier Delalleau) Date: Wed, 7 Dec 2011 15:43:27 -0500 Subject: [Numpy-discussion] Simple way to launch python processes? In-Reply-To: <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> References: <4EDFB566.4000303@noaa.gov> <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> Message-ID: Maybe try stackoverflow, since this isn't really a numpy question. To run a command like "python myscript.py arg1 arg2" in a separate process, you can do: p = subprocess.Popen("python myscript.py arg1 arg2".split()) You can launch many of these, and if you want to know if a process p is over, you can call p.poll(). I'm sure there are other (and better) options though. -=- Olivier 2011/12/7 Lou Pecora > I would like to launch python modules or functions (I don't know which is > easier to do, modules or functions) in separate Terminal windows so I can > see the output from each as they execute. I need to be able to pass each > module or function a set of parameters. I would like to do this from a > python script already running in a Terminal window. In other words, I'd > start up a "master" script and it would launch, say, three processes using > another module or a function with different parameter values for each > launch and each would run independently in its own Terminal window so > stdout from each process would go to it's own respective window. When the > process terminated the window would remain open. > > I've begun to look at subprocess modules, etc., but that's pretty > confusing. I can do what I say above manually, but it's gotten clumsy as I > want to run eventually in 12 cores. > > I have a Mac Pro running Mac OS X 10.6. > > If there is a better forum to ask this question, please let me know. > > Thanks for any advice. > > > -- Lou Pecora, my views are my own. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marquett at iap.fr Wed Dec 7 16:23:47 2011 From: marquett at iap.fr (Jean-Baptiste Marquette) Date: Wed, 7 Dec 2011 16:23:47 -0500 Subject: [Numpy-discussion] Simple way to launch python processes? In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> Message-ID: You should consider the powerful multiprocessing package. Have a look on this piece of code: import glob import os import multiprocessing as multi import subprocess as sub import time NPROC = 4 Python = '/Library/Frameworks/EPD64.framework/Versions/Current/bin/python' Xterm = '/usr/X11/bin/xterm ' coord = [] Size = '100x10' XPos = 810 YPos = 170 XOffset = 0 YOffset = 0 for i in range(NPROC): if i % 2 == 0: coord.append(Size + '+' + str(YPos) + '+' + str(YOffset)) else: coord.append(Size + '+' + str(XPos) + '+' + str(YOffset)) YOffset = YOffset + YPos def CompareColourRef(Champ): BaseChamp = os.path.basename(Champ) NameProc = int(multi.current_process().name[-1]) - 1 print 'Processing', BaseChamp, 'on processor', NameProc+1 os.putenv('ADAM_USER', DirWrk + 'adam_' + str(NameProc+1)) Command = Xterm + '-geometry ' + '"' + coord[NameProc] + '" -T " Proc' + str(NameProc+1) + ' ' + BaseChamp + ' ' + '" -e " ' + Python + ' ' + DirSrc + \ 'CompareColourRef.py ' + BaseChamp + ' 2>&1 | tee ' + DirLog + BaseChamp + '.log"' Process = sub.Popen([Command], shell=True) Process.wait() print BaseChamp, 'processed on processor', NameProc+1 return pool = multi.Pool(processes=NPROC) Champs = glob.glob(DirImg + '*/*') results = pool.map_async(CompareColourRef, Champs) pool.close() while results._number_left > 0: print "Waiting for", results._number_left, 'tasks to complete' time.sleep(15) pool.join() print 'Process completed' exit(0) Cheers Jean-Baptiste Le 7 d?c. 2011 ? 15:43, Olivier Delalleau a ?crit : > Maybe try stackoverflow, since this isn't really a numpy question. > To run a command like "python myscript.py arg1 arg2" in a separate process, you can do: > p = subprocess.Popen("python myscript.py arg1 arg2".split()) > You can launch many of these, and if you want to know if a process p is over, you can call p.poll(). > I'm sure there are other (and better) options though. > > -=- Olivier > > 2011/12/7 Lou Pecora > I would like to launch python modules or functions (I don't know which is easier to do, modules or functions) in separate Terminal windows so I can see the output from each as they execute. I need to be able to pass each module or function a set of parameters. I would like to do this from a python script already running in a Terminal window. In other words, I'd start up a "master" script and it would launch, say, three processes using another module or a function with different parameter values for each launch and each would run independently in its own Terminal window so stdout from each process would go to it's own respective window. When the process terminated the window would remain open. > > I've begun to look at subprocess modules, etc., but that's pretty confusing. I can do what I say above manually, but it's gotten clumsy as I want to run eventually in 12 cores. > > I have a Mac Pro running Mac OS X 10.6. > > If there is a better forum to ask this question, please let me know. > > Thanks for any advice. > > > -- Lou Pecora, my views are my own. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Wed Dec 7 16:50:17 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 7 Dec 2011 22:50:17 +0100 Subject: [Numpy-discussion] Slow Numpy/MKL vs Matlab/MKL In-Reply-To: References: Message-ID: On 07.12.2011, at 9:38PM, Oleg Mikulya wrote: > Agree with your statement. Yes, it is MKL, indeed. For linear equations it is no difference, but there is difference for other functions. And yes, my suspicions is just threading options. How to pass them to MKL from python? Should I change some compiling options or environment options? > You could check by monitoring the CPU usage while running the tasks - if it remains around 100% it is rather not using multiple threads. Generally MKL (if you linked the multi-threaded version, which seems to be the case, as mkl_intel_thread is in the libs) heeds the OMP_NUM_THREADS environment variable like other OpenMP programs. If that's set to your no. of cores before starting Python, it should be inherited; might also be possible to set it within Python (in any case you can check it with os.getenv()). I don't know if matlab sets different defaults so multiple threads are automatically used; normally I'd also expect Python to use all available cores if OMP_NUM_THREADS is not set at all? Cheers, Derek > On Wed, Dec 7, 2011 at 2:02 AM, Pauli Virtanen wrote: > 06.12.2011 23:31, Oleg Mikulya kirjoitti: > > How to make Numpy to match Matlab in term of performance ? I have tryied > > with different options, using different MKL libraries and ICC versions, > > still Numpy is below Matalb for certain basic tasks by ~2x. About 5 > > years ago I was able to get about same speed, not anymore. Matlab > > suppose to use same MKL, what it the reason of such Numpy slowness > > (beside one, yet fundamental, task)? > > There should be no reason for a difference. It simply makes the calls to > the external library, and the wrapper code is straightforward. > > If Numpy indeed is linked against MKL (check the build log), then one > possible reason could be different threading options passed to MKL. > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Wed Dec 7 16:55:22 2011 From: cournape at gmail.com (David Cournapeau) Date: Wed, 7 Dec 2011 16:55:22 -0500 Subject: [Numpy-discussion] Slow Numpy/MKL vs Matlab/MKL In-Reply-To: References: Message-ID: On Tue, Dec 6, 2011 at 5:31 PM, Oleg Mikulya wrote: > Hi, > > How to make Numpy to match Matlab in term of performance ? I have tryied > with different options, using different MKL libraries and ICC versions, > still Numpy is below Matalb for certain basic tasks by ~2x. About 5 years > ago I was able to get about same speed, not anymore. Matlab suppose to use > same MKL, what it the reason of such Numpy slowness (beside one, yet > fundamental, task) ? Have you checked that the returned values are the same (up to some precision) ? It may be that we don't use the same lapack underlying function, cheers, David From lou_boog2000 at yahoo.com Wed Dec 7 17:46:49 2011 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 7 Dec 2011 14:46:49 -0800 (PST) Subject: [Numpy-discussion] Simple way to launch python processes? In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> Message-ID: <1323298009.25023.YahooMailNeo@web34401.mail.mud.yahoo.com> From: Olivier Delalleau To: Discussion of Numerical Python Sent: Wednesday, December 7, 2011 3:43 PM Subject: Re: [Numpy-discussion] Simple way to launch python processes? Maybe try stackoverflow, since this isn't really a numpy question. To run a command like "python myscript.py arg1 arg2" in a separate process, you can do: ??? p = subprocess.Popen("python myscript.py arg1 arg2".split()) You can launch many of these, and if you want to know if a process p is over, you can call p.poll(). I'm sure there are other (and better) options though. -=- Olivier Thank you. ? -- Lou Pecora, my views are my own. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From lou_boog2000 at yahoo.com Wed Dec 7 17:48:08 2011 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 7 Dec 2011 14:48:08 -0800 (PST) Subject: [Numpy-discussion] Simple way to launch python processes? In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <1323289811.40516.YahooMailNeo@web34405.mail.mud.yahoo.com> Message-ID: <1323298088.31858.YahooMailNeo@web34401.mail.mud.yahoo.com> From: Jean-Baptiste Marquette To: Discussion of Numerical Python Sent: Wednesday, December 7, 2011 4:23 PM Subject: Re: [Numpy-discussion] Simple way to launch python processes? You should consider the powerful multiprocessing package. Have a look on this piece of code: importglob importos import multiprocessing as multi import subprocess as sub importtime NPROC = 4 Python = '/Library/Frameworks/EPD64.framework/Versions/Current/bin/python' Xterm = '/usr/X11/bin/xterm ' coord = [] Size = '100x10' XPos = 810 YPos = 170 XOffset = 0 YOffset = 0 for i in range(NPROC): ? ? if i % 2 == 0: ? ? ? ? coord.append(Size + '+' + str(YPos) + '+' + str(YOffset)) ? ? else: ? ? ? ? coord.append(Size + '+' + str(XPos) + '+' + str(YOffset)) ? ? ? ? YOffset = YOffset + YPos def CompareColourRef(Champ): ? ? BaseChamp = os.path.basename(Champ) ? ? NameProc = int(multi.current_process().name[-1]) - 1 ? ? print 'Processing', BaseChamp, 'on processor', NameProc+1 ? ? os.putenv('ADAM_USER', DirWrk + 'adam_' + str(NameProc+1)) ? ? Command =? Xterm + '-geometry ' + '"' + coord[NameProc] + '" -T " Proc' + str(NameProc+1) + ' ' + BaseChamp + ' ' + '" -e " ' + Python + ' ' + DirSrc + \ ? ? ? ? 'CompareColourRef.py ' + BaseChamp + ' 2>&1 | tee' + DirLog + BaseChamp + '.log"' ? ? Process = sub.Popen([Command], shell=True) ? ? Process.wait() ? ? print BaseChamp, 'processed on processor', NameProc+1 ? ? return pool = multi.Pool(processes=NPROC) Champs = glob.glob(DirImg + '*/*') results = pool.map_async(CompareColourRef, Champs) pool.close() while results._number_left > 0: ? ? print"Waiting for", results._number_left, 'tasks to complete' ? ? time.sleep(15) ?? ? pool.join() print'Process completed' exit(0) Cheers Jean-Baptiste ---------------------------------------------------------------------------------------------------------------------------------- Wow. ?I will have to digest that, but thank you. ? -- Lou Pecora, my views are my own. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From xabart at gmail.com Wed Dec 7 17:58:00 2011 From: xabart at gmail.com (Xavier Barthelemy) Date: Thu, 8 Dec 2011 09:58:00 +1100 Subject: [Numpy-discussion] idea of optimisation? In-Reply-To: References: <1323155858-sup-8509@david-desktop> Message-ID: Actually this can be a good idea. i didn't thought using he sorting. i'll try thanks for yours ideas Xavier 2011/12/7 Tony Yu > > > On Tue, Dec 6, 2011 at 2:51 AM, Xavier Barthelemy wrote: > >> ok let me be more precise >> >> I have an Z array which is the elevation >> from this I extract a discrete array of Zero Crossing, and another >> discrete array of Crests. >> len(crest) is different than len(Xzeros). I have a threshold method to >> detect my "valid" crests, and sometimes there are 2 crests between two >> zero-crossing (grouping effect) >> >> Crest and Zeros are 2 different arrays, with positions. example: >> Zeros=[1,2,3,4] Arrays=[1.5,1.7,3.5] >> >> >> and yes arrays can be sorted. not a problm with this. >> >> Xavier >> >> I may be oversimplifying this, but does searchsorted do what you want? > > In [314]: xzeros=[1,2,3,4]; xcrests=[1.5,1.7,3.5] > > In [315]: np.searchsorted(xzeros, xcrests) > Out[315]: array([1, 1, 3]) > > This returns the indexes of xzeros to the left of xcrests. > > -Tony > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 7 20:58:14 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 7 Dec 2011 20:58:14 -0500 Subject: [Numpy-discussion] type checking, what's recommended? Message-ID: If I want to know whether something that might be an array is really a plain ndarray and not a subclass, is using `type` the safest bet? All the other forms don't discriminate against subclasses. >>> type(np.ma.zeros(3)) is np.ndarray False >>> type(np.zeros(3)) is np.ndarray True >>> isinstance(np.ma.zeros(3), np.ndarray) True >>> isinstance(np.zeros(3), np.ndarray) True >>> issubclass(np.ma.zeros(3).__class__, np.ndarray) True >>> issubclass(np.zeros(3).__class__, np.ndarray) True >>> isinstance(np.matrix(np.zeros(3)), np.ndarray) True >>> type(np.matrix(np.zeros(3))) is np.ndarray False Thanks, Josef From shish at keba.be Wed Dec 7 21:01:06 2011 From: shish at keba.be (Olivier Delalleau) Date: Wed, 7 Dec 2011 21:01:06 -0500 Subject: [Numpy-discussion] type checking, what's recommended? In-Reply-To: References: Message-ID: We have indeed been using "type(a) is np.ndarray" in Theano to check that. If there's a better way, I'm interested to know as well :) -=- Olivier 2011/12/7 > If I want to know whether something that might be an array is really a > plain ndarray and not a subclass, is using `type` the safest bet? > > All the other forms don't discriminate against subclasses. > > >>> type(np.ma.zeros(3)) is np.ndarray > False > >>> type(np.zeros(3)) is np.ndarray > True > > >>> isinstance(np.ma.zeros(3), np.ndarray) > True > >>> isinstance(np.zeros(3), np.ndarray) > True > > >>> issubclass(np.ma.zeros(3).__class__, np.ndarray) > True > >>> issubclass(np.zeros(3).__class__, np.ndarray) > True > > >>> isinstance(np.matrix(np.zeros(3)), np.ndarray) > True > >>> type(np.matrix(np.zeros(3))) is np.ndarray > False > > Thanks, > > Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Dec 8 15:30:43 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 8 Dec 2011 21:30:43 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Wed, Dec 7, 2011 at 5:23 PM, Thouis (Ray) Jones wrote: > On Tue, Dec 6, 2011 at 22:11, Ralf Gommers > wrote: > > To be a bit more detailed here, these are the most significant pull > requests > > / patches that I think can be merged with a limited amount of work: > > meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 > > sample_from function: https://github.com/numpy/numpy/pull/151 > > loadtable function: https://github.com/numpy/numpy/pull/143 > > > > Other maintenance things: > > - un-deprecate putmask > > - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." > > - fix failing einsum and polyfit tests > > - update release notes > > I'd suggest that, if possible, someone with sufficient knowledge to > evaluate it look at ticket #1990 (data truncation from arrays of > strings and integers), since it's both potentially dangerous, as well > as a new bug introduced between 1.5.1 and 1.6.1. It might be > straightforward to fix without too much difficulty, and if so, I think > it's probably worth it. > > My opinion might be unduly influenced by a collaborator having been > bitten by this bug recently, and having to throw away and redo a few > weeks of calculations and analysis. > > I wouldn't call that unduly influenced. Regressions should be the nr. 1 priority for a release, so this should certainly be looked at. I've added a 1.7.0 Milestone on Trac and put this ticket under it. If anyone knows of any other regressions, please do the same. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Dec 8 15:48:54 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 8 Dec 2011 21:48:54 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Wed, Dec 7, 2011 at 8:45 PM, Bruce Southey wrote: > On Tue, Dec 6, 2011 at 4:13 PM, Wes McKinney wrote: > > On Tue, Dec 6, 2011 at 4:11 PM, Ralf Gommers > > wrote: > >> > >> > >> On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers < > ralf.gommers at googlemail.com> > >> wrote: > >>> > >>> Hi all, > >>> > >>> It's been a little over 6 months since the release of 1.6.0 and the NA > >>> debate has quieted down, so I'd like to ask your opinion on the timing > of > >>> 1.7.0. It looks to me like we have a healthy amount of bug fixes and > small > >>> improvements, plus three larger chucks of work: > >>> > >>> - datetime > >>> - NA > >>> - Bento support > >>> > >>> My impression is that both datetime and NA are releasable, but should > be > >>> labeled "tech preview" or something similar, because they may still see > >>> significant changes. Please correct me if I'm wrong. > >>> > >>> There's still some maintenance work to do and pull requests to merge, > but > >>> a beta release by Christmas should be feasible. > >> > >> > >> To be a bit more detailed here, these are the most significant pull > requests > >> / patches that I think can be merged with a limited amount of work: > >> meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966 > >> sample_from function: https://github.com/numpy/numpy/pull/151 > >> loadtable function: https://github.com/numpy/numpy/pull/143 > >> > >> Other maintenance things: > >> - un-deprecate putmask > >> - clean up causes of "DType strings 'O4' and 'O8' are deprecated..." > >> - fix failing einsum and polyfit tests > >> - update release notes > >> > >> Cheers, > >> Ralf > >> > >> > >>> What do you all think? > >>> > >>> > >>> Cheers, > >>> Ralf > >> > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > This isn't the place for this discussion but we should start talking > > about building a *high performance* flat file loading solution with > > good column type inference and sensible defaults, etc. It's clear that > > loadtable is aiming for highest compatibility-- for example I can read > > a 2800x30 file in < 50 ms with the read_table / read_csv functions I > > wrote myself recent in Cython (compared with loadtable taking > 1s as > > quoted in the pull request), but I don't handle European decimal > > formats and lots of other sources of unruliness. I personally don't > > believe in sacrificing an order of magnitude of performance in the 90% > > case for the 10% case-- so maybe it makes sense to have two functions > > around: a superfast custom CSV reader for well-behaved data, and a > > slower, but highly flexible, function like loadtable to fall back on. > > I think R has two functions read.csv and read.csv2, where read.csv2 is > > capable of dealing with things like European decimal format. > > > > - Wes > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I do not agree with loadtable request simply because not wanting to > have functions that do virtually the same thing - as the comments on > the pull request (and Chris's email on 'Fast Reading of ASCII files'). > I would like to see a valid user space justification for including it > because just using regex's is not a suitable justification (but I > agree it is a interesting feature): > There's a number of features listed in the pull request message and Chris' first comment, so I won't repeat those here. That it's close to being ready is just my personal impression. There are seven participants in the pull request including Pierre and Derek, who have both done significant work on loadtxt / genfromtxt, and yourself. So loadtable certainly won't be merged without the questions you raise here being resolved. > If loadtable will be a complete replacement for genfromtxt then there > needs a plan towards supporting all the features of genfromtxt like > 'skip_footer' and then genfromtxt needs to be set on the path to be > depreciated. > If loadtable is an intermediate between loadttxt and genfromtxt, then > loadtable needs to be clear exactly what loadtable does not do that > genfromtxt does (anything that loadtable does and genfromtxt does not > do, should be filed as bug against genfromtxt). > > Knowing the case makes it easier to provide help by directing users to > the appropriate function and which function should have bug reports > against. For example, loadtxt requires 'Each row in the text file must > have the same number of values' so one can direct a user to genfromtxt > for that case rather than filing a bug report against loadtxt. > > I am also somewhat concerned regarding the NA object because of the > limited implementation available. For example, numpy.dot is not > implemented. Also there appears to be no plan to increase the > implementation across numpy or support it long term. I have the vague impression that there is such a plan, or at least the intention to support it better over time. But it would be good if someone could spell this out. Ralf > So while I have > no problem with it being included, I do think there must be a serious > commitment to having it fully supporting in the near future as well as > providing a suitable long term roadmap. Otherwise it will just be a > problematic code dump that will be difficult to support. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yanghatespam at gmail.com Thu Dec 8 18:15:24 2011 From: yanghatespam at gmail.com (Yang Zhang) Date: Thu, 8 Dec 2011 15:15:24 -0800 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 12:05 PM, Yang Zhang wrote: > On Tue, Oct 4, 2011 at 1:28 AM, Robin wrote: >> On Mon, Oct 3, 2011 at 9:42 PM, Yang Zhang wrote: >>> It turns out that there's a long-standing problem in numpy that >>> prevents it from being used in embedded CPython environments: >> >> Just wanted to make the point for reference that in general Numpy does >> work fine in (non-threaded) embedded CPython situations, see for >> example pymex [1] which embeds Python + Numpy in a Matlab mex file and >> works really well. >> >> This seems to a be a problem specific to Jepp. >> >> Just wanted to mention it in case it puts someone off trying something >> unnecessarily in the future. > > My (second-hand) understanding is that this is a problem with having > multiple CPython interpreters, which both Jepp and numpy utilize, > incompatibly - is that right? ?I.e., if either one were restricted to > using a single CPython interpreter, we wouldn't see this problem? > > I'm curious how to disable threads in numpy (not an ideal solution). > Googling seems to point me to setting NPY_ALLOW_THREADS to > 0....somewhere. Anyone? > >> >> Cheers >> >> Robin >> >> [1] https://github.com/kw/pymex >> >>> >>> http://stackoverflow.com/questions/7592565/when-embedding-cpython-in-java-why-does-this-hang/7630992#7630992 >>> http://mail.scipy.org/pipermail/numpy-discussion/2009-July/044046.html >>> Is there any fix or workaround for this? ?Thanks. >>> -- >>> Yang Zhang >>> http://yz.mit.edu/ >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Yang Zhang > http://yz.mit.edu/ -- Yang Zhang http://yz.mit.edu/ From mail at enricong.com Thu Dec 8 20:14:48 2011 From: mail at enricong.com (Enrico Ng) Date: Thu, 8 Dec 2011 20:14:48 -0500 Subject: [Numpy-discussion] Help with wrapper ndarray to c array Message-ID: I am trying to pass a multi-dimensional ndarray to C as a multi- dimensional C array for the purposes of passing it to mathematica. They already have a wrapper for a 1-D Python list. where the list is copied to "list". Shown below: static PyObject * mathlink_PutIntegerList(mathlink_Link *self, PyObject *args) { PyObject* seq; PyObject* obj; long i, len, result; int* list; len = PyObject_Length(seq); list = PyMem_New(int, len); for(i = 0; i < len; i++) { obj = PySequence_GetItem(seq, i); list[i] = PyInt_AsLong(obj); } CheckForThreadsAndRunLink(self,result = MLPutIntegerList(self->lp, list, len)); PyMem_Free(list); CHECKNOTEQUAL(result,MLSUCCESS,self); Py_INCREF(Py_None); return Py_None; } I would like to create a similar wrapper which accepts an ndarray and provides the array laid out in memory like a C array declared explicitly as "int a[m][n]...". I also need to pass the length of the array at each level i as dim[i] and the depth. Since this is pretty much the only function I plan to wrap, I'd like to avoid using boost, swig, etc. Any help would be appreciated, looks like I should use PyArray_AsCArray but I'm having trouble finding examples that work for me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Dec 9 03:31:04 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 Dec 2011 08:31:04 +0000 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: On Thu, Dec 8, 2011 at 23:15, Yang Zhang wrote: > On Tue, Oct 4, 2011 at 12:05 PM, Yang Zhang wrote: >> On Tue, Oct 4, 2011 at 1:28 AM, Robin wrote: >>> On Mon, Oct 3, 2011 at 9:42 PM, Yang Zhang wrote: >>>> It turns out that there's a long-standing problem in numpy that >>>> prevents it from being used in embedded CPython environments: >>> >>> Just wanted to make the point for reference that in general Numpy does >>> work fine in (non-threaded) embedded CPython situations, see for >>> example pymex [1] which embeds Python + Numpy in a Matlab mex file and >>> works really well. >>> >>> This seems to a be a problem specific to Jepp. >>> >>> Just wanted to mention it in case it puts someone off trying something >>> unnecessarily in the future. >> >> My (second-hand) understanding is that this is a problem with having >> multiple CPython interpreters, which both Jepp and numpy utilize, >> incompatibly - is that right? ?I.e., if either one were restricted to >> using a single CPython interpreter, we wouldn't see this problem? >> >> I'm curious how to disable threads in numpy (not an ideal solution). >> Googling seems to point me to setting NPY_ALLOW_THREADS to >> 0....somewhere. > > Anyone? numpy does not use multiple interpreters. The threading options have nothing to do with multiple interpreters, and will not let you use multiple CPython interpreters in your application. The problem is that Python does not have good isolation between multiple interpreters for extension modules. Many extension modules happen to work in this environment, but numpy is not one of them. We have some global state that we need to keep, and this gets interfered with in a multiple interpreter environment. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From yanghatespam at gmail.com Fri Dec 9 06:00:31 2011 From: yanghatespam at gmail.com (Yang Zhang) Date: Fri, 9 Dec 2011 03:00:31 -0800 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 12:31 AM, Robert Kern wrote: > On Thu, Dec 8, 2011 at 23:15, Yang Zhang wrote: >> On Tue, Oct 4, 2011 at 12:05 PM, Yang Zhang wrote: >>> On Tue, Oct 4, 2011 at 1:28 AM, Robin wrote: >>>> On Mon, Oct 3, 2011 at 9:42 PM, Yang Zhang wrote: >>>>> It turns out that there's a long-standing problem in numpy that >>>>> prevents it from being used in embedded CPython environments: >>>> >>>> Just wanted to make the point for reference that in general Numpy does >>>> work fine in (non-threaded) embedded CPython situations, see for >>>> example pymex [1] which embeds Python + Numpy in a Matlab mex file and >>>> works really well. >>>> >>>> This seems to a be a problem specific to Jepp. >>>> >>>> Just wanted to mention it in case it puts someone off trying something >>>> unnecessarily in the future. >>> >>> My (second-hand) understanding is that this is a problem with having >>> multiple CPython interpreters, which both Jepp and numpy utilize, >>> incompatibly - is that right? ?I.e., if either one were restricted to >>> using a single CPython interpreter, we wouldn't see this problem? >>> >>> I'm curious how to disable threads in numpy (not an ideal solution). >>> Googling seems to point me to setting NPY_ALLOW_THREADS to >>> 0....somewhere. >> >> Anyone? > > numpy does not use multiple interpreters. The threading options have > nothing to do with multiple interpreters, and will not let you use > multiple CPython interpreters in your application. The problem is that > Python does not have good isolation between multiple interpreters for > extension modules. Many extension modules happen to work in this > environment, but numpy is not one of them. We have some global state > that we need to keep, and this gets interfered with in a multiple > interpreter environment. Thanks for the clarification. Alas. So is there no simple workaround to making numpy work in environments such as Jepp? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Yang Zhang http://yz.mit.edu/ From robert.kern at gmail.com Fri Dec 9 06:05:33 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 Dec 2011 11:05:33 +0000 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 11:00, Yang Zhang wrote: > Thanks for the clarification. ?Alas. ?So is there no simple workaround > to making numpy work in environments such as Jepp? I don't think so, no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From sole at esrf.fr Fri Dec 9 08:05:30 2011 From: sole at esrf.fr (Vicente Sole) Date: Fri, 09 Dec 2011 14:05:30 +0100 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: <20111209140530.de7ffq7s6os4css8@160.103.2.152> Quoting Robert Kern : > On Fri, Dec 9, 2011 at 11:00, Yang Zhang wrote: > >> Thanks for the clarification. ?Alas. ?So is there no simple workaround >> to making numpy work in environments such as Jepp? > > I don't think so, no. > It is far from being an optimal solution (in fact I dislike it) but there is a couple of research facilities that like the python interpreter, they like numpy, but prefer to use java for all their graphical interfaces. They have rewritten part of numpy in java in order to use it from Jython. http://www.opengda.org/documentation/manuals/Diamond_SciSoft_Python_Guide/8.16/scisoftpy.html Armando From pierre.haessig at crans.org Fri Dec 9 08:18:04 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 09 Dec 2011 14:18:04 +0100 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: <4EE20A8C.4000907@crans.org> Le 09/12/2011 09:31, Robert Kern a ?crit : > We have some global state > that we need to keep, and this gets interfered with in a multiple > interpreter environment. I recently got interested in multiprocessing computation with numpy and now I get scare by your statement ! Please don't tell me it is unsafe to launch multiple jobs (for instance with multiprocressing's Pool.map) just doing some ndarray arithmetic ! That's why I'd like to understand better the issue raised by Yang. For instance, what does exactly "multiple CPython interpreters" stands for ? Best, Pierre From robert.kern at gmail.com Fri Dec 9 09:00:00 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 9 Dec 2011 14:00:00 +0000 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: <4EE20A8C.4000907@crans.org> References: <4EE20A8C.4000907@crans.org> Message-ID: On Fri, Dec 9, 2011 at 13:18, Pierre Haessig wrote: > Le 09/12/2011 09:31, Robert Kern a ?crit : >> We have some global state >> that we need to keep, and this gets interfered with in a multiple >> interpreter environment. > I recently got interested in multiprocessing computation with numpy and > now I get scare by your statement ! > Please don't tell me it is unsafe to launch multiple jobs (for instance > with multiprocressing's Pool.map) just doing some ndarray arithmetic ! > > That's why I'd like to understand better the issue raised by Yang. For > instance, what does exactly "multiple CPython interpreters" stands for ? Using multiprocessing is fine. That starts up multiple interpreters in *different* processes. Yang is using a non-Python program that embeds the CPython interpreter and starts up multiple copies of it in the same process. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From pierre.haessig at crans.org Fri Dec 9 09:28:35 2011 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 09 Dec 2011 15:28:35 +0100 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: <4EE20A8C.4000907@crans.org> Message-ID: <4EE21B13.1010506@crans.org> Le 09/12/2011 15:00, Robert Kern a ?crit : > Using multiprocessing is fine. That starts up multiple interpreters in > *different* processes. Yang is using a non-Python program that embeds > the CPython interpreter and starts up multiple copies of it in the > same process. Ok, now I think I understand. I was not aware it was possible to embed multiple CPython instances into one process. So I guess IPython's multiprocessing infrastructure I once briefly considered is also safe since it runs multiple kernels. I'm relieved to hear that ! Thank you very much for the explanation. Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From pcyc.uk at gmail.com Fri Dec 9 10:53:04 2011 From: pcyc.uk at gmail.com (Peter CYC) Date: Fri, 9 Dec 2011 15:53:04 +0000 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: <20111209140530.de7ffq7s6os4css8@160.103.2.152> References: <20111209140530.de7ffq7s6os4css8@160.103.2.152> Message-ID: Hi Armando, No comment on the Java thing ;-) However, http://www.opengda.org/documentation/manuals/Diamond_SciSoft_Python_Guide/8.18/contents.html is more up-to-date and we are on github too: https://github.com/DiamondLightSource Peter On 9 December 2011 13:05, Vicente Sole wrote: > Quoting Robert Kern : > >> On Fri, Dec 9, 2011 at 11:00, Yang Zhang wrote: >> >>> Thanks for the clarification. ?Alas. ?So is there no simple workaround >>> to making numpy work in environments such as Jepp? >> >> I don't think so, no. >> > > It is far from being an optimal solution (in fact I dislike it) but > there is a couple of research facilities that like the python > interpreter, they like numpy, but prefer to use java for all their > graphical interfaces. They have rewritten part of numpy in java in > order to use it from Jython. > > http://www.opengda.org/documentation/manuals/Diamond_SciSoft_Python_Guide/8.16/scisoftpy.html > > > Armando > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From rowen at uw.edu Fri Dec 9 14:02:26 2011 From: rowen at uw.edu (Russell E. Owen) Date: Fri, 09 Dec 2011 11:02:26 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 Message-ID: I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit tests claim the wrong version of fortran was used. I thought I knew how to avoid that, but it's not working. I don't have atlas (this needs to run on a lot of similar-but-not-identical machines). I believe blas and lapack were built against gfortran: -bash-3.2$ ldd /usr/lib64/libblas.so linux-vdso.so.1 => ?(0x00007fff4bffd000) libm.so.6 => /lib64/libm.so.6 (0x00002ab26a0c8000) libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002ab26a34c000) libc.so.6 => /lib64/libc.so.6 (0x00002ab26a5e3000) /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) -bash-3.2$ ldd /usr/lib64/liblapack.so linux-vdso.so.1 => ?(0x00007fffe97fd000) libblas.so.3 => /usr/lib64/libblas.so.3 (0x00002b6438d75000) libm.so.6 => /lib64/libm.so.6 (0x00002b6438fca000) libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002b643924d000) libc.so.6 => /lib64/libc.so.6 (0x00002b64394e4000) /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) The sysadmins have provided a gcc 4.4.0 compiler that I access using symlinks on my $PATH: -bash-3.2$ which gcc g++ gfortran ~/local/bin/gcc ~/local/bin/g++ ~/local/bin/gfortran -bash-3.2$ ls -l ~/local/bin lrwxrwxrwx 1 rowen astro 14 Oct 28 2010 g++ -> /usr/bin/g++44 lrwxrwxrwx 1 rowen astro 14 Oct 28 2010 gcc -> /usr/bin/gcc44 lrwxrwxrwx 1 rowen astro 19 Dec 5 16:40 gfortran -> /usr/bin/gfortran44 -bash-3.2$ gfortran --version GNU Fortran (GCC) 4.4.0 20090514 (Red Hat 4.4.0-6) Copyright (C) 2009 Free Software Foundation, Inc. For this log I used a home-bulit python 2.6.5 that is widely used. However, I've tried it with other builds of python that are on our system, as well, with no better success (including a Python 2.7.2). -bash-3.2$ which python /astro/apps/pkg/python64/bin/python -bash-3.2$ python Python 2.6.5 (r265:79063, Aug 4 2010, 11:27:53) [GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2 Type "help", "copyright", "credits" or "license" for more information. numpy seems to see gfortran when it builds: -bash-3.2$ python setup.py build --fcompiler=gnu95 Running from numpy source directory.non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2 blas_opt_info: blas_mkl_info: ...?NOT AVAILABLE atlas_blas_threads_info: ...?NOT AVAILABLE atlas_blas_info: ...?NOT AVAILABLE blas_info: ...?FOUND: ???libraries = ['blas'] ???library_dirs = ['/usr/lib64'] ???language = f77 ?FOUND: ???libraries = ['blas'] ???library_dirs = ['/usr/lib64'] ???define_macros = [('NO_ATLAS_INFO', 1)] ???language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: ...?NOT AVAILABLE ?NOT AVAILABLE atlas_threads_info: ...?NOT AVAILABLE atlas_info: ...?NOT AVAILABLE /astro/users/rowen/build/numpy-1.6.1/numpy/distutils/system_info.py:1330: UserWarning: ???Atlas (http://math-atlas.sourceforge.net/) libraries not found. ???Directories to search for the libraries can be specified in the ???numpy/distutils/site.cfg file (section [atlas]) or by setting ???the ATLAS environment variable. ?warnings.warn(AtlasNotFoundError.__doc__) lapack_info: ?libraries lapack not found in /astro/apps/lsst_w12_sl5/Linux64/external/python/2.7.2+2/lib ...?FOUND: ???libraries = ['lapack'] ???library_dirs = ['/usr/lib64'] ???language = f77 ?FOUND: ???libraries = ['lapack', 'blas'] ???library_dirs = ['/usr/lib64'] ???define_macros = [('NO_ATLAS_INFO', 1)] ???language = f77 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources creating build creating build/src.linux-x86_64-2.7 creating build/src.linux-x86_64-2.7/numpy creating build/src.linux-x86_64-2.7/numpy/distutils building library "npymath" sources customize Gnu95FCompiler Found executable /astro/users/rowen/local/bin/gfortran # I install it in an out-of-the-way location just so I can test it -bash-3.2$ python setup.py install --home=~/local ... -bash-3.2$ cd -bash-3.2$ python >>> import numpy >>> numpy.__path__ ['/astro/users/rowen/local/lib/python/numpy'] >>> numpy.test() Running unit tests for numpy NumPy version 1.6.1 NumPy is installed in /astro/users/rowen/local/lib/python/numpy Python version 2.6.5 (r265:79063, Aug 4 2010, 11:27:53) [GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] nose version 0.11.4 .... ====================================================================== FAIL: test_lapack (test_build.TestF77Mismatch) ---------------------------------------------------------------------- Traceback (most recent call last): File "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", line 50, in test_lapack information.""") AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! This is likely to cause random crashes and wrong results. See numpy INSTALL.txt for more information. ---------------------------------------------------------------------- Ran 3533 tests in 13.400s Any suggestions on how to fix this? -- Russell From enrico.ng at lmco.com Fri Dec 9 14:25:43 2011 From: enrico.ng at lmco.com (Ng, Enrico) Date: Fri, 09 Dec 2011 14:25:43 -0500 Subject: [Numpy-discussion] Problem with using PyArray_AsCArray Message-ID: I am trying to pass a multi-dimensional ndarray to C as a multi-dimensional C array for the purposes of passing it to mathematica. I am using PyArray_AsCArray but getting an error. ###################################################### Python Code: import Image from scipy.misc import fromimage img = Image.open("../APLS/image709_enhanced_2.tif") nimg = fromimage(img) ... mathlink_PutIntegerArray(nimg) ###################################################################### C Wrapper Code: static PyObject * mathlink_PutIntegerArray(mathlink_Link *self, PyObject *args) { npy_intp dims[3]; /* PyArray_AsCArray is for ndim <= 3 */ PyObject *o1; double *d1; long result; i=PyArray_AsCArray(&o1, (void *)&d1, dims, PyArray_NDIM(o1), PyArray_DescrFromType(PyArray_TYPE(o1))); ... } Program received signal SIGSEGV, Segmentation fault. 0x00002aaaaefa3d2e in PyArray_AsCArray (op=0x7fffffffdf68, ptr=0x7fffffffdf60, dims=0x7fffffffdf40, nd=3, typedescr=) at numpy/core/src/multiarray/multiarraymodule.c:218 218 ptr3[i][j] = ap->data + i*ap->strides[0] + j*ap->strides[1] ################################################################################# I am able to read i,j,ap->data, ap->strides[0], ap->strides[1] so the error seems to be in the assignment to ptr3[i][j] This happens on the first instance i=0 j=0 ################################################################################# PyArray_AsCArray code is below. NPY_NO_EXPORT int PyArray_AsCArray(PyObject **op, void *ptr, npy_intp *dims, int nd, PyArray_Descr* typedescr) { PyArrayObject *ap; npy_intp n, m, i, j; char **ptr2; char ***ptr3; if ((nd < 1) || (nd > 3)) { PyErr_SetString(PyExc_ValueError, "C arrays of only 1-3 dimensions available"); Py_XDECREF(typedescr); return -1; } if ((ap = (PyArrayObject*)PyArray_FromAny(*op, typedescr, nd, nd, CARRAY, NULL)) == NULL) { return -1; } switch(nd) { case 1: *((char **)ptr) = ap->data; break; case 2: n = ap->dimensions[0]; ptr2 = (char **)_pya_malloc(n * sizeof(char *)); if (!ptr2) { goto fail; } for (i = 0; i < n; i++) { ptr2[i] = ap->data + i*ap->strides[0]; } *((char ***)ptr) = ptr2; break; case 3: n = ap->dimensions[0]; m = ap->dimensions[1]; ptr3 = (char ***)_pya_malloc(n*(m+1) * sizeof(char *)); if (!ptr3) { goto fail; } for (i = 0; i < n; i++) { ptr3[i] = ptr3[n + (m-1)*i]; for (j = 0; j < m; j++) { ptr3[i][j] = ap->data + i*ap->strides[0] + j*ap->strides[1]; } } *((char ****)ptr) = ptr3; } memcpy(dims, ap->dimensions, nd*sizeof(npy_intp)); *op = (PyObject *)ap; return 0; fail: PyErr_SetString(PyExc_MemoryError, "no memory"); return -1; } From ferreirafm at lim12.fm.usp.br Fri Dec 9 14:47:50 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Fri, 9 Dec 2011 11:47:50 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems Message-ID: <32945124.post@talk.nabble.com> Hi everyone, I'm quite new to numpy and python either. Could someone, please, tell me what I'm doing wrong? Here goes my peace of code: def stats(filename): """Utilility to perform some basic statistics on columns.""" tab = get_textab(filename) stat_list = [ ] for row in sort_tab(tab): if row['length'] >= 15: stat_list.append(row) stat_array = np.array(stat_list) print type(sort_tab(tab)) print type(stat_array) #print stat_array.mean(axis=0) print np.mean(stat_array, axis=0) Which results in: Traceback (most recent call last): File "/home/ferreirafm/bin/cross.py", line 213, in main() File "/home/ferreirafm/bin/cross.py", line 204, in main stats(filename) File "/home/ferreirafm/bin/cross.py", line 146, in stats print np.mean(stat_array, axis=0) File "/usr/lib64/python2.7/site-packages/numpy/core/fromnumeric.py", line 2374, in mean return mean(axis, dtype, out) TypeError: unsupported operand type(s) for +: 'numpy.void' and 'numpy.void' -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32945124.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From Chris.Barker at noaa.gov Fri Dec 9 15:33:05 2011 From: Chris.Barker at noaa.gov (Chris.Barker) Date: Fri, 09 Dec 2011 12:33:05 -0800 Subject: [Numpy-discussion] Problem with using PyArray_AsCArray In-Reply-To: References: Message-ID: <4EE27081.4040000@noaa.gov> On 12/9/11 11:25 AM, Ng, Enrico wrote: > I am trying to pass a multi-dimensional ndarray to C as a multi-dimensional C array for the purposes of passing it to mathematica. I am using PyArray_AsCArray but getting an error. I understand that SWIG, Boost, et. al are perhaps too heavyweight for this one use, but you might want to give Cython a try. It makes it really easy to grab a numpy array (and tst it for compliance), and then you can do whatever you want with the data pointer: http://wiki.cython.org/tutorials/numpy http://wiki.cython.org/WrappingNumpy (this one is marked as depricated, but may be what you want in your case, as you do want a raw C array) Some wore googling and browsing of the Cython list will likely yield examples similar to yours. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From enrico.ng at lmco.com Fri Dec 9 16:20:43 2011 From: enrico.ng at lmco.com (Ng, Enrico) Date: Fri, 09 Dec 2011 16:20:43 -0500 Subject: [Numpy-discussion] EXTERNAL: Re: Problem with using PyArray_AsCArray In-Reply-To: <4EE27081.4040000@noaa.gov> References: <4EE27081.4040000@noaa.gov> Message-ID: I actually figured it out. I went one level down in the array and it took it. -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Chris.Barker Sent: Friday, December 09, 2011 3:33 PM To: Discussion of Numerical Python Subject: EXTERNAL: Re: [Numpy-discussion] Problem with using PyArray_AsCArray On 12/9/11 11:25 AM, Ng, Enrico wrote: > I am trying to pass a multi-dimensional ndarray to C as a multi-dimensional C array for the purposes of passing it to mathematica. I am using PyArray_AsCArray but getting an error. I understand that SWIG, Boost, et. al are perhaps too heavyweight for this one use, but you might want to give Cython a try. It makes it really easy to grab a numpy array (and tst it for compliance), and then you can do whatever you want with the data pointer: http://wiki.cython.org/tutorials/numpy http://wiki.cython.org/WrappingNumpy (this one is marked as depricated, but may be what you want in your case, as you do want a raw C array) Some wore googling and browsing of the Cython list will likely yield examples similar to yours. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From madsipsen at gmail.com Fri Dec 9 17:18:12 2011 From: madsipsen at gmail.com (Mads Ipsen) Date: Fri, 09 Dec 2011 23:18:12 +0100 Subject: [Numpy-discussion] Warning related to __multiarray_api.h Message-ID: <4EE28924.9040803@gmail.com> Hi, I don't know if this is of importance, but when I compile code using the numpy C API, I get the warning: site-packages/numpy/core/include/numpy/__multiarray_api.h:1532: warning: 'int _import_array()' defined but not used Might be worth cleaning it up. Best regards, Mads -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Dec 9 19:12:28 2011 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 9 Dec 2011 16:12:28 -0800 Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <32945124.post@talk.nabble.com> References: <32945124.post@talk.nabble.com> Message-ID: On Fri, Dec 9, 2011 at 11:47 AM, ferreirafm wrote: > > Hi everyone, > I'm quite new to numpy and python either. Could someone, please, tell me > what I'm doing wrong? > Here goes my peace of code: > > def stats(filename): > ? ?"""Utilility to perform some basic statistics on columns.""" > ? ?tab = get_textab(filename) > ? ?stat_list = [ ] > ? ?for row in sort_tab(tab): > ? ? ? ?if row['length'] >= 15: > ? ? ? ? ? ?stat_list.append(row) > ? ?stat_array = np.array(stat_list) > ? ?print type(sort_tab(tab)) > ? ?print type(stat_array) > ? ?#print stat_array.mean(axis=0) > ? ?print np.mean(stat_array, axis=0) > > Which results in: > > When posting to the mailing list, it's a good idea to have a small, self contained example (otherwise we can't reproduce your problem). In this specific case, I'd like to be able to see what the outputs of "print tab" and "print stat_array" are. Regards St?fan From ferreirafm at lim12.fm.usp.br Sat Dec 10 06:47:21 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Sat, 10 Dec 2011 03:47:21 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: References: <32945124.post@talk.nabble.com> Message-ID: <32951098.post@talk.nabble.com> St?fan van der Walt wrote: > > When posting to the mailing list, it's a good idea to have a small, > self contained example (otherwise we can't reproduce your problem). > In this specific case, I'd like to be able to see what the outputs of > "print tab" and "print stat_array" are. > > Regards > St?fan > Hi St?fan, Thanks for your replay. Have a look in the arrays at: http://ompldr.org/vYm83ZA Regards, Fred -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32951098.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From sole at esrf.fr Sat Dec 10 08:43:38 2011 From: sole at esrf.fr (Vicente Sole) Date: Sat, 10 Dec 2011 14:43:38 +0100 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: <20111209140530.de7ffq7s6os4css8@160.103.2.152> Message-ID: <20111210144338.a19jbh9khcs44kwc@160.103.2.152> Hi Peter, The obsolete link was not deliberate. It was the first reference I found via google. Best regards, Armando Quoting Peter CYC : > Hi Armando, > > No comment on the Java thing ;-) > > However, > http://www.opengda.org/documentation/manuals/Diamond_SciSoft_Python_Guide/8.18/contents.html > is more up-to-date and we are on github too: > https://github.com/DiamondLightSource > > Peter > > > On 9 December 2011 13:05, Vicente Sole wrote: >> Quoting Robert Kern : >> >>> On Fri, Dec 9, 2011 at 11:00, Yang Zhang wrote: >>> >>>> Thanks for the clarification. ?Alas. ?So is there no simple workaround >>>> to making numpy work in environments such as Jepp? >>> >>> I don't think so, no. >>> >> >> It is far from being an optimal solution (in fact I dislike it) but >> there is a couple of research facilities that like the python >> interpreter, they like numpy, but prefer to use java for all their >> graphical interfaces. They have rewritten part of numpy in java in >> order to use it from Jython. >> >> http://www.opengda.org/documentation/manuals/Diamond_SciSoft_Python_Guide/8.16/scisoftpy.html >> >> >> Armando >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From bsouthey at gmail.com Sat Dec 10 10:19:49 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sat, 10 Dec 2011 09:19:49 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 11:32 PM, Matthew Brett wrote: > Hi, > > 2011/12/5 St?fan van der Walt : >> As for barriers to entry, improving the the nature of discourse on the >> mailing list (when it comes to thorny issues) would be good. >> Technical barriers are not that hard to breach for our community; >> setting the right social atmosphere is crucial. > > I'm just about to get on a plane and am going to be out of internet > range for a while, so, in the spirit of constructive discussion: > > In the spirit of use-cases: > > Would it be fair to say that the two contentious recent discussions have been: > > The numpy ABI breakage, 2.0 vs 1.5.1 discussion > The masked array discussion(s) ? > > What did we do wrong or right in each of these two discussions? ?What > could we have done better? ?What process would help us to do better? > > Travis - for your board-only-post mailing list - my feeling is that > this is going in the wrong direction. ?The effect of the board-only > mailing list is to explicitly remove non-qualified people from the > discussion. ? This will make it more explicit that the substantial > decisions will be make by a few important people. ? Do you (Travis - > or Mark?) think that, if this had happened earlier in the masked array > discussion, it would have been less contentious, or had more > substantial content? ?My instinct would be the reverse, and the best > solution would have been to pause and commit to beating out the issues > and getting agreement. > > See you, > > Matthew I would also like to know the long term model for development since some of issues a direct results of that model. At the moment we seem stuck in the old svn model as we have a release that essentially splits from the current development branch where key developers just merge into without any discussion. This 'old svn' view did create some discussion regarding the NA object including the pull request. But we lacked the step about moving it into the current developmental branch. So at times that seemed to add 'insult to injury' (http://idioms.thefreedictionary.com/add+insult+to+injury) which tended to decrease some of the interesting ideas expressed. So perhaps we could find a model that allows real bug fixes (ie have a valid ticket or maximum of 5 lines of changed code) to go the current developmental branch and enhancements come through as some other process that involves community discussions? Bruce From aronne.merrelli at gmail.com Sat Dec 10 12:41:29 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Sat, 10 Dec 2011 11:41:29 -0600 Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <32951098.post@talk.nabble.com> References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> Message-ID: On Sat, Dec 10, 2011 at 5:47 AM, ferreirafm wrote: > > > Hi St?fan, > Thanks for your replay. Have a look in the arrays at: > http://ompldr.org/vYm83ZA > Regards, > Fred > -- I can recreate this error if tab is a structured ndarray - what is the dtype of tab? If that is correct, I think you could fix this by simplifying things. Since tab is already an ndarray, you should not need to convert it back into a python list. By converting the ndarray back to a list you are making an extra level of "wrapping" as a python object, which is ultimately why you get that error about adding numpy.void. Unfortunately you cannot take directly take a mean of a struct dtype; structs are generic so they could have fields with strings, or objects, etc, that would be invalid for a mean calculation. However the following code fragment should work pretty efficiently. It will make a 1-element array of the same dtype as tab, and then populate it with the mean value of all elements where the length is >= 15. Note that dtype.fields.keys() gives you a nice way to iterate over the fields in the struct dtype: length_mask = tab['length'] >= 15 tab_means = np.zeros(1, dtype=tab.dtype) for k in tab.dtype.fields.keys(): tab_means[k] = np.mean( tab[k][mask] ) In general this would not work if tab has a field that is not a simple numeric type, such as a str, object, ... But it looks like your arrays are all numeric from your example above. Hope that helps, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 10 14:07:06 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 10 Dec 2011 12:07:06 -0700 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Sat, Dec 10, 2011 at 8:19 AM, Bruce Southey wrote: > On Mon, Dec 5, 2011 at 11:32 PM, Matthew Brett > wrote: > > Hi, > > > > 2011/12/5 St?fan van der Walt : > >> As for barriers to entry, improving the the nature of discourse on the > >> mailing list (when it comes to thorny issues) would be good. > >> Technical barriers are not that hard to breach for our community; > >> setting the right social atmosphere is crucial. > > > > I'm just about to get on a plane and am going to be out of internet > > range for a while, so, in the spirit of constructive discussion: > > > > In the spirit of use-cases: > > > > Would it be fair to say that the two contentious recent discussions have > been: > > > > The numpy ABI breakage, 2.0 vs 1.5.1 discussion > > The masked array discussion(s) ? > > > > What did we do wrong or right in each of these two discussions? What > > could we have done better? What process would help us to do better? > > > > Travis - for your board-only-post mailing list - my feeling is that > > this is going in the wrong direction. The effect of the board-only > > mailing list is to explicitly remove non-qualified people from the > > discussion. This will make it more explicit that the substantial > > decisions will be make by a few important people. Do you (Travis - > > or Mark?) think that, if this had happened earlier in the masked array > > discussion, it would have been less contentious, or had more > > substantial content? My instinct would be the reverse, and the best > > solution would have been to pause and commit to beating out the issues > > and getting agreement. > > > > See you, > > > > Matthew > > I would also like to know the long term model for development since > some of issues a direct results of that model. At the moment we seem > stuck in the old svn model as we have a release that essentially > splits from the current development branch where key developers just > merge into without any discussion. This 'old svn' view did create some > discussion regarding the NA object including the pull request. But we > lacked the step about moving it into the current developmental branch. > So at times that seemed to add 'insult to injury' > (http://idioms.thefreedictionary.com/add+insult+to+injury) which > tended to decrease some of the interesting ideas expressed. > > So perhaps we could find a model that allows real bug fixes (ie have a > valid ticket or maximum of 5 lines of changed code) to go the current > developmental branch and enhancements come through as some other > process that involves community discussions? > > I think the rule should be that *anyone* seriously interested in what is happening in numpy development should be watching the pull requests. The complementary rule is that all commits go in through a pull request. There isn't much traffic on the pull requests and code/functionality review at that point is both useful and desirable. Large topics do tend to turn up on the list, the NA work, for example. But github makes it easy to follow what is going on and folks should take advantage of it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Dec 10 17:11:24 2011 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 10 Dec 2011 14:11:24 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: Message-ID: On Sat, Dec 10, 2011 at 11:07 AM, Charles R Harris wrote: > I think the rule should be that *anyone* seriously interested in what is > happening in numpy development should be watching the pull requests. It's good to encourage that, but in the end big changes should always be discussed on the mailing list, to make sure everyone is on board. St?fan From ferreirafm at lim12.fm.usp.br Sun Dec 11 07:49:56 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Sun, 11 Dec 2011 04:49:56 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> Message-ID: <32955052.post@talk.nabble.com> Aronne Merrelli wrote: > > I can recreate this error if tab is a structured ndarray - what is the > dtype of tab? > > If that is correct, I think you could fix this by simplifying things. > Since > tab is already an ndarray, you should not need to convert it back into a > python list. By converting the ndarray back to a list you are making an > extra level of "wrapping" as a python object, which is ultimately why you > get that error about adding numpy.void. > > Unfortunately you cannot take directly take a mean of a struct dtype; > structs are generic so they could have fields with strings, or objects, > etc, that would be invalid for a mean calculation. However the following > code fragment should work pretty efficiently. It will make a 1-element > array of the same dtype as tab, and then populate it with the mean value > of > all elements where the length is >= 15. Note that dtype.fields.keys() > gives > you a nice way to iterate over the fields in the struct dtype: > > length_mask = tab['length'] >= 15 > tab_means = np.zeros(1, dtype=tab.dtype) > for k in tab.dtype.fields.keys(): > tab_means[k] = np.mean( tab[k][mask] ) > > In general this would not work if tab has a field that is not a simple > numeric type, such as a str, object, ... But it looks like your arrays are > all numeric from your example above. > > Hope that helps, > Aronne > HI Aronne, Thanks for your replay. Indeed, tab is a mix of different column types: tab.dtype: [('sgi', ' References: Message-ID: On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit tests > claim the wrong version of fortran was used. I thought I knew how to > avoid that, but it's not working. > > I don't have atlas (this needs to run on a lot of > similar-but-not-identical machines). I believe blas and lapack were > built against gfortran: > -bash-3.2$ ldd /usr/lib64/libblas.so > linux-vdso.so.1 => (0x00007fff4bffd000) > libm.so.6 => /lib64/libm.so.6 (0x00002ab26a0c8000) > libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002ab26a34c000) > libc.so.6 => /lib64/libc.so.6 (0x00002ab26a5e3000) > /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) > -bash-3.2$ ldd /usr/lib64/liblapack.so > linux-vdso.so.1 => (0x00007fffe97fd000) > libblas.so.3 => /usr/lib64/libblas.so.3 (0x00002b6438d75000) > libm.so.6 => /lib64/libm.so.6 (0x00002b6438fca000) > libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002b643924d000) > libc.so.6 => /lib64/libc.so.6 (0x00002b64394e4000) > /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) > > The sysadmins have provided a gcc 4.4.0 compiler that I access using > symlinks on my $PATH: > -bash-3.2$ which gcc g++ gfortran > ~/local/bin/gcc > ~/local/bin/g++ > ~/local/bin/gfortran > -bash-3.2$ ls -l ~/local/bin > lrwxrwxrwx 1 rowen astro 14 Oct 28 2010 g++ -> /usr/bin/g++44 > lrwxrwxrwx 1 rowen astro 14 Oct 28 2010 gcc -> /usr/bin/gcc44 > lrwxrwxrwx 1 rowen astro 19 Dec 5 16:40 gfortran -> /usr/bin/gfortran44 > -bash-3.2$ gfortran --version > GNU Fortran (GCC) 4.4.0 20090514 (Red Hat 4.4.0-6) > Copyright (C) 2009 Free Software Foundation, Inc. > > For this log I used a home-bulit python 2.6.5 that is widely used. > However, I've tried it with other builds of python that are on our > system, as well, with no better success (including a Python 2.7.2). > -bash-3.2$ which python > /astro/apps/pkg/python64/bin/python > -bash-3.2$ python > Python 2.6.5 (r265:79063, Aug 4 2010, 11:27:53) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > > > > numpy seems to see gfortran when it builds: > > -bash-3.2$ python setup.py build --fcompiler=gnu95 > > Running from numpy source directory.non-existing path in > 'numpy/distutils': 'site.cfg' > F2PY Version 2 > blas_opt_info: > blas_mkl_info: > ... NOT AVAILABLE > > atlas_blas_threads_info: > ... NOT AVAILABLE > > atlas_blas_info: > ... NOT AVAILABLE > > blas_info: > ... FOUND: > libraries = ['blas'] > library_dirs = ['/usr/lib64'] > language = f77 > > FOUND: > libraries = ['blas'] > library_dirs = ['/usr/lib64'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > lapack_opt_info: > lapack_mkl_info: > mkl_info: > ... NOT AVAILABLE > > NOT AVAILABLE > > atlas_threads_info: > ... NOT AVAILABLE > > atlas_info: > ... NOT AVAILABLE > > /astro/users/rowen/build/numpy-1.6.1/numpy/distutils/system_info.py:1330: > UserWarning: > Atlas (http://math-atlas.sourceforge.net/) libraries not found. > Directories to search for the libraries can be specified in the > numpy/distutils/site.cfg file (section [atlas]) or by setting > the ATLAS environment variable. > warnings.warn(AtlasNotFoundError.__doc__) > lapack_info: > libraries lapack not found in > /astro/apps/lsst_w12_sl5/Linux64/external/python/2.7.2+2/lib > ... FOUND: > libraries = ['lapack'] > library_dirs = ['/usr/lib64'] > language = f77 > > FOUND: > libraries = ['lapack', 'blas'] > library_dirs = ['/usr/lib64'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building py_modules sources > creating build > creating build/src.linux-x86_64-2.7 > creating build/src.linux-x86_64-2.7/numpy > creating build/src.linux-x86_64-2.7/numpy/distutils > building library "npymath" sources > customize Gnu95FCompiler > Found executable /astro/users/rowen/local/bin/gfortran > > > # I install it in an out-of-the-way location just so I can test it > -bash-3.2$ python setup.py install --home=~/local > ... > -bash-3.2$ cd > -bash-3.2$ python > >>> import numpy > >>> numpy.__path__ > ['/astro/users/rowen/local/lib/python/numpy'] > >>> numpy.test() > Running unit tests for numpy > NumPy version 1.6.1 > NumPy is installed in /astro/users/rowen/local/lib/python/numpy > Python version 2.6.5 (r265:79063, Aug 4 2010, 11:27:53) [GCC 4.1.2 > 20080704 (Red Hat 4.1.2-46)] > nose version 0.11.4 > .... > ====================================================================== > FAIL: test_lapack (test_build.TestF77Mismatch) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line > 146, in skipper_func > return f(*args, **kwargs) > File > "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", > line 50, in test_lapack > information.""") > AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! > This is likely to > cause random crashes and wrong results. See numpy INSTALL.txt for more > information. > > ---------------------------------------------------------------------- > Ran 3533 tests in 13.400s > > > Any suggestions on how to fix this? > I assume you have g77 installed and on your PATH. If so, try moving it off your path. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Dec 11 11:40:02 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 11 Dec 2011 17:40:02 +0100 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: <4EDFB566.4000303@noaa.gov> References: <4EDFB566.4000303@noaa.gov> Message-ID: On Wed, Dec 7, 2011 at 7:50 PM, Chris.Barker wrote: > Hi folks, > > This is a continuation of a conversation already started, but i gave it > a new, more appropriate, thread and subject. > > On 12/6/11 2:13 PM, Wes McKinney wrote: > > we should start talking > > about building a *high performance* flat file loading solution with > > good column type inference and sensible defaults, etc. > ... > > > I personally don't > > believe in sacrificing an order of magnitude of performance in the 90% > > case for the 10% case-- so maybe it makes sense to have two functions > > around: a superfast custom CSV reader for well-behaved data, and a > > slower, but highly flexible, function like loadtable to fall back on. > > I've wanted this for ages, and have done some work towards it, but like > others, only had the time for a my-use-case-specific solution. A few > thoughts: > > * If we have a good, fast ascii (or unicode?) to array reader, hopefully > it could be leveraged for use in the more complex cases. So that rather > than genfromtxt() being written from scratch, it would be a wrapper > around the lower-level reader. > You seem to be contradicting yourself here. The more complex cases are Wes' 10% and why genfromtxt is so hairy internally. There's always a trade-off between speed and handling complex corner cases. You want both. A very fast reader for well-behave files would be very welcome, but I see it as a separate topic from genfromtxt/loadtable. The question for the loadtable pull request is whether it is different enough from genfromtxt that we need/want both, or whether loadtable should replace genfromtxt. Cheers, Ralf > * key to performance is to have the text to number to numpy type > happening in C -- if you read the text with python, then convert to > numbers, then to numpy arrays, it's simple going to be slow. > > * I think we want a solution that can be adapted to arbitrary text files > -- not just tabular, CSV-style data. I have a lot of those to read - and > some thoughts about how. > > Efforts I have made so far, and what I've learned from them: > > 1) fromfile(): > fromfile (for text) is nice and fast, but buggy, and a bit too > limited. I've posted various notes about this in the past (and, I'm > pretty sure a couple tickets). They key missing features are: > a) no support form commented lines (this is a lessor need, I think) > b) there can be only one delimiter, and newlines are treated as > generic whitespace. What this means is that if you have > whitespace-delimited file, you can read multiple lines, but if it is, > for instance, comma-delimited, then you can only read one line at a > time, killing performance. > c) there are various bugs if the text is malformed, or doesn't quite > match what you're asking for (ie.e reading integers, but the tet is > float) -- mostly really limited error checking. > > I spent some time digging into the code, and found it to be really hard > to track C code. And very hard to update. The core idea is pretty nice > -- each dtype should know how to read itself form a text file, but the > implementation is painful. The key issue is that for floats and ints, > anyway, it relies on the C atoi and atof functions. However, there have > been patches to these that handle NaN better, etc, for numpy, and I > think a python patch as well. So the code calls the numpy atoi, which > does some checks, then calls the python atoi, which then calls the C lib > atoi (I think all that...) In any case, the core bugs are due to the > fact that atoi and friends doesn't return an error code, so you have to > check if the pointer has been incremented to see if the read was > successful -- this error checking is not propagated through all those > levels of calls. It got really ugly to try to fix! Also, the use of the > C atoi() means that locales may only be handled in the default way -- > i.e. no way to read european-style floats on a system with a US locale. > > My conclusion -- the current code is too much a mess to try to deal with > and fix! > > I also think it's a mistake to have text file reading a special case of > fromfile(), it really should be a separate issue, though that's a minor > API question. > > 2) FileScanner: > > FileScanner is some code a wrote years ago as a C extension - it's > limited, but does the job and is pretty fast. It essentially calls > fscanf() as many times as it gets a successful scan, skipping all > invalid text, then returning a numpy array. You can also specify how > many numbers you want read from the file. It only supports floats. > Travis O. asked it it could be included in Scipy way back when, but I > suspect none of my code actually made it in. > > If I had to do it again, I might write something similar in Cython, > though I am still using it. > > > My Conclusions: > > I think what we need is something similar to MATLAB's fscanf(): > > what it does is take a C-style format string, and apply it to your file > over an over again as many times as it can, and returns an array. What's > nice about this is that it can be purposed to efficiently read a wide > variety of text files fast. > > For numpy, I imagine something like: > > fromtextfile(f, dtype=np.float64, comment=None, shape=None): > """ > read data from a text file, returning a numpy array > > f: is a filename or file-like object > > comment: is a string of the comment signifier. Anything on a line > after this string will be ignored. > > dytpe: is a numpy dtype that you want read from the file > > shape: is the shape of the resulting array. If shape==None, the > file will be read until EOF or until there is read error. > By default, if there are newlines in the file, a 2-d array > will be returned, with the newline signifying a new row in > the array. > """ > > This is actually pretty straightforward. If it support compound dtypes, > then you can read a pretty complex CSV file, once you've determined the > dtype for your "record" (row). It is also really simple to use for the > simple cases. > > But of course, the implementation could be a pain -- I've been thinking > that you could get a lot of it by creating a mapping from numpy dtypes > to fscanf() format strings, then simply use fscanf for the actual file > reading. This would certainly be easy for the easy cases. (maybe you'd > want to use sscanf, so you could have the same code scan strings as well > as files) > > Ideally, each dtype would know how to read itself from a string, but as > I said above, the code for that is currently pretty ugly, so it may be > easier to keep it separate. > > Anyway, I'd be glad to help with this effort. > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davide.lasagna at polito.it Mon Dec 12 09:04:56 2011 From: davide.lasagna at polito.it (LASAGNA DAVIDE) Date: Mon, 12 Dec 2011 15:04:56 +0100 Subject: [Numpy-discussion] polynomial with negative exponents Message-ID: Hi, I have written a class for polynomials with negative exponents like: p(x) = a0 + a1*x**-1 + ... + an*x**-n The code is this one: class NegativeExpPolynomial( object ): def __init__ ( self, coeffs ): self.coeffs = np.array( coeffs ) def __call__( self, x ): return sum( (c*x**(-i) for i, c in enumerate( self.coeffs ) ) ) where coeffs = [a0, a1, ..., an]. I find that the way i evaluate the polynomial is kind of *slow*, especially for polynomial with order larger than ~200 and for arrays x large enough. Do you have suggestions on how to speed up this code? Regards, Davide Lasagna -- Phd Student Dipartimento di Ingegneria Aeronautica a Spaziale Politecnico di Torino, Italy tel: 011/0906871 e-mail: davide.lasagna at polito.it; lasagnadavide at gmail.com From josef.pktd at gmail.com Mon Dec 12 09:35:07 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 12 Dec 2011 09:35:07 -0500 Subject: [Numpy-discussion] polynomial with negative exponents In-Reply-To: References: Message-ID: On Mon, Dec 12, 2011 at 9:04 AM, LASAGNA DAVIDE wrote: > Hi, > > I have written a class for polynomials with negative > exponents like: > > p(x) = a0 + a1*x**-1 + ... + an*x**-n > > The code is this one: > > class NegativeExpPolynomial( object ): > ? ? ?def __init__ ( self, coeffs ): > ? ? ? ? ?self.coeffs = np.array( coeffs ) > > ? ? ?def __call__( self, x ): > ? ? ? ? ?return sum( (c*x**(-i) for i, c in enumerate( > self.coeffs ) ) ) something like self.coeffs = np.asarray(self.coeffs) np.sum(self.coeffs * x**(-np.arange(len(self.coeffs))) or np.dot(self.coeffs, x**(-np.arange(len(self.coeffs))) #check shapes, or np.inner Josef > > where coeffs = [a0, a1, ..., an]. > > I find that the way i evaluate the polynomial is kind of > *slow*, especially for polynomial with order larger than > ~200 and for arrays x large enough. > > Do you have suggestions on how to speed up this code? > > Regards, > > Davide Lasagna > > -- > Phd Student > Dipartimento di Ingegneria Aeronautica a Spaziale > Politecnico di Torino, Italy > tel: 011/0906871 > e-mail: davide.lasagna at polito.it; lasagnadavide at gmail.com > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Mon Dec 12 09:36:54 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 12 Dec 2011 09:36:54 -0500 Subject: [Numpy-discussion] polynomial with negative exponents In-Reply-To: References: Message-ID: On Mon, Dec 12, 2011 at 9:35 AM, wrote: > On Mon, Dec 12, 2011 at 9:04 AM, LASAGNA DAVIDE > wrote: >> Hi, >> >> I have written a class for polynomials with negative >> exponents like: >> >> p(x) = a0 + a1*x**-1 + ... + an*x**-n >> >> The code is this one: >> >> class NegativeExpPolynomial( object ): >> ? ? ?def __init__ ( self, coeffs ): >> ? ? ? ? ?self.coeffs = np.array( coeffs ) >> >> ? ? ?def __call__( self, x ): >> ? ? ? ? ?return sum( (c*x**(-i) for i, c in enumerate( >> self.coeffs ) ) ) > > something like > > self.coeffs = np.asarray(self.coeffs) > > np.sum(self.coeffs * x**(-np.arange(len(self.coeffs))) > > or > np.dot(self.coeffs, ?x**(-np.arange(len(self.coeffs))) ? ? ?#check > shapes, or np.inner > > Josef > >> >> where coeffs = [a0, a1, ..., an]. >> >> I find that the way i evaluate the polynomial is kind of >> *slow*, especially for polynomial with order larger than >> ~200 and for arrays x large enough. there might be numerical problems with large polynomials if the range of values is large Josef >> >> Do you have suggestions on how to speed up this code? >> >> Regards, >> >> Davide Lasagna >> >> -- >> Phd Student >> Dipartimento di Ingegneria Aeronautica a Spaziale >> Politecnico di Torino, Italy >> tel: 011/0906871 >> e-mail: davide.lasagna at polito.it; lasagnadavide at gmail.com >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion From gregor.thalhammer at gmail.com Mon Dec 12 12:17:35 2011 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Mon, 12 Dec 2011 18:17:35 +0100 Subject: [Numpy-discussion] polynomial with negative exponents In-Reply-To: References: Message-ID: <1EBDBB3C-4413-403F-8430-69092FBFBFEB@gmail.com> Am 12.12.2011 um 15:04 schrieb LASAGNA DAVIDE: > Hi, > > I have written a class for polynomials with negative > exponents like: > > p(x) = a0 + a1*x**-1 + ... + an*x**-n > > The code is this one: > > class NegativeExpPolynomial( object ): > def __init__ ( self, coeffs ): > self.coeffs = np.array( coeffs ) > > def __call__( self, x ): > return sum( (c*x**(-i) for i, c in enumerate( > self.coeffs ) ) ) > > where coeffs = [a0, a1, ..., an]. > > I find that the way i evaluate the polynomial is kind of > *slow*, especially for polynomial with order larger than > ~200 and for arrays x large enough. I fear that such high orders create a lot of troubles since evaluating them is very sensitive to numerical errors. > Do you have suggestions on how to speed up this code? > Your polynomials with negative exponents are equivalent to a polynomial with positive exponents, but evaluated at 1/x. Therefore you can make use of the efficient polynomial functions of numpy. Try np.polyval(self.coeffs, x) Gregor From chris.barker at noaa.gov Mon Dec 12 12:22:16 2011 From: chris.barker at noaa.gov (Chris.Barker) Date: Mon, 12 Dec 2011 09:22:16 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> Message-ID: <4EE63848.5000002@noaa.gov> On 12/11/11 8:40 AM, Ralf Gommers wrote: > On Wed, Dec 7, 2011 at 7:50 PM, Chris.Barker * If we have a good, fast ascii (or unicode?) to array reader, hopefully > it could be leveraged for use in the more complex cases. So that rather > than genfromtxt() being written from scratch, it would be a wrapper > around the lower-level reader. > > You seem to be contradicting yourself here. The more complex cases are > Wes' 10% and why genfromtxt is so hairy internally. There's always a > trade-off between speed and handling complex corner cases. You want both. I don't think the version in my mind is contradictory (Not quite). What I'm imagining is that a good, fast ascii to numpy array reader could read a whole table in at once (the common, easy, fast, case), but it could also be used to read snippets of a file in at a time, which could be leveraged to handle many of the more complex cases. I suppose there will always be cases where the user needs to write their own converter from string to dtype, and there is simply no way to leverage what I'm imagining to supported that. Hmm, maybe there is -- for instance, if a "record" consisted off mostly standard, easy-to-parse, numbers, but one field was some weird text that needed custom parsing, we could read it as a dtype, with a string for that one weird field, and that could be converted in a post-processing step. Maybe that wouldn't be any faster or easier, but it could be done... Anyway, whether you can leverage it for the full-featured version or not, I do think there is call for a good, fast, 90% case text file parser. Would anyone like to join/form a small working group to work on this? Wes, I'd like to see your Cython version -- maybe a starting point? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From warren.weckesser at enthought.com Mon Dec 12 12:34:02 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 12 Dec 2011 10:34:02 -0700 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: <4EE63848.5000002@noaa.gov> References: <4EDFB566.4000303@noaa.gov> <4EE63848.5000002@noaa.gov> Message-ID: On Mon, Dec 12, 2011 at 10:22 AM, Chris.Barker wrote: > On 12/11/11 8:40 AM, Ralf Gommers wrote: > > On Wed, Dec 7, 2011 at 7:50 PM, Chris.Barker > * If we have a good, fast ascii (or unicode?) to array reader, > hopefully > > it could be leveraged for use in the more complex cases. So that > rather > > than genfromtxt() being written from scratch, it would be a wrapper > > around the lower-level reader. > > > > You seem to be contradicting yourself here. The more complex cases are > > Wes' 10% and why genfromtxt is so hairy internally. There's always a > > trade-off between speed and handling complex corner cases. You want both. > > I don't think the version in my mind is contradictory (Not quite). > > What I'm imagining is that a good, fast ascii to numpy array reader > could read a whole table in at once (the common, easy, fast, case), but > it could also be used to read snippets of a file in at a time, which > could be leveraged to handle many of the more complex cases. > > I suppose there will always be cases where the user needs to write their > own converter from string to dtype, and there is simply no way to > leverage what I'm imagining to supported that. > > Hmm, maybe there is -- for instance, if a "record" consisted off mostly > standard, easy-to-parse, numbers, but one field was some weird text that > needed custom parsing, we could read it as a dtype, with a string for > that one weird field, and that could be converted in a post-processing > step. > > Maybe that wouldn't be any faster or easier, but it could be done... > > Anyway, whether you can leverage it for the full-featured version or > not, I do think there is call for a good, fast, 90% case text file parser. > > > Would anyone like to join/form a small working group to work on this? > > Wes, I'd like to see your Cython version -- maybe a starting point? > > -Chris > I'm also working on a faster text file reader, so count me in. I've been experimenting in both C and Cython. I'll put it on github as soon as I can. Warren > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 12 13:54:51 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 12 Dec 2011 11:54:51 -0700 Subject: [Numpy-discussion] polynomial with negative exponents In-Reply-To: <1EBDBB3C-4413-403F-8430-69092FBFBFEB@gmail.com> References: <1EBDBB3C-4413-403F-8430-69092FBFBFEB@gmail.com> Message-ID: On Mon, Dec 12, 2011 at 10:17 AM, Gregor Thalhammer < gregor.thalhammer at gmail.com> wrote: > > Am 12.12.2011 um 15:04 schrieb LASAGNA DAVIDE: > > > Hi, > > > > I have written a class for polynomials with negative > > exponents like: > > > > p(x) = a0 + a1*x**-1 + ... + an*x**-n > > > > The code is this one: > > > > class NegativeExpPolynomial( object ): > > def __init__ ( self, coeffs ): > > self.coeffs = np.array( coeffs ) > > > > def __call__( self, x ): > > return sum( (c*x**(-i) for i, c in enumerate( > > self.coeffs ) ) ) > > > > where coeffs = [a0, a1, ..., an]. > > > > I find that the way i evaluate the polynomial is kind of > > *slow*, especially for polynomial with order larger than > > ~200 and for arrays x large enough. > > I fear that such high orders create a lot of troubles since evaluating > them is very sensitive to numerical errors. > > > Do you have suggestions on how to speed up this code? > > > > Your polynomials with negative exponents are equivalent to a polynomial > with positive exponents, but evaluated at 1/x. Therefore you can make use > of the efficient polynomial functions of numpy. Try > > np.polyval(self.coeffs, x) > > Or numpy.polynomial.polynomial.polyval, which will use the coefficients in the same order. Or you could subclass numpy.polynomial.Polynomial and override the call to use 1/x instead of x. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at uw.edu Mon Dec 12 14:29:27 2011 From: rowen at uw.edu (Russell E. Owen) Date: Mon, 12 Dec 2011 11:29:27 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 References: Message-ID: In article , Ralf Gommers wrote: > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit tests > > claim the wrong version of fortran was used. I thought I knew how to > > avoid that, but it's not working. > > > >...(elided text that suggests numpy is building using g77 even though I asked for gfortran)... > > > > Any suggestions on how to fix this? > > > > I assume you have g77 installed and on your PATH. If so, try moving it off > your path. Yes. I would have tried that if I had known how to do it (though I'm puzzled why it would be wanted since I told the installer to use gfortran). The problem is that g77 is in /usr/bin/ and I don't have root privs on this system. -- Russell From shish at keba.be Mon Dec 12 14:43:18 2011 From: shish at keba.be (Olivier Delalleau) Date: Mon, 12 Dec 2011 20:43:18 +0100 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 In-Reply-To: References: Message-ID: 2011/12/12 Russell E. Owen > In article > , > Ralf Gommers wrote: > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit > tests > > > claim the wrong version of fortran was used. I thought I knew how to > > > avoid that, but it's not working. > > > > > >...(elided text that suggests numpy is building using g77 even though I > asked for gfortran)... > > > > > > Any suggestions on how to fix this? > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it > off > > your path. > > Yes. I would have tried that if I had known how to do it (though I'm > puzzled why it would be wanted since I told the installer to use > gfortran). > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > this system. > > -- Russell > You could create a link g77 -> gfortran and make sure this link comes first in your PATH. (That's assuming command lines for g77 and gfortran are compatible -- I don't know if that's the case). -=- Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoytak at stat.washington.edu Mon Dec 12 23:18:23 2011 From: hoytak at stat.washington.edu (Hoyt Koepke) Date: Mon, 12 Dec 2011 20:18:23 -0800 Subject: [Numpy-discussion] ANN: PyCPX 0.02, a numpy/cython wrapper for CPlex Message-ID: Hello, I'm pleased to announce the second release of the PyCPX wrapper for IBM's CPlex Optimizer Suite. This second release fixes several bugs and vastly optimizes the model construction stage, particularly for large models. PyCPX is a python wrapper for the CPlex optimization suite that focuses on ease of use and seamless integration with numpy. It allows one to naturally specify linear and quadratic problems over real, boolean, and integer variables. For documentation and examples, please see the website: http://www.stat.washington.edu/~hoytak/code/pycpx/index.html. Please send me any comments, questions and suggestions! I am quite open to feedback. Thanks, --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiteshg at packtpub.com Tue Dec 13 05:44:11 2011 From: jiteshg at packtpub.com (Jitesh Y. Gawali) Date: Tue, 13 Dec 2011 16:14:11 +0530 Subject: [Numpy-discussion] NumPy book review copies In-Reply-To: <4EE72A02.4010801@packtpub.com> References: <4EE72A02.4010801@packtpub.com> Message-ID: <4EE72C7B.3020600@packtpub.com> Hi, As a part of our reviewing program, we are giving away limited number of copies (print & electronic) of our recent publication NumPy 1.5 Beginner's Guide to people interested in reviewing the book. You need to publish your review/feedback on either your blog or on websites like Barnes and Noble , Slashdot and so on as per your choice. We also encourage uploading the review on Amazon since it gives buyers a chance to know the book through your perspective. Please get in touch with me for more information. Thanks, Jitesh -- Jitesh Gawali *Marketing Research Executive*| Packt Publishing | www.PacktPub.com *MSN:*_jiteshg_ _ at packtpub.com_ Interested in becoming an author? Visit Packt's Author Website for all the information you need about writing for Packt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 13 13:08:44 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Dec 2011 10:08:44 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> Message-ID: NOTE: Let's keep this on the list. On Tue, Dec 13, 2011 at 9:19 AM, denis wrote: > Chris, > unified, consistent save / load is a nice goal > > 1) header lines with date, pwd etc.: "where'd this come from ?" > > # (5, 5) svm.py bz/py/ml/svm 2011-12-13 Dec 11:56 -- automatic > # 80.6 % correct -- user info > 245 39 4 5 26 > ... > I'm not sure I understand what you are expecting here: What would be automatic? if itparses a datetime on the header, what would it do with it? But anyway, this seems to me: - very application specific -- this is for the users code to write - not what we are talking about at this point anyway -- I think this discussion is about a lower-level, does-the-simple-things-fast reader -- that may or may not be able to form the basis of a higher-level fuller featured reader. > 2) read any CSVs: comma or blank-delimited, with/without column names, > a la loadcsv() below > yup -- though the column name reading would be part of a higher-level reader as far as I'm concerned. > 3) sparse or masked arrays ? > > sparse probably not, that seem pretty domain dependent to me -- though hopefully one could build such a thing on top of the lower level reader. Masked support would be good -- once we're convinced what the future of masked arrays are in numpy. I was thinking that the masked array issue would really be a higher-level feature -- it certainly could be if you need to mask "special value" stype files (i.e. 9999), but we may have to build it into the lower level reader for cases where the mask is specified by non-numerical values -- i.e. there are some met files that use "MM" or some other text, so you can't put it into a numerical array first. > > Longterm wishes: beyond the scope of one file <-> one array > but essential for larger projects: > 1) dicts / dotdicts: > Dotdict( A=anysizearray, N=scalar ... ) <-> a directory of little > files > is easy, better than np.savez > (Haven't used hdf5, I believe Matlabv7 does.) > > 2) workflows: has anyone there used visTrails ? > outside of the spec of this thread... > > Anyway it seems to me (old grey cynic) that Numpy/scipy developers > prefer to code first, spec and doc later. Too pessimistic ? > > Well, I think many of us believe in a more agile style approach -- incremental development. But really, as an open source project, it's really about scratching an itch -- so there is usually a spec in mind for the itch at hand. In this case, however, that has been a weakness -- clearly a number of us hav written small solutions to our particular problem at hand, but no we haven't arrived at a more general purpose solution yet. So a bit of spec-ing ahead of time may be called for. On that: I"ve been thinking from teh botom-up -- imaging what I need for the simple case, and how it might apply to more complex cases -- but maybe we should think about this another way: What we're talking about here is really about core software engineering -- optimization. It's easy to write a pure-python simple file parser, and reasonable to write a complex one (genfromtxt) -- the issue is performance -- we need some more C (or Cython) code to really speed it up, but none of us wants to write the complex case code in C. So: genfromtxt is really nice for many of the complex cases. So perhaps another approach is to look at genfromtxt, and see what high performance lower-level functionality we could develop that could make it fast -- then we are done. This actually mirrors exactly what we all usually recommend for python development in general -- write it in Python, then, if it's really not fast enough, write the bottle-neck in C. So where are the bottle necks in genfromtxt? Are there self-contained portions that could be re-written in C/Cython? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From magnetotellurics at gmail.com Tue Dec 13 13:13:31 2011 From: magnetotellurics at gmail.com (kneil) Date: Tue, 13 Dec 2011 10:13:31 -0800 (PST) Subject: [Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication In-Reply-To: References: <4ED76256.4020909@crans.org> <32898383.post@talk.nabble.com> <32900355.post@talk.nabble.com> <32906553.post@talk.nabble.com> <32922174.post@talk.nabble.com> Message-ID: <32969114.post@talk.nabble.com> Hi Olivier, Sorry for the late reply - I have been on travel. I have encountered the error in two separate cases; when I was using numpy arrays, and when I was using numpy matrices. In the case of a numpy array (Y), the operation is: dot(Y,Y.conj().transpose()) and in the case of a matrix, with X=asmatrix(Y) and then the operation is: X*X.H -Karl Olivier Delalleau-2 wrote: > > I was trying to see if I could reproduce this problem, but your code fails > with numpy 1.6.1 with: > AttributeError: 'numpy.ndarray' object has no attribute 'H' > Is X supposed to be a regular ndarray with dtype = 'complex128', or > something else? > > -=- Olivier > > -- View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32969114.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From eraldo.pomponi at gmail.com Tue Dec 13 14:04:22 2011 From: eraldo.pomponi at gmail.com (Eraldo Pomponi) Date: Tue, 13 Dec 2011 20:04:22 +0100 Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <32955052.post@talk.nabble.com> References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> Message-ID: Hi Fred, I would suggest you to have a look at pandas (http://pandas.sourceforge.net/) . It was really helpful for me. It seems well suited for the type of data that you are working with. It has nice "brodcasting" capabilities to apply numpy functions to a set column. http://pandas.sourceforge.net/basics.html#descriptive-statistics http://pandas.sourceforge.net/basics.html#function-application Cheers, Eraldo On Sun, Dec 11, 2011 at 1:49 PM, ferreirafm wrote: > > > Aronne Merrelli wrote: > > > > I can recreate this error if tab is a structured ndarray - what is the > > dtype of tab? > > > > If that is correct, I think you could fix this by simplifying things. > > Since > > tab is already an ndarray, you should not need to convert it back into a > > python list. By converting the ndarray back to a list you are making an > > extra level of "wrapping" as a python object, which is ultimately why you > > get that error about adding numpy.void. > > > > Unfortunately you cannot take directly take a mean of a struct dtype; > > structs are generic so they could have fields with strings, or objects, > > etc, that would be invalid for a mean calculation. However the following > > code fragment should work pretty efficiently. It will make a 1-element > > array of the same dtype as tab, and then populate it with the mean value > > of > > all elements where the length is >= 15. Note that dtype.fields.keys() > > gives > > you a nice way to iterate over the fields in the struct dtype: > > > > length_mask = tab['length'] >= 15 > > tab_means = np.zeros(1, dtype=tab.dtype) > > for k in tab.dtype.fields.keys(): > > tab_means[k] = np.mean( tab[k][mask] ) > > > > In general this would not work if tab has a field that is not a simple > > numeric type, such as a str, object, ... But it looks like your arrays > are > > all numeric from your example above. > > > > Hope that helps, > > Aronne > > > HI Aronne, > Thanks for your replay. Indeed, tab is a mix of different column types: > tab.dtype: > [('sgi', ' ('positive', ' ' ' ' Interestingly, I couldn't be able to import some columns of digits as > strings like as with R dataframe objects. > I'll try to adapt your example to my needs and let you know the results. > Regards. > > -- > View this message in context: > http://old.nabble.com/numpy.mean-problems-tp32945124p32955052.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Dec 13 14:29:47 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 13 Dec 2011 13:29:47 -0600 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> Message-ID: <4EE7A7AB.8060201@gmail.com> On 12/13/2011 12:08 PM, Chris Barker wrote: > NOTE: > > Let's keep this on the list. > > On Tue, Dec 13, 2011 at 9:19 AM, denis > wrote: > > Chris, > unified, consistent save / load is a nice goal > > 1) header lines with date, pwd etc.: "where'd this come from ?" > > # (5, 5) svm.py bz/py/ml/svm 2011-12-13 Dec 11:56 -- automatic > # 80.6 % correct -- user info > 245 39 4 5 26 > ... > > I'm not sure I understand what you are expecting here: What would be > automatic? if itparses a datetime on the header, what would it do with > it? But anyway, this seems to me: > - very application specific -- this is for the users code to write > - not what we are talking about at this point anyway -- I think this > discussion is about a lower-level, does-the-simple-things-fast reader > -- that may or may not be able to form the basis of a higher-level > fuller featured reader. > > 2) read any CSVs: comma or blank-delimited, with/without column names, > a la loadcsv() below > > > yup -- though the column name reading would be part of a higher-level > reader as far as I'm concerned. > > 3) sparse or masked arrays ? > > sparse probably not, that seem pretty domain dependent to me -- though > hopefully one could build such a thing on top of the lower level > reader. Masked support would be good -- once we're convinced what the > future of masked arrays are in numpy. I was thinking that the masked > array issue would really be a higher-level feature -- it certainly > could be if you need to mask "special value" stype files (i.e. 9999), > but we may have to build it into the lower level reader for cases > where the mask is specified by non-numerical values -- i.e. there are > some met files that use "MM" or some other text, so you can't put it > into a numerical array first. > > > Longterm wishes: beyond the scope of one file <-> one array > but essential for larger projects: > 1) dicts / dotdicts: > Dotdict( A=anysizearray, N=scalar ... ) <-> a directory of little > files > is easy, better than np.savez > (Haven't used hdf5, I believe Matlabv7 does.) > > 2) workflows: has anyone there used visTrails ? > > > outside of the spec of this thread... > > > Anyway it seems to me (old grey cynic) that Numpy/scipy developers > prefer to code first, spec and doc later. Too pessimistic ? > > > Well, I think many of us believe in a more agile style approach -- > incremental development. But really, as an open source project, it's > really about scratching an itch -- so there is usually a spec in mind > for the itch at hand. In this case, however, that has been a weakness > -- clearly a number of us hav written small solutions to > our particular problem at hand, but no we haven't arrived at a more > general purpose solution yet. So a bit of spec-ing ahead of time may > be called for. > > On that: > > I"ve been thinking from teh botom-up -- imaging what I need for the > simple case, and how it might apply to more complex cases -- but maybe > we should think about this another way: > > What we're talking about here is really about core software > engineering -- optimization. It's easy to write a pure-python simple > file parser, and reasonable to write a complex one (genfromtxt) -- the > issue is performance -- we need some more C (or Cython) code to really > speed it up, but none of us wants to write the complex case code in C. So: > > genfromtxt is really nice for many of the complex cases. So perhaps > another approach is to look at genfromtxt, and see what > high performance lower-level functionality we could develop that could > make it fast -- then we are done. > > This actually mirrors exactly what we all usually recommend for python > development in general -- write it in Python, then, if it's really not > fast enough, write the bottle-neck in C. > > So where are the bottle necks in genfromtxt? Are there self-contained > portions that could be re-written in C/Cython? > > -Chris > > > Reading data is hard and writing code that suits the diversity in the Numerical Python community is even harder! Both loadtxt and genfromtxt functions (other functions are perhaps less important) perhaps need an upgrade to incorporate the new NA object. I think that adding the NA object will simply some of the process because invalid data (missing or a string in a numerical format) can be set to NA without requiring the creation of a new masked array or returning an error. Here I think loadtxt is a better target than genfromtxt because, as I understand it, it assumes the user really knows the data. Whereas genfromtxt can ask the data for the appropriatye format. So I agree that new 'superfast custom CSV reader for well-behaved data' function would be rather useful especially as an replacement for loadtxt. By that I mean reading data using a user specified format that essentially follows the CSV format (http://en.wikipedia.org/wiki/Comma-separated_values) - it needs are to allow for NA object, skipping lines and user-defined delimiters. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ferreirafm at lim12.fm.usp.br Tue Dec 13 15:01:23 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Tue, 13 Dec 2011 12:01:23 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> Message-ID: <32970295.post@talk.nabble.com> Hi Eraldo, Thanks for your suggestion. I was using pytables but give up after known that some very useful capabilities are sold as a professional package. However, it still useful to many printing and data manipulation and, also, it can handle extremely large datasets (which is not my case.). Regards, Fred Eraldo Pomponi wrote: > > I would suggest you to have a look at pandas > (http://pandas.sourceforge.net/) > . It was > really helpful for me. It seems well suited for the type of data that you > are working > with. It has nice "brodcasting" capabilities to apply numpy functions to a > set column. > http://pandas.sourceforge.net/basics.html#descriptive-statistics > http://pandas.sourceforge.net/basics.html#function-application > > Cheers, > Eraldo > -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32970295.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From eraldo.pomponi at gmail.com Tue Dec 13 15:23:18 2011 From: eraldo.pomponi at gmail.com (Eraldo Pomponi) Date: Tue, 13 Dec 2011 21:23:18 +0100 Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <32970295.post@talk.nabble.com> References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> <32970295.post@talk.nabble.com> Message-ID: Hi Fred, Pandas has a nice interface to PyTable if you still need it: http://pandas.sourceforge.net/io.html#hdf5-pytables However, my intention was just to point you to pandas because it is really a powerful tool if you need to deal with tabular heterogenic data. It is also important to notice that there are plans in the numpy community to include/port "part" of this package directly in the codebase. This says a lot about how good it is... Best, Eraldo On Tue, Dec 13, 2011 at 9:01 PM, ferreirafm wrote: > > Hi Eraldo, > Thanks for your suggestion. I was using pytables but give up after known > that some very useful capabilities are sold as a professional package. > However, it still useful to many printing and data manipulation and, also, > it can handle extremely large datasets (which is not my case.). > Regards, > Fred > > > Eraldo Pomponi wrote: > > > > I would suggest you to have a look at pandas > > (http://pandas.sourceforge.net/) > > . It was > > really helpful for me. It seems well suited for the type of data that you > > are working > > with. It has nice "brodcasting" capabilities to apply numpy functions to > a > > set column. > > http://pandas.sourceforge.net/basics.html#descriptive-statistics > > http://pandas.sourceforge.net/basics.html#function-application > > > > Cheers, > > Eraldo > > > > -- > View this message in context: > http://old.nabble.com/numpy.mean-problems-tp32945124p32970295.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Dec 13 15:43:42 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 13 Dec 2011 21:43:42 +0100 Subject: [Numpy-discussion] Moving to gcc 4.* for win32 installers ? In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 12:18 PM, David Cournapeau wrote: > On Thu, Oct 27, 2011 at 5:19 PM, Ralf Gommers > wrote: > > Hi David, > > > > On Thu, Oct 27, 2011 at 3:02 PM, David Cournapeau > > wrote: > >> > >> Hi, > >> > >> I was wondering if we could finally move to a more recent version of > >> compilers for official win32 installers. This would of course concern > >> the next release cycle, not the ones where beta/rc are already in > >> progress. > >> > >> Basically, the pros: > >> - we will have to move at some point > >> - gcc 4.* seem less buggy, especially C++ and fortran. > >> - no need to maintain msvcr90 vodoo > >> The cons: > >> - it will most likely break the ABI > >> - we need to recompile atlas (but I can take care of it) > >> - the biggest: it is difficult to combine gfortran with visual > >> studio (more exactly you cannot link gfortran runtime to a visual > >> studio executable). The only solution I could think of would be to > >> recompile the gfortran runtime with Visual Studio, which for some > >> reason does not sound very appealing :) > > > > To get the datetime changes to work with MinGW, we already concluded that > > building with 4.x is more or less required (without recognizing some of > the > > points you list above). Changes to mingw32ccompiler to fix compilation > with > > 4.x went in in https://github.com/numpy/numpy/pull/156. It would be > good if > > you could check those. > > I will look into it more carefully, but overall, it seems that > building atlas 3.8.4, numpy and scipy with gcc 4.x works quite well. > The main issue is that gcc 4.* adds some dependencies on mingw dlls. > There are two options: > - adding the dlls in the installers > - statically linking those, which seems to be a bad idea > (generalizing the dll boundaries problem to exception and things we > would rather not care about: > http://cygwin.com/ml/cygwin/2007-06/msg00332.html). > > > It probably makes sense make this move for numpy 1.7. If this breaks the > ABI > > then it would be easiest to make numpy 1.7 the minimum required version > for > > scipy 0.11. > > My thinking as well. > > Hi David, what is the current status of this issue? I kind of forgot this is a prerequisite for the next release when starting the 1.7.0 release thread. Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From konrad.banachewicz at gmail.com Tue Dec 13 15:50:06 2011 From: konrad.banachewicz at gmail.com (Konrad Banachewicz) Date: Tue, 13 Dec 2011 21:50:06 +0100 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 63, Issue 43 In-Reply-To: References: Message-ID: U On 12/13/11, numpy-discussion-request at scipy.org wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. Re: Fast Reading of ASCII files (Chris Barker) > 2. Re: Apparently non-deterministic behaviour of complex array > multiplication (kneil) > 3. Re: numpy.mean problems (Eraldo Pomponi) > 4. Re: Fast Reading of ASCII files (Bruce Southey) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 13 Dec 2011 10:08:44 -0800 > From: Chris Barker > Subject: Re: [Numpy-discussion] Fast Reading of ASCII files > To: denis , Discussion of Numerical Python > > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > NOTE: > > Let's keep this on the list. > > On Tue, Dec 13, 2011 at 9:19 AM, denis wrote: > >> Chris, >> unified, consistent save / load is a nice goal >> >> 1) header lines with date, pwd etc.: "where'd this come from ?" >> >> # (5, 5) svm.py bz/py/ml/svm 2011-12-13 Dec 11:56 -- automatic >> # 80.6 % correct -- user info >> 245 39 4 5 26 >> ... >> > I'm not sure I understand what you are expecting here: What would be > automatic? if itparses a datetime on the header, what would it do with it? > But anyway, this seems to me: > - very application specific -- this is for the users code to write > - not what we are talking about at this point anyway -- I think this > discussion is about a lower-level, does-the-simple-things-fast reader -- > that may or may not be able to form the basis of a higher-level fuller > featured reader. > > >> 2) read any CSVs: comma or blank-delimited, with/without column names, >> a la loadcsv() below >> > > yup -- though the column name reading would be part of a higher-level > reader as far as I'm concerned. > > >> 3) sparse or masked arrays ? >> >> sparse probably not, that seem pretty domain dependent to me -- though > hopefully one could build such a thing on top of the lower level reader. > Masked support would be good -- once we're convinced what the future of > masked arrays are in numpy. I was thinking that the masked array issue > would really be a higher-level feature -- it certainly could be if you need > to mask "special value" stype files (i.e. 9999), but we may have to build > it into the lower level reader for cases where the mask is specified by > non-numerical values -- i.e. there are some met files that use "MM" or some > other text, so you can't put it into a numerical array first. > >> >> Longterm wishes: beyond the scope of one file <-> one array >> but essential for larger projects: >> 1) dicts / dotdicts: >> Dotdict( A=anysizearray, N=scalar ... ) <-> a directory of little >> files >> is easy, better than np.savez >> (Haven't used hdf5, I believe Matlabv7 does.) >> >> 2) workflows: has anyone there used visTrails ? >> > > outside of the spec of this thread... > >> >> Anyway it seems to me (old grey cynic) that Numpy/scipy developers >> prefer to code first, spec and doc later. Too pessimistic ? >> >> > Well, I think many of us believe in a more agile style approach -- > incremental development. But really, as an open source project, it's really > about scratching an itch -- so there is usually a spec in mind for the itch > at hand. In this case, however, that has been a weakness -- clearly a > number of us hav written small solutions to our particular problem at hand, > but no we haven't arrived at a more general purpose solution yet. So a bit > of spec-ing ahead of time may be called for. > > On that: > > I"ve been thinking from teh botom-up -- imaging what I need for the simple > case, and how it might apply to more complex cases -- but maybe we should > think about this another way: > > What we're talking about here is really about core software engineering -- > optimization. It's easy to write a pure-python simple file parser, and > reasonable to write a complex one (genfromtxt) -- the issue is performance > -- we need some more C (or Cython) code to really speed it up, but none of > us wants to write the complex case code in C. So: > > genfromtxt is really nice for many of the complex cases. So perhaps > another approach is to look at genfromtxt, and see what > high performance lower-level functionality we could develop that could make > it fast -- then we are done. > > This actually mirrors exactly what we all usually recommend for python > development in general -- write it in Python, then, if it's really not fast > enough, write the bottle-neck in C. > > So where are the bottle necks in genfromtxt? Are there self-contained > portions that could be re-written in C/Cython? > > -Chris > > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111213/2b6d09f4/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Tue, 13 Dec 2011 10:13:31 -0800 (PST) > From: kneil > Subject: Re: [Numpy-discussion] Apparently non-deterministic behaviour > of complex array multiplication > To: numpy-discussion at scipy.org > Message-ID: <32969114.post at talk.nabble.com> > Content-Type: text/plain; charset=us-ascii > > > Hi Olivier, > Sorry for the late reply - I have been on travel. > I have encountered the error in two separate cases; when I was using numpy > arrays, and when I was using numpy matrices. > In the case of a numpy array (Y), the operation is: > dot(Y,Y.conj().transpose()) > and in the case of a matrix, with X=asmatrix(Y) and then the operation is: > X*X.H > -Karl > > > Olivier Delalleau-2 wrote: >> >> I was trying to see if I could reproduce this problem, but your code fails >> with numpy 1.6.1 with: >> AttributeError: 'numpy.ndarray' object has no attribute 'H' >> Is X supposed to be a regular ndarray with dtype = 'complex128', or >> something else? >> >> -=- Olivier >> >> > > -- > View this message in context: > http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32969114.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > > > ------------------------------ > > Message: 3 > Date: Tue, 13 Dec 2011 20:04:22 +0100 > From: Eraldo Pomponi > Subject: Re: [Numpy-discussion] numpy.mean problems > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Hi Fred, > > I would suggest you to have a look at pandas > (http://pandas.sourceforge.net/) > . It was > really helpful for me. It seems well suited for the type of data that you > are working > with. It has nice "brodcasting" capabilities to apply numpy functions to a > set column. > http://pandas.sourceforge.net/basics.html#descriptive-statistics > http://pandas.sourceforge.net/basics.html#function-application > > Cheers, > Eraldo > > > On Sun, Dec 11, 2011 at 1:49 PM, ferreirafm > wrote: > >> >> >> Aronne Merrelli wrote: >> > >> > I can recreate this error if tab is a structured ndarray - what is the >> > dtype of tab? >> > >> > If that is correct, I think you could fix this by simplifying things. >> > Since >> > tab is already an ndarray, you should not need to convert it back into a >> > python list. By converting the ndarray back to a list you are making an >> > extra level of "wrapping" as a python object, which is ultimately why >> > you >> > get that error about adding numpy.void. >> > >> > Unfortunately you cannot take directly take a mean of a struct dtype; >> > structs are generic so they could have fields with strings, or objects, >> > etc, that would be invalid for a mean calculation. However the following >> > code fragment should work pretty efficiently. It will make a 1-element >> > array of the same dtype as tab, and then populate it with the mean value >> > of >> > all elements where the length is >= 15. Note that dtype.fields.keys() >> > gives >> > you a nice way to iterate over the fields in the struct dtype: >> > >> > length_mask = tab['length'] >= 15 >> > tab_means = np.zeros(1, dtype=tab.dtype) >> > for k in tab.dtype.fields.keys(): >> > tab_means[k] = np.mean( tab[k][mask] ) >> > >> > In general this would not work if tab has a field that is not a simple >> > numeric type, such as a str, object, ... But it looks like your arrays >> are >> > all numeric from your example above. >> > >> > Hope that helps, >> > Aronne >> > >> HI Aronne, >> Thanks for your replay. Indeed, tab is a mix of different column types: >> tab.dtype: >> [('sgi', '> ('positive', '> '> '> '> Interestingly, I couldn't be able to import some columns of digits as >> strings like as with R dataframe objects. >> I'll try to adapt your example to my needs and let you know the results. >> Regards. >> >> -- >> View this message in context: >> http://old.nabble.com/numpy.mean-problems-tp32945124p32955052.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111213/487dbe82/attachment-0001.html > > ------------------------------ > > Message: 4 > Date: Tue, 13 Dec 2011 13:29:47 -0600 > From: Bruce Southey > Subject: Re: [Numpy-discussion] Fast Reading of ASCII files > To: numpy-discussion at scipy.org > Message-ID: <4EE7A7AB.8060201 at gmail.com> > Content-Type: text/plain; charset="utf-8" > > On 12/13/2011 12:08 PM, Chris Barker wrote: >> NOTE: >> >> Let's keep this on the list. >> >> On Tue, Dec 13, 2011 at 9:19 AM, denis > > wrote: >> >> Chris, >> unified, consistent save / load is a nice goal >> >> 1) header lines with date, pwd etc.: "where'd this come from ?" >> >> # (5, 5) svm.py bz/py/ml/svm 2011-12-13 Dec 11:56 -- automatic >> # 80.6 % correct -- user info >> 245 39 4 5 26 >> ... >> >> I'm not sure I understand what you are expecting here: What would be >> automatic? if itparses a datetime on the header, what would it do with >> it? But anyway, this seems to me: >> - very application specific -- this is for the users code to write >> - not what we are talking about at this point anyway -- I think this >> discussion is about a lower-level, does-the-simple-things-fast reader >> -- that may or may not be able to form the basis of a higher-level >> fuller featured reader. >> >> 2) read any CSVs: comma or blank-delimited, with/without column names, >> a la loadcsv() below >> >> >> yup -- though the column name reading would be part of a higher-level >> reader as far as I'm concerned. >> >> 3) sparse or masked arrays ? >> >> sparse probably not, that seem pretty domain dependent to me -- though >> hopefully one could build such a thing on top of the lower level >> reader. Masked support would be good -- once we're convinced what the >> future of masked arrays are in numpy. I was thinking that the masked >> array issue would really be a higher-level feature -- it certainly >> could be if you need to mask "special value" stype files (i.e. 9999), >> but we may have to build it into the lower level reader for cases >> where the mask is specified by non-numerical values -- i.e. there are >> some met files that use "MM" or some other text, so you can't put it >> into a numerical array first. >> >> >> Longterm wishes: beyond the scope of one file <-> one array >> but essential for larger projects: >> 1) dicts / dotdicts: >> Dotdict( A=anysizearray, N=scalar ... ) <-> a directory of little >> files >> is easy, better than np.savez >> (Haven't used hdf5, I believe Matlabv7 does.) >> >> 2) workflows: has anyone there used visTrails ? >> >> >> outside of the spec of this thread... >> >> >> Anyway it seems to me (old grey cynic) that Numpy/scipy developers >> prefer to code first, spec and doc later. Too pessimistic ? >> >> >> Well, I think many of us believe in a more agile style approach -- >> incremental development. But really, as an open source project, it's >> really about scratching an itch -- so there is usually a spec in mind >> for the itch at hand. In this case, however, that has been a weakness >> -- clearly a number of us hav written small solutions to >> our particular problem at hand, but no we haven't arrived at a more >> general purpose solution yet. So a bit of spec-ing ahead of time may >> be called for. >> >> On that: >> >> I"ve been thinking from teh botom-up -- imaging what I need for the >> simple case, and how it might apply to more complex cases -- but maybe >> we should think about this another way: >> >> What we're talking about here is really about core software >> engineering -- optimization. It's easy to write a pure-python simple >> file parser, and reasonable to write a complex one (genfromtxt) -- the >> issue is performance -- we need some more C (or Cython) code to really >> speed it up, but none of us wants to write the complex case code in C. So: >> >> genfromtxt is really nice for many of the complex cases. So perhaps >> another approach is to look at genfromtxt, and see what >> high performance lower-level functionality we could develop that could >> make it fast -- then we are done. >> >> This actually mirrors exactly what we all usually recommend for python >> development in general -- write it in Python, then, if it's really not >> fast enough, write the bottle-neck in C. >> >> So where are the bottle necks in genfromtxt? Are there self-contained >> portions that could be re-written in C/Cython? >> >> -Chris >> >> >> > Reading data is hard and writing code that suits the diversity in the > Numerical Python community is even harder! > > Both loadtxt and genfromtxt functions (other functions are perhaps less > important) perhaps need an upgrade to incorporate the new NA object. I > think that adding the NA object will simply some of the process because > invalid data (missing or a string in a numerical format) can be set to > NA without requiring the creation of a new masked array or returning an > error. > > Here I think loadtxt is a better target than genfromtxt because, as I > understand it, it assumes the user really knows the data. Whereas > genfromtxt can ask the data for the appropriatye format. > > So I agree that new 'superfast custom CSV reader for well-behaved data' > function would be rather useful especially as an replacement for > loadtxt. By that I mean reading data using a user specified format that > essentially follows the CSV format > (http://en.wikipedia.org/wiki/Comma-separated_values) - it needs are to > allow for NA object, skipping lines and user-defined delimiters. > > Bruce > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111213/b01db77d/attachment.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 63, Issue 43 > ************************************************ > -- Sent from my mobile device ""Reasonable people adapt themselves to the world. Unreasonable people attempt to adapt the world to themselves. All progress, therefore, depends on unreasonable people." - G.B. Shaw From chris.barker at noaa.gov Tue Dec 13 16:07:56 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Dec 2011 13:07:56 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: <4EE7A7AB.8060201@gmail.com> References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> Message-ID: On Tue, Dec 13, 2011 at 11:29 AM, Bruce Southey wrote: > ** > Reading data is hard and writing code that suits the diversity in the > Numerical Python community is even harder! > > yup Both loadtxt and genfromtxt functions (other functions are perhaps less > important) perhaps need an upgrade to incorporate the new NA object. > yes, if we are satisfiedthat the new NA object is, in fact, the way of the future. > Here I think loadtxt is a better target than genfromtxt because, as I > understand it, it assumes the user really knows the data. Whereas > genfromtxt can ask the data for the appropriatye format. > > So I agree that new 'superfast custom CSV reader for well-behaved data' > function would be rather useful especially as an replacement for loadtxt. > By that I mean reading data using a user specified format that essentially > follows the CSV format ( > http://en.wikipedia.org/wiki/Comma-separated_values) - it needs are to > allow for NA object, skipping lines and user-defined delimiters. > > I think that ideally, there could be one interface to reading tabular data -- hopefully, it would be easy for the user to specify what the want, and if they don't the code tries to figure it out. Also, under the hood, the "easy" cases are special-cased to high-performing versions. genfromtxt sure looks close for an API -- it just needs the "high performance special cases" under the hood. It may be that the way it's designed makes it very difficult to do that, though -- I haven't looked closely enough to tell. At least that's what I'm thinking at the moment. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Dec 13 16:21:31 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 13 Dec 2011 22:21:31 +0100 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> Message-ID: On Tue, Dec 13, 2011 at 10:07 PM, Chris Barker wrote: > On Tue, Dec 13, 2011 at 11:29 AM, Bruce Southey wrote: > >> ** >> Reading data is hard and writing code that suits the diversity in the >> Numerical Python community is even harder! >> >> > yup > > Both loadtxt and genfromtxt functions (other functions are perhaps less >> important) perhaps need an upgrade to incorporate the new NA object. >> > > yes, if we are satisfiedthat the new NA object is, in fact, the way of the > future. > > >> Here I think loadtxt is a better target than genfromtxt because, as I >> understand it, it assumes the user really knows the data. Whereas >> genfromtxt can ask the data for the appropriatye format. >> >> So I agree that new 'superfast custom CSV reader for well-behaved data' >> function would be rather useful especially as an replacement for loadtxt. >> By that I mean reading data using a user specified format that essentially >> follows the CSV format ( >> http://en.wikipedia.org/wiki/Comma-separated_values) - it needs are to >> allow for NA object, skipping lines and user-defined delimiters. >> >> > I think that ideally, there could be one interface to reading tabular data > -- hopefully, it would be easy for the user to specify what the want, and > if they don't the code tries to figure it out. Also, under the hood, the > "easy" cases are special-cased to high-performing versions. > > genfromtxt sure looks close for an API > This I don't agree with. It has a huge amount of keywords that just confuse or intimidate a beginning user. There should be a dead simple interface, even the loadtxt API is on the heavy side. Ralf > -- it just needs the "high performance special cases" under the hood. It > may be that the way it's designed makes it very difficult to do that, > though -- I haven't looked closely enough to tell. > > At least that's what I'm thinking at the moment. > > -Chris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Dec 13 16:57:51 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 13 Dec 2011 16:57:51 -0500 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <4EE63848.5000002@noaa.gov> Message-ID: On Mon, Dec 12, 2011 at 12:34 PM, Warren Weckesser wrote: > > > On Mon, Dec 12, 2011 at 10:22 AM, Chris.Barker > wrote: >> >> On 12/11/11 8:40 AM, Ralf Gommers wrote: >> > On Wed, Dec 7, 2011 at 7:50 PM, Chris.Barker > > ? ? * If we have a good, fast ascii (or unicode?) to array reader, >> > hopefully >> > ? ? it could be leveraged for use in the more complex cases. So that >> > rather >> > ? ? than genfromtxt() being written from scratch, it would be a wrapper >> > ? ? around the lower-level reader. >> > >> > You seem to be contradicting yourself here. The more complex cases are >> > Wes' 10% and why genfromtxt is so hairy internally. There's always a >> > trade-off between speed and handling complex corner cases. You want >> > both. >> >> I don't think the version in my mind is contradictory (Not quite). >> >> What I'm imagining is that a good, fast ascii to numpy array reader >> could read a whole table in at once (the common, easy, fast, case), but >> it could also be used to read snippets of a file in at a time, which >> could be leveraged to handle many of the more complex cases. >> >> I suppose there will always be cases where the user needs to write their >> own converter from string to dtype, and there is simply no way to >> leverage what I'm imagining to supported that. >> >> Hmm, maybe there is -- for instance, if a "record" consisted off mostly >> standard, easy-to-parse, numbers, but one field was some weird text that >> needed custom parsing, we could read it as a dtype, with a string for >> that one weird field, and that could be converted in a post-processing >> step. >> >> Maybe that wouldn't be any faster or easier, but it could be done... >> >> Anyway, whether you can leverage it for the full-featured version or >> not, I do think there is call for a good, fast, 90% case text file parser. >> >> >> Would anyone like to join/form a small working group to work on this? >> >> Wes, I'd like to see your Cython version -- maybe a starting point? >> >> -Chris > > > > I'm also working on a faster text file reader, so count me in.? I've been > experimenting in both C and Cython.?? I'll put it on github as soon as I > can. > > Warren > > >> >> >> >> -- >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice >> 7600 Sand Point Way NE ? (206) 526-6329 ? fax >> Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Cool, Warren, I look forward to seeing it. I'm hopeful we can craft a performant tool that will meet the needs of of many projects (NumPy, pandas, etc.)... From kbasye1 at jhu.edu Tue Dec 13 17:11:54 2011 From: kbasye1 at jhu.edu (Ken Basye) Date: Tue, 13 Dec 2011 17:11:54 -0500 Subject: [Numpy-discussion] Array min from argmin along an axis? Message-ID: <4EE7CDAA.9060400@jhu.edu> Hi folks, I need an efficient way to get both the min and argmin of a 2-d array along one axis. It seemed to me that the way to do this was to get the argmin and then use it to index into the array to get the min, but I can't figure out how to do it. Here's my toy example: >>> x = np.arange(25).reshape((5,5)) >>> x array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) >>> y = np.abs(x - x.T) >>> y array([[ 0, 4, 8, 12, 16], [ 4, 0, 4, 8, 12], [ 8, 4, 0, 4, 8], [12, 8, 4, 0, 4], [16, 12, 8, 4, 0]]) >>> np.argmin(y, axis=0) array([0, 1, 2, 3, 4]) >>> np.min(y, axis=0) array([0, 0, 0, 0, 0]) Here it seems like there should be a simple way to get the same array that min() returns using the argmin result, which won't need to 'search' in the array. Thanks very much, Ken From warren.weckesser at enthought.com Tue Dec 13 17:23:12 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 13 Dec 2011 16:23:12 -0600 Subject: [Numpy-discussion] Array min from argmin along an axis? In-Reply-To: <4EE7CDAA.9060400@jhu.edu> References: <4EE7CDAA.9060400@jhu.edu> Message-ID: On Tue, Dec 13, 2011 at 4:11 PM, Ken Basye wrote: > Hi folks, > I need an efficient way to get both the min and argmin of a 2-d > array along one axis. It seemed to me that the way to do this was to > get the argmin and then use it to index into the array to get the min, > but I can't figure out how to do it. Here's my toy example: > > >>> x = np.arange(25).reshape((5,5)) > >>> x > array([[ 0, 1, 2, 3, 4], > [ 5, 6, 7, 8, 9], > [10, 11, 12, 13, 14], > [15, 16, 17, 18, 19], > [20, 21, 22, 23, 24]]) > >>> y = np.abs(x - x.T) > >>> y > array([[ 0, 4, 8, 12, 16], > [ 4, 0, 4, 8, 12], > [ 8, 4, 0, 4, 8], > [12, 8, 4, 0, 4], > [16, 12, 8, 4, 0]]) > >>> np.argmin(y, axis=0) > array([0, 1, 2, 3, 4]) > >>> np.min(y, axis=0) > array([0, 0, 0, 0, 0]) > > Here it seems like there should be a simple way to get the same array > that min() returns using the argmin result, which won't need to 'search' > in the array. > > You can use the result of argmin to index into y, if you combine it with, say, arange(ncols) in the second dimension: In [53]: y = random.randint(0,10,size=(5,7)) In [54]: y Out[54]: array([[3, 3, 5, 1, 5, 3, 7], [1, 0, 6, 8, 0, 1, 1], [7, 9, 9, 3, 3, 1, 6], [5, 3, 5, 4, 9, 7, 4], [1, 7, 1, 6, 6, 1, 8]]) In [55]: am = np.argmin(y, axis=0) In [56]: am Out[56]: array([1, 1, 4, 0, 1, 1, 1]) In [57]: colmins = y[am, arange(7)] In [58]: colmins Out[58]: array([1, 0, 1, 1, 0, 1, 1]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Dec 13 17:27:29 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 13 Dec 2011 22:27:29 +0000 Subject: [Numpy-discussion] Array min from argmin along an axis? In-Reply-To: <4EE7CDAA.9060400@jhu.edu> References: <4EE7CDAA.9060400@jhu.edu> Message-ID: On Tue, Dec 13, 2011 at 22:11, Ken Basye wrote: > Hi folks, > ? ? I need an efficient way to get both the min and argmin of a 2-d > array along one axis. ?It seemed to me that the way to do this was to > get the argmin and then use it to index into the array to get the min, > but I can't figure out how to do it. ?Here's my toy example: [~] |1> x = np.arange(25).reshape((5,5)) [~] |2> y = np.abs(x - x.T) [~] |3> y array([[ 0, 4, 8, 12, 16], [ 4, 0, 4, 8, 12], [ 8, 4, 0, 4, 8], [12, 8, 4, 0, 4], [16, 12, 8, 4, 0]]) [~] |4> i = np.argmin(y, axis=0) [~] |5> y[i, np.arange(y.shape[1])] array([0, 0, 0, 0, 0]) [~] |6> y[np.argmin(y, axis=0), np.arange(y.shape[1])] array([0, 0, 0, 0, 0]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From david.verelst at gmail.com Tue Dec 13 17:46:10 2011 From: david.verelst at gmail.com (David Verelst) Date: Tue, 13 Dec 2011 23:46:10 +0100 Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <32970295.post@talk.nabble.com> References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> <32970295.post@talk.nabble.com> Message-ID: <4EE7D5B2.9070601@gmail.com> Note that the pytables pro you are referring to is no longer behind a pay wall. Recently the project went through some changes and the pro versions disappeared. All pro features where merged into the main project and, are as a consequence, also available for free. Regards, David On 13/12/11 21:01, ferreirafm wrote: > Hi Eraldo, > Thanks for your suggestion. I was using pytables but give up after known > that some very useful capabilities are sold as a professional package. > However, it still useful to many printing and data manipulation and, also, > it can handle extremely large datasets (which is not my case.). > Regards, > Fred > > > Eraldo Pomponi wrote: >> I would suggest you to have a look at pandas >> (http://pandas.sourceforge.net/) >> . It was >> really helpful for me. It seems well suited for the type of data that you >> are working >> with. It has nice "brodcasting" capabilities to apply numpy functions to a >> set column. >> http://pandas.sourceforge.net/basics.html#descriptive-statistics >> http://pandas.sourceforge.net/basics.html#function-application >> >> Cheers, >> Eraldo >> From srean.list at gmail.com Tue Dec 13 17:52:08 2011 From: srean.list at gmail.com (srean) Date: Tue, 13 Dec 2011 16:52:08 -0600 Subject: [Numpy-discussion] ANN: Numexpr 2.0 released In-Reply-To: <201111271400.48560.faltet@pytables.org> References: <201111271400.48560.faltet@pytables.org> Message-ID: This is great news, I hope this gets included in the epd distribution soon. I had mailed a few questions about numexpr sometime ago. I am still curious about those. I have included the relevant parts below. In addition, I have another question. There was a numexpr branch that allows a "out=blah" parameer to build the output in place, has that been merged or its functionality incorporated ? This goes without saying, but, thanks for numexpr. -- from old mail -- What I find somewhat encumbering is that there is no single piece of document that lists all the operators and functions that numexpr can parse. For a new user this will be very useful There is a list in the wiki page entitled "overview" but it seems incomplete (for instance it does not describe the reduction operations available). I do not know enough to know how incomplete it is. Is there any plan to implement the reduction like enhancements that ufuncs provide: namely reduce_at, accumulate, reduce ? It is entirely possible that they are already in there but I could not figure out how to use them. If they aren't it would be great to have them. On Sun, Nov 27, 2011 at 7:00 AM, Francesc Alted wrote: > > ======================== > ?Announcing Numexpr 2.0 > ======================== From chris.barker at noaa.gov Wed Dec 14 02:03:24 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Dec 2011 23:03:24 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> Message-ID: On Tue, Dec 13, 2011 at 1:21 PM, Ralf Gommers wrote: > > genfromtxt sure looks close for an API >> > > This I don't agree with. It has a huge amount of keywords that just > confuse or intimidate a beginning user. There should be a dead simple > interface, even the loadtxt API is on the heavy side. > well, yes, though it does do a lot -- do you have a smpler one in mind? But anyway, the really simple cases, are reallly simle, even with genfromtxt. I guess it's a matter of debate about what is a better API: a few functions, each adding a layer of sophistication or one function, with layers of sophistication added with an array of keyword arguments. In either case, though I wish the multiple functionality built on the same, well optimized core code. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Dec 14 09:04:15 2011 From: cournape at gmail.com (David Cournapeau) Date: Wed, 14 Dec 2011 09:04:15 -0500 Subject: [Numpy-discussion] Moving to gcc 4.* for win32 installers ? In-Reply-To: References: Message-ID: On Tue, Dec 13, 2011 at 3:43 PM, Ralf Gommers wrote: > On Sun, Oct 30, 2011 at 12:18 PM, David Cournapeau > wrote: >> >> On Thu, Oct 27, 2011 at 5:19 PM, Ralf Gommers >> wrote: >> > Hi David, >> > >> > On Thu, Oct 27, 2011 at 3:02 PM, David Cournapeau >> > wrote: >> >> >> >> Hi, >> >> >> >> I was wondering if we could finally move to a more recent version of >> >> compilers for official win32 installers. This would of course concern >> >> the next release cycle, not the ones where beta/rc are already in >> >> progress. >> >> >> >> Basically, the pros: >> >> ?- we will have to move at some point >> >> ?- gcc 4.* seem less buggy, especially C++ and fortran. >> >> ?- no need to maintain msvcr90 vodoo >> >> The cons: >> >> ?- it will most likely break the ABI >> >> ?- we need to recompile atlas (but I can take care of it) >> >> ?- the biggest: it is difficult to combine gfortran with visual >> >> studio (more exactly you cannot link gfortran runtime to a visual >> >> studio executable). The only solution I could think of would be to >> >> recompile the gfortran runtime with Visual Studio, which for some >> >> reason does not sound very appealing :) >> > >> > To get the datetime changes to work with MinGW, we already concluded >> > that >> > building with 4.x is more or less required (without recognizing some of >> > the >> > points you list above). Changes to mingw32ccompiler to fix compilation >> > with >> > 4.x went in in https://github.com/numpy/numpy/pull/156. It would be good >> > if >> > you could check those. >> >> I will look into it more carefully, but overall, it seems that >> building atlas 3.8.4, numpy and scipy with gcc 4.x works quite well. >> The main issue is that gcc 4.* adds some dependencies on mingw dlls. >> There are two options: >> ?- adding the dlls in the installers >> ?- statically linking those, which seems to be a bad idea >> (generalizing the dll boundaries problem to exception and things we >> would rather not care about: >> http://cygwin.com/ml/cygwin/2007-06/msg00332.html). >> >> > It probably makes sense make this move for numpy 1.7. If this breaks the >> > ABI >> > then it would be easiest to make numpy 1.7 the minimum required version >> > for >> > scipy 0.11. >> >> My thinking as well. >> > > Hi David, what is the current status of this issue? I kind of forgot this is > a prerequisite for the next release when starting the 1.7.0 release thread. The only issue at this point is the distribution of mingw dlls. I have not found a way to do it nicely (where nicely means something that is distributed within numpy package). Given that those dlls are actually versioned and seem to have a strong versioning policy, maybe we can just install them inside the python installation ? cheers, David From bsouthey at gmail.com Wed Dec 14 10:11:24 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 14 Dec 2011 09:11:24 -0600 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> Message-ID: <4EE8BC9C.6010308@gmail.com> On 12/14/2011 01:03 AM, Chris Barker wrote: > > > On Tue, Dec 13, 2011 at 1:21 PM, Ralf Gommers > > wrote: > > > genfromtxt sure looks close for an API > > > This I don't agree with. It has a huge amount of keywords that > just confuse or intimidate a beginning user. There should be a > dead simple interface, even the loadtxt API is on the heavy side. > > > well, yes, though it does do a lot -- do you have a smpler one in mind? > > But anyway, the really simple cases, are reallly simle, even with > genfromtxt. > > I guess it's a matter of debate about what is a better API: > > a few functions, each adding a layer of sophistication > > or > > one function, with layers of sophistication added with an array of > keyword arguments. > > In either case, though I wish the multiple functionality built on the > same, well optimized core code. > > -Chris > > > I am not sure that you can even create a simple API here as even Python's csv module is rather complex especially when it just reads data as strings. It also 'hides' many arguments in the Dialect class although these are just the collection of 7 'fmtparam' arguments. It also provides the Sniffer class that tries to find correct format that can then be passed to the reader function. Then you still have to convert the data into the required types - another set of arguments as well as yet another pass through the data. In comparison, genfromtxt can perform sniffing and both genfromtxt and loadtxt can read and convert the data. These also add some useful features like skipping rows (start, end and commented) and columns. However, it could be possible to create a sniffer function and a single data reader function leading to a 'simple' reader function but that probably would not change the API of the underlying data reader function. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 14 12:50:07 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 18:50:07 +0100 Subject: [Numpy-discussion] Moving to gcc 4.* for win32 installers ? In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 3:04 PM, David Cournapeau wrote: > On Tue, Dec 13, 2011 at 3:43 PM, Ralf Gommers > wrote: > > On Sun, Oct 30, 2011 at 12:18 PM, David Cournapeau > > wrote: > >> > >> On Thu, Oct 27, 2011 at 5:19 PM, Ralf Gommers > >> wrote: > >> > Hi David, > >> > > >> > On Thu, Oct 27, 2011 at 3:02 PM, David Cournapeau > > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> I was wondering if we could finally move to a more recent version of > >> >> compilers for official win32 installers. This would of course concern > >> >> the next release cycle, not the ones where beta/rc are already in > >> >> progress. > >> >> > >> >> Basically, the pros: > >> >> - we will have to move at some point > >> >> - gcc 4.* seem less buggy, especially C++ and fortran. > >> >> - no need to maintain msvcr90 vodoo > >> >> The cons: > >> >> - it will most likely break the ABI > >> >> - we need to recompile atlas (but I can take care of it) > >> >> - the biggest: it is difficult to combine gfortran with visual > >> >> studio (more exactly you cannot link gfortran runtime to a visual > >> >> studio executable). The only solution I could think of would be to > >> >> recompile the gfortran runtime with Visual Studio, which for some > >> >> reason does not sound very appealing :) > >> > > >> > To get the datetime changes to work with MinGW, we already concluded > >> > that > >> > building with 4.x is more or less required (without recognizing some > of > >> > the > >> > points you list above). Changes to mingw32ccompiler to fix compilation > >> > with > >> > 4.x went in in https://github.com/numpy/numpy/pull/156. It would be > good > >> > if > >> > you could check those. > >> > >> I will look into it more carefully, but overall, it seems that > >> building atlas 3.8.4, numpy and scipy with gcc 4.x works quite well. > >> The main issue is that gcc 4.* adds some dependencies on mingw dlls. > >> There are two options: > >> - adding the dlls in the installers > >> - statically linking those, which seems to be a bad idea > >> (generalizing the dll boundaries problem to exception and things we > >> would rather not care about: > >> http://cygwin.com/ml/cygwin/2007-06/msg00332.html). > >> > >> > It probably makes sense make this move for numpy 1.7. If this breaks > the > >> > ABI > >> > then it would be easiest to make numpy 1.7 the minimum required > version > >> > for > >> > scipy 0.11. > >> > >> My thinking as well. > >> > > > > Hi David, what is the current status of this issue? I kind of forgot > this is > > a prerequisite for the next release when starting the 1.7.0 release > thread. > > The only issue at this point is the distribution of mingw dlls. I have > not found a way to do it nicely (where nicely means something that is > distributed within numpy package). Given that those dlls are actually > versioned and seem to have a strong versioning policy, maybe we can > just install them inside the python installation ? > > Although not ideal, I don't have a problem with that in principle. However, wouldn't it break installing without admin rights if Python is installed by the admin? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ferreirafm at lim12.fm.usp.br Wed Dec 14 13:00:36 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Wed, 14 Dec 2011 10:00:36 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: <4EE7D5B2.9070601@gmail.com> References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> <32970295.post@talk.nabble.com> <4EE7D5B2.9070601@gmail.com> Message-ID: <32975340.post@talk.nabble.com> Thanks for the correction. Good to know! I've got this outdated information from pytable's mailing list. Regards, Fred David Verelst wrote: > > Note that the pytables pro you are referring to is no longer behind a > pay wall. Recently the project went through some changes and the pro > versions disappeared. All pro features where merged into the main > project and, are as a consequence, also available for free. > > Regards, > David > -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32975340.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ferreirafm at lim12.fm.usp.br Wed Dec 14 13:09:55 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Wed, 14 Dec 2011 10:09:55 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> <32970295.post@talk.nabble.com> Message-ID: <32975342.post@talk.nabble.com> Hi Eraldo, Indeed, Pandas is a really really nice module! If it is going to take part of numpy, that's even better. Thanks for the suggestion. All the Best, Fred Eraldo Pomponi wrote: > > Hi Fred, > > Pandas has a nice interface to PyTable if you still need it: > > http://pandas.sourceforge.net/io.html#hdf5-pytables > > However, my intention was just to point you to pandas because it > is really a powerful tool if you need to deal with tabular heterogenic > data. It is also important to notice that there are plans in the numpy > community to include/port "part" of this package directly in the codebase. > This says a lot about how good it is... > > Best, > Eraldo > > -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32975342.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ferreirafm at lim12.fm.usp.br Wed Dec 14 13:17:00 2011 From: ferreirafm at lim12.fm.usp.br (ferreirafm) Date: Wed, 14 Dec 2011 10:17:00 -0800 (PST) Subject: [Numpy-discussion] numpy.mean problems In-Reply-To: References: <32945124.post@talk.nabble.com> <32951098.post@talk.nabble.com> <32955052.post@talk.nabble.com> <32970295.post@talk.nabble.com> Message-ID: <32975344.post@talk.nabble.com> Hi Eraldo, Indeed Pandas is a really really nice module. If it going to take part of numpy, that's even better. Thanks for the suggestion. All the Best, Fred Eraldo Pomponi wrote: > > Hi Fred, > > Pandas has a nice interface to PyTable if you still need it: > > http://pandas.sourceforge.net/io.html#hdf5-pytables > > However, my intention was just to point you to pandas because it > is really a powerful tool if you need to deal with tabular heterogenic > data. It is also important to notice that there are plans in the numpy > community to include/port "part" of this package directly in the codebase. > This says a lot about how good it is... > > Best, > Eraldo > > -- View this message in context: http://old.nabble.com/numpy.mean-problems-tp32945124p32975344.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ralf.gommers at googlemail.com Wed Dec 14 14:22:23 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 20:22:23 +0100 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: <4EE8BC9C.6010308@gmail.com> References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> <4EE8BC9C.6010308@gmail.com> Message-ID: On Wed, Dec 14, 2011 at 4:11 PM, Bruce Southey wrote: > ** > On 12/14/2011 01:03 AM, Chris Barker wrote: > > > > On Tue, Dec 13, 2011 at 1:21 PM, Ralf Gommers > wrote: > >> >> genfromtxt sure looks close for an API >>> >> >> This I don't agree with. It has a huge amount of keywords that just >> confuse or intimidate a beginning user. There should be a dead simple >> interface, even the loadtxt API is on the heavy side. >> > > well, yes, though it does do a lot -- do you have a smpler one in mind? > > Just looking at what I normally wouldn't need for simple data files and/or what a beginning user won't understand at once, the `unpack` and `ndmin` keywords could certainly be left out. `converters` is also questionable. That's probably as simple as it can get. Note that I don't think this should be changed now, that's not worth the trouble. > But anyway, the really simple cases, are reallly simle, even with > genfromtxt. > > I guess it's a matter of debate about what is a better API: > > a few functions, each adding a layer of sophistication > > or > > one function, with layers of sophistication added with an array of keyword > arguments. > > There's always a trade-off, but looking at the docstring for genfromtxt should make it an easy call in this case. > In either case, though I wish the multiple functionality built on the > same, well optimized core code. > > I wish that too, but I'm fairly certain that you can't write that core code with the ability to handle missing and irregular data and make it close to the same speed as an optimized reader for regular data. I am not sure that you can even create a simple API here as even Python's > csv module is rather complex especially when it just reads data as strings. > It also 'hides' many arguments in the Dialect class although these are just > the collection of 7 'fmtparam' arguments. It also provides the Sniffer > class that tries to find correct format that can then be passed to the > reader function. Then you still have to convert the data into the required > types - another set of arguments as well as yet another pass through the > data. > > In comparison, genfromtxt can perform sniffing > I assume you mean the ``dtype=None`` example in the docstring? That works to some extent, but you still need to specify the delimiter. I commented on that on the loadtable PR. > and both genfromtxt and loadtxt can read and convert the data. These also > add some useful features like skipping rows (start, end and commented) and > columns. However, it could be possible to create a sniffer function and a > single data reader function leading to a 'simple' reader function but that > probably would not change the API of the underlying data reader function. > Better auto-detection of things like delimiters would indeed be quite useful. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Dec 14 14:36:04 2011 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 14 Dec 2011 13:36:04 -0600 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> <4EE8BC9C.6010308@gmail.com> Message-ID: On Wed, Dec 14, 2011 at 1:22 PM, Ralf Gommers wrote: > > > On Wed, Dec 14, 2011 at 4:11 PM, Bruce Southey wrote: > >> ** >> On 12/14/2011 01:03 AM, Chris Barker wrote: >> >> >> >> On Tue, Dec 13, 2011 at 1:21 PM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> genfromtxt sure looks close for an API >>>> >>> >>> This I don't agree with. It has a huge amount of keywords that just >>> confuse or intimidate a beginning user. There should be a dead simple >>> interface, even the loadtxt API is on the heavy side. >>> >> >> well, yes, though it does do a lot -- do you have a smpler one in mind? >> >> Just looking at what I normally wouldn't need for simple data files > and/or what a beginning user won't understand at once, the `unpack` and > `ndmin` keywords could certainly be left out. `converters` is also > questionable. That's probably as simple as it can get. > > Just my two cents (and I was one of those who championed its inclusion), the ndmin feature is designed to prevent unexpected results that users (particularly beginners) may encounter with their datasets. Now, maybe it might be difficult to tell a beginner *why* they might need to be aware of it, but it is very easy to describe *how* to use. "How many dimensions is your data? Two? Ok, just set ndmin=2 and you are good to go!" Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Dec 14 15:54:53 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Dec 2011 12:54:53 -0800 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> <4EE8BC9C.6010308@gmail.com> Message-ID: On Wed, Dec 14, 2011 at 11:36 AM, Benjamin Root wrote: >>> well, yes, though it does do a lot -- do you have a smpler one in mind? >>> >> Just looking at what I normally wouldn't need for simple data files and/or >> what a beginning user won't understand at once, the `unpack` and `ndmin` >> keywords could certainly be left out. `converters` is also questionable. >> That's probably as simple as it can get. this may be a function of a well written doc string -- if it is clear to the newbie that " all the rest of this you don't need unless you have a wierd data file", then extra keyword arguments don't really hurt. A few examples of the basic use-cases go a long way. And yes, the core reader for the complex cases isn't going to e fast (it's going to be complex C code...). but we could still have a core reader that handled most cases. Anyway, I think it's time write code, and see if it can be rolled in somehow... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice 7600 Sand Point Way NE ??(206) 526-6329?? fax Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception Chris.Barker at noaa.gov From ralf.gommers at googlemail.com Wed Dec 14 16:11:03 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 22:11:03 +0100 Subject: [Numpy-discussion] Fast Reading of ASCII files In-Reply-To: References: <4EDFB566.4000303@noaa.gov> <5066cdee-4994-4997-afa5-828625023a8f@4g2000yqu.googlegroups.com> <4EE7A7AB.8060201@gmail.com> <4EE8BC9C.6010308@gmail.com> Message-ID: On Wed, Dec 14, 2011 at 9:54 PM, Chris Barker wrote: > On Wed, Dec 14, 2011 at 11:36 AM, Benjamin Root wrote: > >>> well, yes, though it does do a lot -- do you have a smpler one in mind? > >>> > >> Just looking at what I normally wouldn't need for simple data files > and/or > >> what a beginning user won't understand at once, the `unpack` and `ndmin` > >> keywords could certainly be left out. `converters` is also questionable. > >> That's probably as simple as it can get. > > this may be a function of a well written doc string -- if it is clear > to the newbie that " all the rest of this you don't need unless you > have a wierd data file", then extra keyword arguments don't really > hurt. > > A few examples of the basic use-cases go a long way. > > And yes, the core reader for the complex cases isn't going to e fast > (it's going to be complex C code...). but we could still have a core > reader that handled most cases. > > Okay, now we're on the same page I think. > Anyway, I think it's time write code, and see if it can be rolled in > somehow... > > Agreed. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ater1980 at gmail.com Wed Dec 14 16:38:07 2011 From: ater1980 at gmail.com (Alex Ter-Sarkissov) Date: Thu, 15 Dec 2011 10:38:07 +1300 Subject: [Numpy-discussion] scipy installation problem Message-ID: I'm using Eclipse (PyDev) on MacOS. I downloaded scipy010, installed it and added path to .mpkg file to PYTHONPATH and scipy to forced built-in. Nothing worked, I keep getting 'module scipy not found'. I then removed the link to the .mpkg and still nothing works. Strange enough, numpy works just fine. What should I do? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 14 16:52:24 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 22:52:24 +0100 Subject: [Numpy-discussion] scipy installation problem In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 10:38 PM, Alex Ter-Sarkissov wrote: > I'm using Eclipse (PyDev) on MacOS. I downloaded scipy010, installed it > and added path to .mpkg file to PYTHONPATH and scipy to forced built-in. > Nothing worked, I keep getting 'module scipy not found'. I then removed the > link to the .mpkg and still nothing works. Strange enough, numpy works just > fine. What should I do? > > Not sure what you mean by "install" here, but you're supposed to double-click the mpkg installer to run it, not put it on your PYTHONPATH. Note that to use the provided dmg installer, you have to also use the matching python from python.org Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ater1980 at gmail.com Wed Dec 14 16:56:01 2011 From: ater1980 at gmail.com (Alex Ter-Sarkissov) Date: Thu, 15 Dec 2011 10:56:01 +1300 Subject: [Numpy-discussion] scipy installation problem Message-ID: yes, that's exactly what i did. I'm using Python2.6 both in PyDev and Scipy. From ralf.gommers at googlemail.com Wed Dec 14 16:59:31 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 22:59:31 +0100 Subject: [Numpy-discussion] scipy installation problem In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 10:56 PM, Alex Ter-Sarkissov wrote: > yes, that's exactly what i did. I'm using Python2.6 both in PyDev and > Scipy. > Then you don't need to put anything on your pythonpath, since scipy gets installed to the normal site-packages dir. You'll have to provide more details about exactly which installers you used (incl. of python itself), and any possible PyDev oddities, for us to be able to help you here. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ater1980 at gmail.com Wed Dec 14 17:12:31 2011 From: ater1980 at gmail.com (Alex Ter-Sarkissov) Date: Thu, 15 Dec 2011 11:12:31 +1300 Subject: [Numpy-discussion] scipy installation problem Message-ID: yeah, I've already removed it, still doesn't work. I'm running Python 2.6 and SciPy version I'm trying to install is scipy-0.10.0-py2.6-python.org-macosx10.3., pydev version is 2.2.4 I've had no trouble running numpy or Tkinter for example. Also none of the other modules I'm using have been added to PythonPath The only problem I've had when installing the interpreter is the error message that sdlib was either not found or without .py files. I didn't add these files yet, but everything seems to be runnning smoothly except this issue with scipy. From eraldo.pomponi at gmail.com Wed Dec 14 17:19:36 2011 From: eraldo.pomponi at gmail.com (Eraldo Pomponi) Date: Wed, 14 Dec 2011 23:19:36 +0100 Subject: [Numpy-discussion] scipy installation problem In-Reply-To: References: Message-ID: Dear Alexe, I'm not sure I understood what you mean by "install" like Ralf. However, I would also suggest, if you are using Eclipse and PyDev, (after installing new modules) to remove the current python interpreter (from Eclipse options) and then re-add it so that the whole pythonpath will be re-scanned and you will not see any "red" underline (with the msg: module not found) in your python editor. Cheers, Eraldo On Wed, Dec 14, 2011 at 10:52 PM, Ralf Gommers wrote: > > > On Wed, Dec 14, 2011 at 10:38 PM, Alex Ter-Sarkissov wrote: > >> I'm using Eclipse (PyDev) on MacOS. I downloaded scipy010, installed it >> and added path to .mpkg file to PYTHONPATH and scipy to forced built-in. >> Nothing worked, I keep getting 'module scipy not found'. I then removed the >> link to the .mpkg and still nothing works. Strange enough, numpy works just >> fine. What should I do? >> >> > Not sure what you mean by "install" here, but you're supposed to > double-click the mpkg installer to run it, not put it on your PYTHONPATH. > Note that to use the provided dmg installer, you have to also use the > matching python from python.org > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 14 17:22:39 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 23:22:39 +0100 Subject: [Numpy-discussion] scipy installation problem In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 11:12 PM, Alex Ter-Sarkissov wrote: > yeah, I've already removed it, still doesn't work. > > I'm running Python 2.6 and SciPy version I'm trying to install is > scipy-0.10.0-py2.6-python.org-macosx10.3., pydev version is 2.2.4 > > I've had no trouble running numpy or Tkinter for example. Also none of > the other modules I'm using have been added to PythonPath > > The only problem I've had when installing the interpreter is the error > message that sdlib was either not found or without .py files. I didn't > add these files yet, but everything seems to be runnning smoothly > except this issue with scipy. > This seems to be a PyDev issue, google turns up http://stackoverflow.com/questions/5595276/pydev-eclipse-python-interpreters-error-stdlib-not-found If that doesn't help, you should ask on a PyDev mailing list. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ater1980 at gmail.com Wed Dec 14 17:44:52 2011 From: ater1980 at gmail.com (Alex Ter-Sarkissov) Date: Thu, 15 Dec 2011 11:44:52 +1300 Subject: [Numpy-discussion] scipy installation problem Message-ID: OK thanks guys, reinstalling the interpreter did the trick. I'm quite sure I did it before though, without any effect. More interestingly, I have two interpreters running, one for 2.6 and the other the auto. So the latter one still tells me the module isn't found, the former works just fine. Mystery? From silva at lma.cnrs-mrs.fr Thu Dec 15 11:17:44 2011 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Thu, 15 Dec 2011 17:17:44 +0100 Subject: [Numpy-discussion] Owndata flag Message-ID: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> How can one arbitrarily assumes that an ndarray owns its data ? More explicitly, I have some temporary home-made C structure that holds a pointer to an array. I prepare (using Cython) an numpy.ndarray using the PyArray_NewFromDescr function. I can delete my temporary C structure without freeing the memory holding array, but I wish the numpy.ndarray becomes the owner of the data. How can do I do such thing ? -- Fabrice Silva From robert.kern at gmail.com Thu Dec 15 11:36:24 2011 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 15 Dec 2011 16:36:24 +0000 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> Message-ID: On Thu, Dec 15, 2011 at 16:17, Fabrice Silva wrote: > How can one arbitrarily assumes that an ndarray owns its data ? > > More explicitly, I have some temporary home-made C structure that holds > a pointer to an array. I prepare (using Cython) an numpy.ndarray using > the PyArray_NewFromDescr function. I can delete my temporary C structure > without freeing the memory holding array, but I wish the numpy.ndarray > becomes the owner of the data. > > How can do I do such thing ? You can't, really. numpy-owned arrays will be deallocated with numpy's deallocator. This may not be the appropriate deallocator for memory that your library allocated. If at all possible, I recommend using numpy to create the ndarray and pass that pointer to your library. Sometimes the library's API gets in the way of this. Otherwise, copy the data. Devs, looking into this, I noticed that we use PyDataMem_NEW() and PyDataMem_FREE() (which is #defined to malloc() and free()) for handling the data pointer. Why aren't we using the appropriate PyMem_*() functions (or the PyArray_*() memory functions which default to using the PyMem_*() implementations)? Using the PyMem_*() functions lets the Python memory manager have an accurate idea how much memory is being used, which can be important for the large amounts of memory that numpy arrays can consume. I assume this is intentional design. I just want to know the rationale for it and would like it documented. I can certainly understand if it causes bad interactions with the garbage collector, say (though hiding information from the GC seems like a suboptimal approach). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From tmp50 at ukr.net Thu Dec 15 11:41:25 2011 From: tmp50 at ukr.net (Dmitrey) Date: Thu, 15 Dec 2011 18:41:25 +0200 Subject: [Numpy-discussion] Ann: OpenOpt and FuncDesigner 0.37 Message-ID: <68842.1323967285.16181933434562084864@ffe15.ukr.net> Hi all, I'm glad to inform you about new release 0.37 (2011-Dec-15) of our free software: > OpenOpt (numerical optimization): > IPOPT initialization time gap (time till first iteration) for FuncDesigner models has been decreased Some improvements and bugfixes for interalg, especially for "search all SNLE solutions" mode (Systems of Non Linear Equations) Eigenvalue problems (EIG) (in both OpenOpt and FuncDesigner) Equality constraints for GLP (global) solver de Some changes for goldenSection ftol stop criterion GUI func "manage" - now button "Enough" works in Python3, but "Run/Pause" not yet (probably something with threading and it will be fixed in Python instead) > FuncDesigner: Major sparse Automatic differentiation improvements for badly-vectorized or unvectorized problems with lots of constraints (except of box bounds); some problems now work many times or orders faster (of course not faster than vectorized problems with insufficient number of variable arrays). It is recommended to retest your large-scale problems with useSparse = 'auto' | True| False > Two new methods for splines to check their quality: plot and residual Solving ODE dy/dt = f(t) with specifiable accuracy by interalg Speedup for solving 1-dimensional IP by interalg > SpaceFuncs and DerApproximator: > Some code cleanup > > You may trace OpenOpt development information in our recently created entries in Twitter and Facebook, see http://openopt.org for details. > > See also: FuturePlans, this release announcement in OpenOpt forum > > Regards, D. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Thu Dec 15 12:09:14 2011 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Thu, 15 Dec 2011 18:09:14 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> Message-ID: <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> Am 15.12.2011 um 17:17 schrieb Fabrice Silva: > How can one arbitrarily assumes that an ndarray owns its data ? > > More explicitly, I have some temporary home-made C structure that holds > a pointer to an array. I prepare (using Cython) an numpy.ndarray using > the PyArray_NewFromDescr function. I can delete my temporary C structure > without freeing the memory holding array, but I wish the numpy.ndarray > becomes the owner of the data. > > How can do I do such thing ? There is an excellent blog entry from Travis Oliphant, that describes how to create a ndarray from existing data without copy: http://blog.enthought.com/?p=62 The created array does not actually own the data, but its base attribute points to an object, which frees the memory if the numpy array gets deallocated. I guess this is the behavior you want to achieve. Here is a cython implementation (for a uint8 array) Gregor """ see 'NumPy arrays with pre-allocated memory', http://blog.enthought.com/?p=62 """ import numpy as np from numpy cimport import_array, ndarray, npy_intp, set_array_base, PyArray_SimpleNewFromData, NPY_DOUBLE, NPY_INT, NPY_UINT8 cdef extern from "stdlib.h": void* malloc(int size) void free(void *ptr) cdef class MemoryReleaser: cdef void* memory def __cinit__(self): self.memory = NULL def __dealloc__(self): if self.memory: #release memory free(self.memory) print "memory released", hex(self.memory) cdef MemoryReleaser MemoryReleaserFactory(void* ptr): cdef MemoryReleaser mr = MemoryReleaser.__new__(MemoryReleaser) mr.memory = ptr return mr cdef ndarray frompointer(void* ptr, int nbytes): import_array() #cdef int dims[1] #dims[0] = nbytes cdef npy_intp dims = nbytes cdef ndarray arr = PyArray_SimpleNewFromData(1, &dims, NPY_UINT8, ptr) #TODO: check for error set_array_base(arr, MemoryReleaserFactory(ptr)) return arr def test_new_array_from_pointer(): nbytes = 16 cdef void* mem = malloc(nbytes) print "memory allocated", hex(mem) return frompointer(mem, nbytes) From silva at lma.cnrs-mrs.fr Fri Dec 16 05:53:16 2011 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Fri, 16 Dec 2011 11:53:16 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> Message-ID: <1324032796.2246.1.camel@lma-98.cnrs-mrs.fr> Le jeudi 15 d?cembre 2011 ? 18:09 +0100, Gregor Thalhammer a ?crit : > There is an excellent blog entry from Travis Oliphant, that describes > how to create a ndarray from existing data without copy: > http://blog.enthought.com/?p=62 > The created array does not actually own the data, but its base > attribute points to an object, which frees the memory if the numpy > array gets deallocated. I guess this is the behavior you want to > achieve. > Here is a cython implementation (for a uint8 array) Even better: the addendum! http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/ Within cython: cimport numpy numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor)) Seems OK. Any objections about that ? -- Fabrice Silva From gregor.thalhammer at gmail.com Fri Dec 16 09:33:51 2011 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Fri, 16 Dec 2011 15:33:51 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <1324032796.2246.1.camel@lma-98.cnrs-mrs.fr> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> <1324032796.2246.1.camel@lma-98.cnrs-mrs.fr> Message-ID: <723F3936-C6D5-4F0D-9BDD-8AE07CB6BDDE@gmail.com> Am 16.12.2011 um 11:53 schrieb Fabrice Silva: > Le jeudi 15 d?cembre 2011 ? 18:09 +0100, Gregor Thalhammer a ?crit : > >> There is an excellent blog entry from Travis Oliphant, that describes >> how to create a ndarray from existing data without copy: >> http://blog.enthought.com/?p=62 >> The created array does not actually own the data, but its base >> attribute points to an object, which frees the memory if the numpy >> array gets deallocated. I guess this is the behavior you want to >> achieve. >> Here is a cython implementation (for a uint8 array) > > Even better: the addendum! > http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/ > > Within cython: > cimport numpy > numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor)) > > Seems OK. > Any objections about that ? This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2. Gregor From silva at lma.cnrs-mrs.fr Fri Dec 16 10:16:25 2011 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Fri, 16 Dec 2011 16:16:25 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <723F3936-C6D5-4F0D-9BDD-8AE07CB6BDDE@gmail.com> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> <1324032796.2246.1.camel@lma-98.cnrs-mrs.fr> <723F3936-C6D5-4F0D-9BDD-8AE07CB6BDDE@gmail.com> Message-ID: <1324048585.2246.3.camel@lma-98.cnrs-mrs.fr> Le vendredi 16 d?cembre 2011 ? 15:33 +0100, Gregor Thalhammer a ?crit : > > Even better: the addendum! > > http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/ > > > > Within cython: > > cimport numpy > > numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor)) > > > > Seems OK. > > Any objections about that ? > > This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2. My guess is then that the PyCapsule object is the way to go... -- Fabrice Silva From d.s.seljebotn at astro.uio.no Fri Dec 16 12:38:20 2011 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 Dec 2011 18:38:20 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: <1324048585.2246.3.camel@lma-98.cnrs-mrs.fr> References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> <1A56FF2B-3841-48BE-9A0C-2B6491E6D028@gmail.com> <1324032796.2246.1.camel@lma-98.cnrs-mrs.fr> <723F3936-C6D5-4F0D-9BDD-8AE07CB6BDDE@gmail.com> <1324048585.2246.3.camel@lma-98.cnrs-mrs.fr> Message-ID: <4EEB820C.3030802@astro.uio.no> On 12/16/2011 04:16 PM, Fabrice Silva wrote: > Le vendredi 16 d?cembre 2011 ? 15:33 +0100, Gregor Thalhammer a ?crit : >>> Even better: the addendum! >>> http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory/ >>> >>> Within cython: >>> cimport numpy >>> numpy.set_array_base(my_ndarray, PyCObject_FromVoidPtr(pointer_to_Cobj, some_destructor)) >>> >>> Seems OK. >>> Any objections about that ? >> >> This is ok, but CObject is deprecated as of Python 3.1, so it's not portable to Python 3.2. > > My guess is then that the PyCapsule object is the way to go... > Another way: With recent NumPy you should be able to do something like this in Cython cdef class SomeBufferWrapper: ... def __getbuffer__(self, ...): ... def __releasebuffer__(self, ...): .. arr = np.asarray(SomeBufferWrapper(buf)) and then __releasebuffer__ will be called then `arr` goes out of use. See Cython docs. Dag From amcnicol at longroad.ac.uk Fri Dec 16 18:07:56 2011 From: amcnicol at longroad.ac.uk (McNicol, Adam) Date: Fri, 16 Dec 2011 23:07:56 -0000 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 Message-ID: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> Hi There, I am very new to numpy and have really only started investigating it as one of my students needs some functionality from matplotlib. I have managed to install everything under Windows for work in class but I use a Mac at home and have been struggling all night to get it to build and install. I should mention that I am using Python 3.2.2 both in school and at home and it isn't an option to use Python 2.7 as all of the rest of my class is taught in Python 3. I also have the most recent version of Xcode installed. I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from http://r.research.att.com/tools/ I then followed the install instructions but the build fails with the following message: File "numpy/core/setup.py", line 271, in check_types "Cannot compile 'Python.h'. Perhaps you need to "\ SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. I have got no idea what to do with this error message. Any help would be much appreciated. Kind Regards, Adam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Fri Dec 16 18:43:56 2011 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 16 Dec 2011 17:43:56 -0600 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 In-Reply-To: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> References: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> Message-ID: On Friday, December 16, 2011, McNicol, Adam wrote: > Hi There, > > I am very new to numpy and have really only started investigating it as one of my students needs some functionality from matplotlib. I have managed to install everything under Windows for work in class but I use a Mac at home and have been struggling all night to get it to build and install. > > I should mention that I am using Python 3.2.2 both in school and at home and it isn't an option to use Python 2.7 as all of the rest of my class is taught in Python 3. I also have the most recent version of Xcode installed. > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from http://r.research.att.com/tools/ > > I then followed the install instructions but the build fails with the following message: > > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. > > I have got no idea what to do with this error message. Any help would be much appreciated. > > Kind Regards, > > > Adam. > Adam, Just a quick comment about matplotlib and py3k. We have not yet released a version for py3k. Support is currently being worked on in the development branch, but I have no clue about the level of testing it has received for windows and macs. Just a heads-up. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 16 19:21:49 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 16 Dec 2011 17:21:49 -0700 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 In-Reply-To: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> References: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> Message-ID: On Fri, Dec 16, 2011 at 4:07 PM, McNicol, Adam wrote: > ** > > Hi There, > > I am very new to numpy and have really only started investigating it as > one of my students needs some functionality from matplotlib. I have managed > to install everything under Windows for work in class but I use a Mac at > home and have been struggling all night to get it to build and install. > > I should mention that I am using Python 3.2.2 both in school and at home > and it isn't an option to use Python 2.7 as all of the rest of my class is > taught in Python 3. I also have the most recent version of Xcode installed. > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > http://r.research.att.com/tools/ > > I then followed the install instructions but the build fails with the > following message: > > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > I have got no idea what to do with this error message. Any help would be > much appreciated. > > Is Python.h present? If so, is it in the search path? How did you install Python3.2.2? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sonne at debian.org Fri Dec 16 19:41:22 2011 From: sonne at debian.org (Soeren Sonnenburg) Date: Sat, 17 Dec 2011 01:41:22 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) Message-ID: <1324082482.21330.327.camel@no> Hi, I've implemented the buffer protocol (http://www.python.org/dev/peps/pep-3118/) for some matrix class and when I manually call PyObject_GetBuffer on that object I see that I get the right matrix. Now I'd like to see numpy use the buffer protocol of my class. Does anyone know how to test that? What do I need to write, just x=MyMatrix([1,2,3]) y=numpy.array(x) (that doesn't call the buffer function though - so it must be sth else)? Any ideas? Soeren -- For the one fact about the future of which we can be certain is that it will be utterly fantastic. -- Arthur C. Clarke, 1962 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From torgil.svensson at gmail.com Fri Dec 16 21:08:17 2011 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Sat, 17 Dec 2011 03:08:17 +0100 Subject: [Numpy-discussion] Array min from argmin along an axis? In-Reply-To: References: <4EE7CDAA.9060400@jhu.edu> Message-ID: > |6> y[np.argmin(y, axis=0), np.arange(y.shape[1])] > array([0, 0, 0, 0, 0]) Can xrange in this case save me from creating a temporary array here or doesn't it matter? |6> y[np.argmin(y, axis=0), xrange(y.shape[1])] array([0, 0, 0, 0, 0]) //Torgil On Tue, Dec 13, 2011 at 11:27 PM, Robert Kern wrote: > On Tue, Dec 13, 2011 at 22:11, Ken Basye wrote: >> Hi folks, >> ? ? I need an efficient way to get both the min and argmin of a 2-d >> array along one axis. ?It seemed to me that the way to do this was to >> get the argmin and then use it to index into the array to get the min, >> but I can't figure out how to do it. ?Here's my toy example: > > [~] > |1> x = np.arange(25).reshape((5,5)) > > [~] > |2> y = np.abs(x - x.T) > > [~] > |3> y > array([[ 0, ?4, ?8, 12, 16], > ? ? ? [ 4, ?0, ?4, ?8, 12], > ? ? ? [ 8, ?4, ?0, ?4, ?8], > ? ? ? [12, ?8, ?4, ?0, ?4], > ? ? ? [16, 12, ?8, ?4, ?0]]) > > [~] > |4> i = np.argmin(y, axis=0) > > [~] > |5> y[i, np.arange(y.shape[1])] > array([0, 0, 0, 0, 0]) > > [~] > |6> y[np.argmin(y, axis=0), np.arange(y.shape[1])] > array([0, 0, 0, 0, 0]) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From torgil.svensson at gmail.com Fri Dec 16 21:20:22 2011 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Sat, 17 Dec 2011 03:20:22 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: <1324082482.21330.327.camel@no> References: <1324082482.21330.327.camel@no> Message-ID: What happens if you use y=numpy.frombuffer(x) ? //Torgil On Sat, Dec 17, 2011 at 1:41 AM, Soeren Sonnenburg wrote: > Hi, > > I've implemented the buffer protocol > (http://www.python.org/dev/peps/pep-3118/) for some matrix class and > when I manually call PyObject_GetBuffer on that object I see that I get > the right matrix. > > Now I'd like to see numpy use the buffer protocol of my class. Does > anyone know how to test that? What do I need to write, just > > x=MyMatrix([1,2,3]) > y=numpy.array(x) > > (that doesn't call the buffer function though - so it must be sth else)? > > Any ideas? > Soeren > -- > For the one fact about the future of which we can be certain is that it > will be utterly fantastic. -- Arthur C. Clarke, 1962 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sonne at debian.org Sat Dec 17 03:42:56 2011 From: sonne at debian.org (Soeren Sonnenburg) Date: Sat, 17 Dec 2011 09:42:56 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: References: <1324082482.21330.327.camel@no> Message-ID: <1324111376.21330.347.camel@no> Doesn't work, complaining that the object has no __buffer__ attribute. Digging into the numpy c code it seems numpy doesn't even support the buffer protocol but only the deprecated (old) one http://docs.python.org/c-api/objbuffer.html . At least there is nowhere a PyObject_CheckBuffer() call but frombuffer in the numpy C code checks for (Py_TYPE(buf)->tp_as_buffer->bf_getwritebuffer == NULL && Py_TYPE(buf)->tp_as_buffer->bf_getreadbuffer == NULL) . So it needs bf_read/writebuffer to be set instead of bf_getbuffer and the array buffer protocol :-( Soeren On Sat, 2011-12-17 at 03:20 +0100, Torgil Svensson wrote: > What happens if you use > > y=numpy.frombuffer(x) ? > > //Torgil > > > On Sat, Dec 17, 2011 at 1:41 AM, Soeren Sonnenburg wrote: > > Hi, > > > > I've implemented the buffer protocol > > (http://www.python.org/dev/peps/pep-3118/) for some matrix class and > > when I manually call PyObject_GetBuffer on that object I see that I get > > the right matrix. > > > > Now I'd like to see numpy use the buffer protocol of my class. Does > > anyone know how to test that? What do I need to write, just > > > > x=MyMatrix([1,2,3]) > > y=numpy.array(x) > > > > (that doesn't call the buffer function though - so it must be sth else)? > > > > Any ideas? > > Soeren > > -- > > For the one fact about the future of which we can be certain is that it > > will be utterly fantastic. -- Arthur C. Clarke, 1962 > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- For the one fact about the future of which we can be certain is that it will be utterly fantastic. -- Arthur C. Clarke, 1962 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at googlemail.com Sat Dec 17 06:04:42 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 17 Dec 2011 12:04:42 +0100 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 In-Reply-To: References: <4d05ca55227d2cb66ae1d8abe73da4a23358da81@localhost> Message-ID: On Sat, Dec 17, 2011 at 1:21 AM, Charles R Harris wrote: > > > On Fri, Dec 16, 2011 at 4:07 PM, McNicol, Adam wrote: > >> ** >> >> Hi There, >> >> I am very new to numpy and have really only started investigating it as >> one of my students needs some functionality from matplotlib. I have managed >> to install everything under Windows for work in class but I use a Mac at >> home and have been struggling all night to get it to build and install. >> >> I should mention that I am using Python 3.2.2 both in school and at home >> and it isn't an option to use Python 2.7 as all of the rest of my class is >> taught in Python 3. I also have the most recent version of Xcode installed. >> >> Did you also install the optional 10.4/5/6 SDK's with XCode? They may be needed. It could also be a bug in distribute. On OS X 10.6 distribute 0.6.10 works for me, but I see they're now at version 0.6.24. No idea if that works or not. Ralf I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple >> build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from >> http://r.research.att.com/tools/ >> >> I then followed the install instructions but the build fails with the >> following message: >> >> File "numpy/core/setup.py", line 271, in check_types >> "Cannot compile 'Python.h'. Perhaps you need to "\ >> SystemError: Cannot compile 'Python.h'. Perhaps you need to install >> python-dev|python-devel. >> >> I have got no idea what to do with this error message. Any help would be >> much appreciated. >> >> > Is Python.h present? If so, is it in the search path? How did you install > Python3.2.2? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcnicol at longroad.ac.uk Sat Dec 17 06:59:56 2011 From: amcnicol at longroad.ac.uk (McNicol, Adam) Date: Sat, 17 Dec 2011 11:59:56 -0000 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 References: Message-ID: <10223008b8ce733fdab14ccbb504b118fab78763@localhost> Hi There, Thanks for the responses. At this point I would settle from just being able to install matplotlib. Even if some of the functionality isn't present currently that is fine. I'm afraid my knowledge of Python falls down about here as well. I installed Python 3.2.2 via the installer from Python.org so I have no idea whether Python.h is present or where indeed I would find it or how I would add it to the search path. Do I have to install from source or something like that? Thanks again, Adam. -----Original Message----- From: McNicol, Adam Sent: Fri 12/16/2011 11:07 PM To: numpy-discussion at scipy.org Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 Hi There, I am very new to numpy and have really only started investigating it as one of my students needs some functionality from matplotlib. I have managed to install everything under Windows for work in class but I use a Mac at home and have been struggling all night to get it to build and install. I should mention that I am using Python 3.2.2 both in school and at home and it isn't an option to use Python 2.7 as all of the rest of my class is taught in Python 3. I also have the most recent version of Xcode installed. I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from http://r.research.att.com/tools/ I then followed the install instructions but the build fails with the following message: File "numpy/core/setup.py", line 271, in check_types "Cannot compile 'Python.h'. Perhaps you need to "\ SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. I have got no idea what to do with this error message. Any help would be much appreciated. Kind Regards, Adam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From markflorisson88 at gmail.com Sat Dec 17 08:36:04 2011 From: markflorisson88 at gmail.com (mark florisson) Date: Sat, 17 Dec 2011 13:36:04 +0000 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: <1324111376.21330.347.camel@no> References: <1324082482.21330.327.camel@no> <1324111376.21330.347.camel@no> Message-ID: What version of numpy are you using? IIRC the new buffer protocol has been supported since numpy 1.5. On 17 December 2011 08:42, Soeren Sonnenburg wrote: > Doesn't work, complaining that the object has no __buffer__ attribute. > > Digging into the numpy c code it seems numpy doesn't even support the > buffer protocol but only the deprecated (old) one > http://docs.python.org/c-api/objbuffer.html . > > At least there is nowhere a PyObject_CheckBuffer() call but frombuffer > in the numpy C code checks for > > (Py_TYPE(buf)->tp_as_buffer->bf_getwritebuffer == NULL > ? ? ? ? ? ?&& Py_TYPE(buf)->tp_as_buffer->bf_getreadbuffer == NULL) > > . > > So it needs bf_read/writebuffer to be set instead of bf_getbuffer and the array buffer protocol :-( > > Soeren > > On Sat, 2011-12-17 at 03:20 +0100, Torgil Svensson wrote: >> What happens if you use >> >> y=numpy.frombuffer(x) ? >> >> //Torgil >> >> >> On Sat, Dec 17, 2011 at 1:41 AM, Soeren Sonnenburg wrote: >> > Hi, >> > >> > I've implemented the buffer protocol >> > (http://www.python.org/dev/peps/pep-3118/) for some matrix class and >> > when I manually call PyObject_GetBuffer on that object I see that I get >> > the right matrix. >> > >> > Now I'd like to see numpy use the buffer protocol of my class. Does >> > anyone know how to test that? What do I need to write, just >> > >> > x=MyMatrix([1,2,3]) >> > y=numpy.array(x) >> > >> > (that doesn't call the buffer function though - so it must be sth else)? >> > >> > Any ideas? >> > Soeren >> > -- >> > For the one fact about the future of which we can be certain is that it >> > will be utterly fantastic. -- Arthur C. Clarke, 1962 >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -- > For the one fact about the future of which we can be certain is that it > will be utterly fantastic. -- Arthur C. Clarke, 1962 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Sat Dec 17 09:29:37 2011 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 17 Dec 2011 15:29:37 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: <1324111376.21330.347.camel@no> References: <1324082482.21330.327.camel@no> <1324111376.21330.347.camel@no> Message-ID: 17.12.2011 09:42, Soeren Sonnenburg kirjoitti: > Doesn't work, complaining that the object has no __buffer__ attribute. > > Digging into the numpy c code it seems numpy doesn't even support the > buffer protocol but only the deprecated (old) one > http://docs.python.org/c-api/objbuffer.html . [clip] Since Numpy version 1.5, the new buffer protocol is supported. -- Pauli Virtanen From sparrow2867 at yahoo.com Sat Dec 17 11:32:12 2011 From: sparrow2867 at yahoo.com (Alex van Houten) Date: Sat, 17 Dec 2011 08:32:12 -0800 (PST) Subject: [Numpy-discussion] strange conversion integer to float Message-ID: <1324139532.73649.YahooMailNeo@web113603.mail.gq1.yahoo.com> Try this: $ python Python 2.7.1 (r271:86832, Apr 12 2011, 16:15:16) [GCC 4.6.0 20110331 (Red Hat 4.6.0-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.5.1' >>> time=[] >>> time.append(20091231) >>> time_array=np.array(time,'f') >>> time_array array([ 20091232.], dtype=float32) 20091231--->20091232 Why? Note: >>> float(20091231) 20091231.0 Thanks, Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sat Dec 17 11:39:12 2011 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 17 Dec 2011 17:39:12 +0100 Subject: [Numpy-discussion] strange conversion integer to float In-Reply-To: <1324139532.73649.YahooMailNeo@web113603.mail.gq1.yahoo.com> References: <1324139532.73649.YahooMailNeo@web113603.mail.gq1.yahoo.com> Message-ID: Hi, If I remember correctly, float is a double (precision float). The precision is more important in doubles (float64) than in usual floats (float32). And 20091231 can not be reprensented in 32bits floats. Matthieu 2011/12/17 Alex van Houten > Try this: > $ python > Python 2.7.1 (r271:86832, Apr 12 2011, 16:15:16) > [GCC 4.6.0 20110331 (Red Hat 4.6.0-2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy as np > >>> np.__version__ > '1.5.1' > >>> time=[] > >>> time.append(20091231) > >>> time_array=np.array(time,'f') > >>> time_array > array([ 20091232.], dtype=float32) > 20091231--->20091232 Why? > Note: > >>> float(20091231) > 20091231.0 > Thanks, > Alex. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From sonne at debian.org Sat Dec 17 18:49:56 2011 From: sonne at debian.org (Soeren Sonnenburg) Date: Sun, 18 Dec 2011 00:49:56 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: References: <1324082482.21330.327.camel@no> <1324111376.21330.347.camel@no> Message-ID: <1324165796.24957.6.camel@no> On Sat, 2011-12-17 at 15:29 +0100, Pauli Virtanen wrote: > 17.12.2011 09:42, Soeren Sonnenburg kirjoitti: > > Doesn't work, complaining that the object has no __buffer__ attribute. > > > > Digging into the numpy c code it seems numpy doesn't even support the > > buffer protocol but only the deprecated (old) one > > http://docs.python.org/c-api/objbuffer.html . > [clip] > > Since Numpy version 1.5, the new buffer protocol is supported. I've looked at the source code of numpy 1.6.1 and couldn't find the respective code... I guess I must be doing something wrong but there really was no call to PyObject_CheckBuffer() ... The problem is I don't really know what is supposed to happen if the new buffer protocol is supported by some class named say Foo. Could I then do x=Foo([1,2,3]) numpy.array([2,2,2])+x and such operations? Soeren -- For the one fact about the future of which we can be certain is that it will be utterly fantastic. -- Arthur C. Clarke, 1962 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From pav at iki.fi Sat Dec 17 20:18:30 2011 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 18 Dec 2011 02:18:30 +0100 Subject: [Numpy-discussion] Testing the python buffer protocol (bf_getbuffer / tp_as_buffer) In-Reply-To: <1324165796.24957.6.camel@no> References: <1324082482.21330.327.camel@no> <1324111376.21330.347.camel@no> <1324165796.24957.6.camel@no> Message-ID: 18.12.2011 00:49, Soeren Sonnenburg kirjoitti: [clip] > I've looked at the source code of numpy 1.6.1 and couldn't find the > respective code... I guess I must be doing something wrong but there > really was no call to PyObject_CheckBuffer() ... Look for PyObject_GetBuffer > The problem is I don't really know what is supposed to happen if the new > buffer protocol is supported by some class named say Foo. Could I then > do > > x=Foo([1,2,3]) > > numpy.array([2,2,2])+x > > and such operations? Yes. You can try it out with Python's builtin memoryview class. -- Pauli Virtanen From ralf.gommers at googlemail.com Sun Dec 18 03:49:00 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 18 Dec 2011 09:49:00 +0100 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 In-Reply-To: <10223008b8ce733fdab14ccbb504b118fab78763@localhost> References: <10223008b8ce733fdab14ccbb504b118fab78763@localhost> Message-ID: On Sat, Dec 17, 2011 at 12:59 PM, McNicol, Adam wrote: > ** > > Hi There, > > Thanks for the responses. > > At this point I would settle from just being able to install matplotlib. > Even if some of the functionality isn't present currently that is fine. > > I'm afraid my knowledge of Python falls down about here as well. I > installed Python 3.2.2 via the installer from Python.org so I have no idea > whether Python.h is present or where indeed I would find it or how I would > add it to the search path. > > Do I have to install from source or something like that? > No, your Python install should be fine if you just got the dmg installer from python.org. I recommend you install the OS X SDKs and distribute ( http://pypi.python.org/pypi/distribute), as I said before, and try again to compile numpy. Unfortunately you have chosen a difficult combination of OS and Python version, so we don't have binary installers you can use (yet). Ralf > Thanks again, > > > Adam. > > > -----Original Message----- > From: McNicol, Adam > Sent: Fri 12/16/2011 11:07 PM > To: numpy-discussion at scipy.org > Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 > > Hi There, > > I am very new to numpy and have really only started investigating it as > one of my students needs some functionality from matplotlib. I have managed > to install everything under Windows for work in class but I use a Mac at > home and have been struggling all night to get it to build and install. > > I should mention that I am using Python 3.2.2 both in school and at home > and it isn't an option to use Python 2.7 as all of the rest of my class is > taught in Python 3. I also have the most recent version of Xcode installed. > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > http://r.research.att.com/tools/ > > I then followed the install instructions but the build fails with the > following message: > > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > I have got no idea what to do with this error message. Any help would be > much appreciated. > > Kind Regards, > > > Adam. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcnicol at longroad.ac.uk Sun Dec 18 13:48:47 2011 From: amcnicol at longroad.ac.uk (McNicol, Adam) Date: Sun, 18 Dec 2011 18:48:47 -0000 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 References: Message-ID: Hi Ralf, Thanks for the response. I tried reinstalling Xcode 4.2.1 and the GCC/Fortran installer from http://r.research.att.com/tools/ (gcc-42-5666.3-darwin11.pkg) before installing the distribute package that you suggested. I then reran the numpy installer being sure to enter the three export lines as suggested on the numpy installation guide for Lion. Still no success. I guess I'll just have to wait for more official support for my configuration. I have included the output from terminal just in case it is useful as there were a few lines in red that suggest something isn't quite right with something. I have placed ** before the lines that appear in red. I appreciate the suggestions, Thanks again, Adam. running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources creating build creating build/src.macosx-10.6-intel-3.2 creating build/src.macosx-10.6-intel-3.2/numpy creating build/src.macosx-10.6-intel-3.2/numpy/distutils building library "npymath" sources customize NAGFCompiler **Could not locate executable f95 customize AbsoftFCompiler **Could not locate executable f90 **Could not locate executable f77 customize IBMFCompiler **Could not locate executable xlf90 **Could not locate executable xlf customize IntelFCompiler **Could not locate executable fort **Could not locate executable ifc customize GnuFCompiler **Could not locate executable g77 customize Gnu95FCompiler **Could not locate executable gfortran customize G95FCompiler **Could not locate executable g95 customize PGroupFCompiler **Could not locate executable pgf90 **Could not locate executable pgf77 **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c gcc-4.2 _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc-4.2 _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest creating build/src.macosx-10.6-intel-3.2/numpy/core creating build/src.macosx-10.6-intel-3.2/numpy/core/src creating build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math.c conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/ieee754.c conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math_complex.c building extension "numpy.core._sort" sources Generating build/src.macosx-10.6-intel-3.2/numpy/core/include/numpy/config.h customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory lipo: can't figure out the architecture type of: /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory lipo: can't figure out the architecture type of: /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out failure. removing: _configtest.c _configtest.o Running from numpy source directory.Traceback (most recent call last): File "setup.py", line 196, in setup_package() File "setup.py", line 189, in setup_package configuration=configuration ) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/core.py", line 186, in setup return old_setup(**new_attr) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/core.py", line 148, in setup dist.run_commands() File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 917, in run_commands self.run_command(cmd) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 936, in run_command cmd_obj.run() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build.py", line 37, in run old_build.run(self) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/command/build.py", line 126, in run self.run_command(cmd_name) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 936, in run_command cmd_obj.run() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 152, in run self.build_sources() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 169, in build_sources self.build_extension_sources(ext) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 328, in build_extension_sources sources = self.generate_sources(sources, ext) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 385, in generate_sources source = func(extension, build_dir) File "numpy/core/setup.py", line 410, in generate_config_h moredefs, ignored = cocache.check_types(config_cmd, ext, build_dir) File "numpy/core/setup.py", line 41, in check_types out = check_types(*a, **kw) File "numpy/core/setup.py", line 271, in check_types "Cannot compile 'Python.h'. Perhaps you need to "\ SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. Message: 3 Date: Sun, 18 Dec 2011 09:49:00 +0100 From: Ralf Gommers Subject: Re: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 To: Discussion of Numerical Python Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Sat, Dec 17, 2011 at 12:59 PM, McNicol, Adam wrote: > ** > > Hi There, > > Thanks for the responses. > > At this point I would settle from just being able to install matplotlib. > Even if some of the functionality isn't present currently that is fine. > > I'm afraid my knowledge of Python falls down about here as well. I > installed Python 3.2.2 via the installer from Python.org so I have no idea > whether Python.h is present or where indeed I would find it or how I would > add it to the search path. > > Do I have to install from source or something like that? > No, your Python install should be fine if you just got the dmg installer from python.org. I recommend you install the OS X SDKs and distribute ( http://pypi.python.org/pypi/distribute), as I said before, and try again to compile numpy. Unfortunately you have chosen a difficult combination of OS and Python version, so we don't have binary installers you can use (yet). Ralf > Thanks again, > > > Adam. > > > -----Original Message----- > From: McNicol, Adam > Sent: Fri 12/16/2011 11:07 PM > To: numpy-discussion at scipy.org > Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 > > Hi There, > > I am very new to numpy and have really only started investigating it as > one of my students needs some functionality from matplotlib. I have managed > to install everything under Windows for work in class but I use a Mac at > home and have been struggling all night to get it to build and install. > > I should mention that I am using Python 3.2.2 both in school and at home > and it isn't an option to use Python 2.7 as all of the rest of my class is > taught in Python 3. I also have the most recent version of Xcode installed. > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > http://r.research.att.com/tools/ > > I then followed the install instructions but the build fails with the > following message: > > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > I have got no idea what to do with this error message. Any help would be > much appreciated. > > Kind Regards, > > > Adam. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/0747fb12/attachment-0001.html ------------------------------ _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion End of NumPy-Discussion Digest, Vol 63, Issue 54 ************************************************ -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 7394 bytes Desc: not available URL: From ralf.gommers at googlemail.com Sun Dec 18 16:53:35 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 18 Dec 2011 22:53:35 +0100 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 In-Reply-To: References: Message-ID: On Sun, Dec 18, 2011 at 7:48 PM, McNicol, Adam wrote: > Hi Ralf, > > Thanks for the response. I tried reinstalling Xcode 4.2.1 and the > GCC/Fortran installer from http://r.research.att.com/tools/(gcc-42-5666.3-darwin11.pkg) before installing the distribute package that > you suggested. > > I then reran the numpy installer being sure to enter the three export > lines as suggested on the numpy installation guide for Lion. > > Still no success. I guess I'll just have to wait for more official support > for my configuration. > > I have included the output from terminal just in case it is useful as > there were a few lines in red that suggest something isn't quite right with > something. I have placed ** before the lines that appear in red. > > Your compile flags have "-isysroot /Developer/SDKs/MacOSX10.6.sdk" in it twice. Can you confirm you have installed this SDK? If so, I think the problem is that it appears twice. Not sure what's causing it though. Ralf I appreciate the suggestions, > > Thanks again, > > > Adam. > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building py_modules sources > creating build > creating build/src.macosx-10.6-intel-3.2 > creating build/src.macosx-10.6-intel-3.2/numpy > creating build/src.macosx-10.6-intel-3.2/numpy/distutils > building library "npymath" sources > customize NAGFCompiler > **Could not locate executable f95 > customize AbsoftFCompiler > **Could not locate executable f90 > **Could not locate executable f77 > customize IBMFCompiler > **Could not locate executable xlf90 > **Could not locate executable xlf > customize IntelFCompiler > **Could not locate executable fort > **Could not locate executable ifc > customize GnuFCompiler > **Could not locate executable g77 > customize Gnu95FCompiler > **Could not locate executable gfortran > customize G95FCompiler > **Could not locate executable g95 > customize PGroupFCompiler > **Could not locate executable pgf90 > **Could not locate executable pgf77 > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > gcc-4.2 _configtest.o -o _configtest > success! > removing: _configtest.c _configtest.o _configtest > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > _configtest.c:1: warning: conflicting types for built-in function ?exp? > _configtest.c:1: warning: conflicting types for built-in function ?exp? > gcc-4.2 _configtest.o -o _configtest > success! > removing: _configtest.c _configtest.o _configtest > creating build/src.macosx-10.6-intel-3.2/numpy/core > creating build/src.macosx-10.6-intel-3.2/numpy/core/src > creating build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math.c > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/ieee754.c > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math_complex.c > building extension "numpy.core._sort" sources > Generating > build/src.macosx-10.6-intel-3.2/numpy/core/include/numpy/config.h > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > lipo: can't figure out the architecture type of: > /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > lipo: can't figure out the architecture type of: > /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out > failure. > removing: _configtest.c _configtest.o > Running from numpy source directory.Traceback (most recent call last): > File "setup.py", line 196, in > setup_package() > File "setup.py", line 189, in setup_package > configuration=configuration ) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/core.py", > line 186, in setup > return old_setup(**new_attr) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/core.py", > line 148, in setup > dist.run_commands() > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 917, in run_commands > self.run_command(cmd) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 936, in run_command > cmd_obj.run() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build.py", > line 37, in run > old_build.run(self) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/command/build.py", > line 126, in run > self.run_command(cmd_name) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/cmd.py", > line 313, in run_command > self.distribution.run_command(command) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 936, in run_command > cmd_obj.run() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 152, in run > self.build_sources() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 169, in build_sources > self.build_extension_sources(ext) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 328, in build_extension_sources > sources = self.generate_sources(sources, ext) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 385, in generate_sources > source = func(extension, build_dir) > File "numpy/core/setup.py", line 410, in generate_config_h > moredefs, ignored = cocache.check_types(config_cmd, ext, build_dir) > File "numpy/core/setup.py", line 41, in check_types > out = check_types(*a, **kw) > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > > > Message: 3 > Date: Sun, 18 Dec 2011 09:49:00 +0100 > From: Ralf Gommers > Subject: Re: [Numpy-discussion] Problem installing NumPy with Python > 3.2.2/MacOS X 10.7.2 > To: Discussion of Numerical Python > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > On Sat, Dec 17, 2011 at 12:59 PM, McNicol, Adam >wrote: > > > ** > > > > Hi There, > > > > Thanks for the responses. > > > > At this point I would settle from just being able to install matplotlib. > > Even if some of the functionality isn't present currently that is fine. > > > > I'm afraid my knowledge of Python falls down about here as well. I > > installed Python 3.2.2 via the installer from Python.org so I have no > idea > > whether Python.h is present or where indeed I would find it or how I > would > > add it to the search path. > > > > Do I have to install from source or something like that? > > > > No, your Python install should be fine if you just got the dmg installer > from python.org. I recommend you install the OS X SDKs and distribute ( > http://pypi.python.org/pypi/distribute), as I said before, and try again > to > compile numpy. > > Unfortunately you have chosen a difficult combination of OS and Python > version, so we don't have binary installers you can use (yet). > > Ralf > > > > Thanks again, > > > > > > Adam. > > > > > > -----Original Message----- > > From: McNicol, Adam > > Sent: Fri 12/16/2011 11:07 PM > > To: numpy-discussion at scipy.org > > Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 > > > > Hi There, > > > > I am very new to numpy and have really only started investigating it as > > one of my students needs some functionality from matplotlib. I have > managed > > to install everything under Windows for work in class but I use a Mac at > > home and have been struggling all night to get it to build and install. > > > > I should mention that I am using Python 3.2.2 both in school and at home > > and it isn't an option to use Python 2.7 as all of the rest of my class > is > > taught in Python 3. I also have the most recent version of Xcode > installed. > > > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 > (Apple > > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > > http://r.research.att.com/tools/ > > > > I then followed the install instructions but the build fails with the > > following message: > > > > File "numpy/core/setup.py", line 271, in check_types > > "Cannot compile 'Python.h'. Perhaps you need to "\ > > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > > python-dev|python-devel. > > > > I have got no idea what to do with this error message. Any help would be > > much appreciated. > > > > Kind Regards, > > > > > > Adam. > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/0747fb12/attachment-0001.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 63, Issue 54 > ************************************************ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcnicol at longroad.ac.uk Sun Dec 18 17:13:48 2011 From: amcnicol at longroad.ac.uk (McNicol, Adam) Date: Sun, 18 Dec 2011 22:13:48 -0000 Subject: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 References: Message-ID: <2128d5dc6c318f2d07c027b5c6c4c0ef5b01bf31@localhost> Hi, Definitely have the sdk installed. In the Developer/SDKs directory I have one for 10.6 and another for 10.7 - no idea where a second 10.6 would be coming from =( Adam. -----Original Message----- From: numpy-discussion-request at scipy.org [mailto:numpy-discussion-request at scipy.org] Sent: Sun 12/18/2011 9:52 PM To: numpy-discussion at scipy.org Subject: NumPy-Discussion Digest, Vol 63, Issue 55 Send NumPy-Discussion mailing list submissions to numpy-discussion at scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.scipy.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-request at scipy.org You can reach the person managing the list at numpy-discussion-owner at scipy.org When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..." Today's Topics: 1. Re: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 (McNicol, Adam) 2. Re: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 (Ralf Gommers) ---------------------------------------------------------------------- Message: 1 Date: Sun, 18 Dec 2011 18:48:47 -0000 From: "McNicol, Adam" Subject: Re: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 To: Message-ID: Content-Type: text/plain; charset="iso-8859-1" Hi Ralf, Thanks for the response. I tried reinstalling Xcode 4.2.1 and the GCC/Fortran installer from http://r.research.att.com/tools/ (gcc-42-5666.3-darwin11.pkg) before installing the distribute package that you suggested. I then reran the numpy installer being sure to enter the three export lines as suggested on the numpy installation guide for Lion. Still no success. I guess I'll just have to wait for more official support for my configuration. I have included the output from terminal just in case it is useful as there were a few lines in red that suggest something isn't quite right with something. I have placed ** before the lines that appear in red. I appreciate the suggestions, Thanks again, Adam. running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources creating build creating build/src.macosx-10.6-intel-3.2 creating build/src.macosx-10.6-intel-3.2/numpy creating build/src.macosx-10.6-intel-3.2/numpy/distutils building library "npymath" sources customize NAGFCompiler **Could not locate executable f95 customize AbsoftFCompiler **Could not locate executable f90 **Could not locate executable f77 customize IBMFCompiler **Could not locate executable xlf90 **Could not locate executable xlf customize IntelFCompiler **Could not locate executable fort **Could not locate executable ifc customize GnuFCompiler **Could not locate executable g77 customize Gnu95FCompiler **Could not locate executable gfortran customize G95FCompiler **Could not locate executable g95 customize PGroupFCompiler **Could not locate executable pgf90 **Could not locate executable pgf77 **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c gcc-4.2 _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc-4.2 _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest creating build/src.macosx-10.6-intel-3.2/numpy/core creating build/src.macosx-10.6-intel-3.2/numpy/core/src creating build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math.c conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/ieee754.c conv_template:> build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math_complex.c building extension "numpy.core._sort" sources Generating build/src.macosx-10.6-intel-3.2/numpy/core/include/numpy/config.h customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler customize Gnu95FCompiler customize G95FCompiler customize PGroupFCompiler **don't know how to compile Fortran code on platform 'posix' C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' gcc-4.2: _configtest.c In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory lipo: can't figure out the architecture type of: /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory In file included from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, from /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, from _configtest.c:1: /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory lipo: can't figure out the architecture type of: /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out failure. removing: _configtest.c _configtest.o Running from numpy source directory.Traceback (most recent call last): File "setup.py", line 196, in setup_package() File "setup.py", line 189, in setup_package configuration=configuration ) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/core.py", line 186, in setup return old_setup(**new_attr) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/core.py", line 148, in setup dist.run_commands() File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 917, in run_commands self.run_command(cmd) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 936, in run_command cmd_obj.run() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build.py", line 37, in run old_build.run(self) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/command/build.py", line 126, in run self.run_command(cmd_name) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", line 936, in run_command cmd_obj.run() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 152, in run self.build_sources() File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 169, in build_sources self.build_extension_sources(ext) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 328, in build_extension_sources sources = self.generate_sources(sources, ext) File "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", line 385, in generate_sources source = func(extension, build_dir) File "numpy/core/setup.py", line 410, in generate_config_h moredefs, ignored = cocache.check_types(config_cmd, ext, build_dir) File "numpy/core/setup.py", line 41, in check_types out = check_types(*a, **kw) File "numpy/core/setup.py", line 271, in check_types "Cannot compile 'Python.h'. Perhaps you need to "\ SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. Message: 3 Date: Sun, 18 Dec 2011 09:49:00 +0100 From: Ralf Gommers Subject: Re: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 To: Discussion of Numerical Python Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Sat, Dec 17, 2011 at 12:59 PM, McNicol, Adam wrote: > ** > > Hi There, > > Thanks for the responses. > > At this point I would settle from just being able to install matplotlib. > Even if some of the functionality isn't present currently that is fine. > > I'm afraid my knowledge of Python falls down about here as well. I > installed Python 3.2.2 via the installer from Python.org so I have no idea > whether Python.h is present or where indeed I would find it or how I would > add it to the search path. > > Do I have to install from source or something like that? > No, your Python install should be fine if you just got the dmg installer from python.org. I recommend you install the OS X SDKs and distribute ( http://pypi.python.org/pypi/distribute), as I said before, and try again to compile numpy. Unfortunately you have chosen a difficult combination of OS and Python version, so we don't have binary installers you can use (yet). Ralf > Thanks again, > > > Adam. > > > -----Original Message----- > From: McNicol, Adam > Sent: Fri 12/16/2011 11:07 PM > To: numpy-discussion at scipy.org > Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 > > Hi There, > > I am very new to numpy and have really only started investigating it as > one of my students needs some functionality from matplotlib. I have managed > to install everything under Windows for work in class but I use a Mac at > home and have been struggling all night to get it to build and install. > > I should mention that I am using Python 3.2.2 both in school and at home > and it isn't an option to use Python 2.7 as all of the rest of my class is > taught in Python 3. I also have the most recent version of Xcode installed. > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 (Apple > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > http://r.research.att.com/tools/ > > I then followed the install instructions but the build fails with the > following message: > > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > I have got no idea what to do with this error message. Any help would be > much appreciated. > > Kind Regards, > > > Adam. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/0747fb12/attachment-0001.html ------------------------------ _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion End of NumPy-Discussion Digest, Vol 63, Issue 54 ************************************************ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 7394 bytes Desc: not available Url : http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/02269314/attachment-0001.bin ------------------------------ Message: 2 Date: Sun, 18 Dec 2011 22:53:35 +0100 From: Ralf Gommers Subject: Re: [Numpy-discussion] Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 To: Discussion of Numerical Python Message-ID: Content-Type: text/plain; charset="windows-1252" On Sun, Dec 18, 2011 at 7:48 PM, McNicol, Adam wrote: > Hi Ralf, > > Thanks for the response. I tried reinstalling Xcode 4.2.1 and the > GCC/Fortran installer from http://r.research.att.com/tools/(gcc-42-5666.3-darwin11.pkg) before installing the distribute package that > you suggested. > > I then reran the numpy installer being sure to enter the three export > lines as suggested on the numpy installation guide for Lion. > > Still no success. I guess I'll just have to wait for more official support > for my configuration. > > I have included the output from terminal just in case it is useful as > there were a few lines in red that suggest something isn't quite right with > something. I have placed ** before the lines that appear in red. > > Your compile flags have "-isysroot /Developer/SDKs/MacOSX10.6.sdk" in it twice. Can you confirm you have installed this SDK? If so, I think the problem is that it appears twice. Not sure what's causing it though. Ralf I appreciate the suggestions, > > Thanks again, > > > Adam. > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building py_modules sources > creating build > creating build/src.macosx-10.6-intel-3.2 > creating build/src.macosx-10.6-intel-3.2/numpy > creating build/src.macosx-10.6-intel-3.2/numpy/distutils > building library "npymath" sources > customize NAGFCompiler > **Could not locate executable f95 > customize AbsoftFCompiler > **Could not locate executable f90 > **Could not locate executable f77 > customize IBMFCompiler > **Could not locate executable xlf90 > **Could not locate executable xlf > customize IntelFCompiler > **Could not locate executable fort > **Could not locate executable ifc > customize GnuFCompiler > **Could not locate executable g77 > customize Gnu95FCompiler > **Could not locate executable gfortran > customize G95FCompiler > **Could not locate executable g95 > customize PGroupFCompiler > **Could not locate executable pgf90 > **Could not locate executable pgf77 > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > gcc-4.2 _configtest.o -o _configtest > success! > removing: _configtest.c _configtest.o _configtest > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > _configtest.c:1: warning: conflicting types for built-in function ?exp? > _configtest.c:1: warning: conflicting types for built-in function ?exp? > gcc-4.2 _configtest.o -o _configtest > success! > removing: _configtest.c _configtest.o _configtest > creating build/src.macosx-10.6-intel-3.2/numpy/core > creating build/src.macosx-10.6-intel-3.2/numpy/core/src > creating build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math.c > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/ieee754.c > conv_template:> > build/src.macosx-10.6-intel-3.2/numpy/core/src/npymath/npy_math_complex.c > building extension "numpy.core._sort" sources > Generating > build/src.macosx-10.6-intel-3.2/numpy/core/include/numpy/config.h > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > customize NAGFCompiler > customize AbsoftFCompiler > customize IBMFCompiler > customize IntelFCompiler > customize GnuFCompiler > customize Gnu95FCompiler > customize G95FCompiler > customize PGroupFCompiler > **don't know how to compile Fortran code on platform 'posix' > C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g > -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 > -isysroot /Developer/SDKs/MacOSX10.6.sdk > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include > -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c' > gcc-4.2: _configtest.c > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > lipo: can't figure out the architecture type of: > /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > In file included from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/bytearrayobject.h:9, > from > /Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m/Python.h:73, > from _configtest.c:1: > /Developer/SDKs/MacOSX10.6.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: > No such file or directory > lipo: can't figure out the architecture type of: > /var/folders/_c/z033hf1s1cgfcxtxfnpg0lsm0000gn/T//ccKv548x.out > failure. > removing: _configtest.c _configtest.o > Running from numpy source directory.Traceback (most recent call last): > File "setup.py", line 196, in > setup_package() > File "setup.py", line 189, in setup_package > configuration=configuration ) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/core.py", > line 186, in setup > return old_setup(**new_attr) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/core.py", > line 148, in setup > dist.run_commands() > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 917, in run_commands > self.run_command(cmd) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 936, in run_command > cmd_obj.run() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build.py", > line 37, in run > old_build.run(self) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/command/build.py", > line 126, in run > self.run_command(cmd_name) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/cmd.py", > line 313, in run_command > self.distribution.run_command(command) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/distutils/dist.py", > line 936, in run_command > cmd_obj.run() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 152, in run > self.build_sources() > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 169, in build_sources > self.build_extension_sources(ext) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 328, in build_extension_sources > sources = self.generate_sources(sources, ext) > File > "/Users/adammcnicol/Desktop/numpy-1.6.1/build/py3k/numpy/distutils/command/build_src.py", > line 385, in generate_sources > source = func(extension, build_dir) > File "numpy/core/setup.py", line 410, in generate_config_h > moredefs, ignored = cocache.check_types(config_cmd, ext, build_dir) > File "numpy/core/setup.py", line 41, in check_types > out = check_types(*a, **kw) > File "numpy/core/setup.py", line 271, in check_types > "Cannot compile 'Python.h'. Perhaps you need to "\ > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > python-dev|python-devel. > > > > Message: 3 > Date: Sun, 18 Dec 2011 09:49:00 +0100 > From: Ralf Gommers > Subject: Re: [Numpy-discussion] Problem installing NumPy with Python > 3.2.2/MacOS X 10.7.2 > To: Discussion of Numerical Python > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > On Sat, Dec 17, 2011 at 12:59 PM, McNicol, Adam >wrote: > > > ** > > > > Hi There, > > > > Thanks for the responses. > > > > At this point I would settle from just being able to install matplotlib. > > Even if some of the functionality isn't present currently that is fine. > > > > I'm afraid my knowledge of Python falls down about here as well. I > > installed Python 3.2.2 via the installer from Python.org so I have no > idea > > whether Python.h is present or where indeed I would find it or how I > would > > add it to the search path. > > > > Do I have to install from source or something like that? > > > > No, your Python install should be fine if you just got the dmg installer > from python.org. I recommend you install the OS X SDKs and distribute ( > http://pypi.python.org/pypi/distribute), as I said before, and try again > to > compile numpy. > > Unfortunately you have chosen a difficult combination of OS and Python > version, so we don't have binary installers you can use (yet). > > Ralf > > > > Thanks again, > > > > > > Adam. > > > > > > -----Original Message----- > > From: McNicol, Adam > > Sent: Fri 12/16/2011 11:07 PM > > To: numpy-discussion at scipy.org > > Subject: Problem installing NumPy with Python 3.2.2/MacOS X 10.7.2 > > > > Hi There, > > > > I am very new to numpy and have really only started investigating it as > > one of my students needs some functionality from matplotlib. I have > managed > > to install everything under Windows for work in class but I use a Mac at > > home and have been struggling all night to get it to build and install. > > > > I should mention that I am using Python 3.2.2 both in school and at home > > and it isn't an option to use Python 2.7 as all of the rest of my class > is > > taught in Python 3. I also have the most recent version of Xcode > installed. > > > > I have installed the correct build of gcc-4.2 with Fortran (gcc-4.2 > (Apple > > build 5666.3) with GNU Fortran 4.2.4 for Mac OS X 10.7 (Lion)) from > > http://r.research.att.com/tools/ > > > > I then followed the install instructions but the build fails with the > > following message: > > > > File "numpy/core/setup.py", line 271, in check_types > > "Cannot compile 'Python.h'. Perhaps you need to "\ > > SystemError: Cannot compile 'Python.h'. Perhaps you need to install > > python-dev|python-devel. > > > > I have got no idea what to do with this error message. Any help would be > > much appreciated. > > > > Kind Regards, > > > > > > Adam. > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/0747fb12/attachment-0001.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 63, Issue 54 > ************************************************ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111218/35cbac93/attachment.html ------------------------------ _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion End of NumPy-Discussion Digest, Vol 63, Issue 55 ************************************************ -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 13714 bytes Desc: not available URL: From teoliphant at gmail.com Sun Dec 18 21:03:01 2011 From: teoliphant at gmail.com (Travis Oliphant) Date: Sun, 18 Dec 2011 20:03:01 -0600 Subject: [Numpy-discussion] Owndata flag In-Reply-To: References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> Message-ID: <78F9272C-77AF-4E27-80C2-282FDCAB5C06@enthought.com> [snip] > > > > Devs, looking into this, I noticed that we use PyDataMem_NEW() and > PyDataMem_FREE() (which is #defined to malloc() and free()) for > handling the data pointer. Why aren't we using the appropriate > PyMem_*() functions (or the PyArray_*() memory functions which default > to using the PyMem_*() implementations)? Using the PyMem_*() functions > lets the Python memory manager have an accurate idea how much memory > is being used, which can be important for the large amounts of memory > that numpy arrays can consume. > > I assume this is intentional design. I just want to know the rationale > for it and would like it documented. I can certainly understand if it > causes bad interactions with the garbage collector, say (though hiding > information from the GC seems like a suboptimal approach). The macros were created so that the allocator could be switched when we understood better the benefits and trade-offs of using the Python memory manager versus the system memory manager (or one specialized for NumPy). So, the only intentional design was to use the macros (the decision to make them point to malloc and free was more because that's what was being done before than explicit decision. -Travis > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion --- Travis Oliphant Enthought, Inc. oliphant at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Dec 19 04:49:07 2011 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 19 Dec 2011 01:49:07 -0800 Subject: [Numpy-discussion] [ANN] IPython 0.12 is out! Message-ID: Hi all, on behalf of the IPython development team, I'm thrilled to announce, after an intense 4 1/2 months of work, the official release of IPython 0.12. This is a very important release for IPython, for several reasons. First and foremost, we have a major new feature, our interactive web-based notebook, that has been in our sights for a very long time. We tried to build one years ago (with WX) as a Google SoC project in 2005, had other prototypes later on, but things never quite worked. Finally the refactoring effort started two years ago, the communications architecture we built in 2010, and the advances of modern browsers, gave us all the necessary pieces. With this foundation in place, while part of the team worked on the 0.11 release, Brian Granger had already started quietly building the web notebook, which we demoed in early-alpha mode at the SciPy 2011 conference (http://www.archive.org/details/Wednesday-203-6-IpythonANewArchitectureForInteractiveAndParallel). By the EuroScipy conference in August we had merged Brian's amazing effort into our master branch, and after that multiple people (old and new) jumped in to make all kinds of improvements, leaving us today with something that is an excellent foundation. It's still the first release of the notebook, and as such we know it has a number of rough edges, but several of us have been using it as a daily research tool for the last few months. Do not hesitate to file issues for any problems you encounter with it, and we even have an 'open issue' for general discussion of ideas and features for the notebook at: https://github.com/ipython/ipython/issues/977. Furthermore, it is clear that our big refactoring work, combined with the amazing facilities at Github, are paying off. The 0.11 series was a major amount of work, with 511 issues closed over almost two years. But that pales in comparison to this cycle: in only 4 1/2 months we closed 515 issues, with 50% being Pull Requests. And very importantly, our list of contributors includes many new faces (see the credits section in our release notes for full details), which is the best thing that can happen to an open source project. We hope you will find the new features (the notebook isn't the only one! see below) compelling, and that many more will not only use IPython but will join the project; there's plenty to do and now there are tasks for many different skill sets (web, javascript, gui work, low-level networking, parallel machinery, console apps, etc). *Downloads* Download links and instructions are at: http://ipython.org/download.html And IPython is also on PyPI: http://pypi.python.org/pypi/ipython Those contain a built version of the HTML docs; if you want pure source downloads with no docs, those are available on github: Tarball: https://github.com/ipython/ipython/tarball/rel-0.12 Zipball: https://github.com/ipython/ipython/zipball/rel-0.12 * Features * Here is a quick listing of the major new features: - An interactive browser-based Notebook with rich media support - Two-process terminal console - Tabbed QtConsole - Full Python 3 compatibility - Standalone Kernel - PyPy support And many more... We closed over 500 tickets, merged over 200 pull requests, and more than 45 people contributed commits for the final release. Please see our release notes for the full details on everything about this release: http://ipython.org/ipython-doc/stable/whatsnew/version0.12.html * IPython tutorial at PyCon 2012 * Those of you attending (or planning on it) PyCon 2012 in Santa Clara, CA, may be interested in attending a hands-on tutorial we will be presenting on the many faces of IPython. See https://us.pycon.org/2012/schedule/presentation/121/ for full details. * Errata * This was caught by Matthias Bussionnier's (one of our great new contributors) sharp eyes while I was writing these release notes: In the example notebook called display_protocol, the first cell starts with: from IPython.lib.pylabtools import print_figure which should instead be: from IPython.core.pylabtools import print_figure This has already been fixed on master, but since the final 0.12 files have been uploaded to github and PyPI, we'll let them be. As usual, if you find any other problem, please file a ticket --or even better, a pull request fixing it-- on our github issues site (https://github.com/ipython/ipython/issues/). Many thanks to all who contributed! Fernando, on behalf of the IPython development team. http://ipython.org From e.antero.tammi at gmail.com Mon Dec 19 18:27:19 2011 From: e.antero.tammi at gmail.com (eat) Date: Tue, 20 Dec 2011 01:27:19 +0200 Subject: [Numpy-discussion] Would it be possible to enhance np.unique(.) with a keyword kind Message-ID: Hi, Especially when the keyword return_index of np.unique(.) is specified to be True, would it in general also be reasonable to be able to specify the sorting algorithm of the underlying np.argsort(.)? The rationale is that (for at least some of my cases) higher level algorithms seems to be too over complicated unless I'm not able to request a stable sorting order from np.unique(.) (like np.unique(., return_index= True, kind= 'mergesort'). (FWIW, I apparently do have a working local hack for this kind of functionality, but without extensive testing of 'all' corner cases). Thanks, eat -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 19 19:33:30 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 19:33:30 -0500 Subject: [Numpy-discussion] Would it be possible to enhance np.unique(.) with a keyword kind In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 6:27 PM, eat wrote: > Hi, > > Especially when the keyword?return_index of np.unique(.) is specified to be > True, would it in general also be reasonable to be able to specify the > sorting algorithm of the underlying np.argsort(.)? > > The rationale is that (for at least some of my cases) higher level > algorithms seems to be too?over complicated?unless I'm not able to request a > stable sorting order from?np.unique(.) (like?np.unique(.,?return_index= > True, kind= 'mergesort'). > > (FWIW, I apparently do have a working local hack for this kind of > functionality, but without extensive testing of 'all'?corner cases). Just to understand this: Is the return_index currently always the first occurrence or random? I haven't found a use for return_index yet (but use return_inverse a lot), but if we are guaranteed to get the first instance, then this could be very useful. Josef > > > Thanks, > eat > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From e.antero.tammi at gmail.com Mon Dec 19 20:16:31 2011 From: e.antero.tammi at gmail.com (eat) Date: Tue, 20 Dec 2011 03:16:31 +0200 Subject: [Numpy-discussion] Would it be possible to enhance np.unique(.) with a keyword kind In-Reply-To: References: Message-ID: Hi, On Tue, Dec 20, 2011 at 2:33 AM, wrote: > On Mon, Dec 19, 2011 at 6:27 PM, eat wrote: > > Hi, > > > > Especially when the keyword return_index of np.unique(.) is specified to > be > > True, would it in general also be reasonable to be able to specify the > > sorting algorithm of the underlying np.argsort(.)? > > > > The rationale is that (for at least some of my cases) higher level > > algorithms seems to be too over complicated unless I'm not able to > request a > > stable sorting order from np.unique(.) (like np.unique(., return_index= > > True, kind= 'mergesort'). > > > > (FWIW, I apparently do have a working local hack for this kind of > > functionality, but without extensive testing of 'all' corner cases). > > Just to understand this: > > Is the return_index currently always the first occurrence or random? > No, for current implementation it's not always the first occurrence returned. AFAIK, the only stable algorithm to provide this is ' mergesort' and that's why I'll like to have a keyword 'kind' to propagate down to then internals. > > I haven't found a use for return_index yet (but use return_inverse a > lot), but if we are guaranteed to get the first instance, then this > could be very useful. > I think that 'return_inverse' will suffer of the same nondeterministic behavior as well. Thanks, eat > > Josef > > > > > > > > Thanks, > > eat > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 19 20:41:43 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 20:41:43 -0500 Subject: [Numpy-discussion] Would it be possible to enhance np.unique(.) with a keyword kind In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 8:16 PM, eat wrote: > Hi, > > On Tue, Dec 20, 2011 at 2:33 AM, wrote: >> >> On Mon, Dec 19, 2011 at 6:27 PM, eat wrote: >> > Hi, >> > >> > Especially when the keyword?return_index of np.unique(.) is specified to >> > be >> > True, would it in general also be reasonable to be able to specify the >> > sorting algorithm of the underlying np.argsort(.)? >> > >> > The rationale is that (for at least some of my cases) higher level >> > algorithms seems to be too?over complicated?unless I'm not able to >> > request a >> > stable sorting order from?np.unique(.) (like?np.unique(.,?return_index= >> > True, kind= 'mergesort'). >> > >> > (FWIW, I apparently do have a working local hack for this kind of >> > functionality, but without extensive testing of 'all'?corner cases). >> >> Just to understand this: >> >> Is the return_index currently always the first occurrence or random? > > No, for current implementation it's not always the first?occurrence > returned. AFAIK, the only stable algorithm to provide this is '?mergesort' > and that's why I'll like to have a keyword 'kind' to propagate down to then > internals. Thanks, then I'm all in favor of mergesort. >> >> >> I haven't found a use for return_index yet (but use return_inverse a >> lot), but if we are guaranteed to get the first instance, then this >> could be very useful. > > I think that 'return_inverse' will suffer of the same > nondeterministic?behavior?as well. I don't think so, because there is a unique mapping from observations to unique items. Josef > > Thanks, > eat >> >> >> Josef >> >> >> > >> > >> > Thanks, >> > eat >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From e.antero.tammi at gmail.com Mon Dec 19 21:01:01 2011 From: e.antero.tammi at gmail.com (eat) Date: Tue, 20 Dec 2011 04:01:01 +0200 Subject: [Numpy-discussion] Would it be possible to enhance np.unique(.) with a keyword kind In-Reply-To: References: Message-ID: Hi, On Tue, Dec 20, 2011 at 3:41 AM, wrote: > On Mon, Dec 19, 2011 at 8:16 PM, eat wrote: > > Hi, > > > > On Tue, Dec 20, 2011 at 2:33 AM, wrote: > >> > >> On Mon, Dec 19, 2011 at 6:27 PM, eat wrote: > >> > Hi, > >> > > >> > Especially when the keyword return_index of np.unique(.) is specified > to > >> > be > >> > True, would it in general also be reasonable to be able to specify the > >> > sorting algorithm of the underlying np.argsort(.)? > >> > > >> > The rationale is that (for at least some of my cases) higher level > >> > algorithms seems to be too over complicated unless I'm not able to > >> > request a > >> > stable sorting order from np.unique(.) > (like np.unique(., return_index= > >> > True, kind= 'mergesort'). > >> > > >> > (FWIW, I apparently do have a working local hack for this kind of > >> > functionality, but without extensive testing of 'all' corner cases). > >> > >> Just to understand this: > >> > >> Is the return_index currently always the first occurrence or random? > > > > No, for current implementation it's not always the first occurrence > > returned. AFAIK, the only stable algorithm to provide this is > ' mergesort' > > and that's why I'll like to have a keyword 'kind' to propagate down to > then > > internals. > > Thanks, then I'm all in favor of mergesort. > > >> > >> > >> I haven't found a use for return_index yet (but use return_inverse a > >> lot), but if we are guaranteed to get the first instance, then this > >> could be very useful. > > > > I think that 'return_inverse' will suffer of the same > > nondeterministic behavior as well. > > I don't think so, because there is a unique mapping from observations > to unique items. > But (source code of 1.6.1) indicates that keywords 'return_inverse' are 'return_index' are related, indeed. Just my 2 cents eat > > Josef > > > > > Thanks, > > eat > >> > >> > >> Josef > >> > >> > >> > > >> > > >> > Thanks, > >> > eat > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Marc.Poinot at onera.fr Tue Dec 20 05:31:52 2011 From: Marc.Poinot at onera.fr (Marc POINOT) Date: Tue, 20 Dec 2011 11:31:52 +0100 Subject: [Numpy-discussion] import_array weird behavior Message-ID: <4EF06418.4080200@onera.fr> Hi all, I've just changed to cython and old numpy module with a raw C API. The C module init is removed, and I've put the import_array in the 'pure-cython' part of the module init. Usual tutorial examples have these lines: import numpy as NPY cimport numpy as NPY NPY.import_array() But this fails (first numpy API call core dump) in my module, if I put back the import_array() in the C part of the module everything turns ok again. Now if I remove again this C API import_array and I write: import numpy as NPY cimport numpy as CNPY CNPY.import_array() all is ok. Do I miss something? It sounds good to me to have to separate scopes but I wonder if it's a common practice with potential side-effects or if it is the right way to use numpy with cython. And then, what about my core dump? cython 1.15.1 python 2.7.2 numpy 1.6.1 scons 2.1.0 linux x86 64 with gcc 4.3.4 -MP- ----------------------------------------------------------------------- Marc POINOT [ONERA/DSNA] Tel:+33.1.46.73.42.84 Fax:+33.1.46.73.41.66 Avertissement/disclaimer http://www.onera.fr/onera-en/emails-terms From charlesr.harris at gmail.com Tue Dec 20 09:18:31 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 Dec 2011 07:18:31 -0700 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: Hi Ralf, On Mon, Dec 5, 2011 at 12:43 PM, Ralf Gommers wrote: > Hi all, > > It's been a little over 6 months since the release of 1.6.0 and the NA > debate has quieted down, so I'd like to ask your opinion on the timing of > 1.7.0. It looks to me like we have a healthy amount of bug fixes and small > improvements, plus three larger chucks of work: > > - datetime > - NA > - Bento support > > My impression is that both datetime and NA are releasable, but should be > labeled "tech preview" or something similar, because they may still see > significant changes. Please correct me if I'm wrong. > > There's still some maintenance work to do and pull requests to merge, but > a beta release by Christmas should be feasible. What do you all think? > > I'm now thinking that is too optimistic. There are a fair number of tickets that need to be looked at, including some for einsum and the iterator, and I think the number of pull requests needs to be reduced. How about sometime in the beginning of January? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Dec 20 12:02:09 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 Dec 2011 17:02:09 +0000 Subject: [Numpy-discussion] import_array weird behavior In-Reply-To: <4EF06418.4080200@onera.fr> References: <4EF06418.4080200@onera.fr> Message-ID: On Tue, Dec 20, 2011 at 10:31, Marc POINOT wrote: > > Hi all, > > I've just changed to cython and old numpy module with a raw C API. > The C module init is removed, and I've put the import_array in the 'pure-cython' > part of the module init. Usual tutorial examples have these lines: > > import ?numpy as NPY > cimport numpy as NPY > > NPY.import_array() > > But this fails (first numpy API call core dump) in my module, if I put back the > import_array() in the C part of the module everything turns ok again. > Now if I remove again this C API import_array and I write: > > import ?numpy as NPY > cimport numpy as CNPY > > CNPY.import_array() > > all is ok. > Do I miss something? It sounds good to me to have to separate scopes but > I wonder if it's a common practice with potential side-effects or if it is > the right way to use numpy with cython. Yes, it is the right way to do it. Cython can *mostly* tell from context whether to resolve NPY. from either the C side or the Python side, but sometimes it's ambiguous. It's more often ambiguous to the human reader, too, so I try to be explicit about it. I don't really know why the tutorials do it the confusing way. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From dtustudy68 at hotmail.com Tue Dec 20 14:21:39 2011 From: dtustudy68 at hotmail.com (Jack Bryan) Date: Tue, 20 Dec 2011 12:21:39 -0700 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error Message-ID: Hi, In order to install scipy, I am trying to install numpy 1.6.1. on GNU/linux redhat 2.6.18. But, I got error about fortran compiler. I have gfortran. I do not have f77/f90/g77/g90. I run :python setup.py build --fcompiler=gfortran It woks well and tells me that customize Gnu95FCompilerFound executable /usr/bin/gfortran But, i run: building library "npymath" sourcescustomize GnuFCompilerCould not locate executable g77Could not locate executable f77customize IntelFCompilerCould not locate executable ifortCould not locate executable ifccustomize LaheyFCompilerCould not locate executable lf95customize PGroupFCompilerCould not locate executable pgf90Could not locate executable pgf77customize AbsoftFCompilerCould not locate executable f90customize NAGFCompilerFound executable /usr/bin/f95customize Gnu95FCompilercustomize Gnu95FCompiler using configC compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC Do I have to install f77/f90/g77/g90 ? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Tue Dec 20 14:43:53 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Tue, 20 Dec 2011 20:43:53 +0100 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: References: Message-ID: <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> Hi Jack, > In order to install scipy, I am trying to install numpy 1.6.1. on GNU/linux redhat 2.6.18. > > But, I got error about fortran compiler. > > I have gfortran. I do not have f77/f90/g77/g90. > that's good! > I run : > python setup.py build --fcompiler=gfortran > > It woks well and tells me that > > customize Gnu95FCompiler > Found executable /usr/bin/gfortran > > But, i run: > > building library "npymath" sources > customize GnuFCompiler > Could not locate executable g77 > Could not locate executable f77 > customize IntelFCompiler > Could not locate executable ifort > Could not locate executable ifc > customize LaheyFCompiler > Could not locate executable lf95 > customize PGroupFCompiler > Could not locate executable pgf90 > Could not locate executable pgf77 > customize AbsoftFCompiler > Could not locate executable f90 > customize NAGFCompiler > Found executable /usr/bin/f95 > customize Gnu95FCompiler > customize Gnu95FCompiler using config > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > Do I have to install f77/f90/g77/g90 ? > You did not send any actual error message here, so it's difficult to tell where exactly your install failed. But gfortran is preferred over f77 etc. and should in fact be automatically selected (without the '--fcompiler=gfortran'), it is apparently also found in the right place. Could you send us the last lines of output with the error itself, or possibly everything following a line starting with "Traceback:.." ; and also the output of `gfortran -v`? Cheers, Derek From dtustudy68 at hotmail.com Tue Dec 20 15:01:52 2011 From: dtustudy68 at hotmail.com (Jack Bryan) Date: Tue, 20 Dec 2011 13:01:52 -0700 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> References: , <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> Message-ID: Hi,I run : python setup.py build and got: building library "npymath" sourcescustomize GnuFCompilerCould not locate executable g77Could not locate executable f77customize IntelFCompilerCould not locate executable ifortCould not locate executable ifccustomize LaheyFCompilerCould not locate executable lf95customize PGroupFCompilerCould not locate executable pgf90Could not locate executable pgf77customize AbsoftFCompilerCould not locate executable f90customize NAGFCompilerFound executable /usr/bin/f95customize VastFCompilercustomize CompaqFCompilerCould not locate executable fortcustomize IntelItaniumFCompilerCould not locate executable efortCould not locate executable efccustomize IntelEM64TFCompilercustomize Gnu95FCompilerFound executable /usr/bin/gfortrancustomize Gnu95FCompilercustomize Gnu95FCompiler using configC compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC......... building extension "numpy.random.mtrand" sourcesC compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/python272/include/python2.7 -c'gcc: _configtest.cgcc -pthread _configtest.o -o _configtest_configtestfailure. then : python setup.py install --prefix=/mypath/numpy I got: Running from numpy source directory.non-existing path in 'numpy/distutils': 'site.cfg'F2PY Version 2blas_opt_info:blas_mkl_info: libraries mkl,vml,guide not found in /mypath/python272/lib libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info:Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in / mypath/python272/lib libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /mypath/python272/lib libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE UserWarning: Lapack (http://www.netlib.org/lapack/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [lapack]) or by setting the LAPACK environment variable. warnings.warn(LapackNotFoundError.__doc__)lapack_src_info: NOT AVAILABLE Lapack (http://www.netlib.org/lapack/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [lapack_src]) or by setting the LAPACK_SRC environment variable. warnings.warn(LapackSrcNotFoundError.__doc__) NOT AVAILABLE running installrunning buildrunning config_ccunifing config_cc, config, build_clib, build_ext, build commands --compiler optionsrunning config_fcunifing config_fc, config, build_clib, build_ext, build commands --fcompiler optionsrunning build_srcbuild_srcbuilding py_modules sourcesbuilding library "npymath" sourcescustomize GnuFCompilerCould not locate executable g77Could not locate executable f77customize IntelFCompilerCould not locate executable ifortCould not locate executable ifccustomize LaheyFCompilerCould not locate executable lf95customize PGroupFCompilerCould not locate executable pgf90Could not locate executable pgf77customize AbsoftFCompilerCould not locate executable f90customize NAGFCompilerFound executable /usr/bin/f95customize VastFCompilercustomize CompaqFCompilerCould not locate executable fortcustomize IntelItaniumFCompilerCould not locate executable efortCould not locate executable efccustomize IntelEM64TFCompilercustomize Gnu95FCompilerFound executable /usr/bin/gfortrancustomize Gnu95FCompilercustomize Gnu95FCompiler using configC compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/remote/dcnl/Ding/backup_20100716/python272/include/python2.7 -c'gcc: _configtest.cgcc -pthread _configtest.o -o _configtestsuccess!removing: _configtest.c _configtest.o _configtestC compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC ........................ compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/mypath/python272/include/python2.7 -c'gcc: _configtest.cgcc -pthread _configtest.o -o _configtest_configtestfailure. at the end: running install_egg_infoRemoving /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-infoWriting /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-inforunning install_clib Then I got: pythonPython 2.7.2 (default, Dec 20 2011, 12:32:10)[GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2Type "help", "copyright", "credits" or "license" for more information. >>> import numpyTraceback (most recent call last): File "", line 1, in ImportError: No module named numpy I have updated PATH for bin and lib of numpy. I have got f2py and lib files : add_newdocs.py distutils __init__.py numarray testingadd_newdocs.pyc doc __init__.pyc oldnumeric testscompat dual.py lib polynomial version.py__config__.py dual.pyc linalg random version.pyc__config__.pyc f2py ma setup.pycore fft matlib.py setup.pycctypeslib.py _import_tools.py matlib.pyc setupscons.pyctypeslib.pyc _import_tools.pyc matrixlib setupscons.pyc Any help is really appreciated. thanks > From: derek at astro.physik.uni-goettingen.de > Date: Tue, 20 Dec 2011 20:43:53 +0100 > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] numpy1.6.1 install fortran compiler error > > Hi Jack, > > > In order to install scipy, I am trying to install numpy 1.6.1. on GNU/linux redhat 2.6.18. > > > > But, I got error about fortran compiler. > > > > I have gfortran. I do not have f77/f90/g77/g90. > > > that's good! > > > I run : > > python setup.py build --fcompiler=gfortran > > > > It woks well and tells me that > > > > customize Gnu95FCompiler > > Found executable /usr/bin/gfortran > > > > But, i run: > > > > building library "npymath" sources > > customize GnuFCompiler > > Could not locate executable g77 > > Could not locate executable f77 > > customize IntelFCompiler > > Could not locate executable ifort > > Could not locate executable ifc > > customize LaheyFCompiler > > Could not locate executable lf95 > > customize PGroupFCompiler > > Could not locate executable pgf90 > > Could not locate executable pgf77 > > customize AbsoftFCompiler > > Could not locate executable f90 > > customize NAGFCompiler > > Found executable /usr/bin/f95 > > customize Gnu95FCompiler > > customize Gnu95FCompiler using config > > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > > > Do I have to install f77/f90/g77/g90 ? > > > You did not send any actual error message here, so it's difficult to tell > where exactly your install failed. But gfortran is preferred over f77 etc. > and should in fact be automatically selected (without the '--fcompiler=gfortran'), > it is apparently also found in the right place. > Could you send us the last lines of output with the error itself, or possibly > everything following a line starting with "Traceback:.." ; > and also the output of `gfortran -v`? > > Cheers, > Derek > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Tue Dec 20 15:23:24 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Tue, 20 Dec 2011 21:23:24 +0100 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: References: , <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> Message-ID: <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> On 20.12.2011, at 9:01PM, Jack Bryan wrote: > customize Gnu95FCompiler using config > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/remote/dcnl/Ding/backup_20100716/python272/include/python2.7 -c' > gcc: _configtest.c > gcc -pthread _configtest.o -o _configtest > success! > removing: _configtest.c _configtest.o _configtest > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > > ........................ > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/mypath/python272/include/python2.7 -c' > gcc: _configtest.c > gcc -pthread _configtest.o -o _configtest > _configtest > failure. > The blas failures further up are non-fatal, but I am not sure about the _configtest.c, or why it once succeeds, then fails again - anyway the installation appears to have finished. > at the end: > > running install_egg_info > Removing /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-info > Writing /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-info > running install_clib > > Then > > I got: > > python > Python 2.7.2 (default, Dec 20 2011, 12:32:10) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > > >>> import numpy > Traceback (most recent call last): > File "", line 1, in > ImportError: No module named numpy > > > I have updated PATH for bin and lib of numpy. > You will need '/mypath/numpy/lib/python2.7/site-packages' in your PYTHONPATH - have you done that, and does it show up with >>> import sys >>> sys.path() in the Python shell? Cheers, Derek From robert.kern at gmail.com Tue Dec 20 15:27:56 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 Dec 2011 20:27:56 +0000 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> References: <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> Message-ID: On Tue, Dec 20, 2011 at 20:23, Derek Homeier wrote: > On 20.12.2011, at 9:01PM, Jack Bryan wrote: > >> customize Gnu95FCompiler using config >> C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC >> >> compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/remote/dcnl/Ding/backup_20100716/python272/include/python2.7 -c' >> gcc: _configtest.c >> gcc -pthread _configtest.o -o _configtest >> success! >> removing: _configtest.c _configtest.o _configtest >> C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC >> >> >> ........................ >> >> compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/mypath/python272/include/python2.7 -c' >> gcc: _configtest.c >> gcc -pthread _configtest.o -o _configtest >> _configtest >> failure. >> > The blas failures further up are non-fatal, but I am not sure about the _configtest.c, > or why it once succeeds, then fails again - anyway the installation appears to have > finished. They are different files. In order to determine your configuration, the setup.py will try to compile many different small C programs. All of them are named _configtest. You can ignore all of these unless if you think the discovered configuration is incorrect somehow. A "failure" here only means that your system does not provide what is being tested for. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From ralf.gommers at googlemail.com Tue Dec 20 15:28:55 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Dec 2011 21:28:55 +0100 Subject: [Numpy-discussion] numpy 1.7.0 release? In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 3:18 PM, Charles R Harris wrote: > Hi Ralf, > > On Mon, Dec 5, 2011 at 12:43 PM, Ralf Gommers > wrote: > >> Hi all, >> >> It's been a little over 6 months since the release of 1.6.0 and the NA >> debate has quieted down, so I'd like to ask your opinion on the timing of >> 1.7.0. It looks to me like we have a healthy amount of bug fixes and small >> improvements, plus three larger chucks of work: >> >> - datetime >> - NA >> - Bento support >> >> My impression is that both datetime and NA are releasable, but should be >> labeled "tech preview" or something similar, because they may still see >> significant changes. Please correct me if I'm wrong. >> >> There's still some maintenance work to do and pull requests to merge, but >> a beta release by Christmas should be feasible. What do you all think? >> >> > I'm now thinking that is too optimistic. There are a fair number of > tickets that need to be looked at, including some for einsum and the > iterator, and I think the number of pull requests needs to be reduced. How > about sometime in the beginning of January? > > Yes, it certainly was. Besides the tickets and pull requests, we also need the support for MinGW 4.x that David is looking at. If that goes smoothly then the first week of January may be feasible, otherwise it'll have to be February (I'm traveling for most of Jan). Or someone else has to volunteer to be the release manager for this release. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at uw.edu Tue Dec 20 16:32:46 2011 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 20 Dec 2011 13:32:46 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 References: Message-ID: In article , Olivier Delalleau wrote: > 2011/12/12 Russell E. Owen > > > In article > > , > > Ralf Gommers wrote: > > > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit > > tests > > > > claim the wrong version of fortran was used. I thought I knew how to > > > > avoid that, but it's not working. > > > > > > > >...(elided text that suggests numpy is building using g77 even though I > > asked for gfortran)... > > > > > > > > Any suggestions on how to fix this? > > > > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it > > off > > > your path. > > > > Yes. I would have tried that if I had known how to do it (though I'm > > puzzled why it would be wanted since I told the installer to use > > gfortran). > > > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > > this system. > > > > -- Russell > > > > You could create a link g77 -> gfortran and make sure this link comes first > in your PATH. > (That's assuming command lines for g77 and gfortran are compatible -- I > don't know if that's the case). Interesting idea. I gave it a try (see P.S.), but it didn't help. I get the same error in the unit test. -- Russell P.S. -bash-3.2$ which g77 ~/local/bin/g77 -bash-3.2$ ls -l ~/local/bin/g77 lrwxrwxrwx 1 rowen astro 19 Dec 20 10:59 /astro/users/rowen/local/bin/g77 -> /usr/bin/gfortran44 -bash-3.2$ g77 --version GNU Fortran (GCC) 4.4.0 20090514 (Red Hat 4.4.0-6) Copyright (C) 2009 Free Software Foundation, Inc. GNU Fortran comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of GNU Fortran under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING From rowen at uw.edu Tue Dec 20 16:52:10 2011 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 20 Dec 2011 13:52:10 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 References: Message-ID: In article , "Russell E. Owen" wrote: > In article > , > Ralf Gommers wrote: > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit tests > > > claim the wrong version of fortran was used. I thought I knew how to > > > avoid that, but it's not working. > > > > > >...(elided text that suggests numpy is building using g77 even though I > > >asked for gfortran)... > > > > > > Any suggestions on how to fix this? > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it off > > your path. > > Yes. I would have tried that if I had known how to do it (though I'm > puzzled why it would be wanted since I told the installer to use > gfortran). > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > this system. I'm starting to suspect this is a bug in the unit test, not the building of numpy. The unit test complains: Traceback (most recent call last): File "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", line 50, in test_lapack information.""") AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! This is likely to but when I run ldd on numpy/linalg/lapack_lite.so I get: -bash-3.2$ ldd /astro/users/rowen/local/lib/python/numpy/linalg/lapack_lite.so linux-vdso.so.1 => (0x00007fff0cff0000) liblapack.so.3 => /usr/lib64/liblapack.so.3 (0x00002acadd738000) libblas.so.3 => /usr/lib64/libblas.so.3 (0x00002acadde42000) libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00002acade096000) libm.so.6 => /lib64/libm.so.6 (0x00002acade380000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002acade604000) libc.so.6 => /lib64/libc.so.6 (0x00002acade812000) libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002acadeb6a000) /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) The build instructions say (sic): One relatively simple and reliable way to check for the compiler used to build a library is to use ldd on the library. If libg2c.so is a dependency, this means that g77 has been used. If libgfortran.so is a a dependency, gfortran has been used. If both are dependencies, this means both have been used, which is almost always a very bad idea. I don't see any sign of libg2c.so. Is there some other evidence that numpy/linalg/lapack_lite.so is build against both g77 and gfortran, or is the unit test result wrong or...? -- Russell From ralf.gommers at googlemail.com Tue Dec 20 17:04:08 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Dec 2011 23:04:08 +0100 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 10:52 PM, Russell E. Owen wrote: > In article , > "Russell E. Owen" wrote: > > > In article > > , > > Ralf Gommers wrote: > > > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit > tests > > > > claim the wrong version of fortran was used. I thought I knew how to > > > > avoid that, but it's not working. > > > > > > > >...(elided text that suggests numpy is building using g77 even though > I > > > >asked for gfortran)... > > > > > > > > Any suggestions on how to fix this? > > > > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it > off > > > your path. > > > > Yes. I would have tried that if I had known how to do it (though I'm > > puzzled why it would be wanted since I told the installer to use > > gfortran). > > > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > > this system. > > The explanation of why g77 is still picked up, and a possible solution: http://thread.gmane.org/gmane.comp.python.numeric.general/13820/focus=13826 Ralf > I'm starting to suspect this is a bug in the unit test, not the building > of numpy. The unit test complains: > Traceback (most recent call last): > File > "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line > 146, in skipper_func > return f(*args, **kwargs) > File > "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", > line 50, in test_lapack > information.""") > AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! > This is likely to > > but when I run ldd on numpy/linalg/lapack_lite.so I get: > -bash-3.2$ ldd > /astro/users/rowen/local/lib/python/numpy/linalg/lapack_lite.so > linux-vdso.so.1 => (0x00007fff0cff0000) > liblapack.so.3 => /usr/lib64/liblapack.so.3 (0x00002acadd738000) > libblas.so.3 => /usr/lib64/libblas.so.3 (0x00002acadde42000) > libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00002acade096000) > libm.so.6 => /lib64/libm.so.6 (0x00002acade380000) > libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002acade604000) > libc.so.6 => /lib64/libc.so.6 (0x00002acade812000) > libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002acadeb6a000) > /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) > > The build instructions say (sic): > One relatively simple and reliable way to check for the compiler used to > build a library is to use ldd on the library. If libg2c.so is a > dependency, this means that g77 has been used. If libgfortran.so is a a > dependency, gfortran has been used. If both are dependencies, this means > both have been used, which is almost always a very bad idea. > > I don't see any sign of libg2c.so. > > Is there some other evidence that numpy/linalg/lapack_lite.so is build > against both g77 and gfortran, or is the unit test result wrong or...? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtustudy68 at hotmail.com Tue Dec 20 17:16:03 2011 From: dtustudy68 at hotmail.com (Jack Bryan) Date: Tue, 20 Dec 2011 15:16:03 -0700 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> References: , , <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de>, , <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> Message-ID: Hi, I have set up PYTHONPATH : >>> sys.path['', '/mypath/numpy/lib/python2.7/site-packages', '/ mypath/python272/lib/python27.zip', '/ mypath/python272/lib/python2.7', '/ mypath/python272/lib/python2.7/plat-linux2', '/ mypath /python272/lib/python2.7/lib-tk', '/ mypath/python272/lib/python2.7/lib-old', '/ mypath /python272/lib/python2.7/lib-dynload', '/ mypath/python272/lib/python2.7/site-packages'] But still errors: >>> import numpyTraceback (most recent call last): File "", line 1, in File "numpy/__init__.py", line 153, in import polynomial File "numpy/polynomial/__init__.py", line 18, in from polynomial import Polynomial File "numpy/polynomial/polynomial.py", line 60, in from polytemplate import polytemplateImportError: cannot import name polytemplate Any help is appreciated. Thanks > From: derek at astro.physik.uni-goettingen.de > Date: Tue, 20 Dec 2011 21:23:24 +0100 > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] numpy1.6.1 install fortran compiler error > > On 20.12.2011, at 9:01PM, Jack Bryan wrote: > > > customize Gnu95FCompiler using config > > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/remote/dcnl/Ding/backup_20100716/python272/include/python2.7 -c' > > gcc: _configtest.c > > gcc -pthread _configtest.o -o _configtest > > success! > > removing: _configtest.c _configtest.o _configtest > > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > > > > > ........................ > > > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/mypath/python272/include/python2.7 -c' > > gcc: _configtest.c > > gcc -pthread _configtest.o -o _configtest > > _configtest > > failure. > > > The blas failures further up are non-fatal, but I am not sure about the _configtest.c, > or why it once succeeds, then fails again - anyway the installation appears to have > finished. > > > at the end: > > > > running install_egg_info > > Removing /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-info > > Writing /mypath/numpy/lib/python2.7/site-packages/numpy-1.6.1-py2.7.egg-info > > running install_clib > > > > Then > > > > I got: > > > > python > > Python 2.7.2 (default, Dec 20 2011, 12:32:10) > > [GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > > > >>> import numpy > > Traceback (most recent call last): > > File "", line 1, in > > ImportError: No module named numpy > > > > > > I have updated PATH for bin and lib of numpy. > > > You will need '/mypath/numpy/lib/python2.7/site-packages' in your > PYTHONPATH - have you done that, and does it show up with > > >>> import sys > >>> sys.path() > > in the Python shell? > > Cheers, > Derek > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 17:21:44 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 17:21:44 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? Message-ID: Howdy, Is it possible to get non-normalized eigenvectors from scipy.linalg.eig(a, b)? Preferably just by using numpy. BTW, Matlab/Octave provides this with its eig(a, b) function but I would like to use numpy for obvious reasons. Regards, Fahri -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 20:10:56 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 20:10:56 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: I'm probably missing something, but... Why would you want non-normalized eigenvectors? -=- Olivier 2011/12/20 Fahredd?n Basegmez > Howdy, > > Is it possible to get non-normalized eigenvectors from scipy.linalg.eig(a, > b)? Preferably just by using numpy. > > BTW, Matlab/Octave provides this with its eig(a, b) function but I would > like to use numpy for obvious reasons. > > Regards, > > Fahri > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 20:29:17 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 20:29:17 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: I am computing normal-mode frequency response of a mass-spring system. The algorithm I am using requires it. On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: > I'm probably missing something, but... Why would you want non-normalized > eigenvectors? > > -=- Olivier > > > 2011/12/20 Fahredd?n Basegmez > >> Howdy, >> >> Is it possible to get non-normalized eigenvectors from >> scipy.linalg.eig(a, b)? Preferably just by using numpy. >> >> BTW, Matlab/Octave provides this with its eig(a, b) function but I would >> like to use numpy for obvious reasons. >> >> Regards, >> >> Fahri >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 20:40:16 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 20:40:16 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: Hmm... ok ;) (sorry, I can't follow you there) Anyway, what kind of non-normalization are you after? I looked at the doc for Matlab and it just says eigenvectors are not normalized, without additional details... so it looks like it could be anything. -=- Olivier 2011/12/20 Fahredd?n Basegmez > I am computing normal-mode frequency response of a mass-spring system. > The algorithm I am using requires it. > > On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: > >> I'm probably missing something, but... Why would you want non-normalized >> eigenvectors? >> >> -=- Olivier >> >> >> 2011/12/20 Fahredd?n Basegmez >> >>> Howdy, >>> >>> Is it possible to get non-normalized eigenvectors from >>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>> >>> BTW, Matlab/Octave provides this with its eig(a, b) function but I would >>> like to use numpy for obvious reasons. >>> >>> Regards, >>> >>> Fahri >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 21:14:11 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 21:14:11 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: If I can get the same response as Matlab I would be all set. Octave results >> STIFM STIFM = Diagonal Matrix 1020 0 0 0 0 0 0 1020 0 0 0 0 0 0 1020 0 0 0 0 0 0 102000 0 0 0 0 0 0 102000 0 0 0 0 0 0 204000 >> MASSM MASSM = Diagonal Matrix 0.25907 0 0 0 0 0 0 0.25907 0 0 0 0 0 0 0.25907 0 0 0 0 0 0 26.00000 0 0 0 0 0 0 26.00000 0 0 0 0 0 0 26.00000 >> [a, b] = eig(STIFM, MASSM) a = 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 b = Diagonal Matrix 3923.1 0 0 0 0 0 0 3923.1 0 0 0 0 0 0 3937.2 0 0 0 0 0 0 3937.2 0 0 0 0 0 0 3937.2 0 0 0 0 0 0 7846.2 Numpy Results >>> STIFM array([[ 1020., 0., 0., 0., 0., 0.], [ 0., 1020., 0., 0., 0., 0.], [ 0., 0., 1020., 0., 0., 0.], [ 0., 0., 0., 102000., 0., 0.], [ 0., 0., 0., 0., 102000., 0.], [ 0., 0., 0., 0., 0., 204000.]]) >>> MASSM array([[ 0.25907, 0. , 0. , 0. , 0. , 0. ], [ 0. , 0.25907, 0. , 0. , 0. , 0. ], [ 0. , 0. , 0.25907, 0. , 0. , 0. ], [ 0. , 0. , 0. , 26. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 26. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 26. ]]) >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>> a array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, 3923.07692308, 7846.15384615]) >>> b array([[ 1., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0.], [ 0., 0., 0., 0., 1., 0.], [ 0., 0., 0., 0., 0., 1.]]) On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: > Hmm... ok ;) (sorry, I can't follow you there) > > Anyway, what kind of non-normalization are you after? I looked at the doc > for Matlab and it just says eigenvectors are not normalized, without > additional details... so it looks like it could be anything. > > > -=- Olivier > > 2011/12/20 Fahredd?n Basegmez > >> I am computing normal-mode frequency response of a mass-spring system. >> The algorithm I am using requires it. >> >> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >> >>> I'm probably missing something, but... Why would you want non-normalized >>> eigenvectors? >>> >>> -=- Olivier >>> >>> >>> 2011/12/20 Fahredd?n Basegmez >>> >>>> Howdy, >>>> >>>> Is it possible to get non-normalized eigenvectors from >>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>> >>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>> would like to use numpy for obvious reasons. >>>> >>>> Regards, >>>> >>>> Fahri >>>> >>> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 21:17:09 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 21:17:09 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: I should include the scipy response too I guess. scipy.linalg.eig(STIFM, MASSM) (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), array([[ 1., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0.], [ 0., 0., 0., 0., 1., 0.], [ 0., 0., 0., 0., 0., 1.]])) On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez wrote: > If I can get the same response as Matlab I would be all set. > > > Octave results > > >> STIFM > STIFM = > > Diagonal Matrix > > 1020 0 0 0 0 0 > 0 1020 0 0 0 0 > 0 0 1020 0 0 0 > 0 0 0 102000 0 0 > 0 0 0 0 102000 0 > 0 0 0 0 0 204000 > > >> MASSM > MASSM = > > Diagonal Matrix > > 0.25907 0 0 0 0 0 > 0 0.25907 0 0 0 0 > 0 0 0.25907 0 0 0 > 0 0 0 26.00000 0 0 > 0 0 0 0 26.00000 0 > 0 0 0 0 0 26.00000 > > >> [a, b] = eig(STIFM, MASSM) > a = > > 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 > 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 > 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 > 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 > 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 > 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 > > b = > > Diagonal Matrix > > 3923.1 0 0 0 0 0 > 0 3923.1 0 0 0 0 > 0 0 3937.2 0 0 0 > 0 0 0 3937.2 0 0 > 0 0 0 0 3937.2 0 > 0 0 0 0 0 7846.2 > > > Numpy Results > > >>> STIFM > array([[ 1020., 0., 0., 0., 0., 0.], > [ 0., 1020., 0., 0., 0., 0.], > [ 0., 0., 1020., 0., 0., 0.], > [ 0., 0., 0., 102000., 0., 0.], > [ 0., 0., 0., 0., 102000., 0.], > [ 0., 0., 0., 0., 0., 204000.]]) > > >>> MASSM > > array([[ 0.25907, 0. , 0. , 0. , 0. , 0. ], > [ 0. , 0.25907, 0. , 0. , 0. , 0. ], > [ 0. , 0. , 0.25907, 0. , 0. , 0. ], > [ 0. , 0. , 0. , 26. , 0. , 0. ], > [ 0. , 0. , 0. , 0. , 26. , 0. ], > [ 0. , 0. , 0. , 0. , 0. , 26. ]]) > > >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) > > >>> a > > array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, > 3923.07692308, 7846.15384615]) > > >>> b > > array([[ 1., 0., 0., 0., 0., 0.], > [ 0., 1., 0., 0., 0., 0.], > [ 0., 0., 1., 0., 0., 0.], > [ 0., 0., 0., 1., 0., 0.], > [ 0., 0., 0., 0., 1., 0.], > [ 0., 0., 0., 0., 0., 1.]]) > > On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: > >> Hmm... ok ;) (sorry, I can't follow you there) >> >> Anyway, what kind of non-normalization are you after? I looked at the doc >> for Matlab and it just says eigenvectors are not normalized, without >> additional details... so it looks like it could be anything. >> >> >> -=- Olivier >> >> 2011/12/20 Fahredd?n Basegmez >> >>> I am computing normal-mode frequency response of a mass-spring system. >>> The algorithm I am using requires it. >>> >>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>> >>>> I'm probably missing something, but... Why would you want >>>> non-normalized eigenvectors? >>>> >>>> -=- Olivier >>>> >>>> >>>> 2011/12/20 Fahredd?n Basegmez >>>> >>>>> Howdy, >>>>> >>>>> Is it possible to get non-normalized eigenvectors from >>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>> >>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>>> would like to use numpy for obvious reasons. >>>>> >>>>> Regards, >>>>> >>>>> Fahri >>>>> >>>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 20 21:19:23 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 Dec 2011 19:19:23 -0700 Subject: [Numpy-discussion] numpy1.6.1 install fortran compiler error In-Reply-To: References: <2E61F130-F107-45A6-B3F0-E61F9535C27A@astro.physik.uni-goettingen.de> <5D923135-1B24-4917-832A-0F3CCC850CC9@astro.physik.uni-goettingen.de> Message-ID: On Tue, Dec 20, 2011 at 3:16 PM, Jack Bryan wrote: > Hi, > I have set up PYTHONPATH : > > >>> sys.path > ['', '/mypath/numpy/lib/python2.7/site-packages', '/ mypath/python272/lib/python27.zip', > '/ mypath/python272/lib/python2.7', '/ mypath/python272/lib/python2.7/plat-linux2', > '/ mypath /python272/lib/python2.7/lib-tk', '/ mypath/python272/lib/python2.7/lib-old', > '/ mypath /python272/lib/python2.7/lib-dynload', '/ mypath > /python272/lib/python2.7/site-packages'] > > But still errors: > > >>> import numpy > Traceback (most recent call last): > File "", line 1, in > File "numpy/__init__.py", line 153, in > import polynomial > File "numpy/polynomial/__init__.py", line 18, in > from polynomial import Polynomial > File "numpy/polynomial/polynomial.py", line 60, in > from polytemplate import polytemplate > ImportError: cannot import name polytemplate > > Any help is appreciated. > > Strange. Is polytemplate.py present in /lib/python2.7/site-packages/numpy/polynomial/ ? You might try deleting the current numpy installation and build directory, then rebuilding and reinstalling. But I suspect something strange is going on with the paths. BTW, it is usual on the list to bottom post rather than top post, that is, your reply should be located under the previous replies. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Tue Dec 20 21:24:44 2011 From: irving at naml.us (Geoffrey Irving) Date: Tue, 20 Dec 2011 17:24:44 -0900 Subject: [Numpy-discussion] test code for user defined types in numpy Message-ID: Hello, As a followup to the prior thread on bugs in user defined types in numpy, I converted my rational number class from C++ to C and switched to 32 bits to remove the need for unportable 128 bit numbers. It should be usable as a fairly thorough test case for user defined types now. It does rather more than a minimal test case would need to do, but that isn't a problem unless you're concerned about code size. Let me know if any further changes are needed before it's suitable for inclusion in numpy as a test case. The repository is here: https://github.com/girving/rational The tests run under either py.test or nose. For completeness, my branch fixing all but one of the bugs I found in numpy user defined types is here: https://github.com/girving/numpy/tree/fixuserloops The remaining bug is that numpy incorrectly releases the GIL during casts even though NPY_NEEDS_API is set. The resulting crash goes away if the line defining ACQUIRE_GIL is uncommented. With the necessary locks in place, all my tests pass with my branch of numpy. I haven't tracked this one down and fixed it yet, but it shouldn't be hard to do so. Geoffrey From charlesr.harris at gmail.com Tue Dec 20 21:30:44 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 20 Dec 2011 19:30:44 -0700 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 7:24 PM, Geoffrey Irving wrote: > Hello, > > As a followup to the prior thread on bugs in user defined types in > numpy, I converted my rational number class from C++ to C and switched > to 32 bits to remove the need for unportable 128 bit numbers. It > should be usable as a fairly thorough test case for user defined types > now. It does rather more than a minimal test case would need to do, > but that isn't a problem unless you're concerned about code size. Let > me know if any further changes are needed before it's suitable for > inclusion in numpy as a test case. The repository is here: > > https://github.com/girving/rational > > The tests run under either py.test or nose. > > For completeness, my branch fixing all but one of the bugs I found in > numpy user defined types is here: > > https://github.com/girving/numpy/tree/fixuserloops > > The remaining bug is that numpy incorrectly releases the GIL during > casts even though NPY_NEEDS_API is set. The resulting crash goes away > if the line defining ACQUIRE_GIL is uncommented. With the necessary > locks in place, all my tests pass with my branch of numpy. I haven't > tracked this one down and fixed it yet, but it shouldn't be hard to do > so. > > Hey, that's great! Thanks for getting this all together. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 21:45:59 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 21:45:59 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: Hmm, sorry, I don't see any obvious logic that would explain how Octave obtains this result, although of course there is probably some logic... Anyway, since you seem to know what you want, can't you obtain the same result by doing whatever un-normalizing operation you are after? -=- Olivier 2011/12/20 Fahredd?n Basegmez > I should include the scipy response too I guess. > > > scipy.linalg.eig(STIFM, MASSM) > (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, > 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), > array([[ 1., 0., 0., 0., 0., 0.], > [ 0., 1., 0., 0., 0., 0.], > [ 0., 0., 1., 0., 0., 0.], > [ 0., 0., 0., 1., 0., 0.], > [ 0., 0., 0., 0., 1., 0.], > [ 0., 0., 0., 0., 0., 1.]])) > > On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez wrote: > >> If I can get the same response as Matlab I would be all set. >> >> >> Octave results >> >> >> STIFM >> STIFM = >> >> Diagonal Matrix >> >> 1020 0 0 0 0 0 >> 0 1020 0 0 0 0 >> 0 0 1020 0 0 0 >> 0 0 0 102000 0 0 >> 0 0 0 0 102000 0 >> 0 0 0 0 0 204000 >> >> >> MASSM >> MASSM = >> >> Diagonal Matrix >> >> 0.25907 0 0 0 0 0 >> 0 0.25907 0 0 0 0 >> 0 0 0.25907 0 0 0 >> 0 0 0 26.00000 0 0 >> 0 0 0 0 26.00000 0 >> 0 0 0 0 0 26.00000 >> >> >> [a, b] = eig(STIFM, MASSM) >> a = >> >> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >> >> b = >> >> Diagonal Matrix >> >> 3923.1 0 0 0 0 0 >> 0 3923.1 0 0 0 0 >> 0 0 3937.2 0 0 0 >> 0 0 0 3937.2 0 0 >> 0 0 0 0 3937.2 0 >> 0 0 0 0 0 7846.2 >> >> >> Numpy Results >> >> >>> STIFM >> array([[ 1020., 0., 0., 0., 0., 0.], >> [ 0., 1020., 0., 0., 0., 0.], >> [ 0., 0., 1020., 0., 0., 0.], >> [ 0., 0., 0., 102000., 0., 0.], >> [ 0., 0., 0., 0., 102000., 0.], >> [ 0., 0., 0., 0., 0., 204000.]]) >> >> >>> MASSM >> >> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. ], >> [ 0. , 0.25907, 0. , 0. , 0. , 0. ], >> [ 0. , 0. , 0.25907, 0. , 0. , 0. ], >> [ 0. , 0. , 0. , 26. , 0. , 0. ], >> [ 0. , 0. , 0. , 0. , 26. , 0. ], >> [ 0. , 0. , 0. , 0. , 0. , 26. >> ]]) >> >> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >> >> >>> a >> >> array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, >> 3923.07692308, 7846.15384615]) >> >> >>> b >> >> array([[ 1., 0., 0., 0., 0., 0.], >> [ 0., 1., 0., 0., 0., 0.], >> [ 0., 0., 1., 0., 0., 0.], >> [ 0., 0., 0., 1., 0., 0.], >> [ 0., 0., 0., 0., 1., 0.], >> [ 0., 0., 0., 0., 0., 1.]]) >> >> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >> >>> Hmm... ok ;) (sorry, I can't follow you there) >>> >>> Anyway, what kind of non-normalization are you after? I looked at the >>> doc for Matlab and it just says eigenvectors are not normalized, without >>> additional details... so it looks like it could be anything. >>> >>> >>> -=- Olivier >>> >>> 2011/12/20 Fahredd?n Basegmez >>> >>>> I am computing normal-mode frequency response of a mass-spring system. >>>> The algorithm I am using requires it. >>>> >>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>> >>>>> I'm probably missing something, but... Why would you want >>>>> non-normalized eigenvectors? >>>>> >>>>> -=- Olivier >>>>> >>>>> >>>>> 2011/12/20 Fahredd?n Basegmez >>>>> >>>>>> Howdy, >>>>>> >>>>>> Is it possible to get non-normalized eigenvectors from >>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>> >>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>>>> would like to use numpy for obvious reasons. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Fahri >>>>>> >>>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 22:05:06 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 22:05:06 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: I don't think I can do that. I can go to the normalized results but not the other way. On Tue, Dec 20, 2011 at 9:45 PM, Olivier Delalleau wrote: > Hmm, sorry, I don't see any obvious logic that would explain how Octave > obtains this result, although of course there is probably some logic... > > Anyway, since you seem to know what you want, can't you obtain the same > result by doing whatever un-normalizing operation you are after? > > > -=- Olivier > > 2011/12/20 Fahredd?n Basegmez > >> I should include the scipy response too I guess. >> >> >> scipy.linalg.eig(STIFM, MASSM) >> (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, >> 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), >> array([[ 1., 0., 0., 0., 0., 0.], >> [ 0., 1., 0., 0., 0., 0.], >> [ 0., 0., 1., 0., 0., 0.], >> [ 0., 0., 0., 1., 0., 0.], >> [ 0., 0., 0., 0., 1., 0.], >> [ 0., 0., 0., 0., 0., 1.]])) >> >> On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez wrote: >> >>> If I can get the same response as Matlab I would be all set. >>> >>> >>> Octave results >>> >>> >> STIFM >>> STIFM = >>> >>> Diagonal Matrix >>> >>> 1020 0 0 0 0 0 >>> 0 1020 0 0 0 0 >>> 0 0 1020 0 0 0 >>> 0 0 0 102000 0 0 >>> 0 0 0 0 102000 0 >>> 0 0 0 0 0 204000 >>> >>> >> MASSM >>> MASSM = >>> >>> Diagonal Matrix >>> >>> 0.25907 0 0 0 0 0 >>> 0 0.25907 0 0 0 0 >>> 0 0 0.25907 0 0 0 >>> 0 0 0 26.00000 0 0 >>> 0 0 0 0 26.00000 0 >>> 0 0 0 0 0 26.00000 >>> >>> >> [a, b] = eig(STIFM, MASSM) >>> a = >>> >>> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >>> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >>> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >>> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >>> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >>> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >>> >>> b = >>> >>> Diagonal Matrix >>> >>> 3923.1 0 0 0 0 0 >>> 0 3923.1 0 0 0 0 >>> 0 0 3937.2 0 0 0 >>> 0 0 0 3937.2 0 0 >>> 0 0 0 0 3937.2 0 >>> 0 0 0 0 0 7846.2 >>> >>> >>> Numpy Results >>> >>> >>> STIFM >>> array([[ 1020., 0., 0., 0., 0., 0.], >>> [ 0., 1020., 0., 0., 0., 0.], >>> [ 0., 0., 1020., 0., 0., 0.], >>> [ 0., 0., 0., 102000., 0., 0.], >>> [ 0., 0., 0., 0., 102000., 0.], >>> [ 0., 0., 0., 0., 0., 204000.]]) >>> >>> >>> MASSM >>> >>> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. >>> ], >>> [ 0. , 0.25907, 0. , 0. , 0. , 0. >>> ], >>> [ 0. , 0. , 0.25907, 0. , 0. , 0. >>> ], >>> [ 0. , 0. , 0. , 26. , 0. , 0. >>> ], >>> [ 0. , 0. , 0. , 0. , 26. , 0. >>> ], >>> [ 0. , 0. , 0. , 0. , 0. , 26. >>> ]]) >>> >>> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>> >>> >>> a >>> >>> array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, >>> 3923.07692308, 7846.15384615]) >>> >>> >>> b >>> >>> array([[ 1., 0., 0., 0., 0., 0.], >>> [ 0., 1., 0., 0., 0., 0.], >>> [ 0., 0., 1., 0., 0., 0.], >>> [ 0., 0., 0., 1., 0., 0.], >>> [ 0., 0., 0., 0., 1., 0.], >>> [ 0., 0., 0., 0., 0., 1.]]) >>> >>> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >>> >>>> Hmm... ok ;) (sorry, I can't follow you there) >>>> >>>> Anyway, what kind of non-normalization are you after? I looked at the >>>> doc for Matlab and it just says eigenvectors are not normalized, without >>>> additional details... so it looks like it could be anything. >>>> >>>> >>>> -=- Olivier >>>> >>>> 2011/12/20 Fahredd?n Basegmez >>>> >>>>> I am computing normal-mode frequency response of a mass-spring system. >>>>> The algorithm I am using requires it. >>>>> >>>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>>> >>>>>> I'm probably missing something, but... Why would you want >>>>>> non-normalized eigenvectors? >>>>>> >>>>>> -=- Olivier >>>>>> >>>>>> >>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>> >>>>>>> Howdy, >>>>>>> >>>>>>> Is it possible to get non-normalized eigenvectors from >>>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>>> >>>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>>>>> would like to use numpy for obvious reasons. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Fahri >>>>>>> >>>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 20 22:08:41 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 21 Dec 2011 14:08:41 +1100 Subject: [Numpy-discussion] find location of maximum values Message-ID: I have a netcdf file that contains hourly temperature data for a whole month. I would like to find the maximum temperature within that file and also the corresponding Latitude and Longitude and Time and then plot this. Below is the code I have so far. I think everything is working except for identifying the corresponding latitude and longitude. The latitude and longitude always seems to be in the same area (from file to file) and they are not where the maximum values are occurring. Any feedback will be greatly appreciated. from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime from numpy import ma as MA import os TSFCall=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': print "dealing with ncfiles:", path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][:] TIME=ncfile.variables['time'][:] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] ncfile.close() TSFCall.append(TSFC) big_array=N.ma.concatenate(TSFCall) tmax=MA.max(TSFC) print tmax t=TIME[tmax] indexTIME=N.where(TIME==t) TSFCmax=TSFC[tmax] print t, indexTIME print LAT[tmax], LON[tmax], TSFCmax cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(t) print ncfiletime timestr=str(ncfiletime) d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S') date_string = d.strftime('%Y%m%d_%H%M') map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') x,y=map(*N.meshgrid(LON,LAT)) map.tissot(LON[tmax], LAT[tmax], 0.15, 100, facecolor='none', edgecolor='black', linewidth=2) map.readshapefile(shapefile1, 'DSE_REGIONS') map.drawcoastlines(linewidth=0.5) map.drawstates() plt.title('test'+' %s'%date_string) ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,TSFCmax, 15, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h], ) cbar=plt.colorbar(CS, cax=cax, drawedges=True) plt.show() plt.close() -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 22:15:51 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 22:15:51 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: What I don't get is that "un-normalized" eigenvectors can be pretty much anything. If you care about the specific output of Matlab / Octave, it means you understand the particular "un-normalization" that these programs use. In that case you should be able to recover it from the normalized output from numpy. -=- Olivier 2011/12/20 Fahredd?n Basegmez > I don't think I can do that. I can go to the normalized results but not > the other way. > > > On Tue, Dec 20, 2011 at 9:45 PM, Olivier Delalleau wrote: > >> Hmm, sorry, I don't see any obvious logic that would explain how Octave >> obtains this result, although of course there is probably some logic... >> >> Anyway, since you seem to know what you want, can't you obtain the same >> result by doing whatever un-normalizing operation you are after? >> >> >> -=- Olivier >> >> 2011/12/20 Fahredd?n Basegmez >> >>> I should include the scipy response too I guess. >>> >>> >>> scipy.linalg.eig(STIFM, MASSM) >>> (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, >>> 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), >>> array([[ 1., 0., 0., 0., 0., 0.], >>> [ 0., 1., 0., 0., 0., 0.], >>> [ 0., 0., 1., 0., 0., 0.], >>> [ 0., 0., 0., 1., 0., 0.], >>> [ 0., 0., 0., 0., 1., 0.], >>> [ 0., 0., 0., 0., 0., 1.]])) >>> >>> On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez >> > wrote: >>> >>>> If I can get the same response as Matlab I would be all set. >>>> >>>> >>>> Octave results >>>> >>>> >> STIFM >>>> STIFM = >>>> >>>> Diagonal Matrix >>>> >>>> 1020 0 0 0 0 0 >>>> 0 1020 0 0 0 0 >>>> 0 0 1020 0 0 0 >>>> 0 0 0 102000 0 0 >>>> 0 0 0 0 102000 0 >>>> 0 0 0 0 0 204000 >>>> >>>> >> MASSM >>>> MASSM = >>>> >>>> Diagonal Matrix >>>> >>>> 0.25907 0 0 0 0 0 >>>> 0 0.25907 0 0 0 0 >>>> 0 0 0.25907 0 0 0 >>>> 0 0 0 26.00000 0 0 >>>> 0 0 0 0 26.00000 0 >>>> 0 0 0 0 0 26.00000 >>>> >>>> >> [a, b] = eig(STIFM, MASSM) >>>> a = >>>> >>>> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >>>> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >>>> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >>>> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >>>> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >>>> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >>>> >>>> b = >>>> >>>> Diagonal Matrix >>>> >>>> 3923.1 0 0 0 0 0 >>>> 0 3923.1 0 0 0 0 >>>> 0 0 3937.2 0 0 0 >>>> 0 0 0 3937.2 0 0 >>>> 0 0 0 0 3937.2 0 >>>> 0 0 0 0 0 7846.2 >>>> >>>> >>>> Numpy Results >>>> >>>> >>> STIFM >>>> array([[ 1020., 0., 0., 0., 0., 0.], >>>> [ 0., 1020., 0., 0., 0., 0.], >>>> [ 0., 0., 1020., 0., 0., 0.], >>>> [ 0., 0., 0., 102000., 0., 0.], >>>> [ 0., 0., 0., 0., 102000., 0.], >>>> [ 0., 0., 0., 0., 0., 204000.]]) >>>> >>>> >>> MASSM >>>> >>>> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. >>>> ], >>>> [ 0. , 0.25907, 0. , 0. , 0. , 0. >>>> ], >>>> [ 0. , 0. , 0.25907, 0. , 0. , 0. >>>> ], >>>> [ 0. , 0. , 0. , 26. , 0. , 0. >>>> ], >>>> [ 0. , 0. , 0. , 0. , 26. , 0. >>>> ], >>>> [ 0. , 0. , 0. , 0. , 0. , 26. >>>> ]]) >>>> >>>> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>>> >>>> >>> a >>>> >>>> array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, >>>> 3923.07692308, 7846.15384615]) >>>> >>>> >>> b >>>> >>>> array([[ 1., 0., 0., 0., 0., 0.], >>>> [ 0., 1., 0., 0., 0., 0.], >>>> [ 0., 0., 1., 0., 0., 0.], >>>> [ 0., 0., 0., 1., 0., 0.], >>>> [ 0., 0., 0., 0., 1., 0.], >>>> [ 0., 0., 0., 0., 0., 1.]]) >>>> >>>> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >>>> >>>>> Hmm... ok ;) (sorry, I can't follow you there) >>>>> >>>>> Anyway, what kind of non-normalization are you after? I looked at the >>>>> doc for Matlab and it just says eigenvectors are not normalized, without >>>>> additional details... so it looks like it could be anything. >>>>> >>>>> >>>>> -=- Olivier >>>>> >>>>> 2011/12/20 Fahredd?n Basegmez >>>>> >>>>>> I am computing normal-mode frequency response of a mass-spring >>>>>> system. The algorithm I am using requires it. >>>>>> >>>>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>>>> >>>>>>> I'm probably missing something, but... Why would you want >>>>>>> non-normalized eigenvectors? >>>>>>> >>>>>>> -=- Olivier >>>>>>> >>>>>>> >>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>> >>>>>>>> Howdy, >>>>>>>> >>>>>>>> Is it possible to get non-normalized eigenvectors from >>>>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>>>> >>>>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>>>>>> would like to use numpy for obvious reasons. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Fahri >>>>>>>> >>>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 22:20:36 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 22:20:36 -0500 Subject: [Numpy-discussion] find location of maximum values In-Reply-To: References: Message-ID: I'm sorry I don't have time to look closely at your code and this may not be helpful, but just in case... I find it suspicious that you *seem* (by quickly glancing at the code) to be taking TIME[max(temperature)] instead of TIME[argmax(temperature)]. -=- Olivier 2011/12/20 questions anon > I have a netcdf file that contains hourly temperature data for a whole > month. I would like to find the maximum temperature within that file and > also the corresponding Latitude and Longitude and Time and then plot this. > Below is the code I have so far. I think everything is working except for > identifying the corresponding latitude and longitude. The latitude and > longitude always seems to be in the same area (from file to file) and they > are not where the maximum values are occurring. > Any feedback will be greatly appreciated. > > from netCDF4 import Dataset > import matplotlib.pyplot as plt > import numpy as N > from mpl_toolkits.basemap import Basemap > from netcdftime import utime > from datetime import datetime > from numpy import ma as MA > import os > > TSFCall=[] > > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > print "dealing with ncfiles:", path+ncfile > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][:] > TIME=ncfile.variables['time'][:] > LAT=ncfile.variables['latitude'][:] > LON=ncfile.variables['longitude'][:] > ncfile.close() > > TSFCall.append(TSFC) > > big_array=N.ma.concatenate(TSFCall) > tmax=MA.max(TSFC) > print tmax > t=TIME[tmax] > indexTIME=N.where(TIME==t) > TSFCmax=TSFC[tmax] > print t, indexTIME > print LAT[tmax], LON[tmax], TSFCmax > > cdftime=utime('seconds since 1970-01-01 00:00:00') > ncfiletime=cdftime.num2date(t) > print ncfiletime > timestr=str(ncfiletime) > d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S') > date_string = d.strftime('%Y%m%d_%H%M') > > map = > Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') > x,y=map(*N.meshgrid(LON,LAT)) > map.tissot(LON[tmax], LAT[tmax], 0.15, 100, facecolor='none', > edgecolor='black', linewidth=2) > map.readshapefile(shapefile1, 'DSE_REGIONS') > map.drawcoastlines(linewidth=0.5) > map.drawstates() > plt.title('test'+' %s'%date_string) > ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] > CS = map.contourf(x,y,TSFCmax, 15, cmap=plt.cm.jet) > l,b,w,h =0.1,0.1,0.8,0.8 > cax = plt.axes([l+w+0.025, b, 0.025, h], ) > cbar=plt.colorbar(CS, cax=cax, drawedges=True) > > plt.show() > plt.close() > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mangabasi at gmail.com Tue Dec 20 22:30:46 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 22:30:46 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: I think I am interested in the non-normalized eigenvectors not the un-normalized ones. Once the eig function computes the generalized eigenvectors I would like to use them as they are. I would think this would be a common request since the normal-mode frequency response is used in many different fields like chemical and biomolecular sciences as well as engineering and physics. Mathematically there may be no difference between the normalized and non-normalized eigenvectors but physically there is. In my case those values represent deflections. Advantage of the normal-modes is you can apply damping in each direction independent of each other. Amount of damping we apply may be dependent on those deflections so I would need to use the non-normalized results. On Tue, Dec 20, 2011 at 10:15 PM, Olivier Delalleau wrote: > What I don't get is that "un-normalized" eigenvectors can be pretty much > anything. If you care about the specific output of Matlab / Octave, it > means you understand the particular "un-normalization" that these programs > use. In that case you should be able to recover it from the normalized > output from numpy. > > > -=- Olivier > > 2011/12/20 Fahredd?n Basegmez > >> I don't think I can do that. I can go to the normalized results but not >> the other way. >> >> >> On Tue, Dec 20, 2011 at 9:45 PM, Olivier Delalleau wrote: >> >>> Hmm, sorry, I don't see any obvious logic that would explain how Octave >>> obtains this result, although of course there is probably some logic... >>> >>> Anyway, since you seem to know what you want, can't you obtain the same >>> result by doing whatever un-normalizing operation you are after? >>> >>> >>> -=- Olivier >>> >>> 2011/12/20 Fahredd?n Basegmez >>> >>>> I should include the scipy response too I guess. >>>> >>>> >>>> scipy.linalg.eig(STIFM, MASSM) >>>> (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, >>>> 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), >>>> array([[ 1., 0., 0., 0., 0., 0.], >>>> [ 0., 1., 0., 0., 0., 0.], >>>> [ 0., 0., 1., 0., 0., 0.], >>>> [ 0., 0., 0., 1., 0., 0.], >>>> [ 0., 0., 0., 0., 1., 0.], >>>> [ 0., 0., 0., 0., 0., 1.]])) >>>> >>>> On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez < >>>> mangabasi at gmail.com> wrote: >>>> >>>>> If I can get the same response as Matlab I would be all set. >>>>> >>>>> >>>>> Octave results >>>>> >>>>> >> STIFM >>>>> STIFM = >>>>> >>>>> Diagonal Matrix >>>>> >>>>> 1020 0 0 0 0 0 >>>>> 0 1020 0 0 0 0 >>>>> 0 0 1020 0 0 0 >>>>> 0 0 0 102000 0 0 >>>>> 0 0 0 0 102000 0 >>>>> 0 0 0 0 0 204000 >>>>> >>>>> >> MASSM >>>>> MASSM = >>>>> >>>>> Diagonal Matrix >>>>> >>>>> 0.25907 0 0 0 0 0 >>>>> 0 0.25907 0 0 0 0 >>>>> 0 0 0.25907 0 0 0 >>>>> 0 0 0 26.00000 0 0 >>>>> 0 0 0 0 26.00000 0 >>>>> 0 0 0 0 0 26.00000 >>>>> >>>>> >> [a, b] = eig(STIFM, MASSM) >>>>> a = >>>>> >>>>> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >>>>> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >>>>> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >>>>> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >>>>> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >>>>> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >>>>> >>>>> b = >>>>> >>>>> Diagonal Matrix >>>>> >>>>> 3923.1 0 0 0 0 0 >>>>> 0 3923.1 0 0 0 0 >>>>> 0 0 3937.2 0 0 0 >>>>> 0 0 0 3937.2 0 0 >>>>> 0 0 0 0 3937.2 0 >>>>> 0 0 0 0 0 7846.2 >>>>> >>>>> >>>>> Numpy Results >>>>> >>>>> >>> STIFM >>>>> array([[ 1020., 0., 0., 0., 0., 0.], >>>>> [ 0., 1020., 0., 0., 0., 0.], >>>>> [ 0., 0., 1020., 0., 0., 0.], >>>>> [ 0., 0., 0., 102000., 0., 0.], >>>>> [ 0., 0., 0., 0., 102000., 0.], >>>>> [ 0., 0., 0., 0., 0., 204000.]]) >>>>> >>>>> >>> MASSM >>>>> >>>>> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. >>>>> ], >>>>> [ 0. , 0.25907, 0. , 0. , 0. , 0. >>>>> ], >>>>> [ 0. , 0. , 0.25907, 0. , 0. , 0. >>>>> ], >>>>> [ 0. , 0. , 0. , 26. , 0. , 0. >>>>> ], >>>>> [ 0. , 0. , 0. , 0. , 26. , 0. >>>>> ], >>>>> [ 0. , 0. , 0. , 0. , 0. , 26. >>>>> ]]) >>>>> >>>>> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>>>> >>>>> >>> a >>>>> >>>>> array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, >>>>> 3923.07692308, 7846.15384615]) >>>>> >>>>> >>> b >>>>> >>>>> array([[ 1., 0., 0., 0., 0., 0.], >>>>> [ 0., 1., 0., 0., 0., 0.], >>>>> [ 0., 0., 1., 0., 0., 0.], >>>>> [ 0., 0., 0., 1., 0., 0.], >>>>> [ 0., 0., 0., 0., 1., 0.], >>>>> [ 0., 0., 0., 0., 0., 1.]]) >>>>> >>>>> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >>>>> >>>>>> Hmm... ok ;) (sorry, I can't follow you there) >>>>>> >>>>>> Anyway, what kind of non-normalization are you after? I looked at the >>>>>> doc for Matlab and it just says eigenvectors are not normalized, without >>>>>> additional details... so it looks like it could be anything. >>>>>> >>>>>> >>>>>> -=- Olivier >>>>>> >>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>> >>>>>>> I am computing normal-mode frequency response of a mass-spring >>>>>>> system. The algorithm I am using requires it. >>>>>>> >>>>>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>>>>> >>>>>>>> I'm probably missing something, but... Why would you want >>>>>>>> non-normalized eigenvectors? >>>>>>>> >>>>>>>> -=- Olivier >>>>>>>> >>>>>>>> >>>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>>> >>>>>>>>> Howdy, >>>>>>>>> >>>>>>>>> Is it possible to get non-normalized eigenvectors from >>>>>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>>>>> >>>>>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but I >>>>>>>>> would like to use numpy for obvious reasons. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Fahri >>>>>>>>> >>>>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Tue Dec 20 23:01:52 2011 From: shish at keba.be (Olivier Delalleau) Date: Tue, 20 Dec 2011 23:01:52 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: Ok well I'm sorry, I have no idea what would be the difference between "non-normalized" and "un-normalized". In PCA you may decide to scale your eigenvectors by the inverse of the square root of their corresponding eigenvalue so that your projected data has unit variance, but it doesn't seem to be what you're after. Can you point to a link that explains what are the "non-normalized eigenvectors" in your application? -=- Olivier 2011/12/20 Fahredd?n Basegmez > I think I am interested in the non-normalized eigenvectors not the > un-normalized ones. Once the eig function computes the generalized > eigenvectors I would like to use them as they are. > I would think this would be a common request since the normal-mode > frequency response is used in many different fields like chemical and > biomolecular sciences as well as engineering and physics. Mathematically > there may be no difference between the normalized and non-normalized > eigenvectors but physically there is. In my case those values represent > deflections. Advantage of the normal-modes is you can apply damping in > each direction independent of each other. Amount of damping we apply may > be dependent on those deflections so I would need to use the non-normalized > results. > > > On Tue, Dec 20, 2011 at 10:15 PM, Olivier Delalleau wrote: > >> What I don't get is that "un-normalized" eigenvectors can be pretty much >> anything. If you care about the specific output of Matlab / Octave, it >> means you understand the particular "un-normalization" that these programs >> use. In that case you should be able to recover it from the normalized >> output from numpy. >> >> >> -=- Olivier >> >> 2011/12/20 Fahredd?n Basegmez >> >>> I don't think I can do that. I can go to the normalized results but not >>> the other way. >>> >>> >>> On Tue, Dec 20, 2011 at 9:45 PM, Olivier Delalleau wrote: >>> >>>> Hmm, sorry, I don't see any obvious logic that would explain how Octave >>>> obtains this result, although of course there is probably some logic... >>>> >>>> Anyway, since you seem to know what you want, can't you obtain the same >>>> result by doing whatever un-normalizing operation you are after? >>>> >>>> >>>> -=- Olivier >>>> >>>> 2011/12/20 Fahredd?n Basegmez >>>> >>>>> I should include the scipy response too I guess. >>>>> >>>>> >>>>> scipy.linalg.eig(STIFM, MASSM) >>>>> (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, >>>>> 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), >>>>> array([[ 1., 0., 0., 0., 0., 0.], >>>>> [ 0., 1., 0., 0., 0., 0.], >>>>> [ 0., 0., 1., 0., 0., 0.], >>>>> [ 0., 0., 0., 1., 0., 0.], >>>>> [ 0., 0., 0., 0., 1., 0.], >>>>> [ 0., 0., 0., 0., 0., 1.]])) >>>>> >>>>> On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez < >>>>> mangabasi at gmail.com> wrote: >>>>> >>>>>> If I can get the same response as Matlab I would be all set. >>>>>> >>>>>> >>>>>> Octave results >>>>>> >>>>>> >> STIFM >>>>>> STIFM = >>>>>> >>>>>> Diagonal Matrix >>>>>> >>>>>> 1020 0 0 0 0 0 >>>>>> 0 1020 0 0 0 0 >>>>>> 0 0 1020 0 0 0 >>>>>> 0 0 0 102000 0 0 >>>>>> 0 0 0 0 102000 0 >>>>>> 0 0 0 0 0 204000 >>>>>> >>>>>> >> MASSM >>>>>> MASSM = >>>>>> >>>>>> Diagonal Matrix >>>>>> >>>>>> 0.25907 0 0 0 0 0 >>>>>> 0 0.25907 0 0 0 0 >>>>>> 0 0 0.25907 0 0 0 >>>>>> 0 0 0 26.00000 0 0 >>>>>> 0 0 0 0 26.00000 0 >>>>>> 0 0 0 0 0 26.00000 >>>>>> >>>>>> >> [a, b] = eig(STIFM, MASSM) >>>>>> a = >>>>>> >>>>>> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >>>>>> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >>>>>> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >>>>>> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >>>>>> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >>>>>> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >>>>>> >>>>>> b = >>>>>> >>>>>> Diagonal Matrix >>>>>> >>>>>> 3923.1 0 0 0 0 0 >>>>>> 0 3923.1 0 0 0 0 >>>>>> 0 0 3937.2 0 0 0 >>>>>> 0 0 0 3937.2 0 0 >>>>>> 0 0 0 0 3937.2 0 >>>>>> 0 0 0 0 0 7846.2 >>>>>> >>>>>> >>>>>> Numpy Results >>>>>> >>>>>> >>> STIFM >>>>>> array([[ 1020., 0., 0., 0., 0., 0.], >>>>>> [ 0., 1020., 0., 0., 0., 0.], >>>>>> [ 0., 0., 1020., 0., 0., 0.], >>>>>> [ 0., 0., 0., 102000., 0., 0.], >>>>>> [ 0., 0., 0., 0., 102000., 0.], >>>>>> [ 0., 0., 0., 0., 0., 204000.]]) >>>>>> >>>>>> >>> MASSM >>>>>> >>>>>> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. >>>>>> ], >>>>>> [ 0. , 0.25907, 0. , 0. , 0. , 0. >>>>>> ], >>>>>> [ 0. , 0. , 0.25907, 0. , 0. , 0. >>>>>> ], >>>>>> [ 0. , 0. , 0. , 26. , 0. , 0. >>>>>> ], >>>>>> [ 0. , 0. , 0. , 0. , 26. , 0. >>>>>> ], >>>>>> [ 0. , 0. , 0. , 0. , 0. , 26. >>>>>> ]]) >>>>>> >>>>>> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>>>>> >>>>>> >>> a >>>>>> >>>>>> array([ 3937.15984097, 3937.15984097, 3937.15984097, 3923.07692308, >>>>>> 3923.07692308, 7846.15384615]) >>>>>> >>>>>> >>> b >>>>>> >>>>>> array([[ 1., 0., 0., 0., 0., 0.], >>>>>> [ 0., 1., 0., 0., 0., 0.], >>>>>> [ 0., 0., 1., 0., 0., 0.], >>>>>> [ 0., 0., 0., 1., 0., 0.], >>>>>> [ 0., 0., 0., 0., 1., 0.], >>>>>> [ 0., 0., 0., 0., 0., 1.]]) >>>>>> >>>>>> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >>>>>> >>>>>>> Hmm... ok ;) (sorry, I can't follow you there) >>>>>>> >>>>>>> Anyway, what kind of non-normalization are you after? I looked at >>>>>>> the doc for Matlab and it just says eigenvectors are not normalized, >>>>>>> without additional details... so it looks like it could be anything. >>>>>>> >>>>>>> >>>>>>> -=- Olivier >>>>>>> >>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>> >>>>>>>> I am computing normal-mode frequency response of a mass-spring >>>>>>>> system. The algorithm I am using requires it. >>>>>>>> >>>>>>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>>>>>> >>>>>>>>> I'm probably missing something, but... Why would you want >>>>>>>>> non-normalized eigenvectors? >>>>>>>>> >>>>>>>>> -=- Olivier >>>>>>>>> >>>>>>>>> >>>>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>>>> >>>>>>>>>> Howdy, >>>>>>>>>> >>>>>>>>>> Is it possible to get non-normalized eigenvectors from >>>>>>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>>>>>> >>>>>>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but >>>>>>>>>> I would like to use numpy for obvious reasons. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> Fahri >>>>>>>>>> >>>>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Dec 20 23:01:58 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 21 Dec 2011 15:01:58 +1100 Subject: [Numpy-discussion] find location of maximum values In-Reply-To: References: Message-ID: ok thanks, a quick try at using it resulted in: IndexError: index out of bounds but I may need to do abit more investigating to understand how it works. thanks On Wed, Dec 21, 2011 at 2:20 PM, Olivier Delalleau wrote: > I'm sorry I don't have time to look closely at your code and this may not > be helpful, but just in case... I find it suspicious that you *seem* (by > quickly glancing at the code) to be taking TIME[max(temperature)] instead > of TIME[argmax(temperature)]. > > -=- Olivier > > 2011/12/20 questions anon > >> I have a netcdf file that contains hourly temperature data for a whole >> month. I would like to find the maximum temperature within that file and >> also the corresponding Latitude and Longitude and Time and then plot this. >> Below is the code I have so far. I think everything is working except for >> identifying the corresponding latitude and longitude. The latitude and >> longitude always seems to be in the same area (from file to file) and they >> are not where the maximum values are occurring. >> Any feedback will be greatly appreciated. >> >> from netCDF4 import Dataset >> import matplotlib.pyplot as plt >> import numpy as N >> from mpl_toolkits.basemap import Basemap >> from netcdftime import utime >> from datetime import datetime >> from numpy import ma as MA >> import os >> >> TSFCall=[] >> >> for (path, dirs, files) in os.walk(MainFolder): >> for dir in dirs: >> print dir >> path=path+'/' >> for ncfile in files: >> if ncfile[-3:]=='.nc': >> print "dealing with ncfiles:", path+ncfile >> ncfile=os.path.join(path,ncfile) >> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >> TSFC=ncfile.variables['T_SFC'][:] >> TIME=ncfile.variables['time'][:] >> LAT=ncfile.variables['latitude'][:] >> LON=ncfile.variables['longitude'][:] >> ncfile.close() >> >> TSFCall.append(TSFC) >> >> big_array=N.ma.concatenate(TSFCall) >> tmax=MA.max(TSFC) >> print tmax >> t=TIME[tmax] >> indexTIME=N.where(TIME==t) >> TSFCmax=TSFC[tmax] >> print t, indexTIME >> print LAT[tmax], LON[tmax], TSFCmax >> >> cdftime=utime('seconds since 1970-01-01 00:00:00') >> ncfiletime=cdftime.num2date(t) >> print ncfiletime >> timestr=str(ncfiletime) >> d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S') >> date_string = d.strftime('%Y%m%d_%H%M') >> >> map = >> Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') >> x,y=map(*N.meshgrid(LON,LAT)) >> map.tissot(LON[tmax], LAT[tmax], 0.15, 100, facecolor='none', >> edgecolor='black', linewidth=2) >> map.readshapefile(shapefile1, 'DSE_REGIONS') >> map.drawcoastlines(linewidth=0.5) >> map.drawstates() >> plt.title('test'+' %s'%date_string) >> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] >> CS = map.contourf(x,y,TSFCmax, 15, cmap=plt.cm.jet) >> l,b,w,h =0.1,0.1,0.8,0.8 >> cax = plt.axes([l+w+0.025, b, 0.025, h], ) >> cbar=plt.colorbar(CS, cax=cax, drawedges=True) >> >> plt.show() >> plt.close() >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue Dec 20 23:02:03 2011 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 20 Dec 2011 23:02:03 -0500 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: This is really excellent. I would like to take a stab at getting this pulled in to the code base --- and fixing the GIL issue --- if someone hasn't beat me to it. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Dec 20, 2011, at 9:24 PM, Geoffrey Irving wrote: > Hello, > > As a followup to the prior thread on bugs in user defined types in > numpy, I converted my rational number class from C++ to C and switched > to 32 bits to remove the need for unportable 128 bit numbers. It > should be usable as a fairly thorough test case for user defined types > now. It does rather more than a minimal test case would need to do, > but that isn't a problem unless you're concerned about code size. Let > me know if any further changes are needed before it's suitable for > inclusion in numpy as a test case. The repository is here: > > https://github.com/girving/rational > > The tests run under either py.test or nose. > > For completeness, my branch fixing all but one of the bugs I found in > numpy user defined types is here: > > https://github.com/girving/numpy/tree/fixuserloops > > The remaining bug is that numpy incorrectly releases the GIL during > casts even though NPY_NEEDS_API is set. The resulting crash goes away > if the line defining ACQUIRE_GIL is uncommented. With the necessary > locks in place, all my tests pass with my branch of numpy. I haven't > tracked this one down and fixed it yet, but it shouldn't be hard to do > so. > > Geoffrey > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From mangabasi at gmail.com Tue Dec 20 23:14:48 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Tue, 20 Dec 2011 23:14:48 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: Message-ID: Sorry about that. I don't think that terminology is commonly used. This is what I mean. Let's say I solve the equations and compute the eigenvalues and eigenvectors for the given two matrices. I call these results "non-normalized". Then they can be normalized. Once they are normalized if I multiply them by any scalar they would become "un-normalized". They would still be eigenvectors but not necessarily the "non-normalized" ones. I think there is a specific algorithm that Matlab uses to solve them but I do not know what that algorithm is. On Tue, Dec 20, 2011 at 11:01 PM, Olivier Delalleau wrote: > Ok well I'm sorry, I have no idea what would be the difference between > "non-normalized" and "un-normalized". > > In PCA you may decide to scale your eigenvectors by the inverse of the > square root of their corresponding eigenvalue so that your projected data > has unit variance, but it doesn't seem to be what you're after. > > Can you point to a link that explains what are the "non-normalized > eigenvectors" in your application? > > > -=- Olivier > > 2011/12/20 Fahredd?n Basegmez > >> I think I am interested in the non-normalized eigenvectors not the >> un-normalized ones. Once the eig function computes the generalized >> eigenvectors I would like to use them as they are. >> I would think this would be a common request since the normal-mode >> frequency response is used in many different fields like chemical and >> biomolecular sciences as well as engineering and physics. Mathematically >> there may be no difference between the normalized and non-normalized >> eigenvectors but physically there is. In my case those values represent >> deflections. Advantage of the normal-modes is you can apply damping in >> each direction independent of each other. Amount of damping we apply may >> be dependent on those deflections so I would need to use the non-normalized >> results. >> >> >> On Tue, Dec 20, 2011 at 10:15 PM, Olivier Delalleau wrote: >> >>> What I don't get is that "un-normalized" eigenvectors can be pretty much >>> anything. If you care about the specific output of Matlab / Octave, it >>> means you understand the particular "un-normalization" that these programs >>> use. In that case you should be able to recover it from the normalized >>> output from numpy. >>> >>> >>> -=- Olivier >>> >>> 2011/12/20 Fahredd?n Basegmez >>> >>>> I don't think I can do that. I can go to the normalized results but >>>> not the other way. >>>> >>>> >>>> On Tue, Dec 20, 2011 at 9:45 PM, Olivier Delalleau wrote: >>>> >>>>> Hmm, sorry, I don't see any obvious logic that would explain how >>>>> Octave obtains this result, although of course there is probably some >>>>> logic... >>>>> >>>>> Anyway, since you seem to know what you want, can't you obtain the >>>>> same result by doing whatever un-normalizing operation you are after? >>>>> >>>>> >>>>> -=- Olivier >>>>> >>>>> 2011/12/20 Fahredd?n Basegmez >>>>> >>>>>> I should include the scipy response too I guess. >>>>>> >>>>>> >>>>>> scipy.linalg.eig(STIFM, MASSM) >>>>>> (array([ 3937.15984097+0.j, 3937.15984097+0.j, 3937.15984097+0.j, >>>>>> 3923.07692308+0.j, 3923.07692308+0.j, 7846.15384615+0.j]), >>>>>> array([[ 1., 0., 0., 0., 0., 0.], >>>>>> [ 0., 1., 0., 0., 0., 0.], >>>>>> [ 0., 0., 1., 0., 0., 0.], >>>>>> [ 0., 0., 0., 1., 0., 0.], >>>>>> [ 0., 0., 0., 0., 1., 0.], >>>>>> [ 0., 0., 0., 0., 0., 1.]])) >>>>>> >>>>>> On Tue, Dec 20, 2011 at 9:14 PM, Fahredd?n Basegmez < >>>>>> mangabasi at gmail.com> wrote: >>>>>> >>>>>>> If I can get the same response as Matlab I would be all set. >>>>>>> >>>>>>> >>>>>>> Octave results >>>>>>> >>>>>>> >> STIFM >>>>>>> STIFM = >>>>>>> >>>>>>> Diagonal Matrix >>>>>>> >>>>>>> 1020 0 0 0 0 0 >>>>>>> 0 1020 0 0 0 0 >>>>>>> 0 0 1020 0 0 0 >>>>>>> 0 0 0 102000 0 0 >>>>>>> 0 0 0 0 102000 0 >>>>>>> 0 0 0 0 0 204000 >>>>>>> >>>>>>> >> MASSM >>>>>>> MASSM = >>>>>>> >>>>>>> Diagonal Matrix >>>>>>> >>>>>>> 0.25907 0 0 0 0 0 >>>>>>> 0 0.25907 0 0 0 0 >>>>>>> 0 0 0.25907 0 0 0 >>>>>>> 0 0 0 26.00000 0 0 >>>>>>> 0 0 0 0 26.00000 0 >>>>>>> 0 0 0 0 0 26.00000 >>>>>>> >>>>>>> >> [a, b] = eig(STIFM, MASSM) >>>>>>> a = >>>>>>> >>>>>>> 0.00000 0.00000 0.00000 1.96468 0.00000 0.00000 >>>>>>> 0.00000 0.00000 0.00000 0.00000 1.96468 0.00000 >>>>>>> 0.00000 0.00000 1.96468 0.00000 0.00000 0.00000 >>>>>>> 0.19612 0.00000 0.00000 0.00000 0.00000 0.00000 >>>>>>> 0.00000 0.19612 0.00000 0.00000 0.00000 0.00000 >>>>>>> 0.00000 0.00000 0.00000 0.00000 0.00000 0.19612 >>>>>>> >>>>>>> b = >>>>>>> >>>>>>> Diagonal Matrix >>>>>>> >>>>>>> 3923.1 0 0 0 0 0 >>>>>>> 0 3923.1 0 0 0 0 >>>>>>> 0 0 3937.2 0 0 0 >>>>>>> 0 0 0 3937.2 0 0 >>>>>>> 0 0 0 0 3937.2 0 >>>>>>> 0 0 0 0 0 7846.2 >>>>>>> >>>>>>> >>>>>>> Numpy Results >>>>>>> >>>>>>> >>> STIFM >>>>>>> array([[ 1020., 0., 0., 0., 0., 0.], >>>>>>> [ 0., 1020., 0., 0., 0., 0.], >>>>>>> [ 0., 0., 1020., 0., 0., 0.], >>>>>>> [ 0., 0., 0., 102000., 0., 0.], >>>>>>> [ 0., 0., 0., 0., 102000., 0.], >>>>>>> [ 0., 0., 0., 0., 0., 204000.]]) >>>>>>> >>>>>>> >>> MASSM >>>>>>> >>>>>>> array([[ 0.25907, 0. , 0. , 0. , 0. , 0. >>>>>>> ], >>>>>>> [ 0. , 0.25907, 0. , 0. , 0. , 0. >>>>>>> ], >>>>>>> [ 0. , 0. , 0.25907, 0. , 0. , 0. >>>>>>> ], >>>>>>> [ 0. , 0. , 0. , 26. , 0. , 0. >>>>>>> ], >>>>>>> [ 0. , 0. , 0. , 0. , 26. , 0. >>>>>>> ], >>>>>>> [ 0. , 0. , 0. , 0. , 0. , 26. >>>>>>> ]]) >>>>>>> >>>>>>> >>> a, b = linalg.eig(dot( linalg.pinv(MASSM), STIFM)) >>>>>>> >>>>>>> >>> a >>>>>>> >>>>>>> array([ 3937.15984097, 3937.15984097, 3937.15984097, >>>>>>> 3923.07692308, >>>>>>> 3923.07692308, 7846.15384615]) >>>>>>> >>>>>>> >>> b >>>>>>> >>>>>>> array([[ 1., 0., 0., 0., 0., 0.], >>>>>>> [ 0., 1., 0., 0., 0., 0.], >>>>>>> [ 0., 0., 1., 0., 0., 0.], >>>>>>> [ 0., 0., 0., 1., 0., 0.], >>>>>>> [ 0., 0., 0., 0., 1., 0.], >>>>>>> [ 0., 0., 0., 0., 0., 1.]]) >>>>>>> >>>>>>> On Tue, Dec 20, 2011 at 8:40 PM, Olivier Delalleau wrote: >>>>>>> >>>>>>>> Hmm... ok ;) (sorry, I can't follow you there) >>>>>>>> >>>>>>>> Anyway, what kind of non-normalization are you after? I looked at >>>>>>>> the doc for Matlab and it just says eigenvectors are not normalized, >>>>>>>> without additional details... so it looks like it could be anything. >>>>>>>> >>>>>>>> >>>>>>>> -=- Olivier >>>>>>>> >>>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>>> >>>>>>>>> I am computing normal-mode frequency response of a mass-spring >>>>>>>>> system. The algorithm I am using requires it. >>>>>>>>> >>>>>>>>> On Tue, Dec 20, 2011 at 8:10 PM, Olivier Delalleau wrote: >>>>>>>>> >>>>>>>>>> I'm probably missing something, but... Why would you want >>>>>>>>>> non-normalized eigenvectors? >>>>>>>>>> >>>>>>>>>> -=- Olivier >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2011/12/20 Fahredd?n Basegmez >>>>>>>>>> >>>>>>>>>>> Howdy, >>>>>>>>>>> >>>>>>>>>>> Is it possible to get non-normalized eigenvectors from >>>>>>>>>>> scipy.linalg.eig(a, b)? Preferably just by using numpy. >>>>>>>>>>> >>>>>>>>>>> BTW, Matlab/Octave provides this with its eig(a, b) function but >>>>>>>>>>> I would like to use numpy for obvious reasons. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> >>>>>>>>>>> Fahri >>>>>>>>>>> >>>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> NumPy-Discussion mailing list >>>>>>>> NumPy-Discussion at scipy.org >>>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 20 23:38:57 2011 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 20 Dec 2011 22:38:57 -0600 Subject: [Numpy-discussion] find location of maximum values In-Reply-To: References: Message-ID: On Tuesday, December 20, 2011, questions anon wrote: > ok thanks, a quick try at using it resulted in: > IndexError: index out of bounds > but I may need to do abit more investigating to understand how it works. > thanks The assumption is that these arrays are all the same shape. If not, then extra work is needed to figure out how to map indices of the temperature array to the indices of the lat and Lon arrays. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwwiebe at gmail.com Wed Dec 21 00:10:56 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Tue, 20 Dec 2011 21:10:56 -0800 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 6:24 PM, Geoffrey Irving wrote: > Hello, > > As a followup to the prior thread on bugs in user defined types in > numpy, I converted my rational number class from C++ to C and switched > to 32 bits to remove the need for unportable 128 bit numbers. It > should be usable as a fairly thorough test case for user defined types > now. It does rather more than a minimal test case would need to do, > but that isn't a problem unless you're concerned about code size. Let > me know if any further changes are needed before it's suitable for > inclusion in numpy as a test case. The repository is here: > > https://github.com/girving/rational > > The tests run under either py.test or nose. > > For completeness, my branch fixing all but one of the bugs I found in > numpy user defined types is here: > > https://github.com/girving/numpy/tree/fixuserloops > > The remaining bug is that numpy incorrectly releases the GIL during > casts even though NPY_NEEDS_API is set. The resulting crash goes away > if the line defining ACQUIRE_GIL is uncommented. With the necessary > locks in place, all my tests pass with my branch of numpy. I haven't > tracked this one down and fixed it yet, but it shouldn't be hard to do > so. > Looks great. I've added some comments to the pull request for the fixuserloops branch, which is here: https://github.com/numpy/numpy/pull/175 I would advise anyone with an interest in the low-level aspects of how NumPy's handling of the GIL and multi-threading/concurrency should evolve to take a look. Prior to anything I contributed, NumPy hardcoded whether to release the GIL during ufuncs or not. I added a needs_api flag in a few places to indicate whether the inner loop functions call the CPython API or not. Note that for ABI compatibility reasons, this flag is not 100% correctly integrated throughout NumPy. What Geoffrey is proposing here conflicts with the way I imagined the flag would be used, but supporting both of our ways of calling the inner loop seems useful to me. Take a look at the pull request for more details. Cheers, Mark > > Geoffrey > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjordan1 at uw.edu Wed Dec 21 01:48:08 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Tue, 20 Dec 2011 22:48:08 -0800 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 9:10 PM, Mark Wiebe wrote: > On Tue, Dec 20, 2011 at 6:24 PM, Geoffrey Irving wrote: >> >> Hello, >> >> As a followup to the prior thread on bugs in user defined types in >> numpy, I converted my rational number class from C++ to C and switched >> to 32 bits to remove the need for unportable 128 bit numbers. ?It >> should be usable as a fairly thorough test case for user defined types >> now. ?It does rather more than a minimal test case would need to do, >> but that isn't a problem unless you're concerned about code size. ?Let >> me know if any further changes are needed before it's suitable for >> inclusion in numpy as a test case. ?The repository is here: >> >> ? ?https://github.com/girving/rational >> >> The tests run under either py.test or nose. >> >> For completeness, my branch fixing all but one of the bugs I found in >> numpy user defined types is here: >> >> ? ?https://github.com/girving/numpy/tree/fixuserloops >> >> The remaining bug is that numpy incorrectly releases the GIL during >> casts even though NPY_NEEDS_API is set. ?The resulting crash goes away >> if the line defining ACQUIRE_GIL is uncommented. ?With the necessary >> locks in place, all my tests pass with my branch of numpy. ?I haven't >> tracked this one down and fixed it yet, but it shouldn't be hard to do >> so. > > > Looks great. I've added some comments to the pull request for the > fixuserloops branch, which is here: > > https://github.com/numpy/numpy/pull/175 > > I would advise anyone with an interest in the low-level aspects of how > NumPy's handling of the GIL and multi-threading/concurrency should evolve to > take a look. Prior to anything I contributed, NumPy hardcoded whether to > release the GIL during ufuncs or not. I added a needs_api flag in a few So releasing the GIL wasn't something the user could get at? (I'm curious if this is something that should be mentioned in the ufunc tutorial on the numpy docs.) > places to indicate whether the inner loop functions call the CPython API or > not. Note that for ABI compatibility reasons, this flag is not 100% > correctly integrated throughout NumPy. Could you expand a little bit on this ABI compatibility issue? > > What Geoffrey is proposing here conflicts with the way I imagined the flag > would be used, but supporting both of our ways of calling the inner loop > seems useful to me. Take a look at the pull request for more details. > > Cheers, > Mark > >> >> >> Geoffrey >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pge08aqw at studserv.uni-leipzig.de Wed Dec 21 04:11:08 2011 From: pge08aqw at studserv.uni-leipzig.de (Lennart Fricke) Date: Wed, 21 Dec 2011 10:11:08 +0100 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? References: Message-ID: <8762haffwz.fsf@tee.ostfriesland> Dear Fahredd?n, I think, the norm of the eigenvectors corresponds to some generic amplitude. But that is something you cannot extract from the solution of the eigenvalue problem but it depends on the initial deflection or velocities. So I think you should be able to use the normalized values just as well as the non-, un- or not normalized ones. Octave seems to normalize that way that, transpose(Z).B.Z=I, where Z is the matrix of eigenvectors, B is matrix B of the generalized eigenvalue problem and I is the identity. It uses lapack functions. But that's only true if A,B are symmetric. If not it normalizes the magnitude of largest element of each eigenvector to 1. I believe you can get it like that. If U is a Matrix with normalization factors it is diagonal and Z.A contains the normalized column vectors. then it is: transpose(Z.A).B.Z.A =transpose(A).transpose(Z).B.Z.A =A.transpose(Z).B.Z.A=I and thus invert(A).invert(A)=transpose(Z).B.Z As A is diagonal invert(A) has the reciprocal elements on the diagonal. So you can easily extract them A=diag(1/sqrt(diag(transpose(Z).B.Z))) I hope that's correct. Best Regards Lennart From shish at keba.be Wed Dec 21 07:01:46 2011 From: shish at keba.be (Olivier Delalleau) Date: Wed, 21 Dec 2011 07:01:46 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: <8762haffwz.fsf@tee.ostfriesland> References: <8762haffwz.fsf@tee.ostfriesland> Message-ID: Aaah, thanks a lot Lennart, I knew there had to be some logic to Octave's output, but I couldn't see it... -=- Olivier 2011/12/21 Lennart Fricke > Dear Fahredd?n, > I think, the norm of the eigenvectors corresponds to some generic > amplitude. But that is something you cannot extract from the solution of > the eigenvalue problem but it depends on the initial deflection or > velocities. > > So I think you should be able to use the normalized values just as well > as the non-, un- or not normalized ones. > > Octave seems to normalize that way that, transpose(Z).B.Z=I, where Z is > the matrix of eigenvectors, B is matrix B of the generalized eigenvalue > problem and I is the identity. It uses lapack functions. But that's only > true if A,B are symmetric. If not it normalizes the magnitude of largest > element of each eigenvector to 1. > > I believe you can get it like that. If U is a Matrix with normalization > factors it is diagonal and Z.A contains the normalized column vectors. > then it is: > > transpose(Z.A).B.Z.A > =transpose(A).transpose(Z).B.Z.A > =A.transpose(Z).B.Z.A=I > > and thus invert(A).invert(A)=transpose(Z).B.Z > As A is diagonal invert(A) has the reciprocal elements on the diagonal. > So you can easily extract them > > A=diag(1/sqrt(diag(transpose(Z).B.Z))) > > I hope that's correct. > > Best Regards > Lennart > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.h.jaffe at gmail.com Wed Dec 21 07:31:14 2011 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Wed, 21 Dec 2011 07:31:14 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: <8762haffwz.fsf@tee.ostfriesland> Message-ID: <4EF1D192.8010705@gmail.com> Just to be completely clear, there is no such thing as a "non-normalized" eigenvector. An eigenvector is only determined *up to a scalar normalization*, which is obvious from the eigenvalue equation: A v = l v where A is the matrix, l is the eigenvalue, and v is the eigenvector. Obviously v is only determined up to a constant factor. A given eigen routine can return anything at all, but there is no native "non-normalized" version. Traditionally, you can decide to return "normalized" eigenvectors with the scalar factor determined by norm(v)=1 for some suitable norm. (I could imagine that an algorithm could depend on that.) Andrew On 21/12/2011 07:01, Olivier Delalleau wrote: > Aaah, thanks a lot Lennart, I knew there had to be some logic to > Octave's output, but I couldn't see it... > > -=- Olivier > > 2011/12/21 Lennart Fricke > > > Dear Fahredd?n, > I think, the norm of the eigenvectors corresponds to some generic > amplitude. But that is something you cannot extract from the solution of > the eigenvalue problem but it depends on the initial deflection or > velocities. > > So I think you should be able to use the normalized values just as well > as the non-, un- or not normalized ones. > > Octave seems to normalize that way that, transpose(Z).B.Z=I, where Z is > the matrix of eigenvectors, B is matrix B of the generalized eigenvalue > problem and I is the identity. It uses lapack functions. But that's only > true if A,B are symmetric. If not it normalizes the magnitude of largest > element of each eigenvector to 1. > > I believe you can get it like that. If U is a Matrix with normalization > factors it is diagonal and Z.A contains the normalized column vectors. > then it is: > > transpose(Z.A).B.Z.A > =transpose(A).transpose(Z).B.Z.A > =A.transpose(Z).B.Z.A=I > > and thus invert(A).invert(A)=transpose(Z).B.Z > As A is diagonal invert(A) has the reciprocal elements on the diagonal. > So you can easily extract them > > A=diag(1/sqrt(diag(transpose(Z).B.Z))) > > I hope that's correct. > > Best Regards > Lennart > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From a.h.jaffe at gmail.com Wed Dec 21 07:31:14 2011 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Wed, 21 Dec 2011 07:31:14 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: References: <8762haffwz.fsf@tee.ostfriesland> Message-ID: <4EF1D192.8010705@gmail.com> Just to be completely clear, there is no such thing as a "non-normalized" eigenvector. An eigenvector is only determined *up to a scalar normalization*, which is obvious from the eigenvalue equation: A v = l v where A is the matrix, l is the eigenvalue, and v is the eigenvector. Obviously v is only determined up to a constant factor. A given eigen routine can return anything at all, but there is no native "non-normalized" version. Traditionally, you can decide to return "normalized" eigenvectors with the scalar factor determined by norm(v)=1 for some suitable norm. (I could imagine that an algorithm could depend on that.) Andrew On 21/12/2011 07:01, Olivier Delalleau wrote: > Aaah, thanks a lot Lennart, I knew there had to be some logic to > Octave's output, but I couldn't see it... > > -=- Olivier > > 2011/12/21 Lennart Fricke > > > Dear Fahredd?n, > I think, the norm of the eigenvectors corresponds to some generic > amplitude. But that is something you cannot extract from the solution of > the eigenvalue problem but it depends on the initial deflection or > velocities. > > So I think you should be able to use the normalized values just as well > as the non-, un- or not normalized ones. > > Octave seems to normalize that way that, transpose(Z).B.Z=I, where Z is > the matrix of eigenvectors, B is matrix B of the generalized eigenvalue > problem and I is the identity. It uses lapack functions. But that's only > true if A,B are symmetric. If not it normalizes the magnitude of largest > element of each eigenvector to 1. > > I believe you can get it like that. If U is a Matrix with normalization > factors it is diagonal and Z.A contains the normalized column vectors. > then it is: > > transpose(Z.A).B.Z.A > =transpose(A).transpose(Z).B.Z.A > =A.transpose(Z).B.Z.A=I > > and thus invert(A).invert(A)=transpose(Z).B.Z > As A is diagonal invert(A) has the reciprocal elements on the diagonal. > So you can easily extract them > > A=diag(1/sqrt(diag(transpose(Z).B.Z))) > > I hope that's correct. > > Best Regards > Lennart > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From mangabasi at gmail.com Wed Dec 21 07:49:31 2011 From: mangabasi at gmail.com (=?UTF-8?Q?Fahredd=C4=B1n_Basegmez?=) Date: Wed, 21 Dec 2011 07:49:31 -0500 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? In-Reply-To: <4EF1D192.8010705@gmail.com> References: <8762haffwz.fsf@tee.ostfriesland> <4EF1D192.8010705@gmail.com> Message-ID: According to this page eigenvectors are normalized with respect to the second matrix. Do you guys have any idea how that's done? http://www.kxcad.net/Altair/HyperWorks/oshelp/frequency_response_analysis.htm "If the eigenvectors are normalized with respect to the mass matrix, the modal mass matrix is the unity matrix and the modal stiffness matrix is a diagonal matrix holding the eigenvalues of the system. This way, the system equation is reduced to a set of uncoupled equations for the components of d that can be solved easily." On Wed, Dec 21, 2011 at 7:31 AM, Andrew Jaffe wrote: > Just to be completely clear, there is no such thing as a > "non-normalized" eigenvector. An eigenvector is only determined *up to a > scalar normalization*, which is obvious from the eigenvalue equation: > > A v = l v > > where A is the matrix, l is the eigenvalue, and v is the eigenvector. > Obviously v is only determined up to a constant factor. A given eigen > routine can return anything at all, but there is no native > "non-normalized" version. > > Traditionally, you can decide to return "normalized" eigenvectors with > the scalar factor determined by norm(v)=1 for some suitable norm. (I > could imagine that an algorithm could depend on that.) > > Andrew > > > On 21/12/2011 07:01, Olivier Delalleau wrote: > > Aaah, thanks a lot Lennart, I knew there had to be some logic to > > Octave's output, but I couldn't see it... > > > > -=- Olivier > > > > 2011/12/21 Lennart Fricke > > > > > > Dear Fahredd?n, > > I think, the norm of the eigenvectors corresponds to some generic > > amplitude. But that is something you cannot extract from the > solution of > > the eigenvalue problem but it depends on the initial deflection or > > velocities. > > > > So I think you should be able to use the normalized values just > as well > > as the non-, un- or not normalized ones. > > > > Octave seems to normalize that way that, transpose(Z).B.Z=I, > where Z is > > the matrix of eigenvectors, B is matrix B of the generalized > eigenvalue > > problem and I is the identity. It uses lapack functions. But > that's only > > true if A,B are symmetric. If not it normalizes the magnitude of > largest > > element of each eigenvector to 1. > > > > I believe you can get it like that. If U is a Matrix with > normalization > > factors it is diagonal and Z.A contains the normalized column > vectors. > > then it is: > > > > transpose(Z.A).B.Z.A > > =transpose(A).transpose(Z).B.Z.A > > =A.transpose(Z).B.Z.A=I > > > > and thus invert(A).invert(A)=transpose(Z).B.Z > > As A is diagonal invert(A) has the reciprocal elements on the > diagonal. > > So you can easily extract them > > > > A=diag(1/sqrt(diag(transpose(Z).B.Z))) > > > > I hope that's correct. > > > > Best Regards > > Lennart > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Dec 21 07:56:36 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 Dec 2011 05:56:36 -0700 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: Hi Geoffrey, On Tue, Dec 20, 2011 at 7:24 PM, Geoffrey Irving wrote: > Hello, > > As a followup to the prior thread on bugs in user defined types in > numpy, I converted my rational number class from C++ to C and switched > to 32 bits to remove the need for unportable 128 bit numbers. It > should be usable as a fairly thorough test case for user defined types > now. It does rather more than a minimal test case would need to do, > but that isn't a problem unless you're concerned about code size. Let > me know if any further changes are needed before it's suitable for > inclusion in numpy as a test case. The repository is here: > > https://github.com/girving/rational > > The tests run under either py.test or nose. > > For completeness, my branch fixing all but one of the bugs I found in > numpy user defined types is here: > > https://github.com/girving/numpy/tree/fixuserloops > > The remaining bug is that numpy incorrectly releases the GIL during > casts even though NPY_NEEDS_API is set. The resulting crash goes away > if the line defining ACQUIRE_GIL is uncommented. With the necessary > locks in place, all my tests pass with my branch of numpy. I haven't > tracked this one down and fixed it yet, but it shouldn't be hard to do > so. > > A few preliminary comments on the C code (since I can't comment directly on github) 1) The C++ style comments aren't portable. 2) The trailing comments would (IMHO) look better on the line above. 3) The inline keyword isn't portable, use NPY_INLINE instead. 4) We've mostly used the int foo(void) { } style of function definition. 5) And for if statements if (is_toohot) { change_seats(); } else if (is_toocold) { change_seats(); } else { eat_cereal(); } 6) Because Python assert disappears in release code, the tests need to use assert_(...) imported from numpy.testing Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pge08aqw at studserv.uni-leipzig.de Wed Dec 21 08:51:45 2011 From: pge08aqw at studserv.uni-leipzig.de (Lennart Fricke) Date: Wed, 21 Dec 2011 14:51:45 +0100 Subject: [Numpy-discussion] Getting non-normalized eigenvectors from generalized eigenvalue solution? References: <8762haffwz.fsf@tee.ostfriesland> <4EF1D192.8010705@gmail.com> Message-ID: <87wr9qdocu.fsf@tee.ostfriesland> I read it like that: (**T is the transpose) Let's call M the mass matrix and N the modal mass matrix. Then X**T*M*X=N. If X (matrix of eigenvectors) is normalized with respect to M, N is I (unity) so it just mean that X**T*M*X=I. That is what octave and matlab give you. For this to be true. x**T*M*x=1 must be true for each column x of X. Thus if y is the not normalized eigenvector. a*y**T*M*a*y=1 a**2 * y**T*M*y=1 a**2 = 1/(y**T*M*y) For me the question, why X diagonalizes M and K at the same time, remains. Another hint. If the matrizes are hermitian (symmetric if real) you can use scipy.linalg.eigh, which gives you result with correct normalization. Best regards Lennart From mwwiebe at gmail.com Wed Dec 21 11:31:00 2011 From: mwwiebe at gmail.com (Mark Wiebe) Date: Wed, 21 Dec 2011 08:31:00 -0800 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Tue, Dec 20, 2011 at 10:48 PM, Christopher Jordan-Squire wrote: > On Tue, Dec 20, 2011 at 9:10 PM, Mark Wiebe wrote: > > On Tue, Dec 20, 2011 at 6:24 PM, Geoffrey Irving wrote: > >> > >> Hello, > >> > >> As a followup to the prior thread on bugs in user defined types in > >> numpy, I converted my rational number class from C++ to C and switched > >> to 32 bits to remove the need for unportable 128 bit numbers. It > >> should be usable as a fairly thorough test case for user defined types > >> now. It does rather more than a minimal test case would need to do, > >> but that isn't a problem unless you're concerned about code size. Let > >> me know if any further changes are needed before it's suitable for > >> inclusion in numpy as a test case. The repository is here: > >> > >> https://github.com/girving/rational > >> > >> The tests run under either py.test or nose. > >> > >> For completeness, my branch fixing all but one of the bugs I found in > >> numpy user defined types is here: > >> > >> https://github.com/girving/numpy/tree/fixuserloops > >> > >> The remaining bug is that numpy incorrectly releases the GIL during > >> casts even though NPY_NEEDS_API is set. The resulting crash goes away > >> if the line defining ACQUIRE_GIL is uncommented. With the necessary > >> locks in place, all my tests pass with my branch of numpy. I haven't > >> tracked this one down and fixed it yet, but it shouldn't be hard to do > >> so. > > > > > > Looks great. I've added some comments to the pull request for the > > fixuserloops branch, which is here: > > > > https://github.com/numpy/numpy/pull/175 > > > > I would advise anyone with an interest in the low-level aspects of how > > NumPy's handling of the GIL and multi-threading/concurrency should > evolve to > > take a look. Prior to anything I contributed, NumPy hardcoded whether to > > release the GIL during ufuncs or not. I added a needs_api flag in a few > > So releasing the GIL wasn't something the user could get at? (I'm > curious if this is something that should be mentioned in the ufunc > tutorial on the numpy docs.) > User C/C++ code can release the GIL with some macros provided by NumPy, and it is accessible in ufuncs. The cases I'm talking about are where a custom data type, such as this rational type, supplies low level inner loop functions to do casting and various primitive calculations. These APIs don't in general provide ways to tell the NumPy outer loop code whether it's safe to release the GIL before calling the functions. > places to indicate whether the inner loop functions call the CPython API > or > > not. Note that for ABI compatibility reasons, this flag is not 100% > > correctly integrated throughout NumPy. > > Could you expand a little bit on this ABI compatibility issue? > This boils down to the main reason that any significant changes to NumPy are currently unnecessarily difficult, that the ABI exposes the binary layout of key objects like ndarray and dtype. This means that when ABI compatibility is required, any changes have to be very careful not to disturb the binary layout of these objects. Changing something like how the needs_api flag works would change the nditer C API, which is part of the ABI as well. When the memory layouts are hidden from the ABI, it will be possible to make drastic changes to them while emulating the previous behavior in existing API calls, making it much easier to attempt changes that would improve performance or functionality. -Mark > > > > > What Geoffrey is proposing here conflicts with the way I imagined the > flag > > would be used, but supporting both of our ways of calling the inner loop > > seems useful to me. Take a look at the pull request for more details. > > > > Cheers, > > Mark > > > >> > >> > >> Geoffrey > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Wed Dec 21 15:37:15 2011 From: irving at naml.us (Geoffrey Irving) Date: Wed, 21 Dec 2011 11:37:15 -0900 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 3:56 AM, Charles R Harris wrote: > Hi Geoffrey, > > On Tue, Dec 20, 2011 at 7:24 PM, Geoffrey Irving wrote: >> >> Hello, >> >> As a followup to the prior thread on bugs in user defined types in >> numpy, I converted my rational number class from C++ to C and switched >> to 32 bits to remove the need for unportable 128 bit numbers. ?It >> should be usable as a fairly thorough test case for user defined types >> now. ?It does rather more than a minimal test case would need to do, >> but that isn't a problem unless you're concerned about code size. ?Let >> me know if any further changes are needed before it's suitable for >> inclusion in numpy as a test case. ?The repository is here: >> >> ? ?https://github.com/girving/rational >> >> The tests run under either py.test or nose. >> >> For completeness, my branch fixing all but one of the bugs I found in >> numpy user defined types is here: >> >> ? ?https://github.com/girving/numpy/tree/fixuserloops >> >> The remaining bug is that numpy incorrectly releases the GIL during >> casts even though NPY_NEEDS_API is set. ?The resulting crash goes away >> if the line defining ACQUIRE_GIL is uncommented. ?With the necessary >> locks in place, all my tests pass with my branch of numpy. ?I haven't >> tracked this one down and fixed it yet, but it shouldn't be hard to do >> so. >> > > A few preliminary comments on the C code (since I can't comment directly on > github) > > 1) The C++ style comments aren't portable. > > 2) The trailing comments would (IMHO) look better on the line above. > > 3) The inline keyword isn't portable, use NPY_INLINE instead. > > 4) We've mostly used the > > ??? int > ??? foo(void) > ??? { > ??? } > > style of function definition. > > 5) And for if statements > > ??? if (is_toohot) { > ??????? change_seats(); > ??? } > ??? else if (is_toocold) { > ??????? change_seats(); > ??? } > ??? else { > ??????? eat_cereal(); > ??? } > > 6) Because Python assert disappears in release code, the tests need to use > assert_(...) imported from numpy.testing All fixed. Geoffrey From rowen at uw.edu Wed Dec 21 15:54:33 2011 From: rowen at uw.edu (Russell E. Owen) Date: Wed, 21 Dec 2011 12:54:33 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 References: Message-ID: In article , Ralf Gommers wrote: > On Tue, Dec 20, 2011 at 10:52 PM, Russell E. Owen wrote: > > > In article , > > "Russell E. Owen" wrote: > > > > > In article > > > , > > > Ralf Gommers wrote: > > > > > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit > > tests > > > > > claim the wrong version of fortran was used. I thought I knew how to > > > > > avoid that, but it's not working. > > > > > > > > > >...(elided text that suggests numpy is building using g77 even though > > I > > > > >asked for gfortran)... > > > > > > > > > > Any suggestions on how to fix this? > > > > > > > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it > > off > > > > your path. > > > > > > Yes. I would have tried that if I had known how to do it (though I'm > > > puzzled why it would be wanted since I told the installer to use > > > gfortran). > > > > > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > > > this system. > > > > The explanation of why g77 is still picked up, and a possible solution: > http://thread.gmane.org/gmane.comp.python.numeric.general/13820/focus=13826 Thank you. I assume you are referring to this answer: > You mean g77? Anyways, I think I know why you are having problems. Passing > --fcompiler to the config command only affects the Fortran compiler that is > used during configuration phase (where we compile small C programs to determine > what your platform supports, like isnan() and the like). It does not propagate to > the > rest of the build_ext phase where you want it. Use config_fc to set up your > Fortran compiler for all of the phases: > > $ python setup.py config_fc --fcompiler=gnu95 build Fascinating. However, there are two things I don't understand: 1) Is my build actually broken? The ldd output for lapack_lite has no sign of libg2c.so (see quote from build instructions below). If it's just a false report from the unit then I don't need to rebuild (and there are a lot of packages built against it -- a rebuild will take much of a day). 2) This advice seems to contradict the build documentation (see below). Does this indicate a bug in the docs? In setup.py? Some other issue? I don't remember ever having this problem building numpy before. Quote from the build docs: Choosing the fortran compiler To build with g77: python setup.py build --fcompiler=gnu To build with gfortran: python setup.py build --fcompiler=gnu95 For more information see: python setup.py build --help-fcompiler How to check the ABI of blas/lapack/atlas One relatively simple and reliable way to check for the compiler used to build a library is to use ldd on the library. If libg2c.so is a dependency, this means that g77 has been used. If libgfortran.so is a a dependency, gfortran has been used. If both are dependencies, this means both have been used, which is almost always a very bad idea. ----- ldd on my numpy/linalg/lapack_lite.so (I don't see libg2c.so): -bash-3.2$ ldd /astro/users/rowen/local/lib/python/numpy/linalg/lapack_lite.so linux-vdso.so.1 => (0x00007fff0cff0000) liblapack.so.3 => /usr/lib64/liblapack.so.3 (0x00002acadd738000) libblas.so.3 => /usr/lib64/libblas.so.3 (0x00002acadde42000) libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00002acade096000) libm.so.6 => /lib64/libm.so.6 (0x00002acade380000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002acade604000) libc.so.6 => /lib64/libc.so.6 (0x00002acade812000) libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002acadeb6a000) /lib64/ld-linux-x86-64.so.2 (0x0000003b2ba00000) From rowen at uw.edu Wed Dec 21 16:05:38 2011 From: rowen at uw.edu (Russell E. Owen) Date: Wed, 21 Dec 2011 13:05:38 -0800 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 References: Message-ID: In article , Ralf Gommers wrote: > On Tue, Dec 20, 2011 at 10:52 PM, Russell E. Owen wrote: > > > In article , > > "Russell E. Owen" wrote: > > > > > In article > > > , > > > Ralf Gommers wrote: > > > > > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen wrote: > > > > > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the unit > > tests > > > > > claim the wrong version of fortran was used. I thought I knew how to > > > > > avoid that, but it's not working. > > > > > > > > > >...(elided text that suggests numpy is building using g77 even though > > I > > > > >asked for gfortran)... > > > > > > > > > > Any suggestions on how to fix this? > > > > > > > > > > > > > I assume you have g77 installed and on your PATH. If so, try moving it > > off > > > > your path. > > > > > > Yes. I would have tried that if I had known how to do it (though I'm > > > puzzled why it would be wanted since I told the installer to use > > > gfortran). > > > > > > The problem is that g77 is in /usr/bin/ and I don't have root privs on > > > this system. > > > > The explanation of why g77 is still picked up, and a possible solution: > http://thread.gmane.org/gmane.comp.python.numeric.general/13820/focus=13826 OK. I tried this: - clear out old numpy from ~/local - unpack fresh numpy 1.6.1 in a build directory and cd into it $ python setup.py config_fc --fcompiler=gnu95 build $ python setup.py install --home=~/local $ cd $ python $ import numpy $ numpy.__file__ # to make sure it picked up the newly build version $ numpy.test() Again the unit test fails with: FAIL: test_lapack (test_build.TestF77Mismatch) ---------------------------------------------------------------------- Traceback (most recent call last): File "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", line 50, in test_lapack information.""") AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! This is likely to cause random crashes and wrong results. See numpy INSTALL.txt for more information. -- Russell P.S. I'm using nose 0.11.4 because the current version requires distrib. Surely that won't affect this? From charlesr.harris at gmail.com Wed Dec 21 18:55:38 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 Dec 2011 16:55:38 -0700 Subject: [Numpy-discussion] test code for user defined types in numpy In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 1:37 PM, Geoffrey Irving wrote: > On Wed, Dec 21, 2011 at 3:56 AM, Charles R Harris > wrote: > > Hi Geoffrey, > > > > On Tue, Dec 20, 2011 at 7:24 PM, Geoffrey Irving wrote: > >> > >> Hello, > >> > >> As a followup to the prior thread on bugs in user defined types in > >> numpy, I converted my rational number class from C++ to C and switched > >> to 32 bits to remove the need for unportable 128 bit numbers. It > >> should be usable as a fairly thorough test case for user defined types > >> now. It does rather more than a minimal test case would need to do, > >> but that isn't a problem unless you're concerned about code size. Let > >> me know if any further changes are needed before it's suitable for > >> inclusion in numpy as a test case. The repository is here: > >> > >> https://github.com/girving/rational > >> > >> The tests run under either py.test or nose. > >> > >> For completeness, my branch fixing all but one of the bugs I found in > >> numpy user defined types is here: > >> > >> https://github.com/girving/numpy/tree/fixuserloops > >> > >> The remaining bug is that numpy incorrectly releases the GIL during > >> casts even though NPY_NEEDS_API is set. The resulting crash goes away > >> if the line defining ACQUIRE_GIL is uncommented. With the necessary > >> locks in place, all my tests pass with my branch of numpy. I haven't > >> tracked this one down and fixed it yet, but it shouldn't be hard to do > >> so. > >> > > > > A few preliminary comments on the C code (since I can't comment directly > on > > github) > > > > 1) The C++ style comments aren't portable. > > > > 2) The trailing comments would (IMHO) look better on the line above. > > > > 3) The inline keyword isn't portable, use NPY_INLINE instead. > > > > 4) We've mostly used the > > > > int > > foo(void) > > { > > } > > > > style of function definition. > > > > 5) And for if statements > > > > if (is_toohot) { > > change_seats(); > > } > > else if (is_toocold) { > > change_seats(); > > } > > else { > > eat_cereal(); > > } > > > > 6) Because Python assert disappears in release code, the tests need to > use > > assert_(...) imported from numpy.testing > > All fixed. > > Mark, I'm ready to merge pull-175, is there any reason not to? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncreati at inogs.it Thu Dec 22 04:13:56 2011 From: ncreati at inogs.it (Nicola Creati) Date: Thu, 22 Dec 2011 10:13:56 +0100 Subject: [Numpy-discussion] Binning Message-ID: <4EF2F4D4.2000605@inogs.it> Hello, I have a cloud on sparse points that can be described by a Nx3 array (N is the number of points). Each point is defined by an x, y and z coordinate: x0 y0 z0 x1 y1 z1 . . . . . . . . . xn yn zn I need to bin the cloud to a regular 2D array according to a desired bin size assigning to each cell (bin) the minimum z of all points that fall in that cell(bin). Moreover I need indexes of points that fall in each cell(bin). Is there any way to accomplish this task in numpy? Thanks. Nicola Creati -- Nicola Creati Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of Lithosphere Department CARS (Cartography and Remote Sensing) Research Group http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste - ITALY ncreati at ogs.trieste.it off. +39 040 2140 213 fax. +39 040 327307 _____________________________________________________________________ This communication, that may contain confidential and/or legally privileged information, is intended solely for the use of the intended addressees. Opinions, conclusions and other information contained in this message, that do not relate to the official business of OGS, shall be considered as not given or endorsed by it. Every opinion or advice contained in this communication is subject to the terms and conditions provided by the agreement governing the engagement with such a client. Any use, disclosure, copying or distribution of the contents of this communication by a not-intended recipient or in violation of the purposes of this communication is strictly prohibited and may be unlawful. For Italy only: Ai sensi del D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni contenute in questo messaggio sono riservate ed a uso esclusivo del destinatario. _____________________________________________________________________ From adnothing at gmail.com Thu Dec 22 06:27:47 2011 From: adnothing at gmail.com (Adrien Gaidon) Date: Thu, 22 Dec 2011 12:27:47 +0100 Subject: [Numpy-discussion] Binning In-Reply-To: <4EF2F4D4.2000605@inogs.it> References: <4EF2F4D4.2000605@inogs.it> Message-ID: Hello Nicola, I am not aware of a magical "one function" numpy solution (is there one numpy gurus?). I don't know if it's optimal, but here's how I usually do similar things. I wrote a simple function that assigns points (any number of dimensions) to a regular multi-dimensional grid. It is here: https://gist.github.com/1509853 It is short, commented and should be straightforward to use. Once you have the assignments, you can: - get the non-empty cell indexes with `np.unique(assignments)` - retrieve the points assigned to a cell with `points[assignments == cell_index]` - iterate over assignments to select the points you want for each cell. Hope this helps, Adrien PS: This is one of the first times I post an answer on this list, so if I did anything wrong, let me know. Numpy is such a wonderful thing and you guys do such an amazing work, that I though it is time to give back at least epsilon of what I got from you :-) 2011/12/22 Nicola Creati > Hello, > > I have a cloud on sparse points that can be described by a Nx3 array (N > is the number of points). Each point is defined by an x, y and z > coordinate: > > x0 y0 z0 > x1 y1 z1 > . . . > . . . > . . . > xn yn zn > > > I need to bin the cloud to a regular 2D array according to a desired bin > size assigning to each cell (bin) the minimum z of all points that fall > in that cell(bin). Moreover I need indexes of points that fall in each > cell(bin). > > Is there any way to accomplish this task in numpy? > > Thanks. > > Nicola Creati > > > > > -- > Nicola Creati > Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS > www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of > Lithosphere Department CARS (Cartography and Remote Sensing) Research Group > http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - > Trieste - ITALY ncreati at ogs.trieste.it > off. +39 040 2140 213 > fax. +39 040 327307 > > _____________________________________________________________________ > This communication, that may contain confidential and/or legally > privileged information, is intended solely for the use of the intended > addressees. Opinions, conclusions and other information contained in this > message, that do not relate to the official business of OGS, shall be > considered as not given or endorsed by it. Every opinion or advice > contained in this communication is subject to the terms and conditions > provided by the agreement governing the engagement with such a client. Any > use, disclosure, copying or distribution of the contents of this > communication by a not-intended recipient or in violation of the purposes > of this communication is strictly prohibited and may be unlawful. For Italy > only: Ai sensi del D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le > informazioni contenute in questo messaggio sono riservate ed a uso > esclusivo del destinatario. > _____________________________________________________________________ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Dec 22 11:17:53 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 22 Dec 2011 11:17:53 -0500 Subject: [Numpy-discussion] Binning In-Reply-To: References: <4EF2F4D4.2000605@inogs.it> Message-ID: On Thu, Dec 22, 2011 at 6:27 AM, Adrien Gaidon wrote: > Hello Nicola, > > I am not aware of a magical "one function" numpy solution (is there one > numpy gurus?). > > I don't know if it's optimal, but here's how I usually do similar things. > > I wrote a simple function that assigns points (any number of dimensions) to > a regular multi-dimensional grid. It is > here:?https://gist.github.com/1509853?It is short, commented and should be > straightforward to use. > > Once you have the assignments, you can: > - get the non-empty cell indexes with `np.unique(assignments)` > - retrieve the points assigned to a cell with `points[assignments == > cell_index]` > - iterate over assignments to select the points you want for each cell. looks nice, reading through it. line 71 looks like a nice trick BSD licensed, so we can keep it? as far as I know numpy doesn't have anything like a digitize_nd Thanks, Josef > > Hope this helps, > > Adrien > > PS: This is one of the first times I post an answer on this list, so if I > did anything wrong, let me know. Numpy is such a wonderful thing and you > guys do such an amazing work, that I though it is time to give back at least > epsilon of what I got from you :-) > > > 2011/12/22 Nicola Creati >> >> Hello, >> >> I have a cloud on sparse points that can be described by a Nx3 array (N >> is the number of points). Each point is defined by an x, y and z >> coordinate: >> >> x0 y0 z0 >> x1 y1 z1 >> ? . ? ?. ? ?. >> ? . ? ?. ? ?. >> ? . ? ?. ? ?. >> xn yn zn >> >> >> I need to bin the cloud to a regular 2D array according to a desired bin >> size assigning to each cell (bin) the minimum z of all points that fall >> in that cell(bin). Moreover I need indexes of points that fall in each >> cell(bin). >> >> Is there any way to accomplish this task in numpy? >> >> Thanks. >> >> Nicola Creati >> >> >> >> >> -- >> Nicola Creati >> Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS >> www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of >> Lithosphere Department CARS (Cartography and Remote Sensing) Research Group >> http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste >> - ITALY ncreati at ogs.trieste.it >> off. ? +39 040 2140 213 >> fax. ? +39 040 327307 >> >> _____________________________________________________________________ >> This communication, that may contain confidential and/or legally >> privileged information, is intended solely for the use of the intended >> addressees. Opinions, conclusions and other information contained in this >> message, that do not relate to the official business of OGS, shall be >> considered as not given or endorsed by it. Every opinion or advice contained >> in this communication is subject to the terms and conditions provided by the >> agreement governing the engagement with such a client. Any use, disclosure, >> copying or distribution of the contents of this communication by a >> not-intended recipient or in violation of the purposes of this communication >> is strictly prohibited and may be unlawful. For Italy only: Ai sensi del >> D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni >> contenute in questo messaggio sono riservate ed a uso esclusivo del >> destinatario. >> _____________________________________________________________________ >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Thu Dec 22 11:19:34 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 22 Dec 2011 11:19:34 -0500 Subject: [Numpy-discussion] Binning In-Reply-To: References: <4EF2F4D4.2000605@inogs.it> Message-ID: On Thu, Dec 22, 2011 at 11:17 AM, wrote: > On Thu, Dec 22, 2011 at 6:27 AM, Adrien Gaidon wrote: >> Hello Nicola, >> >> I am not aware of a magical "one function" numpy solution (is there one >> numpy gurus?). >> >> I don't know if it's optimal, but here's how I usually do similar things. >> >> I wrote a simple function that assigns points (any number of dimensions) to >> a regular multi-dimensional grid. It is >> here:?https://gist.github.com/1509853?It is short, commented and should be >> straightforward to use. >> >> Once you have the assignments, you can: >> - get the non-empty cell indexes with `np.unique(assignments)` >> - retrieve the points assigned to a cell with `points[assignments == >> cell_index]` >> - iterate over assignments to select the points you want for each cell. > > looks nice, reading through it. > line 71 looks like a nice trick forgot the qualifier: if I understand it correctly, just quickly reading it. Josef > > BSD licensed, so we can keep it? > > as far as I know numpy doesn't have anything like a digitize_nd > > Thanks, > > Josef > >> >> Hope this helps, >> >> Adrien >> >> PS: This is one of the first times I post an answer on this list, so if I >> did anything wrong, let me know. Numpy is such a wonderful thing and you >> guys do such an amazing work, that I though it is time to give back at least >> epsilon of what I got from you :-) >> >> >> 2011/12/22 Nicola Creati >>> >>> Hello, >>> >>> I have a cloud on sparse points that can be described by a Nx3 array (N >>> is the number of points). Each point is defined by an x, y and z >>> coordinate: >>> >>> x0 y0 z0 >>> x1 y1 z1 >>> ? . ? ?. ? ?. >>> ? . ? ?. ? ?. >>> ? . ? ?. ? ?. >>> xn yn zn >>> >>> >>> I need to bin the cloud to a regular 2D array according to a desired bin >>> size assigning to each cell (bin) the minimum z of all points that fall >>> in that cell(bin). Moreover I need indexes of points that fall in each >>> cell(bin). >>> >>> Is there any way to accomplish this task in numpy? >>> >>> Thanks. >>> >>> Nicola Creati >>> >>> >>> >>> >>> -- >>> Nicola Creati >>> Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS >>> www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of >>> Lithosphere Department CARS (Cartography and Remote Sensing) Research Group >>> http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste >>> - ITALY ncreati at ogs.trieste.it >>> off. ? +39 040 2140 213 >>> fax. ? +39 040 327307 >>> >>> _____________________________________________________________________ >>> This communication, that may contain confidential and/or legally >>> privileged information, is intended solely for the use of the intended >>> addressees. Opinions, conclusions and other information contained in this >>> message, that do not relate to the official business of OGS, shall be >>> considered as not given or endorsed by it. Every opinion or advice contained >>> in this communication is subject to the terms and conditions provided by the >>> agreement governing the engagement with such a client. Any use, disclosure, >>> copying or distribution of the contents of this communication by a >>> not-intended recipient or in violation of the purposes of this communication >>> is strictly prohibited and may be unlawful. For Italy only: Ai sensi del >>> D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni >>> contenute in questo messaggio sono riservate ed a uso esclusivo del >>> destinatario. >>> _____________________________________________________________________ >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> From chaoyuejoy at gmail.com Thu Dec 22 11:27:11 2011 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 22 Dec 2011 17:27:11 +0100 Subject: [Numpy-discussion] output different columns of ndarray in different formats Message-ID: Dear all, Just a small question, how can I output different columns of ndarray in different formats, the manual says, "A single format (%10.5f), a sequence of formats, or a multi-format string" but I use np.savetxt('new.csv',data,fmt=['%i4','%f6.3']) or np.savetxt('new.csv',data,fmt=('%i4','%f6.3')) give strange results. In [33]: data.shape Out[33]: (6506, 2) I want the first column integer and second column float. cheers, Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aronne.merrelli at gmail.com Thu Dec 22 11:36:35 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Thu, 22 Dec 2011 10:36:35 -0600 Subject: [Numpy-discussion] output different columns of ndarray in different formats In-Reply-To: References: Message-ID: On Thu, Dec 22, 2011 at 10:27 AM, Chao YUE wrote: > Dear all, > > Just a small question, how can I output different columns of ndarray in > different formats, > the manual says, "A single format (%10.5f), a sequence of formats, or a > multi-format string" > but I use > > np.savetxt('new.csv',data,fmt=['%i4','%f6.3']) > > or > > np.savetxt('new.csv',data,fmt=('%i4','%f6.3')) > > give strange results. > I think you've flipped the format codes; try: fmt = ('%4i', '%6.3f') -------------- next part -------------- An HTML attachment was scrubbed... URL: From adnothing at gmail.com Thu Dec 22 11:39:57 2011 From: adnothing at gmail.com (Adrien) Date: Thu, 22 Dec 2011 17:39:57 +0100 Subject: [Numpy-discussion] Binning In-Reply-To: References: <4EF2F4D4.2000605@inogs.it> Message-ID: <4EF35D5D.1050101@gmail.com> Le 22/12/2011 17:17, josef.pktd at gmail.com a ?crit : > On Thu, Dec 22, 2011 at 6:27 AM, Adrien Gaidon wrote: >> Hello Nicola, >> >> I am not aware of a magical "one function" numpy solution (is there one >> numpy gurus?). >> >> I don't know if it's optimal, but here's how I usually do similar things. >> >> I wrote a simple function that assigns points (any number of dimensions) to >> a regular multi-dimensional grid. It is >> here: https://gist.github.com/1509853 It is short, commented and should be >> straightforward to use. >> >> Once you have the assignments, you can: >> - get the non-empty cell indexes with `np.unique(assignments)` >> - retrieve the points assigned to a cell with `points[assignments == >> cell_index]` >> - iterate over assignments to select the points you want for each cell. > looks nice, reading through it. > line 71 looks like a nice trick > > BSD licensed, so we can keep it? Off course! :-) Cheers, Adrien > as far as I know numpy doesn't have anything like a digitize_nd > > Thanks, > > Josef > >> Hope this helps, >> >> Adrien >> >> PS: This is one of the first times I post an answer on this list, so if I >> did anything wrong, let me know. Numpy is such a wonderful thing and you >> guys do such an amazing work, that I though it is time to give back at least >> epsilon of what I got from you :-) >> >> >> 2011/12/22 Nicola Creati >>> Hello, >>> >>> I have a cloud on sparse points that can be described by a Nx3 array (N >>> is the number of points). Each point is defined by an x, y and z >>> coordinate: >>> >>> x0 y0 z0 >>> x1 y1 z1 >>> . . . >>> . . . >>> . . . >>> xn yn zn >>> >>> >>> I need to bin the cloud to a regular 2D array according to a desired bin >>> size assigning to each cell (bin) the minimum z of all points that fall >>> in that cell(bin). Moreover I need indexes of points that fall in each >>> cell(bin). >>> >>> Is there any way to accomplish this task in numpy? >>> >>> Thanks. >>> >>> Nicola Creati >>> >>> >>> >>> >>> -- >>> Nicola Creati >>> Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS >>> www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of >>> Lithosphere Department CARS (Cartography and Remote Sensing) Research Group >>> http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste >>> - ITALY ncreati at ogs.trieste.it >>> off. +39 040 2140 213 >>> fax. +39 040 327307 >>> >>> _____________________________________________________________________ >>> This communication, that may contain confidential and/or legally >>> privileged information, is intended solely for the use of the intended >>> addressees. Opinions, conclusions and other information contained in this >>> message, that do not relate to the official business of OGS, shall be >>> considered as not given or endorsed by it. Every opinion or advice contained >>> in this communication is subject to the terms and conditions provided by the >>> agreement governing the engagement with such a client. Any use, disclosure, >>> copying or distribution of the contents of this communication by a >>> not-intended recipient or in violation of the purposes of this communication >>> is strictly prohibited and may be unlawful. For Italy only: Ai sensi del >>> D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni >>> contenute in questo messaggio sono riservate ed a uso esclusivo del >>> destinatario. >>> _____________________________________________________________________ >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chaoyuejoy at gmail.com Thu Dec 22 11:45:04 2011 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 22 Dec 2011 17:45:04 +0100 Subject: [Numpy-discussion] output different columns of ndarray in different formats In-Reply-To: References: Message-ID: O.... Yes.... You're right... It's fine now. Merry Christmas to all! Chao 2011/12/22 Aronne Merrelli > > > On Thu, Dec 22, 2011 at 10:27 AM, Chao YUE wrote: > >> Dear all, >> >> Just a small question, how can I output different columns of ndarray in >> different formats, >> the manual says, "A single format (%10.5f), a sequence of formats, or a >> multi-format string" >> but I use >> >> np.savetxt('new.csv',data,fmt=['%i4','%f6.3']) >> >> or >> >> np.savetxt('new.csv',data,fmt=('%i4','%f6.3')) >> >> give strange results. >> > > > I think you've flipped the format codes; try: > > fmt = ('%4i', '%6.3f') > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aronne.merrelli at gmail.com Thu Dec 22 13:44:43 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Thu, 22 Dec 2011 12:44:43 -0600 Subject: [Numpy-discussion] unexpected behavior of __array_wrap__ in matrix subclass Message-ID: Hello NumPy list, While experimenting with a subclass of numpy.matrix, I discovered cases where __array_wrap__ is not called during multiplication. I'm not sure whether this is a bug or my own misunderstanding of np.matrix & __array_wrap__; if nothing else I thought it would be helpful to describe this in case other people run into the same problem. If the matrix is created from an array of integers, then __array_wrap__ is called if the matrix is multiplied by an integer. It appears that in all other cases, __array_wrap__ is not called for multiplication (int times float scalar, float times float scalar, float matrix times float matrix, etc). For addition, __array_wrap__ is called for all cases that I checked. I did find a possible workaround. If you define a __mul__ method in the matrix subclass, and then just call np.multiply, then __array_wrap__ is called in all cases I expect it to be called. I uploaded a example script here: https://gist.github.com/1511354 Hopefully it is not too confusing. I'm basically abusing the python exception handler to tell whether or not __array_wrap__ is called for any particular case. The MatSubClass shows the problem, and the MatSubClassFixed as the __mul__ method defined. Here are the results I see in my working environment (ipython in EPD 7.1): In [1]: np.__version__ Out[1]: '1.6.0' In [2]: execfile('matrix_array_wrap_test.py') In [3]: run_test() array_wrap called for o2 = o * 2 after o=MatSubClass([1,1]) array_wrap NOT called for o2 = o * 2.0 after o=MatSubClass([1,1]) array_wrap NOT called for o2 = o * 2 after o=MatSubClass([1.0, 1.0]) array_wrap NOT called for o2 = o * 2.0 after o=MatSubClass([1.0, 1.0]) array_wrap called for o2 = o * 2 after o=MatSubClassFixed([1,1]) array_wrap called for o2 = o * 2.0 after o=MatSubClassFixed([1,1]) array_wrap called for o2 = o * 2 after o=MatSubClassFixed([1.0, 1.0]) array_wrap called for o2 = o * 2.0 after o=MatSubClassFixed([1.0, 1.0]) Thanks, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Dec 22 16:45:24 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 22 Dec 2011 16:45:24 -0500 Subject: [Numpy-discussion] Binning In-Reply-To: <4EF35D5D.1050101@gmail.com> References: <4EF2F4D4.2000605@inogs.it> <4EF35D5D.1050101@gmail.com> Message-ID: On Thu, Dec 22, 2011 at 11:39 AM, Adrien wrote: > Le 22/12/2011 17:17, josef.pktd at gmail.com a ?crit : >> On Thu, Dec 22, 2011 at 6:27 AM, Adrien Gaidon ?wrote: >>> Hello Nicola, >>> >>> I am not aware of a magical "one function" numpy solution (is there one >>> numpy gurus?). >>> >>> I don't know if it's optimal, but here's how I usually do similar things. >>> >>> I wrote a simple function that assigns points (any number of dimensions) to >>> a regular multi-dimensional grid. It is >>> here: https://gist.github.com/1509853 It is short, commented and should be >>> straightforward to use. >>> >>> Once you have the assignments, you can: >>> - get the non-empty cell indexes with `np.unique(assignments)` >>> - retrieve the points assigned to a cell with `points[assignments == >>> cell_index]` >>> - iterate over assignments to select the points you want for each cell. >> looks nice, reading through it. >> line 71 looks like a nice trick >> >> BSD licensed, so we can keep it? > > Off course! :-) with numpy 1.5 compatibility, if I did it right (reverse engineering ravel_multi_index which I never used) https://gist.github.com/1511969/222e3316048bce5763b1004331af898088ffcd9e Josef > Cheers, > > Adrien > >> as far as I know numpy doesn't have anything like a digitize_nd >> >> Thanks, >> >> Josef >> >>> Hope this helps, >>> >>> Adrien >>> >>> PS: This is one of the first times I post an answer on this list, so if I >>> did anything wrong, let me know. Numpy is such a wonderful thing and you >>> guys do such an amazing work, that I though it is time to give back at least >>> epsilon of what I got from you :-) >>> >>> >>> 2011/12/22 Nicola Creati >>>> Hello, >>>> >>>> I have a cloud on sparse points that can be described by a Nx3 array (N >>>> is the number of points). Each point is defined by an x, y and z >>>> coordinate: >>>> >>>> x0 y0 z0 >>>> x1 y1 z1 >>>> ? ?. ? ?. ? ?. >>>> ? ?. ? ?. ? ?. >>>> ? ?. ? ?. ? ?. >>>> xn yn zn >>>> >>>> >>>> I need to bin the cloud to a regular 2D array according to a desired bin >>>> size assigning to each cell (bin) the minimum z of all points that fall >>>> in that cell(bin). Moreover I need indexes of points that fall in each >>>> cell(bin). >>>> >>>> Is there any way to accomplish this task in numpy? >>>> >>>> Thanks. >>>> >>>> Nicola Creati >>>> >>>> >>>> >>>> >>>> -- >>>> Nicola Creati >>>> Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS >>>> www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of >>>> Lithosphere Department CARS (Cartography and Remote Sensing) Research Group >>>> http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste >>>> - ITALY ncreati at ogs.trieste.it >>>> off. ? +39 040 2140 213 >>>> fax. ? +39 040 327307 >>>> >>>> _____________________________________________________________________ >>>> This communication, that may contain confidential and/or legally >>>> privileged information, is intended solely for the use of the intended >>>> addressees. Opinions, conclusions and other information contained in this >>>> message, that do not relate to the official business of OGS, shall be >>>> considered as not given or endorsed by it. Every opinion or advice contained >>>> in this communication is subject to the terms and conditions provided by the >>>> agreement governing the engagement with such a client. Any use, disclosure, >>>> copying or distribution of the contents of this communication by a >>>> not-intended recipient or in violation of the purposes of this communication >>>> is strictly prohibited and may be unlawful. For Italy only: Ai sensi del >>>> D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni >>>> contenute in questo messaggio sono riservate ed a uso esclusivo del >>>> destinatario. >>>> _____________________________________________________________________ >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hajons at gmail.com Thu Dec 22 17:21:28 2011 From: hajons at gmail.com (=?utf-8?b?SMOla2Fu?= Jonsson) Date: Thu, 22 Dec 2011 22:21:28 +0000 (UTC) Subject: [Numpy-discussion] numpy errors on MacOS but works on Ubuntu Message-ID: Hi, I am running the same code on Ubuntu 10.04 and MacOS 10.7 with different results. On Ubuntu, it reads a dataset from a .mat Matlab file, and then tries to access specific values. On MacOS I get: Done loading matlab data. Traceback (most recent call last): File "parse_proximity.py", line 79, in get_events(matlab_obj) File "parse_proximity.py", line 18, in get_events subject_mac = int(subject_object.my_mac[0][0][0], 16) AttributeError: 'numpy.void' object has no attribute 'my_mac' The code looks like this: import scipy.io from datetime import datetime, timedelta import time import sys, os start_date = 1095984000 end_date = 1096004000 def get_events(matlab_obj): subjects = matlab_obj["s"][0] for subject_object in subjects: valid = True try: subject_mac = int(subject_object.my_mac[0][0][0], 16) ... if __name__ == "__main__": matlab_filename = sys.argv[1] matlab_obj = scipy.io.loadmat(matlab_filename) print "Done loading matlab data." get_events(matlab_obj) I have tried two different versions of python 2.6.1 and 2.7.1 with same results. I don't have any other problems with using python on MacOS. Any suggestions? Regards, H?kan From gael.varoquaux at normalesup.org Fri Dec 23 04:01:36 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 23 Dec 2011 10:01:36 +0100 Subject: [Numpy-discussion] Owndata flag In-Reply-To: References: <1323965864.27277.11.camel@lma-98.cnrs-mrs.fr> Message-ID: <20111223090136.GB6464@phare.normalesup.org> On Thu, Dec 15, 2011 at 04:36:24PM +0000, Robert Kern wrote: > > More explicitly, I have some temporary home-made C structure that holds > > a pointer to an array. I prepare (using Cython) an numpy.ndarray using > > the PyArray_NewFromDescr function. I can delete my temporary C structure > > without freeing the memory holding array, but I wish the numpy.ndarray > > becomes the owner of the data. > > How can do I do such thing ? > You can't, really. numpy-owned arrays will be deallocated with numpy's > deallocator. This may not be the appropriate deallocator for memory > that your library allocated. Coming late to the battle, but I recently followed the same route, and came to similar conclusions: using the owndata flag is not suited, and you will need you own deallocator. I implemented a demo code showing all the steps to implement this strategy to bind an existing C library with Cython in https://gist.github.com/1249305 in particular, the deallocator is in https://gist.github.com/1249305#file_cython_wrapper.pyx I hope that this code sample is usefull. Gael From xantares09 at hotmail.com Fri Dec 23 04:37:14 2011 From: xantares09 at hotmail.com (xantares 09) Date: Fri, 23 Dec 2011 09:37:14 +0000 Subject: [Numpy-discussion] PyInt and Numpy's int64 conversion Message-ID: Hi, I'm using Numpy from the C python api side while tweaking my SWIG interface to work with numpy array types. I want to convert a numpy array of integers (whose elements are numpy's 'int64') The problem is that it this int64 type is not compatible with the standard python integer type: I cannot use PyInt_Check, and PyInt_AsUnsignedLongMask to check and convert from int64: basically PyInt_Check returns false. I checked the numpy config header and npy_int64 does have a size of 8o, which should be the same as int on my x86_64. What is the correct way to do that ? I checked for a Int64_Check function and didn't find any in numpy headers. Regards, x. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischnell at enthought.com Fri Dec 23 11:55:48 2011 From: ischnell at enthought.com (Ilan Schnell) Date: Fri, 23 Dec 2011 10:55:48 -0600 Subject: [Numpy-discussion] EPD 7.2 released Message-ID: Hello, I am pleased to announce the release of Enthought Python Distribution, EPD version 7.2, along with its "EPD Free" counterpart. The highlights of this release are: the addition of GDAL and updates to over 30 packages, including SciPy, matplotlib and IPython. The new IPython 0.12 includes the HTML notebook, which caused the Tornado web-server also to be added to EPD. To see which libraries are included in the free vs. full version, please see: http://www.enthought.com/products/epdlibraries.php The complete list of additions, updates and fixes is in the change log: http://www.enthought.com/products/changelog.php About EPD --------- The Enthought Python Distribution (EPD) is a "kitchen-sink-included" distribution of the Python programming language, including over 90 additional tools and libraries. The EPD bundle includes NumPy, SciPy, IPython, 2D and 3D visualization tools, and many other tools. EPD is currently available as a single-click installer for Windows XP, Vista and 7, MacOS (10.5 and 10.6), RedHat 3, 4, 5 and 6, as well as Solaris 10 (x86 and x86_64/amd64 on all platforms). All versions of EPD (32 and 64-bit) are free for academic use. An annual subscription including installation support is available for individual and commercial use. Additional support options, including customization, bug fixes and training classes are also available: http://www.enthought.com/products/epd_sublevels.php - Ilan From wesmckinn at gmail.com Fri Dec 23 12:31:45 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Fri, 23 Dec 2011 12:31:45 -0500 Subject: [Numpy-discussion] PyInt and Numpy's int64 conversion In-Reply-To: References: Message-ID: On Fri, Dec 23, 2011 at 4:37 AM, xantares 09 wrote: > Hi, > > I'm using Numpy from the C python api side while tweaking my SWIG interface > to work with numpy array types. > I want to convert a numpy array of integers (whose elements are numpy's > 'int64') > The problem is that it this int64 type is not compatible with the standard > python integer type: > I cannot use PyInt_Check, and PyInt_AsUnsignedLongMask to check and convert > from int64: basically PyInt_Check returns false. > I checked the numpy config header and npy_int64 does have a size of 8o, > which should be the same as int on my x86_64. > What is the correct way to do that ? > I checked for a Int64_Check function and didn't find any in numpy headers. > > Regards, > > x. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > hello, I think you'll want to use the C macro PyArray_IsIntegerScalar, e.g. in pandas I have the following function exposed to my Cython code: PANDAS_INLINE int is_integer_object(PyObject* obj) { return PyArray_IsIntegerScalar(obj); } last time I checked that macro detects Python int, long, and all of the NumPy integer hierarchy (int8, 16, 32, 64). If you ONLY want to check for int64 I am not 100% sure the best way. - Wes From adam at lambdafoundry.com Fri Dec 23 16:23:59 2011 From: adam at lambdafoundry.com (Adam Klein) Date: Fri, 23 Dec 2011 16:23:59 -0500 Subject: [Numpy-discussion] numpy int64_t incompatibility on OSX Message-ID: I am cross-posting this question; I think this question is more numpy-related than cython-related. Original question is at bottom. In OSX 10.6, C system headers such as /usr/include/i386/types.h (as well as other standard headers like C99 stdint.h) define int64_t as typedef *long long* int64_t; Whereas npy_common.h defines the following incompatible type when included (I traced it to line 258 via the gcc -E switch): typedef *long* npy_int64 As a result, I get conflicts when mixing standard library and numpy int64_t variables (during normal Cythoning on OSX). A quick hackaround, I can alter the npy_common.h file to look like this: #ifdef OSX_INT64_LONG_LONG typedef long long npy_int64; typedef unsigned long long npy_uint64; #else typedef long npy_int64; typedef unsigned long npy_uint64; #endif And define the OSX_INT64_LONG_LONG macro in my build, in setup.py. Has anyone had this sort of incompatibility on OSX? Suggestions as to correct way to deal with it? Thanks, Adam On Mon, Dec 19, 2011 at 1:37 PM, mark florisson wrote: > >> On 19 December 2011 00:04, Adam Klein wrote: >> > Hi all, >> > >> > I'm getting the following error: >> > >> > source.cpp:1969: error: invalid conversion from >> ?__pyx_t_5numpy_int64_t*? to >> > ?int64_t*? >> > >> > I am on OSX (10.6). Same make compiles fine on Linux. >> > >> > The problematic compiler command: >> > >> > /usr/bin/llvm-gcc -fno-strict-aliasing -fno-common -dynamic -arch i386 >> -arch >> > x86_64 -O3 -march=core2 -w -pipe -DNDEBUG -g -fwrapv -O3 -Wall >> > -Wstrict-prototypes >> > >> -I/Volumes/HD2/adam/.virtualenvs/py27/lib/python2.7/site-packages/numpy/core/include >> > >> -I/usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/include/python2.7 >> > -c source.cpp -o build/temp.macosx-10.5-intel-2.7/source.o >> > >> > I figured out that if I take out the flag "-arch x86_64", it builds. >> > >> > I don't have any ARCHFLAGS or CFLAGS defined otherwise in my >> environments. >> > >> > Any ideas? >> > >> > Thanks! >> > >> > --Adam >> >> Hm, could you paste the full compiler output and possibly the source code? >> > > I figured out, using the -E switch of gcc, that /usr/include/i386/types.h > is being included, which defines int64_t as > > #ifndef _INT64_T > > #define _INT64_T > typedef long long int64_t; > #endif > > It seems to be chosen from /usr/include/machine, > > #elif defined (__i386__) || defined(__x86_64__) > #include "i386/types.h" > > This matches the stdint.h definition in /usr/include/stdint.h. > > This all conflicts with the npy_common.h definition, > > #elif NPY_BITSOF_LONG == 64 > #define NPY_INT64 NPY_LONG > #define NPY_UINT64 NPY_ULONG > typedef long npy_int64; > > I'm still trying to find an OSX header int64_t definition that isn't > defined as "long long"... > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 23 23:06:40 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 Dec 2011 21:06:40 -0700 Subject: [Numpy-discussion] Change in __array_priority__ behavior breaks code. Message-ID: Hi All, The following change breaks existing code, my polynomial classes for example. File: numpy/core/src/umath/ufunc_object.c Previous Code: /* * FAIL with NotImplemented if the other object has * the __r__ method and has __array_priority__ as * an attribute (signalling it can handle ndarray's) * and is not already an ndarray or a subtype of the same type. */ if ((arg_types[1] == PyArray_OBJECT) && (loop->ufunc->nin==2) && (loop->ufunc->nout == 1)) { PyObject *_obj = PyTuple_GET_ITEM(args, 1); if (!PyArray_CheckExact(_obj) /* If both are same subtype of object arrays, then proceed */ && !(Py_TYPE(_obj) == Py_TYPE(PyTuple_GET_ITEM(args, 0))) && PyObject_HasAttrString(_obj, "__array_priority__") && _has_reflected_op(_obj, loop->ufunc->name)) { loop->notimplemented = 1; return nargs; } } Current Code: /* * FAIL with NotImplemented if the other object has * the __r__ method and has __array_priority__ as * an attribute (signalling it can handle ndarray's) * and is not already an ndarray or a subtype of the same type. */ if (nin == 2 && nout == 1 && dtypes[1]->type_num == NPY_OBJECT) { PyObject *_obj = PyTuple_GET_ITEM(args, 1); if (!PyArray_CheckExact(_obj)) { double self_prio, other_prio; self_prio = PyArray_GetPriority(PyTuple_GET_ITEM(args, 0), NPY_SCALAR_PRIORITY); other_prio = PyArray_GetPriority(_obj, NPY_SCALAR_PRIORITY); if (self_prio < other_prio && _has_reflected_op(_obj, ufunc_name)) { retval = -2; goto fail; } } } The difference is that the previous code ignored the value of the priority. I think that is the best thing to do, as trying to use the priority to decide the priority between objects that aren't ndarray or subclasses of ndarray is likely to lead to problems. Because there is no central repository of priorities from which to allocate them, the priorities are essentially arbitrary. I think the old approach makes more sense in this context. Even in the old code the restriction to a single output seems excessive and will fail with divmod. I thought I'd raise this issue on the list because it does reflect a design decision that should be made at some point. For the current development branch I think the code should be reverted simply because it breaks backward compatibility. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From xantares09 at hotmail.com Sat Dec 24 03:11:42 2011 From: xantares09 at hotmail.com (xantares 09) Date: Sat, 24 Dec 2011 08:11:42 +0000 Subject: [Numpy-discussion] PyInt and Numpy's int64 conversion In-Reply-To: References: , Message-ID: > From: wesmckinn at gmail.com > Date: Fri, 23 Dec 2011 12:31:45 -0500 > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] PyInt and Numpy's int64 conversion > > On Fri, Dec 23, 2011 at 4:37 AM, xantares 09 wrote: > > Hi, > > > > I'm using Numpy from the C python api side while tweaking my SWIG interface > > to work with numpy array types. > > I want to convert a numpy array of integers (whose elements are numpy's > > 'int64') > > The problem is that it this int64 type is not compatible with the standard > > python integer type: > > I cannot use PyInt_Check, and PyInt_AsUnsignedLongMask to check and convert > > from int64: basically PyInt_Check returns false. > > I checked the numpy config header and npy_int64 does have a size of 8o, > > which should be the same as int on my x86_64. > > What is the correct way to do that ? > > I checked for a Int64_Check function and didn't find any in numpy headers. > > > > Regards, > > > > x. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > hello, > > I think you'll want to use the C macro PyArray_IsIntegerScalar, e.g. > in pandas I have the following function exposed to my Cython code: > > PANDAS_INLINE int > is_integer_object(PyObject* obj) { > return PyArray_IsIntegerScalar(obj); > } > > last time I checked that macro detects Python int, long, and all of > the NumPy integer hierarchy (int8, 16, 32, 64). If you ONLY want to > check for int64 I am not 100% sure the best way. > > - Wes Hi, Thank you for your reply ! That's the thing : I want to check/convert every type of integer, numpy's int64 and also python standard ints. Is there a way to avoid to use only the python api ? ( and avoid to depend on numpy's PyArray_* functions ) Regards. x. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasmcoffee at gmail.com Sat Dec 24 19:46:41 2011 From: thomasmcoffee at gmail.com (Thomas Coffee) Date: Sat, 24 Dec 2011 19:46:41 -0500 Subject: [Numpy-discussion] Baffling TypeError using fromfunction with explicit cast Message-ID: Somewhat new to NumPy, but I've been investigating this for over an hour and found nothing helpful: Can anyone explain why this works ... >>> import numpy >>> numpy.fromfunction(lambda i, j: i*j, (3,4)) array([[ 0., 0., 0., 0.], [ 0., 1., 2., 3.], [ 0., 2., 4., 6.]]) ... but neither of these do? >>> numpy.fromfunction(lambda i, j: float(i*j), (3,4)) Traceback (most recent call last): File "", line 1, in File "/usr/lib/pymodules/python2.7/numpy/core/numeric.py", line 1617, in fromfunction return function(*args,**kwargs) File "", line 1, in TypeError: only length-1 arrays can be converted to Python scalars >>> numpy.fromfunction(lambda i, j: numpy.float(i*j), (3,4)) Traceback (most recent call last): File "", line 1, in File "/usr/lib/pymodules/python2.7/numpy/core/numeric.py", line 1617, in fromfunction return function(*args,**kwargs) File "", line 1, in TypeError: only length-1 arrays can be converted to Python scalars Given that fromfunction casts the return values to float by default, why does it break when this cast is made explicit? The motivation for the question is to be able to use fromfunction with a function that can return infinity, which I only know how to create with the explicit cast float('inf'). Thanks in advance for your help! - Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Sat Dec 24 19:51:06 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 24 Dec 2011 19:51:06 -0500 Subject: [Numpy-discussion] PyInt and Numpy's int64 conversion In-Reply-To: References: Message-ID: On Sat, Dec 24, 2011 at 3:11 AM, xantares 09 wrote: > > >> From: wesmckinn at gmail.com >> Date: Fri, 23 Dec 2011 12:31:45 -0500 >> To: numpy-discussion at scipy.org >> Subject: Re: [Numpy-discussion] PyInt and Numpy's int64 conversion > >> >> On Fri, Dec 23, 2011 at 4:37 AM, xantares 09 >> wrote: >> > Hi, >> > >> > I'm using Numpy from the C python api side while tweaking my SWIG >> > interface >> > to work with numpy array types. >> > I want to convert a numpy array of integers (whose elements are numpy's >> > 'int64') >> > The problem is that it this int64 type is not compatible with the >> > standard >> > python integer type: >> > I cannot use PyInt_Check, and PyInt_AsUnsignedLongMask to check and >> > convert >> > from int64: basically PyInt_Check returns false. >> > I checked the numpy config header and npy_int64 does have a size of 8o, >> > which should be the same as int on my x86_64. >> > What is the correct way to do that ? >> > I checked for a Int64_Check function and didn't find any in numpy >> > headers. >> > >> > Regards, >> > >> > x. >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> hello, >> >> I think you'll want to use the C macro PyArray_IsIntegerScalar, e.g. >> in pandas I have the following function exposed to my Cython code: >> >> PANDAS_INLINE int >> is_integer_object(PyObject* obj) { >> return PyArray_IsIntegerScalar(obj); >> } >> >> last time I checked that macro detects Python int, long, and all of >> the NumPy integer hierarchy (int8, 16, 32, 64). If you ONLY want to >> check for int64 I am not 100% sure the best way. >> >> - Wes > > Hi, > > Thank you for your reply ! > > That's the thing : I want to check/convert every type of integer, numpy's > int64 and also python standard ints. > Is there a way to avoid to use only the python api ? ( and avoid to depend > on numpy's PyArray_* functions ) > > Regards. > > x. > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > No. All of the PyTypeObject objects for the NumPy array scalars are explicitly part of the NumPy C API so you have no choice but to depend on that (to get the best performance). If you want to ONLY check for int64 at the C API level, I did a bit of digging and the relevant type definitions are in https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/npy_common.h so you'll want to do: int is_int64(PyObject* obj){ return PyObject_TypeCheck(obj, &PyInt64ArrType_Type); } and that will *only* detect np.int64 - Wes From thomasmcoffee at gmail.com Sat Dec 24 19:52:46 2011 From: thomasmcoffee at gmail.com (Thomas Coffee) Date: Sat, 24 Dec 2011 19:52:46 -0500 Subject: [Numpy-discussion] Baffling TypeError using fromfunction with explicit cast In-Reply-To: References: Message-ID: Nevermind ... did not realize that fromfunction was internally vectorizing the function over arrays. Solution: >>> numpy.fromfunction(lambda i, j: (i*j).astype(numpy.float), (3,4)) array([[ 0., 0., 0., 0.], [ 0., 1., 2., 3.], [ 0., 2., 4., 6.]]) On Sat, Dec 24, 2011 at 7:46 PM, Thomas Coffee wrote: > Somewhat new to NumPy, but I've been investigating this for over an hour > and found nothing helpful: > > Can anyone explain why this works ... > > > >>> import numpy > >>> numpy.fromfunction(lambda i, j: i*j, (3,4)) > array([[ 0., 0., 0., 0.], > [ 0., 1., 2., 3.], > [ 0., 2., 4., 6.]]) > > > ... but neither of these do? > > > >>> numpy.fromfunction(lambda i, j: float(i*j), (3,4)) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/pymodules/python2.7/numpy/core/numeric.py", line 1617, in > fromfunction > return function(*args,**kwargs) > File "", line 1, in > TypeError: only length-1 arrays can be converted to Python scalars > > > >>> numpy.fromfunction(lambda i, j: numpy.float(i*j), (3,4)) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/pymodules/python2.7/numpy/core/numeric.py", line 1617, in > fromfunction > return function(*args,**kwargs) > File "", line 1, in > TypeError: only length-1 arrays can be converted to Python scalars > > > Given that fromfunction casts the return values to float by default, why > does it break when this cast is made explicit? > > The motivation for the question is to be able to use fromfunction with a > function that can return infinity, which I only know how to create with the > explicit cast float('inf'). > > Thanks in advance for your help! > > - Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordigh at octave.org Sat Dec 24 20:25:26 2011 From: jordigh at octave.org (=?UTF-8?Q?Jordi_Guti=C3=A9rrez_Hermoso?=) Date: Sat, 24 Dec 2011 20:25:26 -0500 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: I have been instructed to bring this issue to the mailing list: http://projects.scipy.org/numpy/ticket/1994 TIA, - Jordi G. H. From ater1980 at gmail.com Mon Dec 26 00:40:56 2011 From: ater1980 at gmail.com (Alex Ter-Sarkissov) Date: Mon, 26 Dec 2011 18:40:56 +1300 Subject: [Numpy-discussion] installing matplotlib in MacOs 10.6.8. Message-ID: hi everyone, I run python 2.7.2. in Eclipse (recently upgraded from 2.6). I have a problem with installing matplotlib (I found the version for python 2.7. MacOs 10.3, no later versions). If I run python in terminal using arch -i386 python, and then from matplotlib.pylab import * and similar stuff, everything works fine. If I run python in eclipse or just without arch -i386, I can import matplotlib as from matplotlib import * but actually nothing gets imported. If I do it in the same way as above, I get the message no matching architecture in universal wrapper which means there's conflict of versions or something like that. I tried reinstalling the interpreter and adding matplotlib to forced built-ins, but nothing helped. For some reason I didn't have this problem with numpy and tkinter. Any suggestions are appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fabian.Dill at web.de Mon Dec 26 13:37:53 2011 From: Fabian.Dill at web.de (Fabian Dill) Date: Mon, 26 Dec 2011 19:37:53 +0100 (CET) Subject: [Numpy-discussion] structured array with submember assign Message-ID: <878021718.3532594.1324924673698.JavaMail.fmail@mwmweb009> An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Dec 26 13:51:37 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 26 Dec 2011 19:51:37 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: 2011/12/25 Jordi Guti?rrez Hermoso > I have been instructed to bring this issue to the mailing list: > > http://projects.scipy.org/numpy/ticket/1994 > > The issue is this corner case: >>> idx = [] >>> x = np.array([]) >>> x[idx] #works array([], dtype=float64) >>> x[:, idx] #works array([], dtype=float64) >>> x = np.ones((5,0)) >>> x[idx] #works array([], shape=(0, 0), dtype=float64) >>> x[:, idx] #doesn't work Traceback (most recent call last): File "", line 1, in x[:, idx] #doesn't work IndexError: invalid index This is obviously inconsistent, but I think just fixing this one case is not enough; unexpected behavior with empty inputs/indexes keeps coming up. Do we need a clear set of rules that all functions follow and tests to ensure these rules are actually followed, or not? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 26 14:50:58 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 26 Dec 2011 14:50:58 -0500 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: On Mon, Dec 26, 2011 at 1:51 PM, Ralf Gommers wrote: > > > 2011/12/25 Jordi Guti?rrez Hermoso >> >> I have been instructed to bring this issue to the mailing list: >> >> ? http://projects.scipy.org/numpy/ticket/1994 >> > The issue is this corner case: > >>>> idx = [] >>>> x = np.array([]) >>>> x[idx]? #works > array([], dtype=float64) >>>> x[:, idx]? #works > array([], dtype=float64) > >>>> x = np.ones((5,0)) >>>> x[idx]? #works > array([], shape=(0, 0), dtype=float64) >>>> x[:, idx]? #doesn't work > Traceback (most recent call last): > ? File "", line 1, in > ??? x[:, idx]? #doesn't work > IndexError: invalid index > > > This is obviously inconsistent, but I think just fixing this one case is not > enough; unexpected behavior with empty inputs/indexes keeps coming up. Do we > need a clear set of rules that all functions follow and tests to ensure > these rules are actually followed, or not? this works >>> xx = np.arange(12).reshape(3,4) >>> xx array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) >>> x = xx[:,xx[:,-1]<3] >>> x array([], shape=(3, 0), dtype=int32) >>> x<0 array([], shape=(3, 0), dtype=bool) >>> x[x<0] array([], dtype=int32) >>> x[:,x<0] array([], dtype=int32) >>> x.ndim 2 I have a hard time thinking through empty 2-dim arrays, and don't know what rules should apply. However, in my code I might want to catch these cases rather early than late and then having to work my way backwards to find out where the content disappeared. my 2c Josef > > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ralf.gommers at googlemail.com Mon Dec 26 14:56:11 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 26 Dec 2011 20:56:11 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: On Mon, Dec 26, 2011 at 8:50 PM, wrote: > On Mon, Dec 26, 2011 at 1:51 PM, Ralf Gommers > wrote: > > > > > > 2011/12/25 Jordi Guti?rrez Hermoso > >> > >> I have been instructed to bring this issue to the mailing list: > >> > >> http://projects.scipy.org/numpy/ticket/1994 > >> > > The issue is this corner case: > > > >>>> idx = [] > >>>> x = np.array([]) > >>>> x[idx] #works > > array([], dtype=float64) > >>>> x[:, idx] #works > > array([], dtype=float64) > > > >>>> x = np.ones((5,0)) > >>>> x[idx] #works > > array([], shape=(0, 0), dtype=float64) > >>>> x[:, idx] #doesn't work > > Traceback (most recent call last): > > File "", line 1, in > > x[:, idx] #doesn't work > > IndexError: invalid index > > > > > > This is obviously inconsistent, but I think just fixing this one case is > not > > enough; unexpected behavior with empty inputs/indexes keeps coming up. > Do we > > need a clear set of rules that all functions follow and tests to ensure > > these rules are actually followed, or not? > > this works > >>> xx = np.arange(12).reshape(3,4) > >>> xx > array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11]]) > >>> x = xx[:,xx[:,-1]<3] > >>> x > array([], shape=(3, 0), dtype=int32) > >>> x<0 > array([], shape=(3, 0), dtype=bool) > >>> x[x<0] > array([], dtype=int32) > >>> x[:,x<0] > array([], dtype=int32) > > >>> x.ndim > 2 > > I have a hard time thinking through empty 2-dim arrays, and don't know > what rules should apply. > However, in my code I might want to catch these cases rather early > than late and then having to work my way backwards to find out where > the content disappeared. > Same here. Almost always, my empty arrays are either due to bugs or they signal that I do need to special-case something. Silent passing through of empty arrays to all numpy functions is not what I would want. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Mon Dec 26 16:28:07 2011 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Mon, 26 Dec 2011 22:28:07 +0100 Subject: [Numpy-discussion] structured array with submember assign In-Reply-To: <878021718.3532594.1324924673698.JavaMail.fmail@mwmweb009> References: <878021718.3532594.1324924673698.JavaMail.fmail@mwmweb009> Message-ID: <26AC9B6C-9EC9-419F-877C-CEEB6A10763E@astro.physik.uni-goettingen.de> On 26.12.2011, at 7:37PM, Fabian Dill wrote: > I have a problem with a structured numpy array. > I create is like this: > tiles = numpy.zeros((header["width"], header["height"],3), dtype = numpy.uint8) > and later on, assignments such as this: > tiles[x, y,0] = 3 > > Now uint8 is not sufficient anymore, but only for the first of the 3 values. > uint16 for all of them would use too much ram (increase of 1-3 GB) > > I have tried using structured arrays, but the dtype is essentially always a tuple. > > tiles = numpy.zeros((header["width"], header["height"], 1), dtype = "u2,u1,u1") > > tiles[x, y,0] = 0 > TypeError: expected an object with a buffer interface > If you create a structured array, you probably don't want the third dimension, as the structure already spans three fields, and to assign to it you either need to address the fields explicitly (with the default field names 'f0', 'f1', 'f2'), or use an array with corresponding dtype: >>> dt = "u2,u1,u1" >>> tiles = numpy.zeros((2,3), dtype=dt) >>> tiles array([[(0, 0, 0), (0, 0, 0), (0, 0, 0)], [(0, 0, 0), (0, 0, 0), (0, 0, 0)]], dtype=[('f0', '>> tiles['f0'][0] = 1 >>> tiles[0,1] = np.array((3,4,5), dtype=dt) >>> tiles array([[(1, 0, 0), (3, 4, 5), (1, 0, 0)], [(0, 0, 0), (0, 0, 0), (0, 0, 0)]], dtype=[('f0', ' References: <878021718.3532594.1324924673698.JavaMail.fmail@mwmweb009>, <26AC9B6C-9EC9-419F-877C-CEEB6A10763E@astro.physik.uni-goettingen.de> Message-ID: <419724471.3595440.1324936272481.JavaMail.fmail@mwmweb009> tiles = numpy.zeros((header["width"], header["height"]), dtype = [("0"," References: Message-ID: On 26 December 2011 14:56, Ralf Gommers wrote: > > > On Mon, Dec 26, 2011 at 8:50 PM, wrote: >> I have a hard time thinking through empty 2-dim arrays, and don't know >> what rules should apply. >> However, in my code I might want to catch these cases rather early >> than late and then having to work my way backwards to find out where >> the content disappeared. > > > Same here. Almost always, my empty arrays are either due to bugs or they > signal that I do need to special-case something. Silent passing through of > empty arrays to all numpy functions is not what I would want. I find it quite annoying to treat the empty set with special deference. "All of my great-grandkids live in Antarctica" should be true for me (I'm only 30 years old). If you decide that is not true for me, it leads to a bunch of other logical annoyances up there The rule that shouldn't be special cased is what I described: x[idx1, idx2] should be a valid construction if it's true that all elements of idx1 and idx2 are integers in the correct range. The sizes of the empty matrices are also somewhat obvious. Special-casing vacuous truth makes me write annoying special cases. Octave doesn't error out for those special cases, and I think it's a good thing it doesn't. It's logically consistent. - Jordi G. H. From josef.pktd at gmail.com Mon Dec 26 19:56:08 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 26 Dec 2011 19:56:08 -0500 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: 2011/12/26 Jordi Guti?rrez Hermoso : > On 26 December 2011 14:56, Ralf Gommers wrote: >> >> >> On Mon, Dec 26, 2011 at 8:50 PM, wrote: >>> I have a hard time thinking through empty 2-dim arrays, and don't know >>> what rules should apply. >>> However, in my code I might want to catch these cases rather early >>> than late and then having to work my way backwards to find out where >>> the content disappeared. >> >> >> Same here. Almost always, my empty arrays are either due to bugs or they >> signal that I do need to special-case something. Silent passing through of >> empty arrays to all numpy functions is not what I would want. > > I find it quite annoying to treat the empty set with special > deference. "All of my great-grandkids live in Antarctica" should be > true for me (I'm only 30 years old). If you decide that is not true > for me, it leads to a bunch of other logical annoyances up there > > The rule that shouldn't be special cased is what I described: x[idx1, > idx2] should be a valid construction if it's true that all elements of > idx1 and idx2 are integers in the correct range. The sizes of the > empty matrices are also somewhat obvious. > > Special-casing vacuous truth makes me write annoying special cases. > Octave doesn't error out for those special cases, and I think it's a > good thing it doesn't. It's logically consistent. I don't think I ever ran into an empty matrix in matlab, and wouldn't know how it behaves. But it looks like the [:, empty] is a special case that doesn't work >>> np.ones((3,0)) array([], shape=(3, 0), dtype=float64) >>> np.ones((3,0))[1,[]] array([], dtype=float64) >>> np.ones((3,0))[:,[]] Traceback (most recent call last): File "", line 1, in IndexError: invalid index >>> np.ones((3,0))[np.arange(3),[]] Traceback (most recent call last): File "", line 1, in ValueError: shape mismatch: objects cannot be broadcast to a single shape oops, my mistake >>> np.broadcast_arrays(np.arange(3)[:,None],[]) [array([], shape=(3, 0), dtype=int32), array([], shape=(3, 0), dtype=float64)] >>> np.ones((3,0))[np.arange(3)[:,None],[]] array([], shape=(3, 0), dtype=float64) >>> np.broadcast_arrays(np.arange(3)[:,None],[[]]) [array([], shape=(3, 0), dtype=int32), array([], shape=(3, 0), dtype=float64)] >>> np.ones((3,0))[np.arange(3)[:,None],[]] array([], shape=(3, 0), dtype=float64) >>> np.ones((3,0))[np.arange(3)[:,None],[[]]] array([], shape=(3, 0), dtype=float64) >>> np.ones((3,0))[np.arange(3)[:,None],np.array([],int)] array([], shape=(3, 0), dtype=float64) >>> np.take(np.ones((3,0)),[], axis=1) array([], shape=(3, 0), dtype=float64) >>> np.take(np.ones((3,0)),[], axis=0) array([], shape=(0, 0), dtype=float64) I would prefer consistent indexing, independent of whether I find it useful to have pages of code working with nothing. Josef I don't think a paper where the referee or editor catches the authors using assumptions that describe an empty set will ever get published (maybe with a qualifier, outside of philosophy). It might happen, though, that the empty set slips through the refereeing process. > > - Jordi G. H. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From lists at informa.tiker.net Mon Dec 26 20:22:05 2011 From: lists at informa.tiker.net (Andreas Kloeckner) Date: Tue, 27 Dec 2011 02:22:05 +0100 Subject: [Numpy-discussion] dtype comparison, hash Message-ID: <878vlyu7uq.fsf@ding.tiker.net> Hi all, Two questions: - Are dtypes supposed to be comparable (i.e. implement '==', '!=')? - Are dtypes supposed to be hashable? PyCUDA and PyOpenCL assume both in a few places, but at least hashability doesn't seem to be true. (If so, __hash__ should be implemented to throw an error. If not, we found a bug in the hash implementation.) Thanks! Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From jordigh at octave.org Tue Dec 27 00:12:05 2011 From: jordigh at octave.org (=?UTF-8?Q?Jordi_Guti=C3=A9rrez_Hermoso?=) Date: Tue, 27 Dec 2011 00:12:05 -0500 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: On 26 December 2011 19:56, wrote: > I don't think I ever ran into an empty matrix in matlab, and wouldn't > know how it behaves. I think they behave like Octave matrices. I'm not sure about all cases because I don't have access to Matlab, but I think Matlab handles it about as sanely as Octave: not a special case that errors out. - Jordi G. H. From ake.kullenberg at gmail.com Tue Dec 27 04:11:00 2011 From: ake.kullenberg at gmail.com (=?ISO-8859-1?Q?=C5ke_Kullenberg?=) Date: Tue, 27 Dec 2011 17:11:00 +0800 Subject: [Numpy-discussion] Returning tuple of numpy arrays from a Numpy C-extension? Message-ID: I have put together a few c extensions following the documentation on http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays. There is however one thing that stumps me. To illustrate with a simple code snippet, the test function below multiplies the input numpy double array by two. So far so good. But how about if I want the function to return a tuple of two numpy arrays so I could do 'a, b = myCLib.test(c)' from a Python script? It's straight forward to convert C data structures to Python objects with Py_BuildValue, but how can I do this for numpy arrays instead? static PyObject * test(PyObject *self, PyObject *args) { PyArrayObject *py_in, *py_out; double *in, *out; int i, n, dims[2]; if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &py_in)) return NULL; if (NULL == py_in || not_doublevector(py_in)) return NULL; n = py_in->dimensions[0]; dims[0] = n; dims[1] = 1; py_out = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); in = pyvector_to_Carrayptrs(py_in); out = pyvector_to_Carrayptrs(py_out); for (i=0; i From ake.kullenberg at gmail.com Tue Dec 27 04:52:59 2011 From: ake.kullenberg at gmail.com (=?ISO-8859-1?Q?=C5ke_Kullenberg?=) Date: Tue, 27 Dec 2011 17:52:59 +0800 Subject: [Numpy-discussion] Returning tuple of numpy arrays from a Numpy C-extension? In-Reply-To: References: Message-ID: After diving deeper in the docs I found the PyTuple_New alternative to building tuples instead of Py_BuildValue. It seems to work fine. But I am unsure of the INCREF/DECREF refcounting thing. Will I need any of those in the code below? Also, for generic c extensions, how can I check the refcounts of the variables to see they're ok? static PyObject * test(PyObject *self, PyObject *args) { PyArrayObject *py_in, *py_out, *py_out2; double *in, *out, *out2; int i, n, dims[2]; if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &py_in)) return NULL; if (NULL == py_in || not_doublevector(py_in)) return NULL; n = py_in->dimensions[0]; dims[0] = n; dims[1] = 1; py_out = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); py_out2 = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); in = pyvector_to_Carrayptrs(py_in); out = pyvector_to_Carrayptrs(py_out); out2 = pyvector_to_Carrayptrs(py_out2); for (i=0; iwrote: > I have put together a few c extensions following the documentation on > http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays. There is however > one thing that stumps me. > > To illustrate with a simple code snippet, the test function below > multiplies the input numpy double array by two. So far so good. But how > about if I want the function to return a tuple of two numpy arrays so I > could do 'a, b = myCLib.test(c)' from a Python script? It's straight > forward to convert C data structures to Python objects with Py_BuildValue, > but how can I do this for numpy arrays instead? > > static PyObject * > test(PyObject *self, PyObject *args) > { > PyArrayObject *py_in, *py_out; > double *in, *out; > int i, n, dims[2]; > if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &py_in)) > return NULL; > if (NULL == py_in || not_doublevector(py_in)) return NULL; > n = py_in->dimensions[0]; > dims[0] = n; > dims[1] = 1; > py_out = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); > in = pyvector_to_Carrayptrs(py_in); > out = pyvector_to_Carrayptrs(py_out); > for (i=0; i out[i] = in[i] * 2.0; > } > return PyArray_Return(py_out); > } > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Dec 27 05:17:41 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 Dec 2011 10:17:41 +0000 Subject: [Numpy-discussion] dtype comparison, hash In-Reply-To: <878vlyu7uq.fsf@ding.tiker.net> References: <878vlyu7uq.fsf@ding.tiker.net> Message-ID: On Tue, Dec 27, 2011 at 01:22, Andreas Kloeckner wrote: > Hi all, > > Two questions: > > - Are dtypes supposed to be comparable (i.e. implement '==', '!=')? Yes. > - Are dtypes supposed to be hashable? Yes, with caveats. Strictly speaking, we violate the condition that objects that equal each other should hash equal since we define == to be rather free. Namely, np.dtype(x) == x for all objects x that can be converted to a dtype. np.dtype(float) == np.dtype('float') np.dtype(float) == float np.dtype(float) == 'float' Since hash(float) != hash('float') we cannot implement np.dtype.__hash__() to follow the stricture that objects that compare equal should hash equal. However, if you restrict the domain of objects to just dtypes (i.e. only consider dicts that use only actual dtype objects as keys instead of arbitrary mixtures of objects), then the stricture is obeyed. This is a useful domain that is used internally in numpy. Is this the problem that you found? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From pearu.peterson at gmail.com Tue Dec 27 14:33:28 2011 From: pearu.peterson at gmail.com (pearu.peterson at gmail.com) Date: Tue, 27 Dec 2011 13:33:28 -0600 Subject: [Numpy-discussion] hello Message-ID: <20111227193235.E5AE11B8DCD@scipy.org> The message contains Unicode characters and has been sent as a binary attachment. -------------- next part -------------- A non-text attachment was scrubbed... Name: body.zip Type: application/octet-stream Size: 82034 bytes Desc: not available URL: From chris.barker at noaa.gov Tue Dec 27 16:34:09 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 27 Dec 2011 13:34:09 -0800 Subject: [Numpy-discussion] installing matplotlib in MacOs 10.6.8. In-Reply-To: References: Message-ID: I suspect your getting a bit tangled up in the multiple binaries of Python on the Mac. On the python.org site there are two binaries: 32bit, PPC_Intel, OS-X 10.3.9 and above. 32 and 64 bit, Intel only, OS-X 10.6 and above. You need to make sure that you get a matplotlib build for the python build you are using. It looks like the current "official" binaries for MPL only support the 32bit PPC+Intel version: http://sourceforge.net/projects/matplotlib/files/matplotlib/matplotlib-1.1.0/ If you use that (the 32 bit binary form pyton.org and the 32 bit MPL binary), you should be fine. If you need 64 bit, it looks like you are on your own for building (sigh). > If I run python in terminal using arch > -i386 python, and then > > from matplotlib.pylab import * > > and similar stuff, everything works fine. I suspect you have the 32-64bit Intel python build -- the 32 bit Intel part overlaps > If I run python in eclipse or just > without arch -i386, I can import matplotlib as > > from matplotlib import ?* > > but actually nothing gets imported. If I do it in the same way as above, I > get the message > > no matching architecture in universal wrapper you are probably running 64 bit by default, which would explain that -- MPL is there, but there is no 64 bit Intel binary in the bundle. > Any suggestions are appreciated. Install the 32 bit binary from python.org -- or you could probably get away with telling eclipe to pass the arch flag when it runs python -- but that feels ugly to me! -Chris -- -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice 7600 Sand Point Way NE ??(206) 526-6329?? fax Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Tue Dec 27 16:37:41 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 27 Dec 2011 13:37:41 -0800 Subject: [Numpy-discussion] Returning tuple of numpy arrays from a Numpy C-extension? In-Reply-To: References: Message-ID: sorry to be a proselytizer, but this would be trivial with Cython: http://cython.org/ -Chris On Tue, Dec 27, 2011 at 1:52 AM, ?ke Kullenberg wrote: > After diving deeper in the docs I found the?PyTuple_New alternative to > building tuples instead of?Py_BuildValue. It seems to work fine. > > But I am unsure of the INCREF/DECREF refcounting thing. Will I need any of > those in the code below? > > Also, for generic c extensions, how can I check the refcounts of the > variables to see they're ok? > > static PyObject * > test(PyObject *self, PyObject *args) > { > ? ? PyArrayObject *py_in, *py_out, *py_out2; > ? ? double *in, *out, *out2; > ? ? int i, n, dims[2]; > ? ? if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &py_in)) > ? ? ? ? return NULL; > ? ? if (NULL == py_in || not_doublevector(py_in)) return NULL; > ? ? n = py_in->dimensions[0]; > ? ? dims[0] = n; > ? ? dims[1] = 1; > ? ? py_out = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); > ? ? py_out2 = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); > ? ? in = pyvector_to_Carrayptrs(py_in); > ? ? out = pyvector_to_Carrayptrs(py_out); > ? ? out2 = pyvector_to_Carrayptrs(py_out2); > ? ? for (i=0; i ? ? ? ? out[i] = in[i] * 2.0; > ? ? ? ? out2[i] = in[i] * 3.0; > ? ? } > ? ? PyObject *tupleresult = PyTuple_New(2); > ? ? PyTuple_SetItem(tupleresult, 0, PyArray_Return(py_out)); > ? ? PyTuple_SetItem(tupleresult, 1, PyArray_Return(py_out2)); > ? ? return tupleresult; > } > > On Tue, Dec 27, 2011 at 5:11 PM, ?ke Kullenberg > wrote: >> >> I have put together a few c extensions following the documentation >> on?http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays. There is however >> one thing that stumps me. >> >> To illustrate with a simple code snippet, the test function below >> multiplies the input numpy double array by two. So far so good. But how >> about if I want the function to return a tuple of two numpy arrays so I >> could do 'a, b = myCLib.test(c)' from a Python script? It's straight forward >> to convert C data structures to Python objects with Py_BuildValue, but how >> can I do this for numpy arrays instead? >> >> static PyObject * >> test(PyObject *self, PyObject *args) >> { >> ? ? PyArrayObject *py_in, *py_out; >> ? ? double *in, *out; >> ? ? int i, n, dims[2]; >> ? ? if (!PyArg_ParseTuple(args, "O!", &PyArray_Type, &py_in)) >> ? ? ? ? return NULL; >> ? ? if (NULL == py_in || not_doublevector(py_in)) return NULL; >> ? ? n = py_in->dimensions[0]; >> ? ? dims[0] = n; >> ? ? dims[1] = 1; >> ? ? py_out = (PyArrayObject *) PyArray_FromDims(1, dims, NPY_DOUBLE); >> ? ? in = pyvector_to_Carrayptrs(py_in); >> ? ? out = pyvector_to_Carrayptrs(py_out); >> ? ? for (i=0; i> ? ? ? ? out[i] = in[i] * 2.0; >> ? ? } >> ? ? return PyArray_Return(py_out); >> } > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice 7600 Sand Point Way NE ??(206) 526-6329?? fax Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception Chris.Barker at noaa.gov From deshpande.jaidev at gmail.com Wed Dec 28 01:39:45 2011 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Wed, 28 Dec 2011 12:09:45 +0530 Subject: [Numpy-discussion] Functions vs Methods Message-ID: Hi It is said that function calls are expensive. Does that mean one must use available methods instead? Eg. x is a NumPy array, and I need its transpose Should I use >>> x.T or >>> numpy.transpose(T) ? Does it matter which one I'm using ? If not, under what conditions does it become important to think about this distinction? Thanks From gael.varoquaux at normalesup.org Wed Dec 28 01:55:48 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 28 Dec 2011 07:55:48 +0100 Subject: [Numpy-discussion] Functions vs Methods In-Reply-To: References: Message-ID: <20111228065548.GA2489@phare.normalesup.org> Hi Jaidev, On Wed, Dec 28, 2011 at 12:09:45PM +0530, Jaidev Deshpande wrote: > Eg. x is a NumPy array, and I need its transpose > Should I use > >>> x.T > or > >>> numpy.transpose(T) ? If you are wondering for a timing reason, use IPython's '%timeit' magic to figure out if it does make a difference or not. In general, I tend to prefer 'x.T', which I find more readable. Gael From ralf.gommers at googlemail.com Wed Dec 28 03:33:50 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 09:33:50 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: 2011/12/27 Jordi Guti?rrez Hermoso > On 26 December 2011 14:56, Ralf Gommers > wrote: > > > > > > On Mon, Dec 26, 2011 at 8:50 PM, wrote: > >> I have a hard time thinking through empty 2-dim arrays, and don't know > >> what rules should apply. > >> However, in my code I might want to catch these cases rather early > >> than late and then having to work my way backwards to find out where > >> the content disappeared. > > > > > > Same here. Almost always, my empty arrays are either due to bugs or they > > signal that I do need to special-case something. Silent passing through > of > > empty arrays to all numpy functions is not what I would want. > > I find it quite annoying to treat the empty set with special > deference. "All of my great-grandkids live in Antarctica" should be > true for me (I'm only 30 years old). If you decide that is not true > for me, it leads to a bunch of other logical annoyances up there > Guess you don't mean true/false, because it's neither. But I understand you want an empty array back instead of an error. Currently the problem is that when you do get that empty array back, you'll then use that for something else and it will probably still crash. Many numpy functions do not check for empty input and will still give exceptions. My impression is that you're better off handling these where you create the empty array, rather than in some random place later on. The alternative is to have consistent rules for empty arrays, and handle them explicitly in all functions. Can be done, but is of course a lot of work and has some overhead. Finally, I note that your exception only occurs for empty arrays with shape (N, 0). It's not obvious to me if the same rules should apply to shape (0,) and other shapes, or why those shapes are even useful. Ralf > The rule that shouldn't be special cased is what I described: x[idx1, > idx2] should be a valid construction if it's true that all elements of > idx1 and idx2 are integers in the correct range. The sizes of the > empty matrices are also somewhat obvious. > > Special-casing vacuous truth makes me write annoying special cases. > Octave doesn't error out for those special cases, and I think it's a > good thing it doesn't. It's logically consistent. > > - Jordi G. H. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 28 05:48:06 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 11:48:06 +0100 Subject: [Numpy-discussion] trouble building numpy 1.6.1 on Scientific Linux 5 In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 10:05 PM, Russell E. Owen wrote: > In article > , > Ralf Gommers wrote: > > > On Tue, Dec 20, 2011 at 10:52 PM, Russell E. Owen wrote: > > > > > In article , > > > "Russell E. Owen" wrote: > > > > > > > In article > > > > >, > > > > Ralf Gommers wrote: > > > > > > > > > On Fri, Dec 9, 2011 at 8:02 PM, Russell E. Owen > wrote: > > > > > > > > > > > I'm trying to build numpy 1.6.1 on Scientific Linux 5 but the > unit > > > tests > > > > > > claim the wrong version of fortran was used. I thought I knew > how to > > > > > > avoid that, but it's not working. > > > > > > > > > > > >...(elided text that suggests numpy is building using g77 even > though > > > I > > > > > >asked for gfortran)... > > > > > > > > > > > > Any suggestions on how to fix this? > > > > > > > > > > > > > > > > I assume you have g77 installed and on your PATH. If so, try > moving it > > > off > > > > > your path. > > > > > > > > Yes. I would have tried that if I had known how to do it (though I'm > > > > puzzled why it would be wanted since I told the installer to use > > > > gfortran). > > > > > > > > The problem is that g77 is in /usr/bin/ and I don't have root privs > on > > > > this system. > > > > > > The explanation of why g77 is still picked up, and a possible solution: > > > http://thread.gmane.org/gmane.comp.python.numeric.general/13820/focus=13826 > > OK. I tried this: > - clear out old numpy from ~/local > - unpack fresh numpy 1.6.1 in a build directory and cd into it > $ python setup.py config_fc --fcompiler=gnu95 build > $ python setup.py install --home=~/local > $ cd > $ python > $ import numpy > $ numpy.__file__ # to make sure it picked up the newly build version > $ numpy.test() > > Again the unit test fails with: > FAIL: test_lapack (test_build.TestF77Mismatch) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/astro/users/rowen/local/lib/python/numpy/testing/decorators.py", line > 146, in skipper_func > return f(*args, **kwargs) > File > "/astro/users/rowen/local/lib/python/numpy/linalg/tests/test_build.py", > line 50, in test_lapack > information.""") > AssertionError: Both g77 and gfortran runtimes linked in lapack_lite ! > This is likely to > cause random crashes and wrong results. See numpy INSTALL.txt for more > information. > > Looks like the test is incorrect. For some reason you have both libgfortran.so.3 and libgfortran.so.1 on separate lines of your ldd output. This causes FindDependenciesLdd to return a list of length 2, which is interpreted as having found both g77 and gfortran. So no problem I think, although I don't understand why two different gfortran binaries are picked up. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Dec 28 07:38:18 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 28 Dec 2011 12:38:18 +0000 Subject: [Numpy-discussion] Functions vs Methods In-Reply-To: References: Message-ID: On Wed, Dec 28, 2011 at 06:39, Jaidev Deshpande wrote: > Hi > > It is said that function calls are expensive. Does that mean one must > use available methods instead? For the most part, it usually doesn't matter enough to care about. Whether you use the methods or the functions should be dominated by other concerns like readability and genericity. Typically, this means that you do want the methods, but not for performance reasons. Note that when we say "function calls are expensive", we mean that the call overhead for functions and methods implemented in pure Python is expensive relative to functions and methods implemented via C. By "overhead", we mean the extra time that isn't related to the actual computation itself, like setting up the stack frame, packing and unpacking the arguments, etc. Extension functions/methods don't need to do as much of this. Really, the performance difference only matters when you have thousands of calls in a tight loop, at which point you should be thinking about reorganizing your algorithm anyways. This is a micro-optimization. Micro-optimizations have their place: namely, after you have done all of the macro-optimizations that you can. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From d.s.seljebotn at astro.uio.no Wed Dec 28 07:52:39 2011 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 28 Dec 2011 13:52:39 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: <4EFB1117.3090304@astro.uio.no> On 12/28/2011 09:33 AM, Ralf Gommers wrote: > > > 2011/12/27 Jordi Guti?rrez Hermoso > > > On 26 December 2011 14:56, Ralf Gommers > wrote: > > > > > > On Mon, Dec 26, 2011 at 8:50 PM, > wrote: > >> I have a hard time thinking through empty 2-dim arrays, and > don't know > >> what rules should apply. > >> However, in my code I might want to catch these cases rather early > >> than late and then having to work my way backwards to find out where > >> the content disappeared. > > > > > > Same here. Almost always, my empty arrays are either due to bugs > or they > > signal that I do need to special-case something. Silent passing > through of > > empty arrays to all numpy functions is not what I would want. > > I find it quite annoying to treat the empty set with special > deference. "All of my great-grandkids live in Antarctica" should be > true for me (I'm only 30 years old). If you decide that is not true > for me, it leads to a bunch of other logical annoyances up there > > > Guess you don't mean true/false, because it's neither. But I understand > you want an empty array back instead of an error. > > Currently the problem is that when you do get that empty array back, > you'll then use that for something else and it will probably still > crash. Many numpy functions do not check for empty input and will still > give exceptions. My impression is that you're better off handling these > where you create the empty array, rather than in some random place later > on. The alternative is to have consistent rules for empty arrays, and > handle them explicitly in all functions. Can be done, but is of course a > lot of work and has some overhead. Are you saying that the existence of other bugs means that this bug shouldn't be fixed? I just fail to see the relevance of these other bugs to this discussion. For the record, I've encountered this bug many times myself and it's rather irritating, since it leads to more verbose code. It is useful whenever you want to return data that is a subset of the input data (since the selected subset can usually be zero-sized sometimes -- remember, in computer science the only numbers are 0, 1, and "any number"). Here's one of the examples I've had. The Interpolative Decomposition decomposes a m-by-n matrix A of rank k as A = B C where B is an m-by-k matrix consisting of a subset of the columns of A, and C is a k-by-n matrix. Now, if A is all zeros (which is often the case for me), then k is 0. I would still like to create the m-by-0 matrix B by doing B = A[:, selected_columns] But now I have to do this instead: if len(selected_columns) == 0: B = np.zeros((A.shape[0], 0), dtype=A.dtype) else: B = A[:, selected_columns] In this case, zero-sized B and C are of course perfectly valid and useful results: In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) Out[2]: array([[ 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0.]]) Dag Sverre From d.s.seljebotn at astro.uio.no Wed Dec 28 07:57:55 2011 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 28 Dec 2011 13:57:55 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: <4EFB1117.3090304@astro.uio.no> References: <4EFB1117.3090304@astro.uio.no> Message-ID: <4EFB1253.7030904@astro.uio.no> On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: > On 12/28/2011 09:33 AM, Ralf Gommers wrote: >> >> >> 2011/12/27 Jordi Guti?rrez Hermoso> > >> >> On 26 December 2011 14:56, Ralf Gommers> > wrote: >> > >> > >> > On Mon, Dec 26, 2011 at 8:50 PM,> > wrote: >> >> I have a hard time thinking through empty 2-dim arrays, and >> don't know >> >> what rules should apply. >> >> However, in my code I might want to catch these cases rather early >> >> than late and then having to work my way backwards to find out where >> >> the content disappeared. >> > >> > >> > Same here. Almost always, my empty arrays are either due to bugs >> or they >> > signal that I do need to special-case something. Silent passing >> through of >> > empty arrays to all numpy functions is not what I would want. >> >> I find it quite annoying to treat the empty set with special >> deference. "All of my great-grandkids live in Antarctica" should be >> true for me (I'm only 30 years old). If you decide that is not true >> for me, it leads to a bunch of other logical annoyances up there >> >> >> Guess you don't mean true/false, because it's neither. But I understand >> you want an empty array back instead of an error. >> >> Currently the problem is that when you do get that empty array back, >> you'll then use that for something else and it will probably still >> crash. Many numpy functions do not check for empty input and will still >> give exceptions. My impression is that you're better off handling these >> where you create the empty array, rather than in some random place later >> on. The alternative is to have consistent rules for empty arrays, and >> handle them explicitly in all functions. Can be done, but is of course a >> lot of work and has some overhead. > > Are you saying that the existence of other bugs means that this bug > shouldn't be fixed? I just fail to see the relevance of these other bugs > to this discussion. > > For the record, I've encountered this bug many times myself and it's > rather irritating, since it leads to more verbose code. > > It is useful whenever you want to return data that is a subset of the > input data (since the selected subset can usually be zero-sized > sometimes -- remember, in computer science the only numbers are 0, 1, > and "any number"). > > Here's one of the examples I've had. The Interpolative Decomposition > decomposes a m-by-n matrix A of rank k as > > A = B C > > where B is an m-by-k matrix consisting of a subset of the columns of A, > and C is a k-by-n matrix. > > Now, if A is all zeros (which is often the case for me), then k is 0. I > would still like to create the m-by-0 matrix B by doing > > B = A[:, selected_columns] > > But now I have to do this instead: > > if len(selected_columns) == 0: > B = np.zeros((A.shape[0], 0), dtype=A.dtype) > else: > B = A[:, selected_columns] > > In this case, zero-sized B and C are of course perfectly valid and > useful results: > > In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) > Out[2]: > array([[ 0., 0., 0., 0., 0.], > [ 0., 0., 0., 0., 0.], > [ 0., 0., 0., 0., 0.]]) > And to answer the obvious question: Yes, this is a real usecase. It is used for something similar to image compression, where sub-sections of the images may well be all-zero and have zero rank (full story at [1]). Reading the above thread I understand Ralf's reasoning better, but really, relying on NumPy's buggy behaviour to discover bugs in user code seems like the wrong approach. Tools should be dumb unless there are good reasons to make them smart. I'd be rather irritated about my hammer if it refused to drive in nails that it decided where in the wrong spot. Dag Sverre [1] http://arxiv.org/abs/1110.4874 From ralf.gommers at googlemail.com Wed Dec 28 08:21:01 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 14:21:01 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: <4EFB1253.7030904@astro.uio.no> References: <4EFB1117.3090304@astro.uio.no> <4EFB1253.7030904@astro.uio.no> Message-ID: On Wed, Dec 28, 2011 at 1:57 PM, Dag Sverre Seljebotn < d.s.seljebotn at astro.uio.no> wrote: > On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: > > On 12/28/2011 09:33 AM, Ralf Gommers wrote: > >> > >> > >> 2011/12/27 Jordi Guti?rrez Hermoso >> > > >> > >> On 26 December 2011 14:56, Ralf Gommers< > ralf.gommers at googlemail.com > >> > wrote: > >> > > >> > > >> > On Mon, Dec 26, 2011 at 8:50 PM, >> > wrote: > >> >> I have a hard time thinking through empty 2-dim arrays, and > >> don't know > >> >> what rules should apply. > >> >> However, in my code I might want to catch these cases rather > early > >> >> than late and then having to work my way backwards to find > out where > >> >> the content disappeared. > >> > > >> > > >> > Same here. Almost always, my empty arrays are either due to > bugs > >> or they > >> > signal that I do need to special-case something. Silent passing > >> through of > >> > empty arrays to all numpy functions is not what I would want. > >> > >> I find it quite annoying to treat the empty set with special > >> deference. "All of my great-grandkids live in Antarctica" should be > >> true for me (I'm only 30 years old). If you decide that is not true > >> for me, it leads to a bunch of other logical annoyances up there > >> > >> > >> Guess you don't mean true/false, because it's neither. But I understand > >> you want an empty array back instead of an error. > >> > >> Currently the problem is that when you do get that empty array back, > >> you'll then use that for something else and it will probably still > >> crash. Many numpy functions do not check for empty input and will still > >> give exceptions. My impression is that you're better off handling these > >> where you create the empty array, rather than in some random place later > >> on. The alternative is to have consistent rules for empty arrays, and > >> handle them explicitly in all functions. Can be done, but is of course a > >> lot of work and has some overhead. > > > > Are you saying that the existence of other bugs means that this bug > > shouldn't be fixed? I just fail to see the relevance of these other bugs > > to this discussion. > See below. > > For the record, I've encountered this bug many times myself and it's > > rather irritating, since it leads to more verbose code. > > > > It is useful whenever you want to return data that is a subset of the > > input data (since the selected subset can usually be zero-sized > > sometimes -- remember, in computer science the only numbers are 0, 1, > > and "any number"). > > > > Here's one of the examples I've had. The Interpolative Decomposition > > decomposes a m-by-n matrix A of rank k as > > > > A = B C > > > > where B is an m-by-k matrix consisting of a subset of the columns of A, > > and C is a k-by-n matrix. > > > > Now, if A is all zeros (which is often the case for me), then k is 0. I > > would still like to create the m-by-0 matrix B by doing > > > > B = A[:, selected_columns] > > > > But now I have to do this instead: > > > > if len(selected_columns) == 0: > > B = np.zeros((A.shape[0], 0), dtype=A.dtype) > > else: > > B = A[:, selected_columns] > > > > In this case, zero-sized B and C are of course perfectly valid and > > useful results: > > > > In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) > > Out[2]: > > array([[ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.]]) > > > > And to answer the obvious question: Yes, this is a real usecase. It is > used for something similar to image compression, where sub-sections of > the images may well be all-zero and have zero rank (full story at [1]). > > Thanks for the example. I was a little surprised that dot works. Then I read what wikipedia had to say about empty arrays. It mentions dot like you do, and that the determinant of the 0-by-0 matrix is 1. So I try: In [1]: a = np.zeros((0,0)) In [2]: a Out[2]: array([], shape=(0, 0), dtype=float64) In [3]: np.linalg.det(a) Parameter 4 to routine DGETRF was incorrect Reading the above thread I understand Ralf's reasoning better, but > really, relying on NumPy's buggy behaviour to discover bugs in user code > seems like the wrong approach. Tools should be dumb unless there are > good reasons to make them smart. I'd be rather irritated about my hammer > if it refused to drive in nails that it decided where in the wrong spot. > The point is not that we shouldn't fix it, but that it's a waste of time to fix it in only one place. I remember fixing several functions to explicitly check for empty arrays and then returning an empty array or giving a sensible error. So can you answer my question: do you think it's worth the time and computational overhead to handle empty arrays in all functions? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.s.seljebotn at astro.uio.no Wed Dec 28 08:45:29 2011 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 28 Dec 2011 14:45:29 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: <4EFB1117.3090304@astro.uio.no> <4EFB1253.7030904@astro.uio.no> Message-ID: <4EFB1D79.8000309@astro.uio.no> On 12/28/2011 02:21 PM, Ralf Gommers wrote: > > > On Wed, Dec 28, 2011 at 1:57 PM, Dag Sverre Seljebotn > > wrote: > > On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: > > On 12/28/2011 09:33 AM, Ralf Gommers wrote: > >> > >> > >> 2011/12/27 Jordi Guti?rrez Hermoso > >> >> > >> > >> On 26 December 2011 14:56, Ralf > Gommers > >> >> wrote: > >> > > >> > > >> > On Mon, Dec 26, 2011 at 8:50 PM, > >> >> wrote: > >> >> I have a hard time thinking through empty 2-dim arrays, and > >> don't know > >> >> what rules should apply. > >> >> However, in my code I might want to catch these cases rather > early > >> >> than late and then having to work my way backwards to find > out where > >> >> the content disappeared. > >> > > >> > > >> > Same here. Almost always, my empty arrays are either due to bugs > >> or they > >> > signal that I do need to special-case something. Silent passing > >> through of > >> > empty arrays to all numpy functions is not what I would want. > >> > >> I find it quite annoying to treat the empty set with special > >> deference. "All of my great-grandkids live in Antarctica" > should be > >> true for me (I'm only 30 years old). If you decide that is > not true > >> for me, it leads to a bunch of other logical annoyances up > there > >> > >> > >> Guess you don't mean true/false, because it's neither. But I > understand > >> you want an empty array back instead of an error. > >> > >> Currently the problem is that when you do get that empty array back, > >> you'll then use that for something else and it will probably still > >> crash. Many numpy functions do not check for empty input and > will still > >> give exceptions. My impression is that you're better off > handling these > >> where you create the empty array, rather than in some random > place later > >> on. The alternative is to have consistent rules for empty > arrays, and > >> handle them explicitly in all functions. Can be done, but is of > course a > >> lot of work and has some overhead. > > > > Are you saying that the existence of other bugs means that this bug > > shouldn't be fixed? I just fail to see the relevance of these > other bugs > > to this discussion. > > > See below. > > > For the record, I've encountered this bug many times myself and it's > > rather irritating, since it leads to more verbose code. > > > > It is useful whenever you want to return data that is a subset of the > > input data (since the selected subset can usually be zero-sized > > sometimes -- remember, in computer science the only numbers are 0, 1, > > and "any number"). > > > > Here's one of the examples I've had. The Interpolative Decomposition > > decomposes a m-by-n matrix A of rank k as > > > > A = B C > > > > where B is an m-by-k matrix consisting of a subset of the columns > of A, > > and C is a k-by-n matrix. > > > > Now, if A is all zeros (which is often the case for me), then k > is 0. I > > would still like to create the m-by-0 matrix B by doing > > > > B = A[:, selected_columns] > > > > But now I have to do this instead: > > > > if len(selected_columns) == 0: > > B = np.zeros((A.shape[0], 0), dtype=A.dtype) > > else: > > B = A[:, selected_columns] > > > > In this case, zero-sized B and C are of course perfectly valid and > > useful results: > > > > In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) > > Out[2]: > > array([[ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.], > > [ 0., 0., 0., 0., 0.]]) > > > > And to answer the obvious question: Yes, this is a real usecase. It is > used for something similar to image compression, where sub-sections of > the images may well be all-zero and have zero rank (full story at [1]). > > Thanks for the example. I was a little surprised that dot works. Then I > read what wikipedia had to say about empty arrays. It mentions dot like > you do, and that the determinant of the 0-by-0 matrix is 1. So I try: > > In [1]: a = np.zeros((0,0)) > > In [2]: a > Out[2]: array([], shape=(0, 0), dtype=float64) > > In [3]: np.linalg.det(a) > Parameter 4 to routine DGETRF was incorrect > :-) Well, a segfault is most certainly a bug, so this must be fixed one way or the other way anyway, and returning 1 seems at least as good a solution as raising an exception. Both solutions require an extra if-test. > > Reading the above thread I understand Ralf's reasoning better, but > really, relying on NumPy's buggy behaviour to discover bugs in user code > seems like the wrong approach. Tools should be dumb unless there are > good reasons to make them smart. I'd be rather irritated about my hammer > if it refused to drive in nails that it decided where in the wrong spot. > > > The point is not that we shouldn't fix it, but that it's a waste of time > to fix it in only one place. I remember fixing several functions to > explicitly check for empty arrays and then returning an empty array or > giving a sensible error. > > So can you answer my question: do you think it's worth the time and > computational overhead to handle empty arrays in all functions? I'd hope the computational overhead is negligible? I do believe that handling this correctly everywhere is the right thing to do and would improve overall code quality (as witnessed by the segfault found above). Of course, likely nobody is ready to actually perform all that work. So the right thing to do seems to be to state that places where NumPy does not handle zero-size arrays is a bug, but not do anything about it until somebody actually submits a patch. That means, ending this email discussion by verifying that this is indeed a bug on Trac, and then wait and see if anybody bothers to submit a patch. Dag Sverre From ralf.gommers at googlemail.com Wed Dec 28 08:58:21 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 14:58:21 +0100 Subject: [Numpy-discussion] dtype related deprecations Message-ID: Hi, I'm having some trouble cleaning up tests to deal with these two deprecations: DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will become immutable in a future version DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they are platform specific. Use 'O' instead They seem fairly invasive, judged by the test noise in both numpy and scipy. Record arrays rely on setting dtype names. There are tests for picking and the buffer protocol that generate warnings for O4/O8. Can anyone comment on the necessity of these deprecations and how to deal with them? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 28 09:24:02 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 15:24:02 +0100 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: <4EFB1D79.8000309@astro.uio.no> References: <4EFB1117.3090304@astro.uio.no> <4EFB1253.7030904@astro.uio.no> <4EFB1D79.8000309@astro.uio.no> Message-ID: On Wed, Dec 28, 2011 at 2:45 PM, Dag Sverre Seljebotn < d.s.seljebotn at astro.uio.no> wrote: > On 12/28/2011 02:21 PM, Ralf Gommers wrote: > > > > > > On Wed, Dec 28, 2011 at 1:57 PM, Dag Sverre Seljebotn > > > wrote: > > > > On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: > > > On 12/28/2011 09:33 AM, Ralf Gommers wrote: > > >> > > >> > > >> 2011/12/27 Jordi Guti?rrez Hermoso > > > >> >> > > >> > > >> On 26 December 2011 14:56, Ralf > > Gommers ralf.gommers at googlemail.com> > > >> > >> wrote: > > >> > > > >> > > > >> > On Mon, Dec 26, 2011 at 8:50 PM, > > > >> >> > wrote: > > >> >> I have a hard time thinking through empty 2-dim arrays, and > > >> don't know > > >> >> what rules should apply. > > >> >> However, in my code I might want to catch these cases rather > > early > > >> >> than late and then having to work my way backwards to find > > out where > > >> >> the content disappeared. > > >> > > > >> > > > >> > Same here. Almost always, my empty arrays are either due to > bugs > > >> or they > > >> > signal that I do need to special-case something. Silent > passing > > >> through of > > >> > empty arrays to all numpy functions is not what I would want. > > >> > > >> I find it quite annoying to treat the empty set with special > > >> deference. "All of my great-grandkids live in Antarctica" > > should be > > >> true for me (I'm only 30 years old). If you decide that is > > not true > > >> for me, it leads to a bunch of other logical annoyances up > > there > > >> > > >> > > >> Guess you don't mean true/false, because it's neither. But I > > understand > > >> you want an empty array back instead of an error. > > >> > > >> Currently the problem is that when you do get that empty array > back, > > >> you'll then use that for something else and it will probably > still > > >> crash. Many numpy functions do not check for empty input and > > will still > > >> give exceptions. My impression is that you're better off > > handling these > > >> where you create the empty array, rather than in some random > > place later > > >> on. The alternative is to have consistent rules for empty > > arrays, and > > >> handle them explicitly in all functions. Can be done, but is of > > course a > > >> lot of work and has some overhead. > > > > > > Are you saying that the existence of other bugs means that this > bug > > > shouldn't be fixed? I just fail to see the relevance of these > > other bugs > > > to this discussion. > > > > > > See below. > > > > > For the record, I've encountered this bug many times myself and > it's > > > rather irritating, since it leads to more verbose code. > > > > > > It is useful whenever you want to return data that is a subset of > the > > > input data (since the selected subset can usually be zero-sized > > > sometimes -- remember, in computer science the only numbers are > 0, 1, > > > and "any number"). > > > > > > Here's one of the examples I've had. The Interpolative > Decomposition > > > decomposes a m-by-n matrix A of rank k as > > > > > > A = B C > > > > > > where B is an m-by-k matrix consisting of a subset of the columns > > of A, > > > and C is a k-by-n matrix. > > > > > > Now, if A is all zeros (which is often the case for me), then k > > is 0. I > > > would still like to create the m-by-0 matrix B by doing > > > > > > B = A[:, selected_columns] > > > > > > But now I have to do this instead: > > > > > > if len(selected_columns) == 0: > > > B = np.zeros((A.shape[0], 0), dtype=A.dtype) > > > else: > > > B = A[:, selected_columns] > > > > > > In this case, zero-sized B and C are of course perfectly valid and > > > useful results: > > > > > > In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) > > > Out[2]: > > > array([[ 0., 0., 0., 0., 0.], > > > [ 0., 0., 0., 0., 0.], > > > [ 0., 0., 0., 0., 0.]]) > > > > > > > And to answer the obvious question: Yes, this is a real usecase. It > is > > used for something similar to image compression, where sub-sections > of > > the images may well be all-zero and have zero rank (full story at > [1]). > > > > Thanks for the example. I was a little surprised that dot works. Then I > > read what wikipedia had to say about empty arrays. It mentions dot like > > you do, and that the determinant of the 0-by-0 matrix is 1. So I try: > > > > In [1]: a = np.zeros((0,0)) > > > > In [2]: a > > Out[2]: array([], shape=(0, 0), dtype=float64) > > > > In [3]: np.linalg.det(a) > > Parameter 4 to routine DGETRF was incorrect > > > > :-) > > Well, a segfault is most certainly a bug, so this must be fixed one way > or the other way anyway, and returning 1 seems at least as good a > solution as raising an exception. Both solutions require an extra if-test. > > > > > Reading the above thread I understand Ralf's reasoning better, but > > really, relying on NumPy's buggy behaviour to discover bugs in user > code > > seems like the wrong approach. Tools should be dumb unless there are > > good reasons to make them smart. I'd be rather irritated about my > hammer > > if it refused to drive in nails that it decided where in the wrong > spot. > > > > > > The point is not that we shouldn't fix it, but that it's a waste of time > > to fix it in only one place. I remember fixing several functions to > > explicitly check for empty arrays and then returning an empty array or > > giving a sensible error. > > > > So can you answer my question: do you think it's worth the time and > > computational overhead to handle empty arrays in all functions? > > I'd hope the computational overhead is negligible? > If you have to check all array_like inputs in all functions, I wouldn't think so. > I do believe that handling this correctly everywhere is the right thing > to do and would improve overall code quality (as witnessed by the > segfault found above). > > Of course, likely nobody is ready to actually perform all that work. So > the right thing to do seems to be to state that places where NumPy does > not handle zero-size arrays is a bug, but not do anything about it until > somebody actually submits a patch. That means, ending this email > discussion by verifying that this is indeed a bug on Trac, and then wait > and see if anybody bothers to submit a patch. > Agreed. I've created http://projects.scipy.org/numpy/ticket/2007 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Wed Dec 28 09:32:12 2011 From: travis at continuum.io (Travis Oliphant) Date: Wed, 28 Dec 2011 08:32:12 -0600 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: <4EFB1D79.8000309@astro.uio.no> References: <4EFB1117.3090304@astro.uio.no> <4EFB1253.7030904@astro.uio.no> <4EFB1D79.8000309@astro.uio.no> Message-ID: I agree with Dag, NumPy should provide consistent handling of empty arrays. It does require some work, but it should be at least declared a bug when it doesn't. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Dec 28, 2011, at 7:45 AM, Dag Sverre Seljebotn wrote: > On 12/28/2011 02:21 PM, Ralf Gommers wrote: >> >> >> On Wed, Dec 28, 2011 at 1:57 PM, Dag Sverre Seljebotn >> > wrote: >> >> On 12/28/2011 01:52 PM, Dag Sverre Seljebotn wrote: >>> On 12/28/2011 09:33 AM, Ralf Gommers wrote: >>>> >>>> >>>> 2011/12/27 Jordi Guti?rrez Hermoso> >>>> >> >>>> >>>> On 26 December 2011 14:56, Ralf >> Gommers >>>> > >> wrote: >>>>> >>>>> >>>>> On Mon, Dec 26, 2011 at 8:50 PM,> >>>> >> wrote: >>>>>> I have a hard time thinking through empty 2-dim arrays, and >>>> don't know >>>>>> what rules should apply. >>>>>> However, in my code I might want to catch these cases rather >> early >>>>>> than late and then having to work my way backwards to find >> out where >>>>>> the content disappeared. >>>>> >>>>> >>>>> Same here. Almost always, my empty arrays are either due to bugs >>>> or they >>>>> signal that I do need to special-case something. Silent passing >>>> through of >>>>> empty arrays to all numpy functions is not what I would want. >>>> >>>> I find it quite annoying to treat the empty set with special >>>> deference. "All of my great-grandkids live in Antarctica" >> should be >>>> true for me (I'm only 30 years old). If you decide that is >> not true >>>> for me, it leads to a bunch of other logical annoyances up >> there >>>> >>>> >>>> Guess you don't mean true/false, because it's neither. But I >> understand >>>> you want an empty array back instead of an error. >>>> >>>> Currently the problem is that when you do get that empty array back, >>>> you'll then use that for something else and it will probably still >>>> crash. Many numpy functions do not check for empty input and >> will still >>>> give exceptions. My impression is that you're better off >> handling these >>>> where you create the empty array, rather than in some random >> place later >>>> on. The alternative is to have consistent rules for empty >> arrays, and >>>> handle them explicitly in all functions. Can be done, but is of >> course a >>>> lot of work and has some overhead. >>> >>> Are you saying that the existence of other bugs means that this bug >>> shouldn't be fixed? I just fail to see the relevance of these >> other bugs >>> to this discussion. >> >> >> See below. >> >>> For the record, I've encountered this bug many times myself and it's >>> rather irritating, since it leads to more verbose code. >>> >>> It is useful whenever you want to return data that is a subset of the >>> input data (since the selected subset can usually be zero-sized >>> sometimes -- remember, in computer science the only numbers are 0, 1, >>> and "any number"). >>> >>> Here's one of the examples I've had. The Interpolative Decomposition >>> decomposes a m-by-n matrix A of rank k as >>> >>> A = B C >>> >>> where B is an m-by-k matrix consisting of a subset of the columns >> of A, >>> and C is a k-by-n matrix. >>> >>> Now, if A is all zeros (which is often the case for me), then k >> is 0. I >>> would still like to create the m-by-0 matrix B by doing >>> >>> B = A[:, selected_columns] >>> >>> But now I have to do this instead: >>> >>> if len(selected_columns) == 0: >>> B = np.zeros((A.shape[0], 0), dtype=A.dtype) >>> else: >>> B = A[:, selected_columns] >>> >>> In this case, zero-sized B and C are of course perfectly valid and >>> useful results: >>> >>> In [2]: np.dot(np.ones((3,0)), np.ones((0, 5))) >>> Out[2]: >>> array([[ 0., 0., 0., 0., 0.], >>> [ 0., 0., 0., 0., 0.], >>> [ 0., 0., 0., 0., 0.]]) >>> >> >> And to answer the obvious question: Yes, this is a real usecase. It is >> used for something similar to image compression, where sub-sections of >> the images may well be all-zero and have zero rank (full story at [1]). >> >> Thanks for the example. I was a little surprised that dot works. Then I >> read what wikipedia had to say about empty arrays. It mentions dot like >> you do, and that the determinant of the 0-by-0 matrix is 1. So I try: >> >> In [1]: a = np.zeros((0,0)) >> >> In [2]: a >> Out[2]: array([], shape=(0, 0), dtype=float64) >> >> In [3]: np.linalg.det(a) >> Parameter 4 to routine DGETRF was incorrect >> > > :-) > > Well, a segfault is most certainly a bug, so this must be fixed one way > or the other way anyway, and returning 1 seems at least as good a > solution as raising an exception. Both solutions require an extra if-test. > >> >> Reading the above thread I understand Ralf's reasoning better, but >> really, relying on NumPy's buggy behaviour to discover bugs in user code >> seems like the wrong approach. Tools should be dumb unless there are >> good reasons to make them smart. I'd be rather irritated about my hammer >> if it refused to drive in nails that it decided where in the wrong spot. >> >> >> The point is not that we shouldn't fix it, but that it's a waste of time >> to fix it in only one place. I remember fixing several functions to >> explicitly check for empty arrays and then returning an empty array or >> giving a sensible error. >> >> So can you answer my question: do you think it's worth the time and >> computational overhead to handle empty arrays in all functions? > > I'd hope the computational overhead is negligible? > > I do believe that handling this correctly everywhere is the right thing > to do and would improve overall code quality (as witnessed by the > segfault found above). > > Of course, likely nobody is ready to actually perform all that work. So > the right thing to do seems to be to state that places where NumPy does > not handle zero-size arrays is a bug, but not do anything about it until > somebody actually submits a patch. That means, ending this email > discussion by verifying that this is indeed a bug on Trac, and then wait > and see if anybody bothers to submit a patch. > > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From jordigh at octave.org Wed Dec 28 09:46:27 2011 From: jordigh at octave.org (=?UTF-8?Q?Jordi_Guti=C3=A9rrez_Hermoso?=) Date: Wed, 28 Dec 2011 09:46:27 -0500 Subject: [Numpy-discussion] Indexing empty dimensions with empty arrays In-Reply-To: References: Message-ID: On 28 December 2011 03:33, Ralf Gommers wrote: > > > 2011/12/27 Jordi Guti?rrez Hermoso >> >> On 26 December 2011 14:56, Ralf Gommers >> wrote: >> > >> > >> > On Mon, Dec 26, 2011 at 8:50 PM, wrote: >> >> I have a hard time thinking through empty 2-dim arrays, and don't know >> >> what rules should apply. >> >> However, in my code I might want to catch these cases rather early >> >> than late and then having to work my way backwards to find out where >> >> the content disappeared. >> > >> > >> > Same here. Almost always, my empty arrays are either due to bugs or they >> > signal that I do need to special-case something. Silent passing through >> > of >> > empty arrays to all numpy functions is not what I would want. >> >> I find it quite annoying to treat the empty set with special >> deference. "All of my great-grandkids live in Antarctica" should be >> true for me (I'm only 30 years old). If you decide that is not true >> for me, it leads to a bunch of other logical annoyances up there > > > Guess you don't mean true/false, because it's neither. But I understand you > want an empty array back instead of an error. It should be true. This is a case of vacuous truth: http://en.wikipedia.org/wiki/Vacuous_truth - Jordi G. H. From jordigh at octave.org Wed Dec 28 10:08:13 2011 From: jordigh at octave.org (=?UTF-8?Q?Jordi_Guti=C3=A9rrez_Hermoso?=) Date: Wed, 28 Dec 2011 10:08:13 -0500 Subject: [Numpy-discussion] How's our broadcasting? Message-ID: Just FYI, the next stable release of Octave (3.6) will have broadcasting. I used Numpy as an inspiration. Here is the WIP manual for it: http://jordi.platinum.linux.pl/octave.html/Broadcasting.html#Broadcasting I want to thank Numpy both for the inspiration and for any comments you may have on this. I'm pretty excited that Octave will now have this feature. Thanks, - Jordi G. H. GNU Octave developer From ralf.gommers at googlemail.com Wed Dec 28 13:41:27 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 28 Dec 2011 19:41:27 +0100 Subject: [Numpy-discussion] How's our broadcasting? In-Reply-To: References: Message-ID: 2011/12/28 Jordi Guti?rrez Hermoso > Just FYI, the next stable release of Octave (3.6) will have > broadcasting. I used Numpy as an inspiration. > > Here is the WIP manual for it: > > > http://jordi.platinum.linux.pl/octave.html/Broadcasting.html#Broadcasting > That looks good. Should be much nicer to use than bsxfun, so also a good argument to prefer Octave over Matlab. Is this a departure of maintaining full compatibility with Matlab? Ralf I want to thank Numpy both for the inspiration and for any comments > you may have on this. I'm pretty excited that Octave will now have > this feature. > > Thanks, > - Jordi G. H. > GNU Octave developer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordigh at octave.org Wed Dec 28 14:44:01 2011 From: jordigh at octave.org (=?UTF-8?Q?Jordi_Guti=C3=A9rrez_Hermoso?=) Date: Wed, 28 Dec 2011 14:44:01 -0500 Subject: [Numpy-discussion] How's our broadcasting? In-Reply-To: References: Message-ID: On 28 December 2011 13:41, Ralf Gommers wrote: > > > 2011/12/28 Jordi Guti?rrez Hermoso >> >> Just FYI, the next stable release of Octave (3.6) will have >> broadcasting. I used Numpy as an inspiration. >> >> Here is the WIP manual for it: >> >> >> ?http://jordi.platinum.linux.pl/octave.html/Broadcasting.html#Broadcasting > > Is this a departure of maintaining full compatibility with Matlab? No, Matlab code will still work in Octave, except in very rare cases, but it's a matter of flipping a switch to make those weird cases work as well. Octave isn't a Matlab clone. The goal is to be source compatible to Matlab: code that runs in Matlab should run in Octave, but we don't try to limit ourselves to whatever Matlab does, nor do we copy every bug unless there's a very good reason to copy its bugs. I think I may be missing a few broadcasting behaviours regarding assignment. Is something like that possible, to broadcast a vector across a matrix during assignment? - Jordi G. H. From shish at keba.be Wed Dec 28 14:53:46 2011 From: shish at keba.be (Olivier Delalleau) Date: Wed, 28 Dec 2011 14:53:46 -0500 Subject: [Numpy-discussion] How's our broadcasting? In-Reply-To: References: Message-ID: 2011/12/28 Jordi Guti?rrez Hermoso > On 28 December 2011 13:41, Ralf Gommers > wrote: > > > > > > 2011/12/28 Jordi Guti?rrez Hermoso > >> > >> Just FYI, the next stable release of Octave (3.6) will have > >> broadcasting. I used Numpy as an inspiration. > >> > >> Here is the WIP manual for it: > >> > >> > >> > http://jordi.platinum.linux.pl/octave.html/Broadcasting.html#Broadcasting > > > > Is this a departure of maintaining full compatibility with Matlab? > > No, Matlab code will still work in Octave, except in very rare cases, > but it's a matter of flipping a switch to make those weird cases work > as well. > > Octave isn't a Matlab clone. The goal is to be source compatible to > Matlab: code that runs in Matlab should run in Octave, but we don't > try to limit ourselves to whatever Matlab does, nor do we copy every > bug unless there's a very good reason to copy its bugs. > > I think I may be missing a few broadcasting behaviours regarding > assignment. Is something like that possible, to broadcast a vector > across a matrix during assignment? > > - Jordi G. H. > Numpy does it: In [12]: x = numpy.zeros((3, 3)) In [15]: x[:] = numpy.arange(3) In [16]: x Out[16]: array([[ 0., 1., 2.], [ 0., 1., 2.], [ 0., 1., 2.]]) -=- Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From burlen.loring at gmail.com Wed Dec 28 16:05:36 2011 From: burlen.loring at gmail.com (Burlen Loring) Date: Wed, 28 Dec 2011 13:05:36 -0800 Subject: [Numpy-discussion] fft help Message-ID: <4EFB84A0.9080301@gmail.com> Hi I have an image I need to do an fft on, I tried numpy.fft but results are not what I expected, and differ from matlab. My input image is a weird size, 5118x1279, I think numpy fft is not liking it. In numpy the fft appears to be computed multiple times and tiled across the output image. In other words the pattern I see in matlab fft is tiled repeatedly over numpy fft output. Any idea on what I'm doing wrong? you can see repeated pattern in the top panel of this image which also has the input in the bottom panel. http://old.nabble.com/file/p33047057/fft_uex.png fft_uex.png tx From ondrej at certik.cz Thu Dec 29 01:51:45 2011 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 28 Dec 2011 22:51:45 -0800 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: <4EDB7293.90302@gmail.com> Message-ID: On Mon, Dec 5, 2011 at 4:22 AM, Perry Greenfield wrote: > I'm not sure I'm crazy about leaving final decision making for a > board. A board may be a good way of carefully considering the issues, > and it could make it's own recommendation (with a sufficient > majority). But in the end I think one person needs to decide (and that > decision may go against the board consensus, presumably only rarely). > > Why shouldn't that person be you? I haven't contributed to NumPy directly. But I can offer my experience with SymPy. I agree with Perry. Having one person being in charge as the last call (project leader) works excellent in my experience. For SymPy, that person has been me, up until a year ago (when I realized that I am too busy to do a good job as a project leader), when I passed it to Aaron Meurer. We always try to reach consensus, and the project leader's main job is to encourage such discussion. When consensus cannot be reached, he needs to make the decision (that happened maybe once or twice in the last 5 years and it is very rare). There seems to be quite strong "community ownership" in SymPy (that was Stefan's objection). I think the reason being that in fact we probably have something like a "board of members", except that it is informal and it simply consists of people whose opinions the project leader highly values. And I think that it is very easy for anybody who gets involved with SymPy development to become trusted and thus his or her opinion will count. As such, for NumPy I think by default the project leader is Travis, who created it. He became busy in the last few years and so he could appoint a person, who will be the project leader. The list of possible people seems quite simple, I would choose somebody who is involved a lot with NumPy in the last 1 year (let's say): $ git shortlog -ns --since="1 year ago" | head 651 Mark Wiebe 137 Charles Harris 72 David Cournapeau 61 Ralf Gommers 52 rgommers 29 Pearu Peterson 17 Pauli Virtanen 11 Chris Jordan-Squire 11 Matthew Brett 10 Christopher L. Farrow So anybody from the top 5 or 10 people seems ok. This has to be a personal decision, and I don't know what the actual contribution and involvement (and personal ability to be a project leader) is of the above people, so that's why it should be done by Travis (possibly consulting with somebody who he trusts and who is involved). For SymPy, here is the list from the "1 year ago" when I passed the project leadership: $ git shortlog -ns --since="January 2010" --until "January 2011" | head 317 ?yvind Jensen 150 Mateusz Paprocki 93 Aaron Meurer 81 Addison Cugini 79 Brian E. Granger 64 Ronan Lamy 61 Matt Curry 58 Ond?ej ?ert?k 36 Chris Smith 34 Christian Muise It's not exactly accurate, as some of the branches from 2010 were merged in 2011, but it gives you a picture. The above list doesn't tell you who the best person should be. I knew that Aaron would be the best choice, and I consulted it with several "core developers" to see what the "community" thinks, and everybody told me, that if I need to pass it on, Aaron would be the choice. Since this was the first time for me doing this, I simply stated, that Aaron is the project leader from now on. And in couple months we clarified it a little bit, that I am the "owner", in a sense that I own the domain and some servers and other things and I am ultimately responsible for the project (and I still have a say in non-coding related issues, like Google Summer of Code and such). For anything code related, Aaron has the last word, and I will not override it. The precise email is here: https://groups.google.com/d/topic/sympy/i-XD15syvqs/discussion You can compare it to today's list: $ git shortlog -ns --since="1 year ago" | head 805 Chris Smith 583 Mateusz Paprocki 508 Aaron Meurer 183 Ronan Lamy 150 Saptarshi Mandal 112 Tom Bachmann 101 Vladimir Peri? 93 Gilbert Gede 91 Ond?ej ?ert?k 89 Brian E. Granger So the activity has gone up after I stopped being the bottleneck, and after there was again a clear person, who is in charge and has time for it. Anyway, I just wanted to offer some experience that I gained with SymPy with this regard. As I said, I am not a NumPy developer and as such, this decision should be made by NumPy developers and Travis as the original project leader. I could see a familiar pattern here --- Travis has spent enormous time to develop NumPy and to build a community, and later became busy. This is exactly what happened to me with SymPy (when I was back in Prague, I spent months, every evening, many hours with sympy....). In fact, Travis once said at some lecture, that opensource is addictive. And not only that, also, if you develop (start) something, it really feels like it's yours. And then when I didn't have time and I knew I am not doing good job with SymPy, it was probably the hardest decision I had to make to pass the leadership on. Now, from retrospect, I should have done it much earlier and it is now obvious, that it was the right thing to do. But at that time, it was not obvious and I was very unsure what is going to happen. So anyway, good luck with any decision that you make. :) Ondrej From travis at continuum.io Thu Dec 29 02:01:57 2011 From: travis at continuum.io (Travis Oliphant) Date: Thu, 29 Dec 2011 01:01:57 -0600 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: References: <4EDB7293.90302@gmail.com> Message-ID: <950D1FD6-071D-4C49-9678-FC9B8742694D@continuum.io> That was an extremely helpful and useful post. Thank you Ondrej for sharing it and taking the time to provide that insight. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Dec 29, 2011, at 12:51 AM, Ondrej Certik wrote: > On Mon, Dec 5, 2011 at 4:22 AM, Perry Greenfield wrote: >> I'm not sure I'm crazy about leaving final decision making for a >> board. A board may be a good way of carefully considering the issues, >> and it could make it's own recommendation (with a sufficient >> majority). But in the end I think one person needs to decide (and that >> decision may go against the board consensus, presumably only rarely). >> >> Why shouldn't that person be you? > > I haven't contributed to NumPy directly. But I can offer my experience > with SymPy. > > I agree with Perry. Having one person being in charge as the last > call (project leader) works excellent in my experience. For SymPy, > that person has been me, > up until a year ago (when I realized that I am too busy to do a good > job as a project leader), when I passed it to Aaron Meurer. > We always try to reach > consensus, and the project leader's main job is to encourage such discussion. > When consensus cannot be reached, he needs to make the decision (that > happened maybe once or twice in the last 5 years and it is very rare). > > There seems to be quite strong "community ownership" in SymPy (that > was Stefan's objection). I think the reason being that in fact we > probably have something like a "board of members", except that > it is informal and it simply consists of people whose opinions > the project leader highly values. And I think that it is very easy > for anybody who gets involved with SymPy development to > become trusted and thus his or her opinion will count. > > As such, for NumPy I think by default the project leader is Travis, who > created it. He became busy in the last few years and so he could > appoint a person, who will be the project leader. > > The list of possible people seems quite simple, I would choose > somebody who is involved a lot with NumPy in the last 1 year > (let's say): > > $ git shortlog -ns --since="1 year ago" | head > 651 Mark Wiebe > 137 Charles Harris > 72 David Cournapeau > 61 Ralf Gommers > 52 rgommers > 29 Pearu Peterson > 17 Pauli Virtanen > 11 Chris Jordan-Squire > 11 Matthew Brett > 10 Christopher L. Farrow > > So anybody from the top 5 or 10 people seems ok. This has to be a personal > decision, and I don't know what the actual contribution and involvement (and > personal ability to be a project leader) is of the above people, so that's > why it should be done by Travis (possibly consulting with somebody who > he trusts and who is involved). > > For SymPy, here is the list from the "1 year ago" when I passed the > project leadership: > > $ git shortlog -ns --since="January 2010" --until "January 2011" | head > 317 ?yvind Jensen > 150 Mateusz Paprocki > 93 Aaron Meurer > 81 Addison Cugini > 79 Brian E. Granger > 64 Ronan Lamy > 61 Matt Curry > 58 Ond?ej ?ert?k > 36 Chris Smith > 34 Christian Muise > > It's not exactly accurate, as some of the branches from 2010 were > merged in 2011, but it gives you a picture. The above > list doesn't tell you who the best person should be. I knew that Aaron > would be the best choice, and I consulted it > with several "core developers" to see what the "community" thinks, and > everybody told me, that if I need to pass it > on, Aaron would be the choice. > > Since this was the first time for me doing this, I simply stated, that Aaron > is the project leader from now on. And in couple months we clarified > it a little bit, that I am the "owner", > in a sense that I own the domain and some servers and other things and > I am ultimately responsible for the project (and I still have a say in > non-coding related issues, like Google Summer of Code and such). For > anything code related, Aaron has the last word, > and I will not override it. The precise email is here: > > https://groups.google.com/d/topic/sympy/i-XD15syvqs/discussion > > You can compare it to today's list: > > $ git shortlog -ns --since="1 year ago" | head 805 Chris Smith > 583 Mateusz Paprocki > 508 Aaron Meurer > 183 Ronan Lamy > 150 Saptarshi Mandal > 112 Tom Bachmann > 101 Vladimir Peri? > 93 Gilbert Gede > 91 Ond?ej ?ert?k > 89 Brian E. Granger > > > So the activity has gone up after I stopped being the bottleneck, and > after there was again a clear person, who is in charge and has time > for it. > > > Anyway, I just wanted to offer some experience that I gained with > SymPy with this regard. As I said, I am not a NumPy developer and as > such, this decision should be made by NumPy developers and Travis as > the original project leader. > > I could see a familiar pattern here --- Travis has spent enormous time > to develop NumPy and to build a community, and later became busy. This > is exactly what happened to me with SymPy (when I was back in Prague, > I spent months, every evening, many hours with sympy....). In fact, > Travis once said at some lecture, that opensource is addictive. And > not only that, also, if you develop (start) something, it really feels > like it's yours. And then when I didn't have time and I knew I am not > doing good job with SymPy, it was probably the hardest decision I had > to make to pass the leadership on. > Now, from retrospect, I should have done it much earlier and it is now > obvious, that it was the right thing to do. But at that time, it was > not obvious and I was very unsure what is going to happen. > > So anyway, good luck with any decision that you make. :) > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From torgil.svensson at gmail.com Thu Dec 29 10:21:46 2011 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Thu, 29 Dec 2011 16:21:46 +0100 Subject: [Numpy-discussion] fft help In-Reply-To: <4EFB84A0.9080301@gmail.com> References: <4EFB84A0.9080301@gmail.com> Message-ID: This is because fft computes one-dimensional transforms (on each row). Try fft2 instead. //Torgil fft(a, n=None, axis=-1) Compute the one-dimensional discrete Fourier Transform. fft2(a, s=None, axes=(-2, -1)) Compute the 2-dimensional discrete Fourier Transform fftn(a, s=None, axes=None) Compute the N-dimensional discrete Fourier Transform. On Wed, Dec 28, 2011 at 10:05 PM, Burlen Loring wrote: > > Hi > > I have an image I need to do an fft on, I tried numpy.fft but results are > not what I expected, and differ from matlab. > > My input image is a weird size, 5118x1279, I think numpy fft is not liking it. In > numpy the fft appears to be computed multiple times and tiled across the > output image. In other words the pattern I see in matlab fft is tiled > repeatedly over numpy fft output. Any idea on what I'm doing wrong? > > you can see repeated pattern in the top panel of this image which also has > the input in the bottom panel. > http://old.nabble.com/file/p33047057/fft_uex.png ?fft_uex.png > > tx > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From burlen.loring at gmail.com Thu Dec 29 12:32:55 2011 From: burlen.loring at gmail.com (Burlen Loring) Date: Thu, 29 Dec 2011 09:32:55 -0800 Subject: [Numpy-discussion] fft help In-Reply-To: References: <4EFB84A0.9080301@gmail.com> Message-ID: <4EFCA447.4020001@gmail.com> hmmph, I used both fftn and fft2, they both produce the same result. Is there a restriction on the dimension of the input? power of 2 or some such? On 12/29/2011 07:21 AM, Torgil Svensson wrote: > This is because fft computes one-dimensional transforms (on each row). > Try fft2 instead. > > //Torgil > > > fft(a, n=None, axis=-1) > Compute the one-dimensional discrete Fourier Transform. > > fft2(a, s=None, axes=(-2, -1)) > Compute the 2-dimensional discrete Fourier Transform > > fftn(a, s=None, axes=None) > Compute the N-dimensional discrete Fourier Transform. > > > On Wed, Dec 28, 2011 at 10:05 PM, Burlen Loring wrote: >> Hi >> >> I have an image I need to do an fft on, I tried numpy.fft but results are >> not what I expected, and differ from matlab. >> >> My input image is a weird size, 5118x1279, I think numpy fft is not liking it. In >> numpy the fft appears to be computed multiple times and tiled across the >> output image. In other words the pattern I see in matlab fft is tiled >> repeatedly over numpy fft output. Any idea on what I'm doing wrong? >> >> you can see repeated pattern in the top panel of this image which also has >> the input in the bottom panel. >> http://old.nabble.com/file/p33047057/fft_uex.png fft_uex.png >> >> tx >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From torgil.svensson at gmail.com Thu Dec 29 12:43:25 2011 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Thu, 29 Dec 2011 18:43:25 +0100 Subject: [Numpy-discussion] fft help In-Reply-To: <4EFCA447.4020001@gmail.com> References: <4EFB84A0.9080301@gmail.com> <4EFCA447.4020001@gmail.com> Message-ID: Sorry, i should have looked at your image. A few test you can do is 1) does ifft2 give you back the original image? (allclose returned True for a little test I did here) 2) does scipy.fftpack.fft2 yield the same result? //Torgil On Thu, Dec 29, 2011 at 6:32 PM, Burlen Loring wrote: > hmmph, I used both fftn and fft2, they both produce the same result. Is > there a restriction on the dimension of the input? power of 2 or some such? > > On 12/29/2011 07:21 AM, Torgil Svensson wrote: >> This is because fft computes one-dimensional transforms (on each row). >> Try fft2 instead. >> >> //Torgil >> >> >> fft(a, n=None, axis=-1) >> ? ? ?Compute the one-dimensional discrete Fourier Transform. >> >> fft2(a, s=None, axes=(-2, -1)) >> ? ? ?Compute the 2-dimensional discrete Fourier Transform >> >> fftn(a, s=None, axes=None) >> ? ? ?Compute the N-dimensional discrete Fourier Transform. >> >> >> On Wed, Dec 28, 2011 at 10:05 PM, Burlen Loring ?wrote: >>> Hi >>> >>> I have an image I need to do an fft on, I tried numpy.fft but results are >>> not what I expected, and differ from matlab. >>> >>> My input image is a weird size, 5118x1279, I think numpy fft is not liking it. In >>> numpy the fft appears to be computed multiple times and tiled across the >>> output image. In other words the pattern I see in matlab fft is tiled >>> repeatedly over numpy fft output. Any idea on what I'm doing wrong? >>> >>> you can see repeated pattern in the top panel of this image which also has >>> the input in the bottom panel. >>> http://old.nabble.com/file/p33047057/fft_uex.png ?fft_uex.png >>> >>> tx >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From jh at physics.ucf.edu Thu Dec 29 15:09:57 2011 From: jh at physics.ucf.edu (Joe Harrington) Date: Thu, 29 Dec 2011 15:09:57 -0500 Subject: [Numpy-discussion] NumPy Governance In-Reply-To: (numpy-discussion-request@scipy.org) Message-ID: I had intended to stay quiet in this discussion since I am not a core developer and also no longer even lead the doc project. However, I've watched two organizations go very wrong very fast recently. Both were similar in structure to this one. I've done some study as a result and there are some lessons to learn. The committee-of-doers-and-executive format is very effective when it works. It's worked well for numpy and many other OSS projects. Benevolence and commitment of the leader are made likely by choosing someone who is already doing a lot and whom everyone knows. Yet it can still go wrong. By studying the failures, as well as the successes, we can build in some safeguards. So, I'll tell the story of Pack 608. (It's the next four paragraphs if you want to skip it.) This is a Cub Scout Pack, which for those who don't already know is a group of 20-100 boys, ages 6-10, who participate in a variety of indoor and outdoor activities that are designed to build good character (whether they do is a different discussion). Our group has about 80 boys. It is run by parents who volunteer their time, and it is led by a Pack Committee, with a Chair and a Chartered Organization Representative. The latter is an overlord who can overrule anything but by design usually observes from the sidelines and only steps in when things go wrong. He represents the organization (usually a church or community service organization) that owns the pack. There is also a Cubmaster, who is the face of the organization to the boys. He reports to the Committee but is a really big dog there, often as big as the Chair. Sometimes bigger. Running a group of 80 boys doing a dozen activities a year is a huge job, way too big for any individual with a job, so finding and developing parent volunteers is hard. In our group, one married couple, extremely committed, got involved and began doing a huge number of tasks. Whenever anyone slacked off, they took over that job. They did those jobs extremely well. As our top leadership moved on (their boys turned 11 and became Boy Scouts), eventually we looked around and the clear choice for the Chair and Cubmaster were these two. We were not exited about installing a married couple in two of three key posts, but we had few options. We didn't want to do it, but nobody else stepped up. It was the biggest mistake we ever made. Even though everything this couple did was, in their eyes, in the best interest of the boys and the group, their style was so overbearing and their opinions so inflexible that there were serious verbal conflicts and other leaders began to quit. Some non-leaders complained, the Chartered Org. Rep. called a meeting, and these two were forced to step down. But, by this time, they were doing so many tasks that between them and the other departed leaders, we had lost EIGHT critical leadership roles (for those who know Scouts, these were Committee Chair, Cubmaster, Outdoors Chair, Events Chair, Advancement Chair, Fundraising Chair, and two Den Leaders). The pack nearly folded, but we managed to pull in a few parents (including me) who were neither eager to get close to this group nor really ready to spend a lot of time on it. We aren't yet up to where we used to be a year ago, but we're surviving. It's been brutal on our personal and work time, however. It would have been much easier had we stepped up before the fiasco. While such near-collapses are infrequent in small organizations, they happen often enough that many readers have probably experienced one. They can happen out of poor choice of leaders, feelings of possessiveness, differences of vision among stakeholders, and a host of other causes. The point here is to build into any organization plan not just the structure that will let it succeed, but also the safeguards that will reduce the chances of failure by catching and correcting organizational health problems early. With OSS, we have an ultimate safeguard, which is that anyone can fork the source and start a new org. But, as we've seen in this project, forks are *very* damaging and can take years to recover from, so it makes sense to think about safeguards in this organization, too. There are organizations of organizations that provide advice on this topic. Among the key suggestions: - Have uninvolved oversight. In Pack 608's case this was the Chartered Org. Rep. For us it could be an external board with some community reps. This oversight itself can be abused, so it must be limited to removing problem leaders and calling an election, or even just to calling an election. It should probably NOT extend to changing decisions on the code, other than forcing a vote of reconsideration. - Rotate key posts. When people do a job forever, they can become possessive of it. Not everyone does, but it's common. The job becomes tailored to the person, others don't know how it's done, key practices go undocumented, and if that person leaves or is temporarily unavailable at a key time or if a change needs to be made and that person disagrees, there's trouble. Feeling part ownership is good. Feeling full possession is not. The lack of central ownership is a big piece of why we can trust OSS in a way we can't trust commercial software. So, rotate key jobs. - Ensure that critical authorities (keys, passwords, certificates, title to important property like a web domain, signatory authority on a bank account) are never held by just one or even two people. Make sure the oversight entity has ultimate access. - Have multiple sources of key resources, like labor and money, so that the threat of taking them away is not debilitating. This means not letting one person hold too many jobs, too. - Have a well defined voting procedure and rules of order. You don't have to use them all the time when consensus is clear. They can be slow and cumbersome. But, when there is conflict, they ensure fairness. - Have a no-confidence vote procedure that removes a problem Chair. The rules for this need to be carefully thought out. - Have the consent group, and not the outgoing Chair, choose the next Chair. There are many reasons a Chair moves on, including lack of time, becoming tired of the job, and differences of opinion with the consent group. Especially in the latter case, you don't want the former Chair to influence the future. - Have open meetings with posted minutes. Keep no secrets (except passwords, etc.). As Ondrej pointed out, identifying the potential leaders isn't hard, in our case, but I think those in the consent group should include more than just coders. There has been a lot of work on the docs, for example, though not as much recently. Leaders of key client packages should be involved. Representatives of key user classes (astronomers, neuroscientists, statisticians, numerical programming teachers, newbies, etc.) would be important. Perhaps it's ok to let most coding questions go to the coders, but significant decisions that could change the direction of the effort should be referred to the wider group of stakeholders. That could be the mailing list, but then voting becomes vague. I would suggest including some number of core coders and an equal number of community representatives. Or perhaps the community reps become the oversight board, which also includes a small minority of coding reps. One method that works well for continuity is to elect a Vice-Chair who shadows the Chair and is involved in decisions for a period, then becomes Chair. The Past Chair can be a formal position, too, whose role it is to provide advice. This also eases the transition out of the top job. The Past Chair should be a resource only, and have no special authority. There are lots of sample organizational bylaws available online that can serve as a template. If we join an existing organization, e.g., for fundraising, they likely have requirements that our bylaws need to meet. To sum it up, the BDFL system is great if you have a truly BD. It's most likely to find one in the founder of an effort. Once they step down, be wary of centralizing too much authority. If one out of four leaders is less than fully committed, competent, and benevolent, it's trouble. We should at least have oversight and procedures that can remove a problem leader. Numpy is at a difficult stage, where we like the freedom of not having written rules and procedures, but we're at risk if we don't. Much larger and it's clear we need them, and vice versa. The time we'll wish we had them is when we have a leadership crisis. The chances of having one are small, but not vanishingly so. --jh-- From charlesr.harris at gmail.com Thu Dec 29 15:50:17 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 29 Dec 2011 13:50:17 -0700 Subject: [Numpy-discussion] GSOC Message-ID: Hi All, I thought I'd raise this topic just to get some ideas out there. At the moment I see two areas that I'd like to see addressed. 1. Documentation editor. This would involve looking at the generated documentation and it's organization/coverage as well such things as style and maybe reviewing stuff on the documentation site. This would be more technical writing than coding. 2. Test coverage. There are a lot of areas of numpy that are not well tested as well as some tests that are still doc tests and should probably be updated. This is a substantial amount of work and would require some familiarity with numpy as well as a willingness to ping developers for clarification of some topics. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From burlen.loring at gmail.com Thu Dec 29 16:23:32 2011 From: burlen.loring at gmail.com (Burlen Loring) Date: Thu, 29 Dec 2011 13:23:32 -0800 Subject: [Numpy-discussion] fft help In-Reply-To: References: <4EFB84A0.9080301@gmail.com> <4EFCA447.4020001@gmail.com> Message-ID: <4EFCDA54.3020605@gmail.com> there seems to be some undocumented restriction on dimensions as when I work with 512x512 data things work as expected. On 12/29/2011 09:43 AM, Torgil Svensson wrote: > Sorry, i should have looked at your image. A few test you can do is > > 1) does ifft2 give you back the original image? (allclose returned > True for a little test I did here) > 2) does scipy.fftpack.fft2 yield the same result? > > //Torgil > > > On Thu, Dec 29, 2011 at 6:32 PM, Burlen Loring wrote: >> hmmph, I used both fftn and fft2, they both produce the same result. Is >> there a restriction on the dimension of the input? power of 2 or some such? >> >> On 12/29/2011 07:21 AM, Torgil Svensson wrote: >>> This is because fft computes one-dimensional transforms (on each row). >>> Try fft2 instead. >>> >>> //Torgil >>> >>> >>> fft(a, n=None, axis=-1) >>> Compute the one-dimensional discrete Fourier Transform. >>> >>> fft2(a, s=None, axes=(-2, -1)) >>> Compute the 2-dimensional discrete Fourier Transform >>> >>> fftn(a, s=None, axes=None) >>> Compute the N-dimensional discrete Fourier Transform. >>> >>> >>> On Wed, Dec 28, 2011 at 10:05 PM, Burlen Loring wrote: >>>> Hi >>>> >>>> I have an image I need to do an fft on, I tried numpy.fft but results are >>>> not what I expected, and differ from matlab. >>>> >>>> My input image is a weird size, 5118x1279, I think numpy fft is not liking it. In >>>> numpy the fft appears to be computed multiple times and tiled across the >>>> output image. In other words the pattern I see in matlab fft is tiled >>>> repeatedly over numpy fft output. Any idea on what I'm doing wrong? >>>> >>>> you can see repeated pattern in the top panel of this image which also has >>>> the input in the bottom panel. >>>> http://old.nabble.com/file/p33047057/fft_uex.png fft_uex.png >>>> >>>> tx >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at googlemail.com Thu Dec 29 16:36:30 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 29 Dec 2011 22:36:30 +0100 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: On Thu, Dec 29, 2011 at 9:50 PM, Charles R Harris wrote: > Hi All, > > I thought I'd raise this topic just to get some ideas out there. At the > moment I see two areas that I'd like to see addressed. > > > 1. Documentation editor. This would involve looking at the generated > documentation and it's organization/coverage as well such things as style > and maybe reviewing stuff on the documentation site. This would be more > technical writing than coding. > 2. Test coverage. There are a lot of areas of numpy that are not well > tested as well as some tests that are still doc tests and should probably > be updated. This is a substantial amount of work and would require some > familiarity with numpy as well as a willingness to ping developers for > clarification of some topics. > > Thoughts? > First thought: very useful, but probably not GSOC topics by themselves. For a very good student, I'd think topics like implementing NA bit masks or improved user-defined dtypes would be interesting. In SciPy there's also a lot to do, and that's probably a better project for students who prefer to work in Python. Thanks for bringing this up. Last year we missed the boat, it would be good to get one or more slots this year. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Thu Dec 29 17:15:17 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 29 Dec 2011 17:15:17 -0500 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: On Thu, Dec 29, 2011 at 4:36 PM, Ralf Gommers wrote: > > > On Thu, Dec 29, 2011 at 9:50 PM, Charles R Harris > wrote: >> >> Hi All, >> >> I thought I'd raise this topic just to get some ideas out there. At the >> moment I see two areas that I'd like to see addressed. >> >> Documentation editor. This would involve looking at the generated >> documentation and it's organization/coverage as well such things as style >> and maybe reviewing stuff on the documentation site. This would be more >> technical writing than coding. >> Test coverage. There are a lot of areas of numpy that are not well tested >> as well as some tests that are still doc tests and should probably be >> updated. This is a substantial amount of work and would require some >> familiarity with numpy as well as a willingness to ping developers for >> clarification of some topics. >> >> Thoughts? > > First thought: very useful, but probably not GSOC topics by themselves. > > For a very good student, I'd think topics like implementing NA bit masks or > improved user-defined dtypes would be interesting. In SciPy there's also a > lot to do, and that's probably a better project for students who prefer to > work in Python. > > Thanks for bringing this up. Last year we missed the boat, it would be good > to get one or more slots this year. > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Along with test coverage, have any of you considered any systematic monitoring of NumPy performance? With all of the extensive refactoring / work on the internals, it would be useful to keep an eye on things in case of any performance regressions. I mention this because I started a little prototype project (http://github.com/wesm/vbench) for doing exactly that for my own development purposes-- it's already proved extremely useful. Anyway, just a thought. I'm sure a motivated student could spend a whole summer writing unit tests for NumPy and nothing else. - Wes From jsalvati at u.washington.edu Thu Dec 29 17:16:46 2011 From: jsalvati at u.washington.edu (John Salvatier) Date: Thu, 29 Dec 2011 14:16:46 -0800 Subject: [Numpy-discussion] Problem with changes to f2py Message-ID: Hi Numpy users! I maintain the boundary value problem solver package scikits.bvp_solver. It's had problems with f2py for a while, and I am not sure where they are coming from. I made this stackoverflow postsome time ago, but I didn't get any solutions. Here are the details: I am trying to update my package scikits.bvp_solver (source here) and I have run into some problems with f2py generated files. The files 'bvp_solverf-f2pywrappers2.f90' and 'bvp_solverfmodule.c' which were generated in 2009 allow the package to be built in place with "python setup.py build_ext --inplace" but if I delete them and try to rebuild I get the error scikits/bvp_solver/lib/bvp_solverf-f2pywrappers2.f90:218.48: use guess_3_wrap__user__routines 1 Fatal Error: Can't open module file 'guess_3_wrap__user__routines.mod' for reading at (1): No such file or directory scikits/bvp_solver/lib/bvp_solverf-f2pywrappers2.f90:11.19: The file bvp_interface.pyf specifies a python module guess_3_wrap__user__routines, but I do not see this show up in the f2py generated modules. The part the adds this use statement does not appear in the old version of the file. I am having difficulty figuring out how to fix this issue. Can anyone offer advice? What are the major changes to f2py in the last two years? I appreciate all clues on this issue. Thank you in advance, John -------------- next part -------------- An HTML attachment was scrubbed... URL: From deshpande.jaidev at gmail.com Thu Dec 29 23:37:13 2011 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Fri, 30 Dec 2011 10:07:13 +0530 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: Hi! > Along with test coverage, have any of you considered any systematic > monitoring of NumPy performance? I'm mildly obsessed with performance and benchmarking of NumPy. I used to use a lot of MATLAB until a year back and I tend to compare Python performance with it all the time. I generally don't feel happy until I'm convinced that I've extracted the last bit of speed out of my Python code. I think the generalization of this idea is more or less equivalent to performance benchmarking. Of course, I know there's a lot more than 'MATLAB vs Python' to it. I'd be more than happy to be involved. GSoC or otherwise. Where do I start? Thanks From jason-sage at creativetrax.com Thu Dec 29 23:45:46 2011 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Thu, 29 Dec 2011 22:45:46 -0600 Subject: [Numpy-discussion] Numpy performance testing In-Reply-To: References: Message-ID: <4EFD41FA.7010303@creativetrax.com> On 12/29/11 10:37 PM, Jaidev Deshpande wrote: > Hi! > >> Along with test coverage, have any of you considered any systematic >> monitoring of NumPy performance? > > I'm mildly obsessed with performance and benchmarking of NumPy. I used > to use a lot of MATLAB until a year back and I tend to compare Python > performance with it all the time. I generally don't feel happy until > I'm convinced that I've extracted the last bit of speed out of my > Python code. > > I think the generalization of this idea is more or less equivalent to > performance benchmarking. Of course, I know there's a lot more than > 'MATLAB vs Python' to it. I'd be more than happy to be involved. GSoC > or otherwise. > > Where do I start? We've recently had a discussion about more intelligent timeit commands and timing objects in Python/Sage. People here might find the discussion interesting, and it might also be interesting to collaborate on code. The basic idea was a much smarter timeit command that uses more intelligent statistics and presents a much more comprehensive look at the timing information. Here is the discussion: https://groups.google.com/forum/#!topic/sage-devel/8lq3twm9Olc Here is our ticket tracking the issue: http://trac.sagemath.org/sage_trac/ticket/12168 Here are some examples of the analysis: http://sagenb.org/home/pub/3857/ I've CCd the sage-devel list as well, which is where our discussion happened. Thanks, Jason From alan.isaac at gmail.com Fri Dec 30 10:13:05 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 30 Dec 2011 10:13:05 -0500 Subject: [Numpy-discussion] Mersenne Twister: Python vs. NumPy Message-ID: <4EFDD501.2040103@gmail.com> If I seed NumPy's random number generator, I get the expected sequence. If I use the same seed for Python's random number generator, I get a different sequence. 1. Why does the Python sequence differ from others? 2. Can I somehow put both MT's in a common state? Thank you, Alan Isaac From robert.kern at gmail.com Fri Dec 30 10:36:19 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 30 Dec 2011 15:36:19 +0000 Subject: [Numpy-discussion] Mersenne Twister: Python vs. NumPy In-Reply-To: <4EFDD501.2040103@gmail.com> References: <4EFDD501.2040103@gmail.com> Message-ID: On Fri, Dec 30, 2011 at 15:13, Alan G Isaac wrote: > If I seed NumPy's random number generator, I get the > expected sequence. What do you mean by "expected"? Where are these expectations coming from? Other implementations of the Mersenne Twister? >?If I use the same seed for Python's > random number generator, I get a different sequence. > > 1. Why does the Python sequence differ from others? The initialization algorithm that takes the input seed (usually just an integer) to a 624-word uint32 array that is the Mersenne Twister's state vector. There are two initialization modes that are fairly standard (since they were distributed with the original published MT sources). One takes a 32-bit int, and the other takes an array of 32-bit ints. For numpy, I made the choice that if the integer seed fits into the uint32, then we would just use the initialization function for that. Python just treats any integer input as if it were a Python long object, breaks it up into 32-bit words and uses the array initialization function. > 2. Can I somehow put both MT's in a common state? With a little bit of work, yes. Look at random.Random.getstate() and np.random.RandomState.get_state() and their associated setter functions. You really just need to reformat the information to be acceptable to the other. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From alan.isaac at gmail.com Fri Dec 30 11:04:59 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 30 Dec 2011 11:04:59 -0500 Subject: [Numpy-discussion] Mersenne Twister: Python vs. NumPy In-Reply-To: References: <4EFDD501.2040103@gmail.com> Message-ID: <4EFDE12B.1060306@gmail.com> > On Fri, Dec 30, 2011 at 15:13, Alan wrote: >>> If I seed NumPy's random number generator, I get the >>> expected sequence. On 12/30/2011 10:36 AM, Robert Kern wrote: > What do you mean by "expected"? Where are these expectations coming > from? Other implementations of the Mersenne Twister? Right. (GSL and Matlab.) Thanks for the helpful information. Alan From lists at informa.tiker.net Fri Dec 30 13:57:39 2011 From: lists at informa.tiker.net (Andreas Kloeckner) Date: Fri, 30 Dec 2011 19:57:39 +0100 Subject: [Numpy-discussion] dtype comparison, hash In-Reply-To: References: <878vlyu7uq.fsf@ding.tiker.net> Message-ID: <87wr9drios.fsf@ding.tiker.net> Hi Robert, On Tue, 27 Dec 2011 10:17:41 +0000, Robert Kern wrote: > On Tue, Dec 27, 2011 at 01:22, Andreas Kloeckner > wrote: > > Hi all, > > > > Two questions: > > > > - Are dtypes supposed to be comparable (i.e. implement '==', '!=')? > > Yes. > > > - Are dtypes supposed to be hashable? > > Yes, with caveats. Strictly speaking, we violate the condition that > objects that equal each other should hash equal since we define == to > be rather free. Namely, > > np.dtype(x) == x > > for all objects x that can be converted to a dtype. > > np.dtype(float) == np.dtype('float') > np.dtype(float) == float > np.dtype(float) == 'float' > > Since hash(float) != hash('float') we cannot implement > np.dtype.__hash__() to follow the stricture that objects that compare > equal should hash equal. > > However, if you restrict the domain of objects to just dtypes (i.e. > only consider dicts that use only actual dtype objects as keys instead > of arbitrary mixtures of objects), then the stricture is obeyed. This > is a useful domain that is used internally in numpy. > > Is this the problem that you found? Thanks for the reply. It doesn't seem like this is our issue--instead, we're encountering two different dtype objects that claim to be float64, compare as equal, but don't hash to the same value. I've asked the user who encountered the user to investigate, and I'll be back with more detail in a bit. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From robert.kern at gmail.com Fri Dec 30 15:05:14 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 30 Dec 2011 20:05:14 +0000 Subject: [Numpy-discussion] dtype comparison, hash In-Reply-To: <87wr9drios.fsf@ding.tiker.net> References: <878vlyu7uq.fsf@ding.tiker.net> <87wr9drios.fsf@ding.tiker.net> Message-ID: On Fri, Dec 30, 2011 at 18:57, Andreas Kloeckner wrote: > Hi Robert, > > On Tue, 27 Dec 2011 10:17:41 +0000, Robert Kern wrote: >> On Tue, Dec 27, 2011 at 01:22, Andreas Kloeckner >> wrote: >> > Hi all, >> > >> > Two questions: >> > >> > - Are dtypes supposed to be comparable (i.e. implement '==', '!=')? >> >> Yes. >> >> > - Are dtypes supposed to be hashable? >> >> Yes, with caveats. Strictly speaking, we violate the condition that >> objects that equal each other should hash equal since we define == to >> be rather free. Namely, >> >> ? np.dtype(x) == x >> >> for all objects x that can be converted to a dtype. >> >> ? np.dtype(float) == np.dtype('float') >> ? np.dtype(float) == float >> ? np.dtype(float) == 'float' >> >> Since hash(float) != hash('float') we cannot implement >> np.dtype.__hash__() to follow the stricture that objects that compare >> equal should hash equal. >> >> However, if you restrict the domain of objects to just dtypes (i.e. >> only consider dicts that use only actual dtype objects as keys instead >> of arbitrary mixtures of objects), then the stricture is obeyed. This >> is a useful domain that is used internally in numpy. >> >> Is this the problem that you found? > > Thanks for the reply. > > It doesn't seem like this is our issue--instead, we're encountering two > different dtype objects that claim to be float64, compare as equal, but > don't hash to the same value. > > I've asked the user who encountered the user to investigate, and I'll > be back with more detail in a bit. I think we've run into this before and tried to fix it. Try to find the version of numpy the user has and a minimal example, if you can. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From chris.barker at noaa.gov Fri Dec 30 23:41:02 2011 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 30 Dec 2011 20:41:02 -0800 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: On Thu, Dec 29, 2011 at 1:36 PM, Ralf Gommers wrote: > First thought: very useful, but probably not GSOC topics by themselves. Documentation is specificsly excluded from GSoC (at least it was a couple years ago when I last was involved) Not sure about testing, but I'd guess it can't be a project by itself. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice 7600 Sand Point Way NE ??(206) 526-6329?? fax Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception Chris.Barker at noaa.gov From deshpande.jaidev at gmail.com Sat Dec 31 00:43:55 2011 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Sat, 31 Dec 2011 11:13:55 +0530 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: Hi Chris > Documentation is specificsly excluded from GSoC (at least it was a > couple years ago when I last was involved) Documentation wasn't excluded last year from GSoC, there were quite a few projects that required a lot of documentation. But yes, there was no "documentation only" project. Anyhow, it seems reasonable that testing alone can't be a project. What about benchmarking and the related statistics? Does that qualify as a worthwhile project (again, GSoC or otherwise)? Thanks From kalatsky at gmail.com Sat Dec 31 01:48:10 2011 From: kalatsky at gmail.com (Val Kalatsky) Date: Sat, 31 Dec 2011 00:48:10 -0600 Subject: [Numpy-discussion] Ufuncs and flexible types, CAPI Message-ID: Hi folks, First post, may not follow the standards, please bear with me. Need to define a ufunc that takes care of various type. Fixed - no problem, userdef - no problem, flexible - problem. It appears that the standard ufunc loop does not provide means to deliver the size of variable size items. Questions and suggestions: 1) Please no laughing: I have to code for NumPy 1.3.0. Perhaps this issue has been resolved, then the discussion becomes moot. If so please direct me to the right link. 2) A reasonable approach here would be to use callbacks and to give the user (read programmer) a chance to intervene at least twice: OnInit and OnFail (OnFinish may not be unreasonable as well). OnInit: before starting the type resolution the user is given a chance to do something (e.g. check for that pesky type and take control then return a flag indicating a stop) before the resolution starts OnFail: the resolution took place and did not succeed, the user is given a chance to fix it. In most of the case these callbacks are NULLs. I could patch numpy with a generic method that does it, but it's a shame not to use the good ufunc machine. Thanks for tips and suggestions. Val Kalatsky -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Dec 31 05:55:01 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 31 Dec 2011 11:55:01 +0100 Subject: [Numpy-discussion] Numpy performance testing In-Reply-To: <4EFD41FA.7010303@creativetrax.com> References: <4EFD41FA.7010303@creativetrax.com> Message-ID: On Fri, Dec 30, 2011 at 5:45 AM, wrote: > On 12/29/11 10:37 PM, Jaidev Deshpande wrote: > > Hi! > > > >> Along with test coverage, have any of you considered any systematic > >> monitoring of NumPy performance? > > > > I'm mildly obsessed with performance and benchmarking of NumPy. I used > > to use a lot of MATLAB until a year back and I tend to compare Python > > performance with it all the time. I generally don't feel happy until > > I'm convinced that I've extracted the last bit of speed out of my > > Python code. > > > > I think the generalization of this idea is more or less equivalent to > > performance benchmarking. Of course, I know there's a lot more than > > 'MATLAB vs Python' to it. I'd be more than happy to be involved. GSoC > > or otherwise. > > > > Where do I start? > > We've recently had a discussion about more intelligent timeit commands > and timing objects in Python/Sage. People here might find the > discussion interesting, and it might also be interesting to collaborate > on code. The basic idea was a much smarter timeit command that uses > more intelligent statistics and presents a much more comprehensive look > at the timing information. > > Here is the discussion: > https://groups.google.com/forum/#!topic/sage-devel/8lq3twm9Olc > > Here is our ticket tracking the issue: > http://trac.sagemath.org/sage_trac/ticket/12168 > > Here are some examples of the analysis: http://sagenb.org/home/pub/3857/ > > Nice. It would be cool to have this available as a separate ipython magic command. For performance monitoring it's probably unnecessary, regular %timeit should be OK for that. Performance monitoring does require quite a bit of infrastructure (like Wes' vbench project) though, which could be a good (GSOC) project. There's other VCS's to support, maybe a buildbot plugin, many options there. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 31 10:43:17 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 31 Dec 2011 08:43:17 -0700 Subject: [Numpy-discussion] GSOC In-Reply-To: References: Message-ID: On Thu, Dec 29, 2011 at 2:36 PM, Ralf Gommers wrote: > > > On Thu, Dec 29, 2011 at 9:50 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> I thought I'd raise this topic just to get some ideas out there. At the >> moment I see two areas that I'd like to see addressed. >> >> >> 1. Documentation editor. This would involve looking at the generated >> documentation and it's organization/coverage as well such things as style >> and maybe reviewing stuff on the documentation site. This would be more >> technical writing than coding. >> 2. Test coverage. There are a lot of areas of numpy that are not well >> tested as well as some tests that are still doc tests and should probably >> be updated. This is a substantial amount of work and would require some >> familiarity with numpy as well as a willingness to ping developers for >> clarification of some topics. >> >> Thoughts? >> > First thought: very useful, but probably not GSOC topics by themselves. > > For a very good student, I'd think topics like implementing NA bit masks > or improved user-defined dtypes would be interesting. In SciPy there's also > a lot to do, and that's probably a better project for students who prefer > to work in Python. > > Good points. There is actually a fair bit of work that could go into NA. The low level infrastructure seems to me somewhat independent of the arguments about the API. I see four areas there 1) Size - that requires bit masks and a decision that masks only take two values. 2) Speed - that requires support in the ufunc loops. 3) Functions - isna needs some help, like isanyna(a, axis=1) 4) More support in current functions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: