From chan_dhf at yahoo.de Wed Aug 1 02:55:10 2007 From: chan_dhf at yahoo.de (Danny Chan) Date: Wed, 1 Aug 2007 08:55:10 +0200 (CEST) Subject: [Numpy-discussion] reading 10 bit raw data into an array In-Reply-To: <46AE17C2.80301@ieee.org> Message-ID: <51633.82348.qm@web26214.mail.ukl.yahoo.com> Hi Travis! I guess I will still have to pad my data to full bytes before reading it, correct? Travis Oliphant schrieb: Danny Chan wrote: > Hi all! > I'm trying to read a data file that contains a raw image file. Every > pixel is assigned a value from 0 to 1023, and all pixels are stored from > top left to bottom right pixel in binary format in this file. I know the > width and the height of the image, so all that would be required is to > read 10 bits at a time and store it these as an integer. I played around > with the fromstring and fromfile function, and I read the documentation > for dtype objects, but I'm still confused. It seems simple enough to > read data in a format with a standard bitwidth, but how can I read data > in a non-standard format. Can anyone help? > This kind of bit-manipulation must be done using bit operations on standard size data types even in C. The file reading and writing libraries use bytes as their common denominator. I would read in the entire image into a numpy array of unsigned bytes and then use slicing, masking, and bit-shifting to take 5 bytes at a time and convert them to 4 values of a 16-bit unsigned image. Basically, you would do something like # read in entire image into 1-d unsigned byte array # create 16-bit array of the correct 2-D size # use flat indexing to store into the new array # new.flat[::4] = old[::5] + bitwise_or(old[1::5], MASK1b) << SHIFT1b # new.flat[1::4] = bitwise_or(old[1::5], MASK2a) << SHIFT2a + bitwise_or(old[2::5], MASK2b) << SHIFT2b # etc. The exact MASKS and shifts to use is left as an exercise for the reader :-) -Travis _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion --------------------------------- Jetzt Mails schnell in einem Vorschaufenster ?berfliegen. Dies und viel mehr bietet das neue Yahoo! Mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: From haase at msg.ucsf.edu Wed Aug 1 05:03:27 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed, 1 Aug 2007 11:03:27 +0200 Subject: [Numpy-discussion] reading 10 bit raw data into an array In-Reply-To: <51633.82348.qm@web26214.mail.ukl.yahoo.com> References: <46AE17C2.80301@ieee.org> <51633.82348.qm@web26214.mail.ukl.yahoo.com> Message-ID: On 8/1/07, Danny Chan wrote: > Hi Travis! > I guess I will still have to pad my data to full bytes before reading it, > correct? > > Travis Oliphant schrieb: > Danny Chan wrote: > > Hi all! > > I'm trying to read a data file that contains a raw image file. Every > > pixel is assigned a value from 0 to 1023, and all pixels are stored from > > top left to bottom right pixel in binary format in this file. I know the > > width and the height of the image, so all that would be required is to > > read 10 bits at a time and store it these as an integer. I played around > > with the fromstring and fromfile function, and I read the documentation > > for dtype objects, but I'm still confused. It seems simple enough to > > read data in a format with a standard bitwidth, but how can I read data > > in a non-standard format. Can anyone help? > > > > This kind of bit-manipulation must be done using bit operations on > standard size data types even in C. The file reading and writing > libraries use bytes as their common denominator. > > I would read in the entire image into a numpy array of unsigned bytes > and then use slicing, masking, and bit-shifting to take 5 bytes at a > time and convert them to 4 values of a 16-bit unsigned image. > > Basically, you would do something like > > # read in entire image into 1-d unsigned byte array > # create 16-bit array of the correct 2-D size > # use flat indexing to store into the new array > # new.flat[::4] = old[::5] + bitwise_or(old[1::5], MASK1b) << SHIFT1b > # new.flat[1::4] = bitwise_or(old[1::5], MASK2a) << SHIFT2a > + bitwise_or(old[2::5], MASK2b) << SHIFT2b > > # etc. > > > The exact MASKS and shifts to use is left as an exercise for the reader :-) Quick comment : are you really sure your camera produces the 12 bit data in a "12 bit stream" --- all I have ever seen is that cameras would just use 16 bit for each pixel. (All you had to know if it uses the left or the right part of those. In other words, you might have to divide (or use bit shifting) the data by 16.) Wasteful yes, but much simpler to handel !? -Sebastian Haase From lfriedri at imtek.de Wed Aug 1 10:09:37 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Wed, 01 Aug 2007 16:09:37 +0200 Subject: [Numpy-discussion] fourier with single precision Message-ID: <46B09421.3020807@imtek.de> Hello, is there a way to tell numpy.fft.fft2 to use complex64 instead of complex128 as output dtype to speed the up transformation? Thanks Lars From vincent.nijs at gmail.com Wed Aug 1 11:30:21 2007 From: vincent.nijs at gmail.com (Vincent) Date: Wed, 01 Aug 2007 15:30:21 -0000 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: <1185982221.939389.53900@g12g2000prg.googlegroups.com> I do a lot of this kind of things in SAS. In don't like SAS that much so it would be great to have functionality like this for numpy recarray's. To transplant the approach that SAS takes to a numpy setting you'd have something like the following 4 steps: 1. Sort the data by date and region 2. Determine the indices for the blocks (e.g., East, 1/1) 3. calculate the summary stats per block SAS is very efficient at these types of operations i believe. Since it assumes that the data is sorted, and throws and error if the data is not sorted appropriately, i assume the indexing can be more efficient. However, given the earlier comments i am wonder if this approach would enhance performance. I would be very interested to see what you come up with so please post some of the code and/or timing tests to the list if possible. Best, Vincent From chan_dhf at yahoo.de Wed Aug 1 13:38:44 2007 From: chan_dhf at yahoo.de (Danny Chan) Date: Wed, 1 Aug 2007 19:38:44 +0200 Subject: [Numpy-discussion] reading 10 bit raw data into an array In-Reply-To: References: <46AE17C2.80301@ieee.org> <51633.82348.qm@web26214.mail.ukl.yahoo.com> Message-ID: <200708011938.45196.chan_dhf@yahoo.de> > > Quick comment : are you really sure your camera produces the 12 bit > data in a "12 bit stream" --- all I have ever seen is that cameras > would just use 16 bit for each pixel. (All you had to know if it uses > the left or the right part of those. In other words, you might have > to divide (or use bit shifting) the data by 16.) > Wasteful yes, but much simpler to handel !? > 10 bit, but yes, I am sure. It is an embedded camera system, in fact, I get the data stream even before it is handled to any ISP for further processing of the picture. In the end, the ISP will convert the data stream to another format, but I have to simulate some of the algorithms that will be implemented in hardware. From bsouthey at gmail.com Wed Aug 1 15:02:24 2007 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 1 Aug 2007 14:02:24 -0500 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: Hi, The hard part is knowing what aggregate function that you want. So a hard way, even after cheating, to take the data provided is given below. (The Numpy Example List was very useful especially on the where function)! I tried to be a little generic so you can replace the sum by any suitable function and probably the array type as well. Of course it is not complete because you still need to know the levels of the 'rows' and 'columns' and also is not efficient as it has loops. Bruce from numpy import * A=array([[1,1,10], [1,1,20], [1,2,30], [2,1,40], [2,2,50], [2,2,60] ]) C = zeros((2,2)) for i in range(2): crit1 = (A[:,0]==1+i) subA=A[crit1,1:] for j in range(2): crit2 = (subA[:,0]==1+j) subB=subA[crit2,1:] C[i,j]=subB.sum() print C On 7/30/07, Geoffrey Zhu wrote: > Hi Everyone, > > I am wondering what is the best (and fast) way to build a pivot table > aside from the 'brute force way?' > > I want to transform an numpy array into a pivot table. For example, if > I have a numpy array like below: > > Region Date # of Units > ---------- ---------- -------------- > East 1/1 10 > East 1/1 20 > East 1/2 30 > West 1/1 40 > West 1/2 50 > West 1/2 60 > > I want to transform this into the following table, where f() is a > given aggregate function: > > Date > Region 1/1 1/2 > ---------- > East f(10,20) f(30) > West f(40) f(50,60) > > > I can regroup them into 'sets' and do it the brute force way, but that > is kind of slow to execute. Does anyone know a better way? > > > Thanks, > Geoffrey > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From rhdireen at gmail.com Wed Aug 1 19:24:46 2007 From: rhdireen at gmail.com (Randy Direen) Date: Wed, 1 Aug 2007 17:24:46 -0600 Subject: [Numpy-discussion] f2py self documenting not working Message-ID: Im using f2py under numpy. I've written several simple examples and f2py has not generated any documentations for the routines I have made. Any help would be great, I am very new to f2py and I would like to use the tool to wrap a rather large program written in Fortran90. Thanks! Randy Direen NIST -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at enthought.com Thu Aug 2 00:22:31 2007 From: travis at enthought.com (Travis Vaught) Date: Wed, 1 Aug 2007 23:22:31 -0500 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com> Greetings, Speaking of brute force... I've attached a rather ugly module that let's you do things with a pretty simple interface (session shown below). I haven't fully tested the performance, but a million records with 5 fields takes about 11 seconds on my Mac to do a 'mean'. I'm not sure what your performance considerations are, but this may be useful. Record arrays are really nice if they make sense for your data. Travis (from an ipython command prompt) In [1]: import testpivot as p In [2]: a = p.sample_data() In [3]: a Out[3]: recarray([('ACorp', 'Region 1', 'Q1', 20000.0), ('ACorp', 'Region 1', 'Q2', 22000.0), ('ACorp', 'Region 1', 'Q3', 21000.0), ('ACorp', 'Region 1', 'Q4', 26000.0), ('ACorp', 'Region 2', 'Q1', 23000.0), ('ACorp', 'Region 2', 'Q2', 20000.0), ('ACorp', 'Region 2', 'Q3', 22000.0), ('ACorp', 'Region 2', 'Q4', 21000.0), ('ACorp', 'Region 3', 'Q1', 26000.0), ('ACorp', 'Region 3', 'Q2', 23000.0), ('ACorp', 'Region 3', 'Q3', 29000.0), ('ACorp', 'Region 3', 'Q4', 27000.0), ('BCorp', 'Region 1', 'Q1', 20000.0), ('BCorp', 'Region 1', 'Q2', 20000.0), ('BCorp', 'Region 1', 'Q3', 24000.0), ('BCorp', 'Region 1', 'Q4', 24000.0), ('BCorp', 'Region 2', 'Q1', 21000.0), ('BCorp', 'Region 2', 'Q2', 21000.0), ('BCorp', 'Region 2', 'Q3', 22000.0), ('BCorp', 'Region 2', 'Q4', 29000.0), ('BCorp', 'Region 3', 'Q1', 28000.0), ('BCorp', 'Region 3', 'Q2', 25000.0), ('BCorp', 'Region 3', 'Q3', 22000.0), ('BCorp', 'Region 3', 'Q4', 21000.0)], dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '| S2'), ('income', ' -------------- next part -------------- On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote: > Hi, > The hard part is knowing what aggregate function that you want. So a > hard way, even after cheating, to take the data provided is given > below. (The Numpy Example List was very useful especially on the where > function)! > > I tried to be a little generic so you can replace the sum by any > suitable function and probably the array type as well. Of course it is > not complete because you still need to know the levels of the 'rows' > and 'columns' and also is not efficient as it has loops. > > Bruce > > from numpy import * > A=array([[1,1,10], > [1,1,20], > [1,2,30], > [2,1,40], > [2,2,50], > [2,2,60] ]) > C = zeros((2,2)) > > for i in range(2): > crit1 = (A[:,0]==1+i) > subA=A[crit1,1:] > for j in range(2): > crit2 = (subA[:,0]==1+j) > subB=subA[crit2,1:] > C[i,j]=subB.sum() > > > print C > > On 7/30/07, Geoffrey Zhu wrote: >> Hi Everyone, >> >> I am wondering what is the best (and fast) way to build a pivot table >> aside from the 'brute force way?' >> >> I want to transform an numpy array into a pivot table. For >> example, if >> I have a numpy array like below: >> >> Region Date # of Units >> ---------- ---------- -------------- >> East 1/1 10 >> East 1/1 20 >> East 1/2 30 >> West 1/1 40 >> West 1/2 50 >> West 1/2 60 >> >> I want to transform this into the following table, where f() is a >> given aggregate function: >> >> Date >> Region 1/1 1/2 >> ---------- >> East f(10,20) f(30) >> West f(40) f(50,60) >> >> >> I can regroup them into 'sets' and do it the brute force way, but >> that >> is kind of slow to execute. Does anyone know a better way? >> >> >> Thanks, >> Geoffrey >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Thu Aug 2 00:59:47 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 02 Aug 2007 13:59:47 +0900 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B09421.3020807@imtek.de> References: <46B09421.3020807@imtek.de> Message-ID: <46B164C3.6040504@ar.media.kyoto-u.ac.jp> Lars Friedrich wrote: > Hello, > > is there a way to tell numpy.fft.fft2 to use complex64 instead of > complex128 as output dtype to speed the up transformation? > As far as I can read from the fft code in numpy, only double is supported at the moment, unfortunately. Note that you can get some speed by using scipy.fftpack methods instead, if scipy is an option for you. David From goddard at cgl.ucsf.edu Thu Aug 2 01:43:01 2007 From: goddard at cgl.ucsf.edu (Tom Goddard) Date: Wed, 01 Aug 2007 22:43:01 -0700 Subject: [Numpy-discussion] Memory efficient equality test for arrays Message-ID: <46B16EE5.5010900@cgl.ucsf.edu> Is there a numpy call to test if two large arrays (say 1 Gbyte each) are equal (same shape and elements) without creating another large array of booleans as happens with "a == b", numpy.equal(a,b), or numpy.array_equal(a,b)? I want a memory efficient and fast comparison. Tom From haase at msg.ucsf.edu Thu Aug 2 06:20:32 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Thu, 2 Aug 2007 12:20:32 +0200 Subject: [Numpy-discussion] rant against from numpy import * / from pylab import * In-Reply-To: <45FA4377.6010201@hawaii.edu> References: <45FA4377.6010201@hawaii.edu> Message-ID: Hi all, Here a quick update: I'm trying to have a concise / sparse module with containing only pylab-specific names and not all names I already have in numpy. To easy typing I want to call numpy "N" and my pylab "P". I'm now using this code: import matplotlib, new matplotlib.use('WXAgg') from matplotlib import pylab P = new.module("pylab_sparse","""pylab module minus stuff alreay in numpy""") for k,v in pylab.__dict__.iteritems(): try: if v is N.__dict__[k]: continue except KeyError: pass P.__dict__[k] = v P.ion() del matplotlib, new, pylab The result is "some" reduction in the number of non-pylab-specific names in my "P"-module. However there seem to be still many extra names left, like e.g.: alltrue, amax, array, ... look at this: # 20070802 # >>> len(dir(pylab)) # 441 # >>> len(dir(P)) # 346 # >>> P.nx.numpy.__version__ # '1.0.1' # >>> N.__version__ # '1.0.1' # >>> N.alltrue # # >>> P.alltrue # # >>> N.alltrue.__doc__ # 'Perform a logical_and over the given axis.' # >>> P.alltrue.__doc__ # >>> #N.alltrue(x, axis=None, out=None) # >>> #P.alltrue(x, axis=0) I'm using matplotlib with __version__ = '0.90.0' __revision__ = '$Revision: 3003 $' __date__ = '$Date: 2007-02-06 22:24:06 -0500 (Tue, 06 Feb 2007) $' Any hint how to further reduce the number of names in "P" ? My ideal would be that the "P" module (short for pylab) would only contain the stuff described in the __doc__ strings of `pylab.py` and `__init__.py`(in matplotlib) (+ plus some extra, undocumented, yet pylab specific things) Thanks -Sebastian On 3/16/07, Eric Firing wrote: > Sebastian Haase wrote: > > Hi! > > I use the wxPython PyShell. > > I like especially the feature that when typing a module and then the > > dot "." I get a popup list of all available functions (names) inside > > that module. > > > > Secondly, I think it really makes code clearer when one can see where > > a function comes from. > > > > I have a default > > import numpy as N > > executed before my shell even starts. > > In fact I have a bunch of my "standard" modules imported as > single capital letter>. > > > > This - I think - is a good compromise to the commonly used "extra > > typing" and "unreadable" argument. > > > > a = sin(b) * arange(10,50, .1) * cos(d) > > vs. > > a = N.sin(b) * N.arange(10,50, .1) * N.cos(d) > > I generally do the latter, but really, all those "N." bits are still > visual noise when it comes to reading the code--that is, seeing the > algorithm rather than where the functions come from. I don't think > there is anything wrong with explicitly importing commonly-used names, > especially things like sin and cos. > > > > > I would like to hear some comments by others. > > > > > > On a different note: I just started using pylab, so I did added an > > automatic "from matplotlib import pylab as P" -- but now P contains > > everything that I already have in N. It makes it really hard to > > *find* (as in *see* n the popup-list) the pylab-only functions. -- > > what can I do about this ? > > A quick and dirty solution would be to comment out most of the imports > in pylab.py; they are not needed for the pylab functions and are there > only to give people lots of functionality in a single namespace. > > I am cross-posting this to matplotlib-users because it involves mpl, and > an alternative solution would be for us to add an rcParam entry to allow > one to turn off all of the namespace consolidation. A danger is that if > someone is using "from pylab import *" in a script, then whether it > would run would depend on the matplotlibrc file. To get around that, > another possibility would be to break pylab.py into two parts, with > pylab.py continuing to do the namespace consolidation and importing the > second part, which would contain the actual pylab functions. Then if > you don't want the namespace consolidation, you could simply import the > second part instead of pylab. There may be devils in the details, but > it seems to me that this last alternative--splitting pylab.py--might > make a number of people happier while having no adverse effects on > everyone else. > > Eric > > > > > > Thanks, > > Sebastian From lfriedri at imtek.de Thu Aug 2 12:40:13 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Thu, 02 Aug 2007 18:40:13 +0200 Subject: [Numpy-discussion] fourier with single precision Message-ID: <46B208ED.1010403@imtek.de> Hello, David Cournapeau wrote: > As far as I can read from the fft code in numpy, only double is > supported at the moment, unfortunately. Note that you can get some speed > by using scipy.fftpack methods instead, if scipy is an option for you. What I understood is that numpy uses FFTPACK's algorithms. From www.netlib.org/fftpack (is this the right address?) I took that there is a single-precision and double-precision-version of the algorithms. How hard would it be (for example for me...) to add the single-precision versions to numpy? I am not a decent C-hacker, but if someone tells me, that this task is not *too* hard, I would start looking more closely at the code... Would it make sense, that if one passes an array of dtype = numpy.float32 to the fft function, a complex64 is returned, and if one passes an array of dtype = numpy.float64, a complex128 is returned? Lars From rmay at ou.edu Thu Aug 2 15:18:49 2007 From: rmay at ou.edu (Ryan May) Date: Thu, 02 Aug 2007 14:18:49 -0500 Subject: [Numpy-discussion] 16bit Integer Array/Scalar Inconsistency Message-ID: <46B22E19.2090707@ou.edu> Hi, I ran into this while debugging a script today: In [1]: import numpy as N In [2]: N.__version__ Out[2]: '1.0.3' In [3]: d = N.array([32767], dtype=N.int16) In [4]: d + 32767 Out[4]: array([-2], dtype=int16) In [5]: d[0] + 32767 Out[5]: 65534 In [6]: type(d[0] + 32767) Out[6]: In [7]: type(d[0]) Out[7]: It seems that numpy will automatically promote the scalar to avoid overflow, but not in the array case. Is this inconsistency a bug, just a (known) gotcha? I myself don't have any problems with the array not being promoted automatically, but the inconsistency with scalar operation made debugging my problem more difficult. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From dalcinl at gmail.com Thu Aug 2 15:22:07 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Aug 2007 16:22:07 -0300 Subject: [Numpy-discussion] reference leacks in numpy.asarray Message-ID: using numpy-1.0.3, I believe there are a reference leak somewhere. Using a debug build of Python 2.5.1 (--with-pydebug), I get the following import sys, gc import numpy def testleaks(func, args=(), kargs={}, repeats=5): for i in xrange(repeats): r1 = sys.gettotalrefcount() func(*args,**kargs) r2 = sys.gettotalrefcount() rd = r2-r1 print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd) def npy_asarray_1(): a = numpy.zeros(5, dtype=int) b = numpy.asarray(a, dtype=float) del a, b def npy_asarray_2(): a = numpy.zeros(5, dtype=float) b = numpy.asarray(a, dtype=float) del a, b if __name__ == '__main__': testleaks(npy_asarray_1) testleaks(npy_asarray_2) $ python npyleaktest.py before: 84531, after: 84532, diff: [1] before: 84534, after: 84534, diff: [0] before: 84534, after: 84534, diff: [0] before: 84534, after: 84534, diff: [0] before: 84534, after: 84534, diff: [0] before: 84531, after: 84533, diff: [2] before: 84535, after: 84536, diff: [1] before: 84536, after: 84537, diff: [1] before: 84537, after: 84538, diff: [1] before: 84538, after: 84539, diff: [1] It seems npy_asarray_2() is leaking a reference. I am missing something here?. The same problem is found in C, using PyArray_FROM_OTF (no time to go inside to see what's going on, sorry) If this is know and solved in SVN, please forget me. Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robert.kern at gmail.com Thu Aug 2 15:31:07 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 02 Aug 2007 14:31:07 -0500 Subject: [Numpy-discussion] 16bit Integer Array/Scalar Inconsistency In-Reply-To: <46B22E19.2090707@ou.edu> References: <46B22E19.2090707@ou.edu> Message-ID: <46B230FB.5090201@gmail.com> Ryan May wrote: > Hi, > > I ran into this while debugging a script today: > > In [1]: import numpy as N > > In [2]: N.__version__ > Out[2]: '1.0.3' > > In [3]: d = N.array([32767], dtype=N.int16) > > In [4]: d + 32767 > Out[4]: array([-2], dtype=int16) > > In [5]: d[0] + 32767 > Out[5]: 65534 > > In [6]: type(d[0] + 32767) > Out[6]: > > In [7]: type(d[0]) > Out[7]: > > It seems that numpy will automatically promote the scalar to avoid > overflow, but not in the array case. Is this inconsistency a bug, just > a (known) gotcha? Known feature. When arrays and scalars are mixed and the types are within the same kind (e.g. both are integer types just at different precisions), the type of the scalar is ignored. This solves one of the usability issues with trying to use lower precisions; you still want to be able to divide by 2.0, for example, without automatically up-casting your very large float32 array. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From focke at slac.stanford.edu Thu Aug 2 15:51:57 2007 From: focke at slac.stanford.edu (Warren Focke) Date: Thu, 2 Aug 2007 12:51:57 -0700 (PDT) Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B208ED.1010403@imtek.de> References: <46B208ED.1010403@imtek.de> Message-ID: On Thu, 2 Aug 2007, Lars Friedrich wrote: > What I understood is that numpy uses FFTPACK's algorithms. Sort of. It appears to be a hand translation from F77 to C. > From www.netlib.org/fftpack (is this the right address?) I took that > there is a single-precision and double-precision-version of the > algorithms. How hard would it be (for example for me...) to add the > single-precision versions to numpy? I am not a decent C-hacker, but if > someone tells me, that this task is not *too* hard, I would start > looking more closely at the code... It shouldn't be hard. fftpack.c will make a single-precision version if DOUBLE is not defined at compile time. > Would it make sense, that if one passes an array of dtype = > numpy.float32 to the fft function, a complex64 is returned, and if one > passes an array of dtype = numpy.float64, a complex128 is returned? Sounds like reasonable default behavior. Might be useful if the caller could overrride it. w From tim.hochberg at ieee.org Thu Aug 2 17:15:03 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Thu, 2 Aug 2007 14:15:03 -0700 Subject: [Numpy-discussion] reference leacks in numpy.asarray In-Reply-To: References: Message-ID: On 8/2/07, Lisandro Dalcin wrote: > > using numpy-1.0.3, I believe there are a reference leak somewhere. > Using a debug build of Python 2.5.1 (--with-pydebug), I get the > following > > import sys, gc > import numpy > > def testleaks(func, args=(), kargs={}, repeats=5): > for i in xrange(repeats): > r1 = sys.gettotalrefcount() > func(*args,**kargs) > r2 = sys.gettotalrefcount() > rd = r2-r1 > print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd) > > def npy_asarray_1(): > a = numpy.zeros(5, dtype=int) > b = numpy.asarray(a, dtype=float) > del a, b > > def npy_asarray_2(): > a = numpy.zeros(5, dtype=float) > b = numpy.asarray(a, dtype=float) > del a, b > > if __name__ == '__main__': > testleaks(npy_asarray_1) > testleaks(npy_asarray_2) > > > $ python npyleaktest.py > before: 84531, after: 84532, diff: [1] > before: 84534, after: 84534, diff: [0] > before: 84534, after: 84534, diff: [0] > before: 84534, after: 84534, diff: [0] > before: 84534, after: 84534, diff: [0] > before: 84531, after: 84533, diff: [2] > before: 84535, after: 84536, diff: [1] > before: 84536, after: 84537, diff: [1] > before: 84537, after: 84538, diff: [1] > before: 84538, after: 84539, diff: [1] > > It seems npy_asarray_2() is leaking a reference. I am missing > something here?. The same problem is found in C, using > PyArray_FROM_OTF (no time to go inside to see what's going on, sorry) > > If this is know and solved in SVN, please forget me. I don't have a debug build handy to test this on, but this might not be a reference leak. Since you are checking the count before and after each cycle, it could be that there are cycles being created that are subsequently cleaned up by the garbage collector. Can you try instead to look at the difference between the reference count at the end of each cycle with the reference count before the first cycle? If that goes up indefinitely, then it's probably a leak. If it bounces around or levels off, then probably not. You'd probably want to run a bunch of repeats just to be sure. regards, -tim Regards, > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Thu Aug 2 18:03:19 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Aug 2007 19:03:19 -0300 Subject: [Numpy-discussion] reference leacks in numpy.asarray In-Reply-To: References: Message-ID: Ups, I forgot to mention I was using gc.collect(), I accidentally cleaned it my mail Anyway, the following import sys, gc import numpy def test(): a = numpy.zeros(5, dtype=float) while 1: gc.collect() b = numpy.asarray(a, dtype=float); del b gc.collect() print sys.gettotalrefcount() test() shows in mi box alway 1 more totalrefcount in each pass, so always increasing. IMHO, I still think there is a leak somewere. And now, I am not sure if PyArray_FromAny is the source of the problem. On 8/2/07, Timothy Hochberg wrote: > > > > On 8/2/07, Lisandro Dalcin wrote: > > using numpy-1.0.3, I believe there are a reference leak somewhere. > > Using a debug build of Python 2.5.1 (--with-pydebug), I get the > > following > > > > import sys, gc > > import numpy > > > > def testleaks(func, args=(), kargs={}, repeats=5): > > for i in xrange(repeats): > > r1 = sys.gettotalrefcount() > > func(*args,**kargs) > > r2 = sys.gettotalrefcount() > > rd = r2-r1 > > print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd) > > > > def npy_asarray_1(): > > a = numpy.zeros(5, dtype=int) > > b = numpy.asarray(a, dtype=float) > > del a, b > > > > def npy_asarray_2(): > > a = numpy.zeros(5, dtype=float) > > b = numpy.asarray(a, dtype=float) > > del a, b > > > > if __name__ == '__main__': > > testleaks(npy_asarray_1) > > testleaks(npy_asarray_2) > > > > > > $ python npyleaktest.py > > before: 84531, after: 84532, diff: [1] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84531, after: 84533, diff: [2] > > before: 84535, after: 84536, diff: [1] > > before: 84536, after: 84537, diff: [1] > > before: 84537, after: 84538, diff: [1] > > before: 84538, after: 84539, diff: [1] > > > > It seems npy_asarray_2() is leaking a reference. I am missing > > something here?. The same problem is found in C, using > > PyArray_FROM_OTF (no time to go inside to see what's going on, sorry) > > > > If this is know and solved in SVN, please forget me. > > I don't have a debug build handy to test this on, but this might not be a > reference leak. Since you are checking the count before and after each > cycle, it could be that there are cycles being created that are subsequently > cleaned up by the garbage collector. > > Can you try instead to look at the difference between the reference count at > the end of each cycle with the reference count before the first cycle? If > that goes up indefinitely, then it's probably a leak. If it bounces around > or levels off, then probably not. You'd probably want to run a bunch of > repeats just to be sure. > regards, > -tim > > > Regards, > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Aug 2 18:20:22 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Aug 2007 19:20:22 -0300 Subject: [Numpy-discussion] reference leacks in numpy.asarray In-Reply-To: References: Message-ID: I think the problem is in _array_fromobject (seen as numpy.array in Python) This function parses its arguments by using the convertor PyArray_DescrConverter2. which RETURNS A NEW REFERENCE!!! This reference is never DECREF'ed. BTW, A lesson I've learned of the pattern if (!PyArg_ParseXXX(....)) return NULL is that convertor functions should NEVER return new references to PyObject*'s, because if the conversion fails (because of latter wrong argument), you leak a reference to the 'converted' object. If this pattern is used everywhere in numpy, well, there are big chances of leaking references in the case of bad args to C functions. Regards, On 8/2/07, Timothy Hochberg wrote: > > > > On 8/2/07, Lisandro Dalcin wrote: > > using numpy-1.0.3, I believe there are a reference leak somewhere. > > Using a debug build of Python 2.5.1 (--with-pydebug), I get the > > following > > > > import sys, gc > > import numpy > > > > def testleaks(func, args=(), kargs={}, repeats=5): > > for i in xrange(repeats): > > r1 = sys.gettotalrefcount() > > func(*args,**kargs) > > r2 = sys.gettotalrefcount() > > rd = r2-r1 > > print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd) > > > > def npy_asarray_1(): > > a = numpy.zeros(5, dtype=int) > > b = numpy.asarray(a, dtype=float) > > del a, b > > > > def npy_asarray_2(): > > a = numpy.zeros(5, dtype=float) > > b = numpy.asarray(a, dtype=float) > > del a, b > > > > if __name__ == '__main__': > > testleaks(npy_asarray_1) > > testleaks(npy_asarray_2) > > > > > > $ python npyleaktest.py > > before: 84531, after: 84532, diff: [1] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84534, after: 84534, diff: [0] > > before: 84531, after: 84533, diff: [2] > > before: 84535, after: 84536, diff: [1] > > before: 84536, after: 84537, diff: [1] > > before: 84537, after: 84538, diff: [1] > > before: 84538, after: 84539, diff: [1] > > > > It seems npy_asarray_2() is leaking a reference. I am missing > > something here?. The same problem is found in C, using > > PyArray_FROM_OTF (no time to go inside to see what's going on, sorry) > > > > If this is know and solved in SVN, please forget me. > > I don't have a debug build handy to test this on, but this might not be a > reference leak. Since you are checking the count before and after each > cycle, it could be that there are cycles being created that are subsequently > cleaned up by the garbage collector. > > Can you try instead to look at the difference between the reference count at > the end of each cycle with the reference count before the first cycle? If > that goes up indefinitely, then it's probably a leak. If it bounces around > or levels off, then probably not. You'd probably want to run a bunch of > repeats just to be sure. > regards, > -tim > > > Regards, > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From charlesr.harris at gmail.com Thu Aug 2 19:01:55 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 2 Aug 2007 17:01:55 -0600 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: References: <46B208ED.1010403@imtek.de> Message-ID: On 8/2/07, Warren Focke wrote: > > > > On Thu, 2 Aug 2007, Lars Friedrich wrote: > > > What I understood is that numpy uses FFTPACK's algorithms. > > Sort of. It appears to be a hand translation from F77 to C. > > > From www.netlib.org/fftpack (is this the right address?) I took that > > there is a single-precision and double-precision-version of the > > algorithms. How hard would it be (for example for me...) to add the > > single-precision versions to numpy? I am not a decent C-hacker, but if > > someone tells me, that this task is not *too* hard, I would start > > looking more closely at the code... > > It shouldn't be hard. fftpack.c will make a single-precision version if > DOUBLE is not defined at compile time. > > > Would it make sense, that if one passes an array of dtype = > > numpy.float32 to the fft function, a complex64 is returned, and if one > > passes an array of dtype = numpy.float64, a complex128 is returned? > > Sounds like reasonable default behavior. Might be useful if the caller > could overrride it. On X86 machines the main virtue would be smaller and more cache friendly arrays because double precision arithmetic is about the same speed as single precision, sometimes even a bit faster. The PPC architecture does have faster single than double precision, so there it could make a difference. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Thu Aug 2 19:42:07 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Aug 2007 20:42:07 -0300 Subject: [Numpy-discussion] reference leacks in numpy.asarray In-Reply-To: References: Message-ID: This patch corrected the problem for me, numpy test pass... On 8/2/07, Lisandro Dalcin wrote: > I think the problem is in _array_fromobject (seen as numpy.array in Python) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: array.patch Type: application/octet-stream Size: 700 bytes Desc: not available URL: From dalcinl at gmail.com Thu Aug 2 20:01:43 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Aug 2007 21:01:43 -0300 Subject: [Numpy-discussion] reference leaks in array() and arange() Message-ID: As PyArray_DescrConverter return new references, I think there could be many places were PyArray_Descr* objects get its reference count incremented. Here, I send a patch correcting this for array() and arange(), but not sure if this is the more general solution. BTW, please see my previous comments in previous mail on using convertor functions (returning new refs) and the (very common) idiom "if(!PyArg_PaseXXX(,,,) return NULL", as this seems to be used in almost all places in numpy C sources and is a potential source of ref leaks. Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: array-arange.patch Type: application/octet-stream Size: 1229 bytes Desc: not available URL: From focke at slac.stanford.edu Thu Aug 2 22:11:36 2007 From: focke at slac.stanford.edu (Warren Focke) Date: Thu, 2 Aug 2007 19:11:36 -0700 (PDT) Subject: [Numpy-discussion] fourier with single precision In-Reply-To: References: <46B208ED.1010403@imtek.de> Message-ID: On Thu, 2 Aug 2007, Charles R Harris wrote: > On X86 machines the main virtue would be smaller and more cache friendly > arrays because double precision arithmetic is about the same speed as single > precision, sometimes even a bit faster. The PPC architecture does have > faster single than double precision, so there it could make a difference. Yeah, I was wondering if I should mention that. I think SSE has real single precision, if you can convince the compiler to do it that way. Even better if it could be vectorized with SSE. w From vincent.nijs at gmail.com Fri Aug 3 01:57:06 2007 From: vincent.nijs at gmail.com (Vincent) Date: Fri, 03 Aug 2007 05:57:06 -0000 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com> References: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com> Message-ID: <1186120626.433696.242920@i38g2000prf.googlegroups.com> What is ugly about the module? I like it! What do you mean about recarray's? Do you think they are they not appropriate for this type of thing? When i get some time i'll run some tests versus SAS for the same operations and do a speed comparison. Question: Would there be an easy way to merge the summary stats back into the recarray? Best, Vincent On Aug 1, 11:22 pm, Travis Vaught wrote: > Greetings, > > Speaking of brute force... I've attached a rather ugly module that > let's you do things with a pretty simple interface (session shown > below). I haven't fully tested the performance, but a million > records with 5 fields takes about 11 seconds on my Mac to do a > 'mean'. I'm not sure what your performance considerations are, but > this may be useful. Record arrays are really nice if they make sense > for your data. > > Travis > > (from an ipython command prompt) > > In [1]: import testpivot as p > > In [2]: a = p.sample_data() > > In [3]: a > Out[3]: > recarray([('ACorp', 'Region 1', 'Q1', 20000.0), > ('ACorp', 'Region 1', 'Q2', 22000.0), > ('ACorp', 'Region 1', 'Q3', 21000.0), > ('ACorp', 'Region 1', 'Q4', 26000.0), > ('ACorp', 'Region 2', 'Q1', 23000.0), > ('ACorp', 'Region 2', 'Q2', 20000.0), > ('ACorp', 'Region 2', 'Q3', 22000.0), > ('ACorp', 'Region 2', 'Q4', 21000.0), > ('ACorp', 'Region 3', 'Q1', 26000.0), > ('ACorp', 'Region 3', 'Q2', 23000.0), > ('ACorp', 'Region 3', 'Q3', 29000.0), > ('ACorp', 'Region 3', 'Q4', 27000.0), > ('BCorp', 'Region 1', 'Q1', 20000.0), > ('BCorp', 'Region 1', 'Q2', 20000.0), > ('BCorp', 'Region 1', 'Q3', 24000.0), > ('BCorp', 'Region 1', 'Q4', 24000.0), > ('BCorp', 'Region 2', 'Q1', 21000.0), > ('BCorp', 'Region 2', 'Q2', 21000.0), > ('BCorp', 'Region 2', 'Q3', 22000.0), > ('BCorp', 'Region 2', 'Q4', 29000.0), > ('BCorp', 'Region 3', 'Q1', 28000.0), > ('BCorp', 'Region 3', 'Q2', 25000.0), > ('BCorp', 'Region 3', 'Q3', 22000.0), > ('BCorp', 'Region 3', 'Q4', 21000.0)], > dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '| > S2'), ('income', ' > In [4]: p.pivot(a, 'company', 'region', 'income', p.psum) > ######## Summary by company and region ########## > cols:['ACorp' 'BCorp'] > rows:['Region 1' 'Region 2' 'Region 3'] > [[ 89000. 88000.] > [ 86000. 93000.] > [ 105000. 96000.]] > > In [5]: p.pivot(a, 'company', 'quarter', 'income', p.psum) > ######## Summary by company and quarter ########## > cols:['ACorp' 'BCorp'] > rows:['Q1' 'Q2' 'Q3' 'Q4'] > [[ 69000. 69000.] > [ 65000. 66000.] > [ 72000. 68000.] > [ 74000. 74000.]] > > In [6]: p.pivot(a, 'company', 'quarter', 'income', p.pmean) > ######## Summary by company and quarter ########## > cols:['ACorp' 'BCorp'] > rows:['Q1' 'Q2' 'Q3' 'Q4'] > [[ 23000. 23000. ] > [ 21666.66666667 22000. ] > [ 24000. 22666.66666667] > [ 24666.66666667 24666.66666667]] > > testpivot.py > 3KDownload > > > > On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote: > > > Hi, > > The hard part is knowing what aggregate function that you want. So a > > hard way, even after cheating, to take the data provided is given > > below. (The Numpy Example List was very useful especially on the where > > function)! > > > I tried to be a little generic so you can replace the sum by any > > suitable function and probably the array type as well. Of course it is > > not complete because you still need to know the levels of the 'rows' > > and 'columns' and also is not efficient as it has loops. > > > Bruce > > > from numpy import * > > A=array([[1,1,10], > > [1,1,20], > > [1,2,30], > > [2,1,40], > > [2,2,50], > > [2,2,60] ]) > > C = zeros((2,2)) > > > for i in range(2): > > crit1 = (A[:,0]==1+i) > > subA=A[crit1,1:] > > for j in range(2): > > crit2 = (subA[:,0]==1+j) > > subB=subA[crit2,1:] > > C[i,j]=subB.sum() > > > print C > > > On 7/30/07, Geoffrey Zhu wrote: > >> Hi Everyone, > > >> I am wondering what is the best (and fast) way to build a pivot table > >> aside from the 'brute force way?' > > >> I want to transform an numpy array into a pivot table. For > >> example, if > >> I have a numpy array like below: > > >> Region Date # of Units > >> ---------- ---------- -------------- > >> East 1/1 10 > >> East 1/1 20 > >> East 1/2 30 > >> West 1/1 40 > >> West 1/2 50 > >> West 1/2 60 > > >> I want to transform this into the following table, where f() is a > >> given aggregate function: > > >> Date > >> Region 1/1 1/2 > >> ---------- > >> East f(10,20) f(30) > >> West f(40) f(50,60) > > >> I can regroup them into 'sets' and do it the brute force way, but > >> that > >> is kind of slow to execute. Does anyone know a better way? > > >> Thanks, > >> Geoffrey > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discuss... at scipy.org > >>http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discuss... at scipy.org > >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From david at ar.media.kyoto-u.ac.jp Fri Aug 3 02:06:58 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 03 Aug 2007 15:06:58 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement Message-ID: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> Hi, Following an ongoing discussion with S. Johnson, one of the developer of fftw3, I would be interested in what people think about adding infrastructure in numpy related to SIMD alignement (that is 16 bytes alignement for SSE/ALTIVEC, I don't know anything about other archs). The problem is that right now, it is difficult to get information for alignement in numpy (by alignement here, I mean something different than what is normally meant in numpy context; whether, in my understanding, NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I am talking about arbitrary alignement). For example, for fftw3, we need to know whether a given data buffer is 16 bytes aligned to get optimal performances; generally, SSE needs 16 byte alignement for optimal performances, as well as altivec. I think it would be nice to get some infrastructure to help developers to get those kind of information, and maybe to be able to request 16 aligned buffers. Here is what I can think of: - adding an API to know whether a given PyArrayObject has its data buffer 16 bytes aligned, and requesting a 16 bytes aligned PyArrayObject. Something like NPY_ALIGNED, basically. - forcing data allocation to be 16 bytes aligned in numpy (eg define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). This would mean that many arrays would be "naturally" 16 bytes aligned without effort. Point 2 is really easy to implement I think: actually, on some platforms (Mac OS X and FreeBSD), malloc returning 16 bytes aligned buffers anyway, so I don't think the wasted space is a real problem. Linux with glibc is 8 bytes aligned, I don't know about windows. Implementing our own 16 bytes aligned memory allocator for cross platform compatibility should be relatively easy. I don't see any drawback, but I guess other people will. Point 1 is more tricky, as this requires much more changes in the code. Do main developers of numpy have an opinion on this ? cheers, David From matthew.brett at gmail.com Fri Aug 3 09:38:36 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 3 Aug 2007 14:38:36 +0100 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0708030638l1c91885bt7769c1c45bdf341d@mail.gmail.com> Hi, > Following an ongoing discussion with S. Johnson, one of the developer > of fftw3, I would be interested in what people think about adding > infrastructure in numpy related to SIMD alignement (that is 16 bytes > alignement for SSE/ALTIVEC, I don't know anything about other archs). > The problem is that right now, it is difficult to get information for > alignement in numpy (by alignement here, I mean something different than > what is normally meant in numpy context; whether, in my understanding, > NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I > am talking about arbitrary alignement). Excellent idea if practical... Matthew From strawman at astraw.com Fri Aug 3 11:12:44 2007 From: strawman at astraw.com (Andrew Straw) Date: Fri, 03 Aug 2007 08:12:44 -0700 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> Message-ID: <46B345EC.6090503@astraw.com> Dear David, Both ideas, particularly the 2nd, would be excellent additions to numpy. I often use the Intel IPP (Integrated Performance Primitives) Library together with numpy, but I have to do all my memory allocation with the IPP to ensure fastest operation. I then create numpy views of the data. All this works brilliantly, but it would be really nice if I could allocate the memory directly in numpy. IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given that fftw3 apparently wants 16 byte aligned memory, my feeling is that, if the effort is made, the alignment width should be specified at run-time, rather than hard-coded. In terms of implementation of your 1st point, I'm not aware of how much effort your idea would take (and it does sound nice), but some benefit would be had just from a simple function numpy.is_mem_aligned( ndarray, width=16 ) which returns a bool. Cheers! Andrew David Cournapeau wrote: > Hi, > > Following an ongoing discussion with S. Johnson, one of the developer > of fftw3, I would be interested in what people think about adding > infrastructure in numpy related to SIMD alignement (that is 16 bytes > alignement for SSE/ALTIVEC, I don't know anything about other archs). > The problem is that right now, it is difficult to get information for > alignement in numpy (by alignement here, I mean something different than > what is normally meant in numpy context; whether, in my understanding, > NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I > am talking about arbitrary alignement). > For example, for fftw3, we need to know whether a given data buffer is > 16 bytes aligned to get optimal performances; generally, SSE needs 16 > byte alignement for optimal performances, as well as altivec. I think it > would be nice to get some infrastructure to help developers to get those > kind of information, and maybe to be able to request 16 aligned buffers. > Here is what I can think of: > - adding an API to know whether a given PyArrayObject has its data > buffer 16 bytes aligned, and requesting a 16 bytes aligned > PyArrayObject. Something like NPY_ALIGNED, basically. > - forcing data allocation to be 16 bytes aligned in numpy (eg > define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). > This would mean that many arrays would be "naturally" 16 bytes aligned > without effort. > > Point 2 is really easy to implement I think: actually, on some platforms > (Mac OS X and FreeBSD), malloc returning 16 bytes aligned buffers > anyway, so I don't think the wasted space is a real problem. Linux with > glibc is 8 bytes aligned, I don't know about windows. Implementing our > own 16 bytes aligned memory allocator for cross platform compatibility > should be relatively easy. I don't see any drawback, but I guess other > people will. > > Point 1 is more tricky, as this requires much more changes in the code. > > Do main developers of numpy have an opinion on this ? > > cheers, > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From focke at slac.stanford.edu Fri Aug 3 11:17:30 2007 From: focke at slac.stanford.edu (Warren Focke) Date: Fri, 3 Aug 2007 08:17:30 -0700 (PDT) Subject: [Numpy-discussion] fourier with single precision In-Reply-To: References: <46B208ED.1010403@imtek.de> Message-ID: On Thu, 2 Aug 2007, Warren Focke wrote: > > > On Thu, 2 Aug 2007, Lars Friedrich wrote: > >> versions to numpy? I am not a decent C-hacker, but if someone tells me, >> that this task is not *too* hard, I would start looking more closely at the >> code... > > It shouldn't be hard. fftpack.c will make a single-precision version if > DOUBLE is not defined at compile time. Of course, it's even less hard to use FFTW or MKL. w From david at ar.media.kyoto-u.ac.jp Fri Aug 3 23:28:34 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 04 Aug 2007 12:28:34 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B345EC.6090503@astraw.com> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> Message-ID: <46B3F262.80000@ar.media.kyoto-u.ac.jp> Andrew Straw wrote: > Dear David, > > Both ideas, particularly the 2nd, would be excellent additions to numpy. > I often use the Intel IPP (Integrated Performance Primitives) Library > together with numpy, but I have to do all my memory allocation with the > IPP to ensure fastest operation. I then create numpy views of the data. > All this works brilliantly, but it would be really nice if I could > allocate the memory directly in numpy. > > IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. > http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given > that fftw3 apparently wants 16 byte aligned memory, my feeling is that, > if the effort is made, the alignment width should be specified at > run-time, rather than hard-coded. I think that doing it at runtime would be overkill, no ? I was thinking about making it a compile option. Generally, at the ASM level, you need 16 bytes alignment (for instructions like movaps, which takes 16 bytes in memory and put it in the SSE registers), this is not just fftw. Maybe the 32 bytes alignment is useful for cache reasons, I don't know. I don't think it would be difficult to implement and validate; what I don't know at all is the implication of this at the binary level, if any. cheers, David From charlesr.harris at gmail.com Sat Aug 4 01:30:46 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 3 Aug 2007 23:30:46 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B3F262.80000@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp> Message-ID: On 8/3/07, David Cournapeau wrote: > > Andrew Straw wrote: > > Dear David, > > > > Both ideas, particularly the 2nd, would be excellent additions to numpy. > > I often use the Intel IPP (Integrated Performance Primitives) Library > > together with numpy, but I have to do all my memory allocation with the > > IPP to ensure fastest operation. I then create numpy views of the data. > > All this works brilliantly, but it would be really nice if I could > > allocate the memory directly in numpy. > > > > IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. > > http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given > > that fftw3 apparently wants 16 byte aligned memory, my feeling is that, > > if the effort is made, the alignment width should be specified at > > run-time, rather than hard-coded. > I think that doing it at runtime would be overkill, no ? I was thinking > about making it a compile option. Generally, at the ASM level, you need > 16 bytes alignment (for instructions like movaps, which takes 16 bytes > in memory and put it in the SSE registers), this is not just fftw. Maybe > the 32 bytes alignment is useful for cache reasons, I don't know. > > I don't think it would be difficult to implement and validate; what I > don't know at all is the implication of this at the binary level, if any. Here's a hack that google turned up: (1) Use static variables instead of dynamic (stack) variables (2) Use in-line assembly code that explicitly aligns data (3) In C code, use "*malloc*" to explicitly allocate variables Here is Intel's example of (2): ; procedure prologue push ebp mov esp, ebp and ebp, -8 sub esp, 12 ; procedure epilogue add esp, 12 pop ebp ret Intel's example of (3), slightly modified: double *p, *newp; p = (double*)*malloc* ((sizeof(double)*NPTS)+4); newp = (p+4) & (~7); This assures that newp is 8-*byte* aligned even if p is not. However, *malloc*() may already follow Intel's recommendation that a *32*-*byte* or greater data structures be aligned on a *32* *byte* boundary. In that case, increasing the requested memory by 4 bytes and computing newp are superfluous. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Aug 4 02:06:15 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 4 Aug 2007 00:06:15 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp> Message-ID: On 8/3/07, Charles R Harris wrote: > > > > On 8/3/07, David Cournapeau wrote: > > > > Andrew Straw wrote: > > > Dear David, > > > > > > Both ideas, particularly the 2nd, would be excellent additions to > > numpy. > > > I often use the Intel IPP (Integrated Performance Primitives) Library > > > together with numpy, but I have to do all my memory allocation with > > the > > > IPP to ensure fastest operation. I then create numpy views of the > > data. > > > All this works brilliantly, but it would be really nice if I could > > > allocate the memory directly in numpy. > > > > > > IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. > > > http://www.intel.com/support/performancetools/sb/CS-021418.htm ). > > Given > > > that fftw3 apparently wants 16 byte aligned memory, my feeling is > > that, > > > if the effort is made, the alignment width should be specified at > > > run-time, rather than hard-coded. > > I think that doing it at runtime would be overkill, no ? I was thinking > > about making it a compile option. Generally, at the ASM level, you need > > 16 bytes alignment (for instructions like movaps, which takes 16 bytes > > in memory and put it in the SSE registers), this is not just fftw. Maybe > > the 32 bytes alignment is useful for cache reasons, I don't know. > > > > I don't think it would be difficult to implement and validate; what I > > don't know at all is the implication of this at the binary level, if > > any. > > > > Here's a hack that google turned up: > > (1) Use static variables instead of dynamic (stack) variables > (2) Use in-line assembly code that explicitly aligns data > (3) In C code, use "*malloc*" to explicitly allocate variables > > Here is Intel's example of (2): > > ; procedure prologue > push ebp > mov esp, ebp > and ebp, -8 > sub esp, 12 > > ; procedure epilogue > add esp, 12 > pop ebp > ret > > Intel's example of (3), slightly modified: > > double *p, *newp; > p = (double*)*malloc* ((sizeof(double)*NPTS)+4); > newp = (p+4) & (~7); > > This assures that newp is 8-*byte* aligned even if p is not. However, > *malloc*() may already follow Intel's recommendation that a *32*-* byte*or > greater data structures be aligned on a * 32* *byte* boundary. In that > case, > increasing the requested memory by 4 bytes and computing newp are > superfluous. > I think that for numpy arrays it should be possible to define the offset so that the result is 32 byte aligned. However, this might break some peoples' code if they haven't payed attention to the offset. Another possibility is to allocate an oversized array, check the pointer, and take a range out of it. For instance: In [32]: a = zeros(10) In [33]: a.ctypes.data % 32 Out[33]: 16 The array alignment is 16 bytes, consequently In [34]: a[2:].ctypes.data % 32 Out[34]: 0 Voila, 32 byte alignment. I think a short python routine could do this, which ought to serve well for 1D fft's. Multidimensional arrays will be trickier if you want the rows to be aligned. Aligning the columns just isn't going to work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sat Aug 4 02:25:38 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 04 Aug 2007 15:25:38 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp> Message-ID: <46B41BE2.7020003@ar.media.kyoto-u.ac.jp> > > > Here's a hack that google turned up: > > (1) Use static variables instead of dynamic (stack) variables > (2) Use in-line assembly code that explicitly aligns data > (3) In C code, use "*malloc*" to explicitly allocate variables > > Here is Intel's example of (2): > > ; procedure prologue > push ebp > mov esp, ebp > and ebp, -8 > sub esp, 12 > > ; procedure epilogue > add esp, 12 > pop ebp > ret > > Intel's example of (3), slightly modified: > > double *p, *newp; > p = (double*)*malloc* ((sizeof(double)*NPTS)+4); > newp = (p+4) & (~7); > > This assures that newp is 8-*byte* aligned even if p is not. However, > *malloc*() may already follow Intel's recommendation that a *32*-* > byte* or > greater data structures be aligned on a *32* *byte* boundary. In > that case, > increasing the requested memory by 4 bytes and computing newp are > superfluous. > > > I think that for numpy arrays it should be possible to define the > offset so that the result is 32 byte aligned. However, this might > break some peoples' code if they haven't payed attention to the offset. Why ? I really don't see how it can break anything at the source code level. You don't have to care about things you didn't care before: the best proof of that if that numpy runs on different platforms where the malloc has different alignment guarantees (mac OS X already aligned to 16 bytes, for the very reason of making optimizing with SIMD easier, whereas glibc malloc only aligns to 8 bytes, at least on Linux). > Another possibility is to allocate an oversized array, check the > pointer, and take a range out of it. For instance: > > In [32]: a = zeros(10) > > In [33]: a.ctypes.data % 32 > Out[33]: 16 > > The array alignment is 16 bytes, consequently > > In [34]: a[2:].ctypes.data % 32 > Out[34]: 0 > > Voila, 32 byte alignment. I think a short python routine could do > this, which ought to serve well for 1D fft's. Multidimensional arrays > will be trickier if you want the rows to be aligned. Aligning the > columns just isn't going to work. I am not suggesting realigning existing arrays. What I would like numpy to support are the following cases: - Check whether a given a numpy array is simd aligned: /* Simple case: if aligned, use optimized func, use non optimized otherwise */ int simd_func(double* in, size_t n); int nosimd_func(double* in, size_t n); if (PyArray_ISALIGNED_SIMD(a)) { simd_func((double *)a->data, a->size); } else { nosimd_func((double *)a->data, a->size); } - Request explicitely an aligned arrays from any PyArray_* functions which create a ndarray, eg: ar = PyArray_FROM_OF(a, NPY_SIMD_ALIGNED); Allocating a buffer aligned to a given alignment is not the problem: there is a posix functions to do it, and we can implement easily a function for the OS who do not support it. This would be done in C, not in python. cheers, David From peridot.faceted at gmail.com Sat Aug 4 03:24:55 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 4 Aug 2007 03:24:55 -0400 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B41BE2.7020003@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp> <46B41BE2.7020003@ar.media.kyoto-u.ac.jp> Message-ID: On 04/08/07, David Cournapeau wrote: > > Here's a hack that google turned up: I'd avoid hacks in favour of posix_memalign (which allows arbitrary degrees of alignment. For one thing, freeing becomes a headache (you can't free a pointer you've jiggered!). > - Check whether a given a numpy array is simd aligned: > > /* Simple case: if aligned, use optimized func, use non optimized > otherwise */ > int simd_func(double* in, size_t n); > int nosimd_func(double* in, size_t n); > > if (PyArray_ISALIGNED_SIMD(a)) { > simd_func((double *)a->data, a->size); > } else { > nosimd_func((double *)a->data, a->size); > } > - Request explicitely an aligned arrays from any PyArray_* functions > which create a ndarray, eg: ar = PyArray_FROM_OF(a, NPY_SIMD_ALIGNED); > > Allocating a buffer aligned to a given alignment is not the problem: > there is a posix functions to do it, and we can implement easily a > function for the OS who do not support it. This would be done in C, not > in python. I'd just like to point out that PyArray_ISALIGNED_SIMD(a) can be a macro which aligns to something like "!((a->datapointer)&0xf)"; this avoids any change to the array objects and allows checking for arbitrary degrees of alignment - somebody mentioned the Intel Performance Primitives need 32-byte aligned data? One might also want page-aligned data or data aligned in some way with cache lines. It seems to me two things are needed: * A mechanism for requesting numpy arrays with buffers aligned to an arbitrary power-of-two size (basically just using posix_memalign or some horrible hack on platforms that don't have it). * A macro (in C, and some way to get the same information from python, perhaps just "a.ctypes.data % 16") to test for common alignment cases; SIMD alignment and arbitrary power-of-two alignment are probably sufficient. Does this fail to cover any important cases? Anne From adam.powell at ucl.ac.uk Fri Aug 3 07:50:04 2007 From: adam.powell at ucl.ac.uk (adam.powell at ucl.ac.uk) Date: Fri, 03 Aug 2007 12:50:04 +0100 Subject: [Numpy-discussion] multinomial error? Message-ID: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk> Hi, I appear to be having a problem with the random.multinomial function. For some reason if i attempt to loop over a large number of single-trial multinomial picks then the function begins to ignore some non-zero entries in my 1-D array of multinomial probabilities... Is seems that there is no upper limit on the size of the probability array for a one off multinomial pick, but if looping over the process multiple times the function can't handle the whole array and seems to truncate it arbitrarily before performing the trial with only the remaining probabilities. There is a reason why i need to loop over a large number of single-trial events, rather than just replacing the loop with a large number of trials in one single multinomial pick (annoying, as that's so much quicker!). Thanks for any help, Adam From stevenj at alum.mit.edu Sat Aug 4 23:20:31 2007 From: stevenj at alum.mit.edu (Steven G. Johnson) Date: Sat, 04 Aug 2007 20:20:31 -0700 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp> <46B41BE2.7020003@ar.media.kyoto-u.ac.jp> Message-ID: <1186284031.873276.35610@19g2000hsx.googlegroups.com> On Aug 4, 3:24 am, "Anne Archibald" wrote: > It seems to me two things are needed: > > * A mechanism for requesting numpy arrays with buffers aligned to an > arbitrary power-of-two size (basically just using posix_memalign or > some horrible hack on platforms that don't have it). Right, you might as well allow the alignment (to a power-of-two size) to be specified at runtime, as there is really no cost to implementing an arbitrary alignment once you have any alignment. Although you should definitely use posix_memalign (or the old memalign) where it is available, unfortunately it's not implemented on all systems. e.g. MacOS X and FreeBSD don't have it, last I checked (although in both cases their malloc is 16-byte aligned). Microsoft VC ++ has a function called _aligned_malloc which is equivalent. However, since MinGW (www.mingw.org) didn't have an _aligned_malloc function, I wrote one for them a few years ago and put it in the public domain (I use MinGW to cross-compile to Windows from Linux and need the alignment). You are free to use it as a fallback on systems that don't have a memalign function if you want. It should work on any system where sizeof(void*) is a power of two (i.e. every extant architecture, that I know of). You can download it and its test program from: ab-initio.mit.edu/~stevenj/align.c ab-initio.mit.edu/~stevenj/tstalign.c It just uses malloc with a little extra padding as needed to align the data, plus a copy of the original pointer so that you can still free and realloc (using _aligned_free and _aligned_realloc). It could be made a bit more efficient, but it probably doesn't matter. > * A macro (in C, and some way to get the same information from python, > perhaps just "a.ctypes.data % 16") to test for common alignment cases; > SIMD alignment and arbitrary power-of-two alignment are probably > sufficient. In C this is easy, just ((uintptr_t) pointer) % 16 == 0. You might also consider a way to set the default alignment of numpy arrays at runtime, rather than requesting aligned arrays individually. e.g. so that someone could come along at a later date to a large program and just add one function call to make all the arrays 16-byte aligned to improve performance using SIMD libraries. Regards, Steven G. Johnson From aisaac at american.edu Sun Aug 5 10:42:17 2007 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 5 Aug 2007 10:42:17 -0400 Subject: [Numpy-discussion] multinomial error? In-Reply-To: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk> References: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk> Message-ID: On Fri, 03 Aug 2007, adam.powell at ucl.ac.uk apparently wrote: > I appear to be having a problem with the random.multinomial function. For some > reason if i attempt to loop over a large number of single-trial multinomial > picks then the function begins to ignore some non-zero entries in my 1-D array > of multinomial probabilities... Is seems that there is no upper limit on the > size of the probability array for a one off multinomial pick, but if looping > over the process multiple times the function can't handle the whole array and > seems to truncate it arbitrarily before performing the trial with only the > remaining probabilities. Minimal example? Cheers, Alan Isaac From lfriedri at imtek.de Mon Aug 6 02:53:55 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Mon, 06 Aug 2007 08:53:55 +0200 Subject: [Numpy-discussion] fourier with single precision Message-ID: <46B6C583.7080701@imtek.de> Hello, thanks for your comments. If I got you right, I should look for a FFT-code that uses SSE (what does this actually stand for?), which means that it vectorizes 32bit-single-operations into larger chunks that make efficient use of recent CPUs. You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math kernel library'? If I would like to use one of them, is numpy the right place to put it in? Does anyone know, if it is possible to switch on SSE support (at compile time) in the fftpack.c that numpy uses? Thanks Lars From nwagner at iam.uni-stuttgart.de Mon Aug 6 03:09:54 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 06 Aug 2007 09:09:54 +0200 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B6C583.7080701@imtek.de> References: <46B6C583.7080701@imtek.de> Message-ID: <46B6C942.50106@iam.uni-stuttgart.de> Lars Friedrich wrote: > Hello, > > thanks for your comments. If I got you right, I should look for a > FFT-code that uses SSE (what does this actually stand for?), which means > that it vectorizes 32bit-single-operations into larger chunks that make > efficient use of recent CPUs. > > http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions Nils From david at ar.media.kyoto-u.ac.jp Mon Aug 6 03:43:13 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 06 Aug 2007 16:43:13 +0900 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B6C583.7080701@imtek.de> References: <46B6C583.7080701@imtek.de> Message-ID: <46B6D111.7090106@ar.media.kyoto-u.ac.jp> Lars Friedrich wrote: > Hello, > > thanks for your comments. If I got you right, I should look for a > FFT-code that uses SSE (what does this actually stand for?), which means > that it vectorizes 32bit-single-operations into larger chunks that make > efficient use of recent CPUs. > > You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math > kernel library'? If I would like to use one of them, is numpy the right > place to put it in? > > Does anyone know, if it is possible to switch on SSE support (at compile > time) in the fftpack.c that numpy uses? > MKL is from Intel (free as in beer on Linux and for academic purpose I think, but of course, you should check whether this applies to you). FFTW is GPL, and AFAIK is considered to be the fastest general purpose open source FFT. Here are your options as far as I understand: - if you care about speed (that is, faster than numpy), then use scipy.fftpack with fftw3: there are wrappers in scipy for it. There is no float support (yet), but it is planned. Even with double, it will be faster (how much is really platform dependent). There is also MKL support, which may be faster (never used it). - if you care also about memory, then maybe you will have no choice but using your own routines for float support. FFTW support both single and double precision, but only double is available in scipy. cheers, David > Thanks > > Lars > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From david at ar.media.kyoto-u.ac.jp Mon Aug 6 03:51:29 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 06 Aug 2007 16:51:29 +0900 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B6C583.7080701@imtek.de> References: <46B6C583.7080701@imtek.de> Message-ID: <46B6D301.3080208@ar.media.kyoto-u.ac.jp> Lars Friedrich wrote: > Hello, > > thanks for your comments. If I got you right, I should look for a > FFT-code that uses SSE (what does this actually stand for?), which means > that it vectorizes 32bit-single-operations into larger chunks that make > efficient use of recent CPUs. > > You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math > kernel library'? If I would like to use one of them, is numpy the right > place to put it in? > > Does anyone know, if it is possible to switch on SSE support (at compile > time) in the fftpack.c that numpy uses? > I don't think it will have much impact, because to use SSE efficiently, you need some constraints wrt memory allocation which cannot be met easily now in numpy arrays, AND good compiler support (intel compiler, basically) for automatic vectorization. Even then, FFT may have specific patterns which mean that only hand tuned routines can get most of the CPU horsepower: both mkl and fftw use SIMD instructions to get their maximum efficiency. David From matthieu.brucher at gmail.com Mon Aug 6 04:04:18 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 6 Aug 2007 10:04:18 +0200 Subject: [Numpy-discussion] fourier with single precision In-Reply-To: <46B6D111.7090106@ar.media.kyoto-u.ac.jp> References: <46B6C583.7080701@imtek.de> <46B6D111.7090106@ar.media.kyoto-u.ac.jp> Message-ID: > > MKL is from Intel (free as in beer on Linux and for academic purpose I > think, but of course, you should check whether this applies to you). AFAIK, the MKL is free for non-commercial purposes under Linux only, and there is a special license for academics. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Aug 6 09:54:12 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 6 Aug 2007 06:54:12 -0700 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com> References: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com> Message-ID: Nicely done Travis. Working code is always better than theory. I copied your interface and used the brute-force, non-numpy approach to construct the pivot table. On the one hand, it doesn't preserve the order that the entires are discovered in as the original does. On the other hand, it's about 40% faster for large files on my machine (see pivot2). Probably because you don't have to loop through the data so many times. You can get further improvements if you know the operation in advance as shown in pivotsum, although this won't work on median ASAIK. regards, -tim On 8/1/07, Travis Vaught wrote: > > Greetings, > > Speaking of brute force... I've attached a rather ugly module that > let's you do things with a pretty simple interface (session shown > below). I haven't fully tested the performance, but a million > records with 5 fields takes about 11 seconds on my Mac to do a > 'mean'. I'm not sure what your performance considerations are, but > this may be useful. Record arrays are really nice if they make sense > for your data. > > Travis > > > (from an ipython command prompt) > > In [1]: import testpivot as p > > In [2]: a = p.sample_data() > > In [3]: a > Out[3]: > recarray([('ACorp', 'Region 1', 'Q1', 20000.0), > ('ACorp', 'Region 1', 'Q2', 22000.0), > ('ACorp', 'Region 1', 'Q3', 21000.0), > ('ACorp', 'Region 1', 'Q4', 26000.0 ), > ('ACorp', 'Region 2', 'Q1', 23000.0), > ('ACorp', 'Region 2', 'Q2', 20000.0), > ('ACorp', 'Region 2', 'Q3', 22000.0), > ('ACorp', 'Region 2', 'Q4', 21000.0), > ('ACorp', 'Region 3', 'Q1', 26000.0), > ('ACorp', 'Region 3', 'Q2', 23000.0), > ('ACorp', 'Region 3', 'Q3', 29000.0), > ('ACorp', 'Region 3', 'Q4', 27000.0), > ('BCorp', 'Region 1', 'Q1', 20000.0), > ('BCorp', 'Region 1', 'Q2', 20000.0), > ('BCorp', 'Region 1', 'Q3', 24000.0), > ('BCorp', 'Region 1', 'Q4', 24000.0), > ('BCorp', 'Region 2', 'Q1', 21000.0 ), > ('BCorp', 'Region 2', 'Q2', 21000.0), > ('BCorp', 'Region 2', 'Q3', 22000.0), > ('BCorp', 'Region 2', 'Q4', 29000.0), > ('BCorp', 'Region 3', 'Q1', 28000.0), > ('BCorp', 'Region 3', 'Q2', 25000.0), > ('BCorp', 'Region 3', 'Q3', 22000.0), > ('BCorp', 'Region 3', 'Q4', 21000.0)], > dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '| > S2'), ('income', ' > In [4]: p.pivot(a, 'company', 'region', 'income', p.psum) > ######## Summary by company and region ########## > cols:['ACorp' 'BCorp'] > rows:['Region 1' 'Region 2' 'Region 3'] > [[ 89000. 88000.] > [ 86000. 93000.] > [ 105000. 96000.]] > > In [5]: p.pivot(a, 'company', 'quarter', 'income', p.psum) > ######## Summary by company and quarter ########## > cols:['ACorp' 'BCorp'] > rows:['Q1' 'Q2' 'Q3' 'Q4'] > [[ 69000. 69000.] > [ 65000. 66000.] > [ 72000. 68000.] > [ 74000. 74000.]] > > In [6]: p.pivot(a, 'company', 'quarter', 'income', p.pmean) > ######## Summary by company and quarter ########## > cols:['ACorp' 'BCorp'] > rows:['Q1' 'Q2' 'Q3' 'Q4'] > [[ 23000. 23000. ] > [ 21666.66666667 22000. ] > [ 24000. 22666.66666667] > [ 24666.66666667 24666.66666667]] > > > > > On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote: > > > Hi, > > The hard part is knowing what aggregate function that you want. So a > > hard way, even after cheating, to take the data provided is given > > below. (The Numpy Example List was very useful especially on the where > > function)! > > > > I tried to be a little generic so you can replace the sum by any > > suitable function and probably the array type as well. Of course it is > > not complete because you still need to know the levels of the 'rows' > > and 'columns' and also is not efficient as it has loops. > > > > Bruce > > > > from numpy import * > > A=array([[1,1,10], > > [1,1,20], > > [1,2,30], > > [2,1,40], > > [2,2,50], > > [2,2,60] ]) > > C = zeros((2,2)) > > > > for i in range(2): > > crit1 = (A[:,0]==1+i) > > subA=A[crit1,1:] > > for j in range(2): > > crit2 = (subA[:,0]==1+j) > > subB=subA[crit2,1:] > > C[i,j]=subB.sum() > > > > > > print C > > > > On 7/30/07, Geoffrey Zhu wrote: > >> Hi Everyone, > >> > >> I am wondering what is the best (and fast) way to build a pivot table > >> aside from the 'brute force way?' > >> > >> I want to transform an numpy array into a pivot table. For > >> example, if > >> I have a numpy array like below: > >> > >> Region Date # of Units > >> ---------- ---------- -------------- > >> East 1/1 10 > >> East 1/1 20 > >> East 1/2 30 > >> West 1/1 40 > >> West 1/2 50 > >> West 1/2 60 > >> > >> I want to transform this into the following table, where f() is a > >> given aggregate function: > >> > >> Date > >> Region 1/1 1/2 > >> ---------- > >> East f(10,20) f(30) > >> West f(40) f(50,60) > >> > >> > >> I can regroup them into 'sets' and do it the brute force way, but > >> that > >> is kind of slow to execute. Does anyone know a better way? > >> > >> > >> Thanks, > >> Geoffrey > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: testpivot.py Type: text/x-python Size: 6604 bytes Desc: not available URL: From dalcinl at gmail.com Mon Aug 6 16:08:50 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 6 Aug 2007 17:08:50 -0300 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> Message-ID: On 8/3/07, David Cournapeau wrote: > Here is what I can think of: > - adding an API to know whether a given PyArrayObject has its data > buffer 16 bytes aligned, and requesting a 16 bytes aligned > PyArrayObject. Something like NPY_ALIGNED, basically. > - forcing data allocation to be 16 bytes aligned in numpy (eg > define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). All this sounds pretty similar to sdt::allocator we can found in C++ STL (http://www.sgi.com/tech/stl/Allocators.html). Perhaps a NumPy array could be associated with an instance of an 'allocator' object (surely written in C, perhaps subclassable in Python) providing appropriate methos for alloc/dealloc(/realloc?/initialize(memset)?/copy(memcpy)?) memory. This would be really nice, as it is extensible (you could even write a custom allocator, perhaps making use of a preallocated,static pool; use of C++ new/delete; use of any C++ std::allocator, shared memory, etc. etc.). I think this is the direction to go but no idea how much difficult it could be to implement. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From david at ar.media.kyoto-u.ac.jp Mon Aug 6 23:41:03 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 07 Aug 2007 12:41:03 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> Message-ID: <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> Lisandro Dalcin wrote: > On 8/3/07, David Cournapeau wrote: >> Here is what I can think of: >> - adding an API to know whether a given PyArrayObject has its data >> buffer 16 bytes aligned, and requesting a 16 bytes aligned >> PyArrayObject. Something like NPY_ALIGNED, basically. >> - forcing data allocation to be 16 bytes aligned in numpy (eg >> define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). > > All this sounds pretty similar to sdt::allocator we can found in C++ > STL (http://www.sgi.com/tech/stl/Allocators.html). Perhaps a NumPy > array could be associated with an instance of an 'allocator' object > (surely written in C, perhaps subclassable in Python) providing > appropriate methos for > alloc/dealloc(/realloc?/initialize(memset)?/copy(memcpy)?) memory. > > This would be really nice, as it is extensible (you could even write a > custom allocator, perhaps making use of a preallocated,static pool; > use of C++ new/delete; use of any C++ std::allocator, shared memory, > etc. etc.). I think this is the direction to go but no idea how much > difficult it could be to implement. > Well, when I proposed the SIMD extension, I was willing to implement the proposal, and this was for a simple goal: enabling better integration with many numeric libraries which need SIMD alignment. As nice as a custom allocator might be, I will certainly not implement it myself. For SIMD, I think the weight adding complexity / benefit worth it (since there is not much change to the API and implementation), and I know more or less how to do it; for custom allocator, that's an entirely different story. That's really more complex; static pools may be useful in some cases (but that's not obvious, since only the data are allocated with this buffer, everything else being allocated through the python memory allocator, and numpy arrays have pretty simple memory allocation patterns). David From peridot.faceted at gmail.com Tue Aug 7 00:49:09 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 7 Aug 2007 00:49:09 -0400 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> Message-ID: On 06/08/07, David Cournapeau wrote: > Well, when I proposed the SIMD extension, I was willing to implement the > proposal, and this was for a simple goal: enabling better integration > with many numeric libraries which need SIMD alignment. > > As nice as a custom allocator might be, I will certainly not implement > it myself. For SIMD, I think the weight adding complexity / benefit > worth it (since there is not much change to the API and implementation), > and I know more or less how to do it; for custom allocator, that's an > entirely different story. That's really more complex; static pools may > be useful in some cases (but that's not obvious, since only the data are > allocated with this buffer, everything else being allocated through the > python memory allocator, and numpy arrays have pretty simple memory > allocation patterns). I have to agree. I can hardly volunteer David for anything, and I don't have time to implement this myself, but I think a custom allocator is a rather special-purpose tool; if one were to implement one, I think the way to go would be to implement a subclass of ndarray (or just a constructor) that allocated the memory. This could be done from python, since you can make an ndarray from scratch using a given memory array. Of course, making temporaries be allocated with the correct allocator will be very complicated, since it's unclear which allocator should be used. Adding SIMD alignment should be a very small modification; it can be done as simply as using ctypes to wrap posix_memalign (or a portable version, possibly written in python) and writing a simple python function that checks the beginning data address. There's really no need to make it complicated. Anne From david at ar.media.kyoto-u.ac.jp Tue Aug 7 01:00:20 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 07 Aug 2007 14:00:20 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> Message-ID: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > > I have to agree. I can hardly volunteer David for anything, and I > don't have time to implement this myself, but I think a custom > allocator is a rather special-purpose tool; if one were to implement > one, I think the way to go would be to implement a subclass of ndarray > (or just a constructor) that allocated the memory. This could be done > from python, since you can make an ndarray from scratch using a given > memory array. Of course, making temporaries be allocated with the > correct allocator will be very complicated, since it's unclear which > allocator should be used. > > Adding SIMD alignment should be a very small modification; it can be > done as simply as using ctypes to wrap posix_memalign (or a portable > version, possibly written in python) and writing a simple python > function that checks the beginning data address. There's really no > need to make it complicated. > Anne, you said previously that it was easy to allocate buffers for a given alignment at runtime. Could you point me to a document which explains how ? For platforms without posix_memalign, I don't see how to implement a memory allocator with an arbitrary alignment (more precisely, I don't see how to free it if I cannot assume a fixed alignement: how do I know where the "real" pointer is ?). David From peridot.faceted at gmail.com Tue Aug 7 01:33:24 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 7 Aug 2007 01:33:24 -0400 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> Message-ID: On 07/08/07, David Cournapeau wrote: > Anne, you said previously that it was easy to allocate buffers for a > given alignment at runtime. Could you point me to a document which > explains how ? For platforms without posix_memalign, I don't see how to > implement a memory allocator with an arbitrary alignment (more > precisely, I don't see how to free it if I cannot assume a fixed > alignement: how do I know where the "real" pointer is ?). Well, it can be done in Python: just allocate a too-big ndarray and take a slice that's the right shape and has the right alignment. But this sucks. Stephen G. Johnson posted code earlier in this thread that provides a portable aligned-memory allocator - it handles the freeing by (always) storing enough information to recover the original pointer in the padding space. (This means you always need to pad, which is a pain, but there's not much you can do about that.) His implementation stores the original pointer just before the beginning of the aligned data, so _aligned_free is free(((void**)ptr)[-1]). If you were worried about space (rather than time) you could store a single byte just before the pointer whose value indicated how much padding was done, or whatever. These schemes all waste space, but unless malloc's internal structures are the size of the alignment block, it's almost unavoidable to waste some space; the only way around it I can see is if the program also allocates lots of small, odd-shaped, unaligned blocks of memory that can be used to fill the gaps (and even then I doubt any sensible malloc implementation fills in little gaps like this, since it seems likely to lead to memory fragmentation). A posix_memalign that is built into malloc can do better than any implementation that isn't, though, with the possible exception of a specialized pool allocator built with aligned allocation in mind. Anne From matthieu.brucher at gmail.com Tue Aug 7 02:23:15 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 7 Aug 2007 08:23:15 +0200 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> Message-ID: > > For platforms without posix_memalign, I don't see how to > implement a memory allocator with an arbitrary alignment (more > precisely, I don't see how to free it if I cannot assume a fixed > alignement: how do I know where the "real" pointer is ?). Visual Studio seems to offer a counter part (also note that malloc is supposed to return a pointer on a 16bits boundary) which is called _aligned_malloc ( http://msdn2.microsoft.com/en-us/library/8z34s9c6(VS.80).aspx). It should be what you need, at least for Windows/MSVC. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Aug 7 02:11:52 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 07 Aug 2007 15:11:52 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> Message-ID: <46B80D28.8060005@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > Well, it can be done in Python: just allocate a too-big ndarray and > take a slice that's the right shape and has the right alignment. But > this sucks. Stephen G. Johnson posted code earlier in this thread that > provides a portable aligned-memory allocator - it handles the freeing > by (always) storing enough information to recover the original pointer > in the padding space. (This means you always need to pad, which is a > pain, but there's not much you can do about that.) This is indeed no rocket science, I feel a bit ashamed :) I don't see the problem with padding (except wasted time) ? > His implementation > stores the original pointer just before the beginning of the aligned > data, so _aligned_free is free(((void**)ptr)[-1]). If you were worried > about space (rather than time) you could store a single byte just > before the pointer whose value indicated how much padding was done, or > whatever. I really don't see how space would be a problem in our situation: it is not like we will pad more than a few bytes; in the case it is, I don't see how python would be the right choice anymore anyway. I will try to prepare a patch the next few days, then. cheers, David From gerard.vermeulen at grenoble.cnrs.fr Tue Aug 7 04:54:28 2007 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Tue, 7 Aug 2007 10:54:28 +0200 Subject: [Numpy-discussion] ANN: PyQwt3D-0.1.5 released Message-ID: <20070807105428.3aeb7f19@zombie.grenoble.cnrs.fr> What is PyQwt3D ( http://pyqwt3d.sourceforge.net) ? - it is a set of Python bindings for the QwtPlot3D C++ class library which extends the Qt framework with widgets for 3D data visualization. PyQwt3D inherits the snappy feel from QwtPlot3D. The examples at http://pyqwt.sourceforge.net/pyqwt3d-examples.html show how easy it is to make a 3D plot and how to save a 3D plot to an image or an (E)PS/PDF/PGF/SVG file. - it requires and extends PyQt, a set of Python bindings for Qt. - it supports the use of PyQt, Qt, QwtPlot3D, and NumPy or SciPy in a GUI Python application or in an interactive Python session. - it runs on POSIX, Mac OS X and Windows platforms (practically any platform supported by Qt and Python). The home page of PyQwt3D is http://pyqwt.sourceforge.net. New features and bugfixes in PyQwt3D-0.1.5: - Added support for QwtPlot3D-0.2.7 - Added support for SIP-4.7, PyQt-4.3 and PyQt-3.17.3. - Added support for SVG and PGF vector output. - Added Qwt3D.save() to facilitate saving plots to a file. - Added Qwt3D.plot() to facilitate function plotting with nicely scaled axes. - Fixed the type of the result of IO.outputHandler(format). - Fixed saving to pixmap formats in qt4examples/Grab.py. PyQwt3D-0.1.5 supports: 1. Python-2.5, or -2.4. 2. PyQt-4.3, -4.2, -4.1, or -3.17. 3. SIP-4.7, -4.6, or -4.5. 4. Qt-4.3, -4.2, Qt-3.3, or -3.2. 5. QwtPlot3D-0.2.7. Enjoy -- Gerard Vermeulen From lfriedri at imtek.de Tue Aug 7 05:22:24 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Tue, 07 Aug 2007 11:22:24 +0200 Subject: [Numpy-discussion] fourier with single precision Message-ID: <46B839D0.9020905@imtek.de> Thank you for your comments! I will try this fftw3-scipy approach and see how much faster I can get. Maybe this is enough for me...? Lars From nwagner at iam.uni-stuttgart.de Tue Aug 7 08:02:16 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 07 Aug 2007 14:02:16 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers Message-ID: <46B85F48.3000700@iam.uni-stuttgart.de> Hi all, I have a list of integer numbers. The entries can vary between 0 and 19. How can I count the occurrence of any number. Consider >>> data [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] Is there a better way than using, e.g. >>> shape(where(array(data)==10))[1] 2 to compute the occurrence of 10 in the list which is 2 in this case ? Nils From matthieu.brucher at gmail.com Tue Aug 7 08:13:49 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 7 Aug 2007 14:13:49 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de> References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: You can try using hist() with the correct range and number of bins. Matthieu 2007/8/7, Nils Wagner : > > Hi all, > > I have a list of integer numbers. The entries can vary between 0 and 19. > How can I count the occurrence of any number. Consider > > >>> data > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, > 9, 7, 8, 9, 8, 7, 9] > > > Is there a better way than using, e.g. > > >>> shape(where(array(data)==10))[1] > 2 > > > to compute the occurrence of 10 in the list which is 2 in this case ? > > Nils > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Aug 7 08:19:47 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 7 Aug 2007 14:19:47 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de> References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: On 8/7/07, Nils Wagner wrote: > I have a list of integer numbers. The entries can vary between 0 and 19. > How can I count the occurrence of any number. Consider > > >>> data > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] > > > Is there a better way than using, e.g. > > >>> shape(where(array(data)==10))[1] > 2 > > > to compute the occurrence of 10 in the list which is 2 in this case ? Would list comprehension work? len([z for z in data if z == 10]) From kwgoodman at gmail.com Tue Aug 7 08:24:13 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 7 Aug 2007 14:24:13 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: On 8/7/07, Keith Goodman wrote: > On 8/7/07, Nils Wagner wrote: > > I have a list of integer numbers. The entries can vary between 0 and 19. > > How can I count the occurrence of any number. Consider > > > > >>> data > > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] > > > > > > Is there a better way than using, e.g. > > > > >>> shape(where(array(data)==10))[1] > > 2 > > > > > > to compute the occurrence of 10 in the list which is 2 in this case ? > > Would list comprehension work? > > len([z for z in data if z == 10]) Or is this faster? (array(x)==10).sum() From cimrman3 at ntc.zcu.cz Tue Aug 7 08:24:22 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 07 Aug 2007 14:24:22 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de> References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: <46B86476.3080009@ntc.zcu.cz> Nils Wagner wrote: > Hi all, > > I have a list of integer numbers. The entries can vary between 0 and 19. > How can I count the occurrence of any number. Consider > > >>> data > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] > > > Is there a better way than using, e.g. > >>>> shape(where(array(data)==10))[1] > 2 > > > to compute the occurrence of 10 in the list which is 2 in this case ? Your way is ok if you want to count just a few numbers. If you want all, you may sort the array and use searchorted: b = sort( a ) count = searchsorted( b, 7, side = 'right' ) - searchsorted( b, 7, side = 'left' ) r. From lorrmann at physik.uni-wuerzburg.de Tue Aug 7 08:51:53 2007 From: lorrmann at physik.uni-wuerzburg.de (volker) Date: Tue, 7 Aug 2007 12:51:53 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?Count_the_occurrence_of_a_certain_in?= =?utf-8?q?teger_in=09a_list_of_integers?= References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: Keith Goodman gmail.com> writes: > > On 8/7/07, Keith Goodman gmail.com> wrote: > > On 8/7/07, Nils Wagner iam.uni-stuttgart.de> wrote: > > > I have a list of integer numbers. The entries can vary between 0 and 19. > > > How can I count the occurrence of any number. Consider > > > > > > >>> data > > > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] > > > > > > > > > Is there a better way than using, e.g. > > > > > > >>> shape(where(array(data)==10))[1] > > > 2 > > > > > > > > > to compute the occurrence of 10 in the list which is 2 in this case ? > > > > Would list comprehension work? > > > > len([z for z in data if z == 10]) > > Or is this faster? > > (array(x)==10).sum() > Lets test ;) In [34]: data = array(data).repeat(1e6) In [35]: %time shape(where(array(data)==10))[1] CPU times: user 1.27 s, sys: 0.16 s, total: 1.44 s Wall time: 1.65 In [36]: %time ([z for z in data if z == 10]) CPU times: user 18.06 s, sys: 0.52 s, total: 18.58 s Wall time: 18.59 In [37]: %time (array(data)==10).sum() CPU times: user 0.68 s, sys: 0.20 s, total: 0.88 s Wall time: 1.36 From aisaac at american.edu Tue Aug 7 09:11:34 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 7 Aug 2007 09:11:34 -0400 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de> References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: On Tue, 07 Aug 2007, Nils Wagner apparently wrote: > I have a list of integer numbers. The entries can vary between 0 and 19. > How can I count the occurrence of any number. Consider > >>> data > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] > Is there a better way than using, e.g. >>>> shape(where(array(data)==10))[1] > 2 You did not say why data.count(10) is unsatisfactory ... Cheers, Alan Isaac From aisaac at american.edu Tue Aug 7 09:19:13 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 7 Aug 2007 09:19:13 -0400 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: By the way, you can get all the frequencies pretty fast using a defaultdict: http://docs.python.org/lib/defaultdict-examples.html Cheers, Alan Isaac From nwagner at iam.uni-stuttgart.de Tue Aug 7 09:22:37 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 07 Aug 2007 15:22:37 +0200 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers In-Reply-To: References: <46B85F48.3000700@iam.uni-stuttgart.de> Message-ID: <46B8721D.8000709@iam.uni-stuttgart.de> Alan G Isaac wrote: > On Tue, 07 Aug 2007, Nils Wagner apparently wrote: > >> I have a list of integer numbers. The entries can vary between 0 and 19. >> How can I count the occurrence of any number. Consider >> >>> data >> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] >> Is there a better way than using, e.g. >> >>>>> shape(where(array(data)==10))[1] >>>>> >> 2 >> > > > You did not say why data.count(10) is unsatisfactory ... > > Cheers, > Alan Isaac > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Thank you for all your input. To be honest I was not aware of all these possibilities to solve my problem. If you distribute a task among different people you will obtain different methods of resolution. Nils From charlesr.harris at gmail.com Tue Aug 7 16:26:27 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Aug 2007 14:26:27 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> Message-ID: On 8/6/07, Anne Archibald wrote: > > On 06/08/07, David Cournapeau wrote: > > > Well, when I proposed the SIMD extension, I was willing to implement the > > proposal, and this was for a simple goal: enabling better integration > > with many numeric libraries which need SIMD alignment. > > > > As nice as a custom allocator might be, I will certainly not implement > > it myself. For SIMD, I think the weight adding complexity / benefit > > worth it (since there is not much change to the API and implementation), > > and I know more or less how to do it; for custom allocator, that's an > > entirely different story. That's really more complex; static pools may > > be useful in some cases (but that's not obvious, since only the data are > > allocated with this buffer, everything else being allocated through the > > python memory allocator, and numpy arrays have pretty simple memory > > allocation patterns). > > I have to agree. I can hardly volunteer David for anything, and I > don't have time to implement this myself, but I think a custom > allocator is a rather special-purpose tool; if one were to implement > one, I think the way to go would be to implement a subclass of ndarray > (or just a constructor) that allocated the memory. This could be done > from python, since you can make an ndarray from scratch using a given > memory array. Of course, making temporaries be allocated with the > correct allocator will be very complicated, since it's unclear which > allocator should be used. Maybe I'm missing something, but handling the temporaries is automatic. Just return the appropriate slice from an array created in a subroutine. The original array gets its reference count decremented when the routine exits but the slice will still hold one. When the slice is deleted all the allocated memory will get garbage collected. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at saponara.net Tue Aug 7 22:50:05 2007 From: john at saponara.net (john saponara) Date: Tue, 07 Aug 2007 22:50:05 -0400 Subject: [Numpy-discussion] spurious IndexError? Message-ID: <46B92F5D.1010600@saponara.net> Using numpy-1.0.2/python-2.5/winxp pro sp2: in the following, the only array is 'a', and I'm not using it as an index, so why do I get the IndexError below? --- start python session --- >>> a=array([[1,3],[2,4]]) >>> a array([[1, 3], [2, 4]]) >>> f=lambda i,j: a[i,j] >>> f(1,1) 4 >>> fromfunction(f,(2,2)) Traceback (most recent call last): File "", line 1, in File "C:\Python25\Lib\site-packages\numpy\core\numeric.py", line 514, in fromfunction return function(*args,**kwargs) File "", line 1, in IndexError: arrays used as indices must be of integer (or boolean) type --- end python session --- The upstream maple is written in 'fromfunction' style, and I have no control over that but want to port it to python in the most natural way possible. The session suggests that lambda has no trouble with an array, so the problem seems to be related to the way 'fromfunction' works. What am I missing? Thanks! From robert.kern at gmail.com Tue Aug 7 23:04:12 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 07 Aug 2007 22:04:12 -0500 Subject: [Numpy-discussion] spurious IndexError? In-Reply-To: <46B92F5D.1010600@saponara.net> References: <46B92F5D.1010600@saponara.net> Message-ID: <46B932AC.80206@gmail.com> john saponara wrote: > Using numpy-1.0.2/python-2.5/winxp pro sp2: in the following, the only > array is 'a', and I'm not using it as an index, so why do I get the > IndexError below? > > --- start python session --- > >>> a=array([[1,3],[2,4]]) > >>> a > array([[1, 3], > [2, 4]]) > >>> f=lambda i,j: a[i,j] > >>> f(1,1) > 4 > >>> fromfunction(f,(2,2)) > Traceback (most recent call last): > File "", line 1, in > File "C:\Python25\Lib\site-packages\numpy\core\numeric.py", line 514, > in fromfunction > return function(*args,**kwargs) > File "", line 1, in > IndexError: arrays used as indices must be of integer (or boolean) type > --- end python session --- > > The upstream maple is written in 'fromfunction' style, and I have no > control over that but want to port it to python in the most natural way > possible. > > The session suggests that lambda has no trouble with an array, so the > problem seems to be related to the way 'fromfunction' works. What am I > missing? fromfunction() takes the (2, 2) and forms arrays of indices. It then calls your function with those arrays as arguments. It does not loop. The default dtype of these arrays is float, not int. You must use "dtype=int" in your call to fromfunction(). def fromfunction(function, shape, **kwargs): """Returns an array constructed by calling a function on a tuple of number grids. The function should accept as many arguments as the length of shape and work on array inputs. The shape argument is a sequence of numbers indicating the length of the desired output for each axis. The function can also accept keyword arguments (except dtype), which will be passed through fromfunction to the function itself. The dtype argument (default float) determines the data-type of the index grid passed to the function. """ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gerard.vermeulen at grenoble.cnrs.fr Wed Aug 8 01:36:44 2007 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed, 8 Aug 2007 07:36:44 +0200 Subject: [Numpy-discussion] ANN: PyQwt3D-0.1.6 released Message-ID: <20070808073644.265d3e2d@zombie.grenoble.cnrs.fr> What is PyQwt3D ( http://pyqwt3d.sourceforge.net) ? - it is a set of Python bindings for the QwtPlot3D C++ class library which extends the Qt framework with widgets for 3D data visualization. PyQwt3D inherits the snappy feel from QwtPlot3D. The examples at http://pyqwt.sourceforge.net/pyqwt3d-examples.html show how easy it is to make a 3D plot and how to save a 3D plot to an image or an (E)PS/PDF/PGF/SVG file. - it requires and extends PyQt, a set of Python bindings for Qt. - it supports the use of PyQt, Qt, QwtPlot3D, and NumPy or SciPy in a GUI Python application or in an interactive Python session. - it runs on POSIX, Mac OS X and Windows platforms (practically any platform supported by Qt and Python). - it is licensed under the GPL with an exception to allow dynamic linking with non-free releases of Qt and PyQt. The home page of PyQwt3D is http://pyqwt.sourceforge.net. PyQwt3D-0.1.6 is a bug fix release: - Improved text display on screen and in pixmaps with Qt-4 and X (requires the use of the patched QwtPlot3D-0.2.7 library included in PyQwt3D). PyQwt3D-0.1.6 supports: 1. Python-2.5, or -2.4. 2. PyQt-4.3, -4.2, -4.1, or -3.17. 3. SIP-4.7, -4.6, or -4.5. 4. Qt-4.3, -4.2, Qt-3.3, or -3.2. 5. QwtPlot3D-0.2.7. Enjoy -- Gerard Vermeulen From lbolla at gmail.com Wed Aug 8 03:35:32 2007 From: lbolla at gmail.com (lorenzo bolla) Date: Wed, 8 Aug 2007 09:35:32 +0200 Subject: [Numpy-discussion] numpy installation problem In-Reply-To: <7fd38bfa0707301818kd280ce2vdb1e1cb0b0a23111@mail.gmail.com> References: <7fd38bfa0707301818kd280ce2vdb1e1cb0b0a23111@mail.gmail.com> Message-ID: <80c99e790708080035m57c2e186s760d07d8adf24c45@mail.gmail.com> sorry for the silly question: have you done "python setup.py install" from the numpy src directory, after untarring? then cd out from the src directory and try to import numpy from python. L. On 7/31/07, kingshuk ghosh wrote: > > Hi, > I downloaded numpy1.0.3-2.tar and unzipped and untared. > However somehow new numpy does not work. It invokes > the old numpy 0.9.6 when i import numpy from python > and type in numpy.version.version . > I tried to change path and once I do that and when I do > import numpy it says > "running from source directory" and then if I try > numpy.version.version it gives some error. > > Is there something obvious I am missing after unzipping > and untaring the numpy source file ? For example do I need > to do something to install the new numpy1.0.3 ? > > Or do I also need to download full python package ? > I am trying to run this on Red Hat Linux 3.2.2-5 which > has a gcc 3.2.2 and the version of python is 2.4 . > > Any help will be greatly appreciated. > > Cheers > Kings > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Aug 8 05:53:30 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 8 Aug 2007 11:53:30 +0200 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> Message-ID: <20070808095330.GO30988@mentat.za.net> On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote: > Well, it can be done in Python: just allocate a too-big ndarray and > take a slice that's the right shape and has the right alignment. But > this sucks. Could you explain to me why is this such a bad idea? St?fan From markbak at gmail.com Wed Aug 8 06:44:56 2007 From: markbak at gmail.com (mark) Date: Wed, 08 Aug 2007 10:44:56 -0000 Subject: [Numpy-discussion] simple slicing question Message-ID: <1186569896.722070.91690@g4g2000hsf.googlegroups.com> Consider the array d: d = linspace( -10, 10, 10 ) If I want to multiply every value above -5 by 100 I can do d[ d>-5 ] *= 100 But what if I want to multiply every value between -5 and +5 by 100. This does NOT work: d[ d>-5 and d<5 ] *= 100 Any ideas? Thanks, Mark From kwgoodman at gmail.com Wed Aug 8 06:53:07 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 8 Aug 2007 12:53:07 +0200 Subject: [Numpy-discussion] simple slicing question In-Reply-To: <1186569896.722070.91690@g4g2000hsf.googlegroups.com> References: <1186569896.722070.91690@g4g2000hsf.googlegroups.com> Message-ID: On 8/8/07, mark wrote: > But what if I want to multiply every value between -5 and +5 by 100. > This does NOT work: > > d[ d>-5 and d<5 ] *= 100 d[(d>-5) & (d<5)] *= 100 From markbak at gmail.com Wed Aug 8 07:07:15 2007 From: markbak at gmail.com (mark) Date: Wed, 08 Aug 2007 11:07:15 -0000 Subject: [Numpy-discussion] simple slicing question In-Reply-To: References: <1186569896.722070.91690@g4g2000hsf.googlegroups.com> Message-ID: <1186571235.266377.136330@22g2000hsm.googlegroups.com> Life is so simple. Thanks Keith, Mark On Aug 8, 12:53 pm, "Keith Goodman" wrote: > On 8/8/07, mark wrote: > > > But what if I want to multiply every value between -5 and +5 by 100. > > This does NOT work: > > > d[ d>-5 and d<5 ] *= 100 > > d[(d>-5) & (d<5)] *= 100 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From peridot.faceted at gmail.com Wed Aug 8 11:29:55 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 8 Aug 2007 11:29:55 -0400 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <20070808095330.GO30988@mentat.za.net> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: On 08/08/2007, Stefan van der Walt wrote: > On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote: > > Well, it can be done in Python: just allocate a too-big ndarray and > > take a slice that's the right shape and has the right alignment. But > > this sucks. > > Could you explain to me why is this such a bad idea? Oh. Well, it's not *terrible*; it gets you an aligned array. But you have to allocate the original array as a 1D byte array (to allow for arbitrary realignments) and then align it, reshape it, and reinterpret it as a new type. Plus you're allocating an extra ndarray structure, which will live as long as the new array does; this not only wastes even more memory than the portable alignment solutions, it clogs up python's garbage collector. It's not outrageous, if you need aligned arrays *now*, on a released version of numpy, but numpy itself should do better. Anne From markbak at gmail.com Wed Aug 8 11:37:09 2007 From: markbak at gmail.com (mark) Date: Wed, 08 Aug 2007 15:37:09 -0000 Subject: [Numpy-discussion] vectorized function inside a class Message-ID: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> I am trying to figure out a way to define a vectorized function inside a class. This is what I tried: class test: def __init__(self): self.x = 3.0 def func(self,y): rv = self.x if y > self.x: rv = y return rv f = vectorize(func) >>> m = test() >>> m.f( m, [-20,4,6] ) array([ 3., 4., 6.]) But as you can see, I can only call the m.f function when I also pass it the instance m again. I really want to call it as m.f( [-20,4,6] ) But then I get an error ValueError: mismatch between python function inputs and received arguments Any ideas how to do this better? Mark From stefan at sun.ac.za Wed Aug 8 11:50:00 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 8 Aug 2007 17:50:00 +0200 Subject: [Numpy-discussion] vectorized function inside a class In-Reply-To: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> Message-ID: <20070808155000.GC29100@mentat.za.net> Hi Mark On Wed, Aug 08, 2007 at 03:37:09PM -0000, mark wrote: > I am trying to figure out a way to define a vectorized function inside > a class. > This is what I tried: > > class test: > def __init__(self): > self.x = 3.0 > def func(self,y): > rv = self.x > if y > self.x: rv = y > return rv > f = vectorize(func) > > > >>> m = test() > >>> m.f( m, [-20,4,6] ) > array([ 3., 4., 6.]) Maybe you don't need to use vectorize. How about def func(self,y): y = y.copy() y[y <= self.x] = self.x return y Cheers St?fan From tim.hochberg at ieee.org Wed Aug 8 11:54:18 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 8 Aug 2007 08:54:18 -0700 Subject: [Numpy-discussion] vectorized function inside a class In-Reply-To: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> Message-ID: On 8/8/07, mark wrote: > > I am trying to figure out a way to define a vectorized function inside > a class. > This is what I tried: > > class test: > def __init__(self): > self.x = 3.0 > def func(self,y): > rv = self.x > if y > self.x: rv = y > return rv > f = vectorize(func) > > > >>> m = test() > >>> m.f( m, [-20,4,6] ) > array([ 3., 4., 6.]) > > But as you can see, I can only call the m.f function when I also pass > it the instance m again. > I really want to call it as > m.f( [-20,4,6] ) > But then I get an error > ValueError: mismatch between python function inputs and received > arguments > > Any ideas how to do this better? Don't use vectorize? Something like: def f(self,y): return np.where(y > self.x, y, self.x) You could also use vectorize by wrapping the result in a real method like this: _f = vectorize(func) def f(self, y): return self._f(self, y) That seems kind of silly in this instance though. -tim -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 8 12:04:27 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 8 Aug 2007 10:04:27 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: On 8/8/07, Anne Archibald wrote: > > On 08/08/2007, Stefan van der Walt wrote: > > On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote: > > > Well, it can be done in Python: just allocate a too-big ndarray and > > > take a slice that's the right shape and has the right alignment. But > > > this sucks. > > > > Could you explain to me why is this such a bad idea? > > Oh. Well, it's not *terrible*; it gets you an aligned array. But you > have to allocate the original array as a 1D byte array (to allow for > arbitrary realignments) and then align it, reshape it, and reinterpret > it as a new type. Plus you're allocating an extra ndarray structure, > which will live as long as the new array does; this not only wastes > even more memory than the portable alignment solutions, it clogs up > python's garbage collector. The ndarray structure doesn't take up much memory, it is the data that is large and the data is shared between the original array and the slice. Nor does the data type of the slice need changing, one simply uses the desired type to begin with, or at least a type of the right size so that a view will do the job without copies. Nor do I see how the garbage collector will get clogged up, slices are a common feature of using numpy. The slice method also has the advantage of being compiler and operating system independent, there is a reason Intel used that approach. Aligning multidimensional arrays might indeed be complicated, but I suspect those complications will be easier to handle in Python than in C. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Aug 8 12:08:19 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 8 Aug 2007 18:08:19 +0200 Subject: [Numpy-discussion] vectorized function inside a class In-Reply-To: References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> Message-ID: <20070808160819.GD29100@mentat.za.net> On Wed, Aug 08, 2007 at 08:54:18AM -0700, Timothy Hochberg wrote: > Don't use vectorize? Something like: > > def f(self,y): > return np.where(y > self.x, y, self.x) A one-liner, cool. Benchmarks on some other methods: Method 1: N.where 100 loops, best of 3: 9.32 ms per loop Method 2: N.clip 10000000 loops, best of 3: 112 ns per loop 100 loops, best of 3: 3.33 ms per loop Method 3: N.putmask 100 loops, best of 3: 5.95 ms per loop Method 4: fancy indexing 100 loops, best of 3: 5.09 ms per loop Cheers St?fan From mpmusu at cc.usu.edu Wed Aug 8 12:26:24 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Wed, 08 Aug 2007 10:26:24 -0600 Subject: [Numpy-discussion] Count the occurrence of a certain integer in a list of integers Message-ID: <46B9EEB0.40105@cc.usu.edu> A late entry, but here's something that gets you an array of counts for each unique integer: >>> data = numpy.array([9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9]) >>> unique=numpy.unique(data) >>> unique array([ 6, 7, 8, 9, 10, 11]) >>> histo=numpy.histogram(data,unique) >>> histo (array([ 4, 7, 4, 12, 2, 1]), array([ 6, 7, 8, 9, 10, 11])) >>> So histo[0] includes the counts of each integer in data. -Mark 2007/8/7, Nils Wagner : > > Hi all, > > I have a list of integer numbers. The entries can vary between 0 and 19. > How can I count the occurrence of any number. Consider > > >>> data > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, > 9, 7, 8, 9, 8, 7, 9] > > > Is there a better way than using, e.g. > > >>> shape(where(array(data)==10))[1] > 2 > > > to compute the occurrence of 10 in the list which is 2 in this case ? > > Nils From peridot.faceted at gmail.com Wed Aug 8 14:23:44 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 8 Aug 2007 14:23:44 -0400 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: On 08/08/2007, Charles R Harris wrote: > > > On 8/8/07, Anne Archibald wrote: > > Oh. Well, it's not *terrible*; it gets you an aligned array. But you > > have to allocate the original array as a 1D byte array (to allow for > > arbitrary realignments) and then align it, reshape it, and reinterpret > > it as a new type. Plus you're allocating an extra ndarray structure, > > which will live as long as the new array does; this not only wastes > > even more memory than the portable alignment solutions, it clogs up > > python's garbage collector. > > The ndarray structure doesn't take up much memory, it is the data that is > large and the data is shared between the original array and the slice. Nor > does the data type of the slice need changing, one simply uses the desired > type to begin with, or at least a type of the right size so that a view will > do the job without copies. Nor do I see how the garbage collector will get > clogged up, slices are a common feature of using numpy. The slice method > also has the advantage of being compiler and operating system independent, > there is a reason Intel used that approach. > > Aligning multidimensional arrays might indeed be complicated, but I suspect > those complications will be easier to handle in Python than in C. Can we assume that numpy arrays allocated to contain (say) complex64s are aligned to a 16-byte boundary? I don't think they will necessarily, so the shift we need may not be an integer number of complex64s. float96s pose even more problems. So to ensure alignment, we do need to do type conversion; if we're doing it anyway, byte arrays require the least trust in malloc(). The ndarray object isn't too big, probably some twenty or thirty bytes, so I'm not talking about a huge waste. But it is a python object, and the garbage collector needs to walk the whole tree of accessible python objects every time it runs, so this is one more object on the list. As an aside: numpy's handling of ndarray objects is actually not ideal; if you want to exhaust memory on your system, do: a = arange(5) while True: a = a[::-1] Each ndarray object keeps alive the ndarray object it is a slice of, so this operation creates an ever-growing linked list of ndarray objects. Seems to me it would be better to keep a pointer only to the original object that holds the address of the buffer (so it can be freed). Aligning multidimensional arrays is an interesting question. To first order, aligning the first element should be enough. If the dimensions of the array are not divisible by the alignment, though, this means that lower-dimensional complete slices may not be aligned: A = aligned_empty((7,5),dtype=float,alignment=16) Then A is aligned, as is A[0,:], but A[1,:] is not. So in this case we might want to actually allocate an 8-by-5 array and take a slice. This does mean it won't be contiguous in memory, so that flattening it requires a copy (which may not wind up aligned). This is something we might want to do - that is, make available as an option - in python. Anne From charlesr.harris at gmail.com Wed Aug 8 14:58:13 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 8 Aug 2007 12:58:13 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: Anne, On 8/8/07, Anne Archibald wrote: > > On 08/08/2007, Charles R Harris wrote: > > > > > > On 8/8/07, Anne Archibald wrote: > > > Oh. Well, it's not *terrible*; it gets you an aligned array. But you > > > have to allocate the original array as a 1D byte array (to allow for > > > arbitrary realignments) and then align it, reshape it, and reinterpret > > > it as a new type. Plus you're allocating an extra ndarray structure, > > > which will live as long as the new array does; this not only wastes > > > even more memory than the portable alignment solutions, it clogs up > > > python's garbage collector. > > > > The ndarray structure doesn't take up much memory, it is the data that > is > > large and the data is shared between the original array and the slice. > Nor > > does the data type of the slice need changing, one simply uses the > desired > > type to begin with, or at least a type of the right size so that a view > will > > do the job without copies. Nor do I see how the garbage collector will > get > > clogged up, slices are a common feature of using numpy. The slice method > > also has the advantage of being compiler and operating system > independent, > > there is a reason Intel used that approach. > > > > Aligning multidimensional arrays might indeed be complicated, but I > suspect > > those complications will be easier to handle in Python than in C. > > Can we assume that numpy arrays allocated to contain (say) complex64s > are aligned to a 16-byte boundary? I don't think they will > necessarily, so the shift we need may not be an integer number of > complex64s. float96s pose even more problems. So to ensure alignment, > we do need to do type conversion; if we're doing it anyway, byte > arrays require the least trust in malloc(). I think that is a safe assumption, it is probably almost as safe as assuming binary and two's complement, likely more safe than assuming ieee 784. I expect almost all 32 bit OS's to align on 4 byte boundaries at worst, 64 bit machines to align on 8 byte boundaries. Even C structures are typically filled out with blanks to preserve some sort of alignment. That is because of addressing efficiency, or even the impossibility of odd addressing -- depends on the architecture. Sometimes even byte addressing is easier to get by putting a larger integer on the bus and extracting the relevant part. In addition, I expect the heap implementation to make some alignment decisions for efficiency. My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte boundaries. It might be interesting to see what happens with the Intel and MSVC comipilers, but I expect similar results. PPC's, Sun and SGI need to be checked, but I don't expect problems. I think that will cover almost all architectures numpy is likely to run on. > The ndarray object isn't too big, probably some twenty or thirty > bytes, so I'm not talking about a huge waste. But it is a python > object, and the garbage collector needs to walk the whole tree of > accessible python objects every time it runs, so this is one more > object on the list. > > As an aside: numpy's handling of ndarray objects is actually not > ideal; if you want to exhaust memory on your system, do: > > a = arange(5) > while True: > a = a[::-1] Well, that's a pathological case present in numpy. Fixing it doesn't seem to be a high priority although there is a ticket somewhere. Each ndarray object keeps alive the ndarray object it is a slice of, > so this operation creates an ever-growing linked list of ndarray > objects. Seems to me it would be better to keep a pointer only to the > original object that holds the address of the buffer (so it can be > freed). > > Aligning multidimensional arrays is an interesting question. To first > order, aligning the first element should be enough. If the dimensions > of the array are not divisible by the alignment, though, this means > that lower-dimensional complete slices may not be aligned: > > A = aligned_empty((7,5),dtype=float,alignment=16) > > Then A is aligned, as is A[0,:], but A[1,:] is not. > > So in this case we might want to actually allocate an 8-by-5 array and > take a slice. This does mean it won't be contiguous in memory, so that > flattening it requires a copy (which may not wind up aligned). This is > something we might want to do - that is, make available as an option - > in python. I think that is better viewed as need based. I suspect that if you really need such alignment it is better to start with array dimensions that will naturally align the rows. It will be impossible to naturally align all the columnes unless the data type is the correct size. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Wed Aug 8 15:01:59 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 8 Aug 2007 21:01:59 +0200 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: > > My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte > boundaries. It might be interesting to see what happens with the Intel and > MSVC comipilers, but I expect similar results. > According to the doc on the msdn, the data should be 16-bits aligned. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 8 15:16:41 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 8 Aug 2007 13:16:41 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: On 8/8/07, Matthieu Brucher wrote: > > My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte > > boundaries. It might be interesting to see what happens with the Intel and > > MSVC comipilers, but I expect similar results. > > > > According to the doc on the msdn, the data should be 16-bits aligned. Shades of DOS and 16 bit machines. Have you checked what actually happens on modern hardware? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Wed Aug 8 16:38:32 2007 From: markbak at gmail.com (mark) Date: Wed, 08 Aug 2007 20:38:32 -0000 Subject: [Numpy-discussion] vectorized function inside a class In-Reply-To: References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> Message-ID: <1186605512.397478.24150@22g2000hsm.googlegroups.com> Thanks for the ideas to circumvent vectorization. But the real function I need to vectorize is quite a bit more complicated. So I would really like to use vectorize. Are there any reasons against vectorization? Is it slow? The way Tim suggests I expect to be slow as there are two functions calls. Thanks, Mark On Aug 8, 5:54 pm, "Timothy Hochberg" wrote: > On 8/8/07, mark wrote: > > > > > > > I am trying to figure out a way to define a vectorized function inside > > a class. > > This is what I tried: > > > class test: > > def __init__(self): > > self.x = 3.0 > > def func(self,y): > > rv = self.x > > if y > self.x: rv = y > > return rv > > f = vectorize(func) > > > >>> m = test() > > >>> m.f( m, [-20,4,6] ) > > array([ 3., 4., 6.]) > > > But as you can see, I can only call the m.f function when I also pass > > it the instance m again. > > I really want to call it as > > m.f( [-20,4,6] ) > > But then I get an error > > ValueError: mismatch between python function inputs and received > > arguments > > > Any ideas how to do this better? > > Don't use vectorize? Something like: > > def f(self,y): > return np.where(y > self.x, y, self.x) > > You could also use vectorize by wrapping the result in a real method like > this: > > _f = vectorize(func) > def f(self, y): > return self._f(self, y) > > That seems kind of silly in this instance though. > > -tim > > -- > . __ > . |-\ > . > . tim.hochb... at ieee.org > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From peridot.faceted at gmail.com Wed Aug 8 16:53:28 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 8 Aug 2007 16:53:28 -0400 Subject: [Numpy-discussion] vectorized function inside a class In-Reply-To: <1186605512.397478.24150@22g2000hsm.googlegroups.com> References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com> <1186605512.397478.24150@22g2000hsm.googlegroups.com> Message-ID: On 08/08/2007, mark wrote: > Thanks for the ideas to circumvent vectorization. > But the real function I need to vectorize is quite a bit more > complicated. > So I would really like to use vectorize. > Are there any reasons against vectorization? Is it slow? > The way Tim suggests I expect to be slow as there are two functions > calls. vectorize() is just shorthand for a for loop, basically; it won't win you anything on speed over looping yourself. It does win on convenience, but if you can write your function to act on arrays it will run much faster. Anne From john at saponara.net Wed Aug 8 18:13:35 2007 From: john at saponara.net (john saponara) Date: Wed, 08 Aug 2007 18:13:35 -0400 Subject: [Numpy-discussion] fromfunction question Message-ID: <46BA400F.3070505@saponara.net> Thinking I could use fromfunction to generate the x,y,z coordinates of a 3D surface, I instead got separate arrays of x, y, and z coordinates (as I should have expected) and needed to use a nested listcomp to produce the unified array of 3D points: x,y,z=fromfunction( lambda i,j: (xfun(i,j),yfun(i,j),zfun(i,j)), (1,3), dtype=int ) result=[[(a,b,c) for a,b,c in zip(p,q,r)] for p,q,r in zip(x,y,z)] Is it possible to compute the unified array of 3-tuples in a single step? Below is working code. Thanks. --- start python session --- r=array([[0,1],[1,0],[2,1]]) c=array([[0,1],[1,0],[2,1]]) p=array([[-1,0]]) rLen=len(r) cLen=len(c) # functions to compute x,y,z coordinates of 3d points (the exact expressions are not important) def xfun(i,j): return r[j,0] def yfun(i,j): return r[j,1]*p[i,0]+c[i+1,0] def zfun(i,j): return r[j,1]*p[i,1]+c[i+1,1] # the fromfunction and an extra step to arrange coordinates into 3-tuples x,y,z=fromfunction( lambda i,j: (xfun(i,j),yfun(i,j),zfun(i,j)), (1,3), dtype=int ) result=[[(a,b,c) for a,b,c in zip(p,q,r)] for p,q,r in zip(x,y,z)] print result # prints [[(0, 0, 0), (1, 1, 0), (2, 0, 0)]] --- end python session --- From david at ar.media.kyoto-u.ac.jp Wed Aug 8 23:08:45 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 09 Aug 2007 12:08:45 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> Message-ID: <46BA853D.90901@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > Anne, > > On 8/8/07, *Anne Archibald* > wrote: > > On 08/08/2007, Charles R Harris > wrote: > > > > > > On 8/8/07, Anne Archibald > wrote: > > > Oh. Well, it's not *terrible*; it gets you an aligned array. > But you > > > have to allocate the original array as a 1D byte array (to > allow for > > > arbitrary realignments) and then align it, reshape it, and > reinterpret > > > it as a new type. Plus you're allocating an extra ndarray > structure, > > > which will live as long as the new array does; this not only > wastes > > > even more memory than the portable alignment solutions, it > clogs up > > > python's garbage collector. > > > > The ndarray structure doesn't take up much memory, it is the > data that is > > large and the data is shared between the original array and the > slice. Nor > > does the data type of the slice need changing, one simply uses > the desired > > type to begin with, or at least a type of the right size so that > a view will > > do the job without copies. Nor do I see how the garbage > collector will get > > clogged up, slices are a common feature of using numpy. The > slice method > > also has the advantage of being compiler and operating system > independent, > > there is a reason Intel used that approach. > I am not sure to understand which approach to which problem you are talking about here ? IMHO, the discussion is becoming a bit carried away. What I was suggesting is - being able to check whether a given data buffer is aligned to a given alignment (easy) - being able to request an aligned data buffer: requires aligned memory allocators, and some additions to the API for creating arrays. This all boils down to the following case: I have a C function which requires N bytes aligned data, I want the numpy API to provide this capability. I don't understand the discussion on doing it in python: first, this means you cannot request a data buffer at the C level, and I don't understand the whole discussion on slice, multi dimension and so on either: at the C level, different libraries may need different arrays formats, and in the case of fftw, all it cares about is the alignment of the data pointer. For contiguous, C order arrays, as long as the data pointer is aligned, I don't think we need more; are some people familiar with the MKL, who could tell whether we need more ? cheers, David From charlesr.harris at gmail.com Thu Aug 9 03:17:28 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 9 Aug 2007 01:17:28 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46BA853D.90901@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> Message-ID: On 8/8/07, David Cournapeau wrote: > > Charles R Harris wrote: > > Anne, > > > > On 8/8/07, *Anne Archibald* > > wrote: > > > > On 08/08/2007, Charles R Harris > > wrote: > > > > > > > > > On 8/8/07, Anne Archibald > > wrote: > > > > Oh. Well, it's not *terrible*; it gets you an aligned array. > > But you > > > > have to allocate the original array as a 1D byte array (to > > allow for > > > > arbitrary realignments) and then align it, reshape it, and > > reinterpret > > > > it as a new type. Plus you're allocating an extra ndarray > > structure, > > > > which will live as long as the new array does; this not only > > wastes > > > > even more memory than the portable alignment solutions, it > > clogs up > > > > python's garbage collector. > > > > > > The ndarray structure doesn't take up much memory, it is the > > data that is > > > large and the data is shared between the original array and the > > slice. Nor > > > does the data type of the slice need changing, one simply uses > > the desired > > > type to begin with, or at least a type of the right size so that > > a view will > > > do the job without copies. Nor do I see how the garbage > > collector will get > > > clogged up, slices are a common feature of using numpy. The > > slice method > > > also has the advantage of being compiler and operating system > > independent, > > > there is a reason Intel used that approach. > > > I am not sure to understand which approach to which problem you are > talking about here ? > > IMHO, the discussion is becoming a bit carried away. What I was > suggesting is > - being able to check whether a given data buffer is aligned to a > given alignment (easy) > - being able to request an aligned data buffer: requires aligned > memory allocators, and some additions to the API for creating arrays. > > This all boils down to the following case: I have a C function which > requires N bytes aligned data, I want the numpy API to provide this > capability. I don't understand the discussion on doing it in python: Well, what you want might be very easy to do in python, we just need to check the default alignments for doubles and floats for some of the other compilers, architectures, and OS's out there. On the other hand, you might not be able to request a c malloc that is aligned in a portable way without resorting to the same tricks as you do in python. So why not use python and get the reference counting and garbage collection along with it? What we want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be the case with gcc, linux, and the Intel architecture. The idea is to create a slightly oversize array, then use a slice of the proper size that is 16 byte aligned. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Aug 9 03:52:38 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 09 Aug 2007 16:52:38 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> Message-ID: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > Well, what you want might be very easy to do in python, we just need > to check the default alignments for doubles and floats for some of the > other compilers, architectures, and OS's out there. On the other hand, > you might not be able to request a c malloc that is aligned in a > portable way without resorting to the same tricks as you do in python. > So why not use python and get the reference counting and garbage > collection along with it? First, doing it in python means that I cannot use the facility from C easily. But this is exactly where I need it, and where I would guess most people need it. People want to interface numpy with the mkl ? They will do it in C, right ? And maybe I am just too dumb to see the problem, but I don't see the need for garbage collection and so on :) Again, what is needed is: - aligned allocator -> we can use the one from Steven Johnson, used in fftw, which support more or less the same archs than numpy - Refactor the array creation functions in C such as the implementation takes one additional alignement argument, and the original functions are kept identical to before - Add a few utilities function to check whether it is SSE aligned, arbitrary aligned, etc... The only non trivial point is 2 . Actually, when I first thought about it, I thought about fixing alignement at compile time, which would have made it totally avoidable: it would have been a simple change of the definition of PyDataMem_New to an aligned malloc with a constant. I have already the code for this, and besides aligned malloc code, it is like a 5 lines change of numpy code, nothing terrible, really. David From charlesr.harris at gmail.com Thu Aug 9 03:55:50 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 9 Aug 2007 01:55:50 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> Message-ID: On 8/9/07, Charles R Harris wrote: > > > > On 8/8/07, David Cournapeau wrote: > > > > Charles R Harris wrote: > > > Anne, > > > > > > On 8/8/07, *Anne Archibald* > > > wrote: > > > > > > On 08/08/2007, Charles R Harris > > > wrote: > > > > > > > > > > > > On 8/8/07, Anne Archibald > > > wrote: > > > > > Oh. Well, it's not *terrible*; it gets you an aligned array. > > > But you > > > > > have to allocate the original array as a 1D byte array (to > > > allow for > > > > > arbitrary realignments) and then align it, reshape it, and > > > reinterpret > > > > > it as a new type. Plus you're allocating an extra ndarray > > > structure, > > > > > which will live as long as the new array does; this not only > > > wastes > > > > > even more memory than the portable alignment solutions, it > > > clogs up > > > > > python's garbage collector. > > > > > > > > The ndarray structure doesn't take up much memory, it is the > > > data that is > > > > large and the data is shared between the original array and the > > > slice. Nor > > > > does the data type of the slice need changing, one simply uses > > > the desired > > > > type to begin with, or at least a type of the right size so that > > > a view will > > > > do the job without copies. Nor do I see how the garbage > > > collector will get > > > > clogged up, slices are a common feature of using numpy. The > > > slice method > > > > also has the advantage of being compiler and operating system > > > independent, > > > > there is a reason Intel used that approach. > > > > > I am not sure to understand which approach to which problem you are > > talking about here ? > > > > IMHO, the discussion is becoming a bit carried away. What I was > > suggesting is > > - being able to check whether a given data buffer is aligned to a > > given alignment (easy) > > - being able to request an aligned data buffer: requires aligned > > memory allocators, and some additions to the API for creating arrays. > > > > This all boils down to the following case: I have a C function which > > requires N bytes aligned data, I want the numpy API to provide this > > capability. I don't understand the discussion on doing it in python: > > > Well, what you want might be very easy to do in python, we just need to > check the default alignments for doubles and floats for some of the other > compilers, architectures, and OS's out there. On the other hand, you might > not be able to request a c malloc that is aligned in a portable way without > resorting to the same tricks as you do in python. So why not use python and > get the reference counting and garbage collection along with it? What we > want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be > the case with gcc, linux, and the Intel architecture. The idea is to create > a slightly oversize array, then use a slice of the proper size that is 16 > byte aligned. > > Chuck > For instance, in the case of linux-x86 and linux-x86_64, the following should work: In [68]: def align16(n,dtype=float64) : ....: size = dtype().dtype.itemsize ....: over = 16/size ....: data = empty(n + over, dtype=dtype) ....: skip = (- data.ctypes.data % 16)/size ....: return data[skip:skip + n] Of course, now you need to fill in the data. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Aug 9 04:05:03 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 9 Aug 2007 02:05:03 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> Message-ID: On 8/9/07, David Cournapeau wrote: > > Charles R Harris wrote: > > > > Well, what you want might be very easy to do in python, we just need > > to check the default alignments for doubles and floats for some of the > > other compilers, architectures, and OS's out there. On the other hand, > > you might not be able to request a c malloc that is aligned in a > > portable way without resorting to the same tricks as you do in python. > > So why not use python and get the reference counting and garbage > > collection along with it? > First, doing it in python means that I cannot use the facility from C > easily. But this is exactly where I need it, and where I would guess > most people need it. People want to interface numpy with the mkl ? They > will do it in C, right ? And maybe I am just too dumb to see the > problem, but I don't see the need for garbage collection and so on :) > Again, what is needed is: > - aligned allocator -> we can use the one from Steven Johnson, used > in fftw, which support more or less the same archs than numpy > - Refactor the array creation functions in C such as the > implementation takes one additional alignement argument, and the > original functions are kept identical to before > - Add a few utilities function to check whether it is SSE aligned, > arbitrary aligned, etc... > > The only non trivial point is 2 . Actually, when I first thought about > it, I thought about fixing alignement at compile time, which would have > made it totally avoidable: it would have been a simple change of the > definition of PyDataMem_New to an aligned malloc with a constant. I have > already the code for this, and besides aligned malloc code, it is like a > 5 lines change of numpy code, nothing terrible, really. Ah, you want it in C. Well, I think it would not be too difficult to change PyDataMem_New, however, the function signature would change and all the code that used it would break. That is pretty drastic. Better to define PyDataMem_New_Aligned, then redefine PyDataMem_New to use the new function. That way nothing breaks and you get the function you need. I don't think Travis would get upset if you added such a function and documented it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Aug 9 04:26:12 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 09 Aug 2007 17:26:12 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> Message-ID: <46BACFA4.5010707@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > Ah, you want it in C. What would be the use to get SIMD aligned arrays in python ? David From charlesr.harris at gmail.com Thu Aug 9 04:58:24 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 9 Aug 2007 02:58:24 -0600 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46BACFA4.5010707@ar.media.kyoto-u.ac.jp> References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> <46BACFA4.5010707@ar.media.kyoto-u.ac.jp> Message-ID: On 8/9/07, David Cournapeau wrote: > > Charles R Harris wrote: > > > > Ah, you want it in C. > What would be the use to get SIMD aligned arrays in python ? If I wanted a fairly specialized routine and didn't want to touch the guts of numpy, I would pass the aligned array to a C function and use the data pointer. The python code would be just a high level wrapper. You might even be able to use ctypes to pass the pointer into a library function. It's not necessary to code everything in C using the python C API. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu Aug 9 05:40:23 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu, 9 Aug 2007 11:40:23 +0200 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> References: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> Message-ID: <20070809094023.GI9452@mentat.za.net> On Thu, Aug 09, 2007 at 04:52:38PM +0900, David Cournapeau wrote: > Charles R Harris wrote: > > > > Well, what you want might be very easy to do in python, we just need > > to check the default alignments for doubles and floats for some of the > > other compilers, architectures, and OS's out there. On the other hand, > > you might not be able to request a c malloc that is aligned in a > > portable way without resorting to the same tricks as you do in python. > > So why not use python and get the reference counting and garbage > > collection along with it? > First, doing it in python means that I cannot use the facility from C > easily. But this is exactly where I need it, and where I would guess > most people need it. People want to interface numpy with the mkl ? They > will do it in C, right ? It doesn't really matter where the memory allocation occurs, does it? As far as I understand, the underlying fftw function has some flag to indicate when the data is aligned. If so, we could expose that flag in Python, and do something like x = align16(data) _fft(x, is_aligned=True) I am not intimately familiar with the fft wrappers, so maybe I'm missing something more fundamental. Cheers St?fan From millman at berkeley.edu Thu Aug 9 07:29:29 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 9 Aug 2007 04:29:29 -0700 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 Message-ID: I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.3. In order to actually get them both released I will obviously need some help. But given the amount of work required and the number of people who have offered to help, I believe this will be doable. Given the extensive discussion about what is needed for these releases, I am fairly confident that I know what needs to be done. I will try to be very specific about what I will do and what I will need help with. Basically, I am just rewriting the plan described by Robert Kern last month. Please let me know if you have any suggestions/comments/problems with this plan and please let me know if you can commit to helping in any way. [[NOTE: I just (on Monday) hired 2 full-time programmers to work on the neuroimaging in python (NIPY) project, so they will be able to help out with bug fixing as well as testing the pre-releases on different platforms.]] Releasing NumPy 1.0.3.1 =================== On July 24th, Robert suggested making a numpy 1.0.3.1 point release. He was concerned that there were some changes in numpy.distutils that needed to cook a little longer. So I am offering to make a 1.0.3.1 release. If Travis or one of the other core NumPy developers want to make a 1.0.4 release in the next week or so, then there won't be a need for a 1.0.3.1 release. First, I will branch from the 1.0.3 tag: svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3 http://svn.scipy.org/svn/numpy/branches/1.0.3 Second, I will apply all the patches necessary to build scipy from svn, but nothing else. Then I will just follow the NumPy release instructions: http://projects.scipy.org/scipy/numpy/wiki/MakingReleases I will make the tarball and source rpm; but will need help with everything else. Things will go faster if someone else can build the Windows binaries. If not, my new programmers and I will make the binaries. Finally, one of the sourceforge admins will need upload those files once we are done. (I am happy to be made an admin and upload the files myself, if it would be more convenient.) Releasing SciPy 0.5.3 ================= I will make a 0.5.3 scipy branch: svn cp http://svn.scipy.org/svn/scipy/trunk http://svn.scipy.org/svn/scipy/branches/0.5.3 >From then on normal development will continue on the trunk, but only bug fixes will be allowed on the branch. I will ask everyone to test the branch for at least 1 week depending on whether we get any bug reports. Once we are able to get the most serious bugs fixed, I will start working with everyone to build as many binaries as possible. I will rely on David Cournapeau and Andrew Straw to provide RPMs and DEBs. Again, things will go faster if someone else can build the Windows binaries. But if not, my new programmers and I will figure out how to make the binaries for Windows. We can also make the OS X binaries especially if Robert Kern is stilling willing to help. I will also draft a release announcement and give everyone time to comment on it. I will either need to get access to the sourceforge site and the PyPi records or someone will have to update them for me. Timeline ======= If this is agreeable to everyone, I will make the NumPy branch on Friday and apply the relevant patches. Then if I can get someone else to make the Windows executables and upload the files, we should be able to have a new NumPy release before the beginning of the SciPy conference. As for the 0.5.3 SciPy branch, we can discuss this in some detail if everyone is OK with the basic plan. In general, I hope that I will be able to have a 1.0.3.1 NumPy release before August 20th. Perhaps we could even make the 0.5.3 branch by the 20th. Fortunately, as David said earlier the main issue is getting a new release of NumPy out. Resources ======== As I mentioned I just hired 2 full-time programmers to work on NIPY who will be able to help me get the binaries built and tested for the different platforms. All 3 of us will be at the SciPy conference next week. So we will hopefully be able to solve whatever problems we run into very quickly given that it will be so easy to get help. Additionally, David Cournapeau has said that he is willing to help get a new release of SciPy out. He has already been busy at work squashing bugs. Sincerely, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From steve at shrogers.com Thu Aug 9 08:34:47 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Thu, 09 Aug 2007 06:34:47 -0600 Subject: [Numpy-discussion] APL2007 Update Message-ID: <46BB09E7.7070208@shrogers.com> Attached is an updated announcement for APL2007: Arrays and Objects. 21-23 October 2007 Montreal, Canada APL = Array Programming Languages -------------- next part -------------- A non-text attachment was scrubbed... Name: APL2007Ann-2-1.pdf Type: application/pdf Size: 18084 bytes Desc: not available URL: From nwagner at iam.uni-stuttgart.de Thu Aug 9 09:47:04 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 09 Aug 2007 15:47:04 +0200 Subject: [Numpy-discussion] Working with lists Message-ID: <46BB1AD8.2090009@iam.uni-stuttgart.de> Hi all, I have a list e.g. >>> bounds [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0), (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0), (16.0, 18.0), (18.0, 20)] How can I extract the first value of each pair given in parenthesis i.e. 1950,1800,1600,1400,... ? Nils From kwgoodman at gmail.com Thu Aug 9 09:53:00 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 9 Aug 2007 15:53:00 +0200 Subject: [Numpy-discussion] Working with lists In-Reply-To: <46BB1AD8.2090009@iam.uni-stuttgart.de> References: <46BB1AD8.2090009@iam.uni-stuttgart.de> Message-ID: On 8/9/07, Nils Wagner wrote: > [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0), > (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0), > (16.0, 18.0), (18.0, 20)] > > How can I extract the first value of each pair given in parenthesis i.e. > 1950,1800,1600,1400,... ? Here's one way: [z[0] for z in bounds] From lorrmann at physik.uni-wuerzburg.de Thu Aug 9 09:54:21 2007 From: lorrmann at physik.uni-wuerzburg.de (volker) Date: Thu, 9 Aug 2007 13:54:21 +0000 (UTC) Subject: [Numpy-discussion] Working with lists References: <46BB1AD8.2090009@iam.uni-stuttgart.de> Message-ID: Nils Wagner iam.uni-stuttgart.de> writes: > > Hi all, > > I have a list e.g. > >>> bounds > [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0), > (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0), > (16.0, 18.0), (18.0, 20)] > > How can I extract the first value of each pair given in parenthesis i.e. > 1950,1800,1600,1400,... ? > > Nils > Its easy i think: bounds_0 = (array(bounds)[:,0]).tolist() volker From gruben at bigpond.net.au Thu Aug 9 10:06:49 2007 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri, 10 Aug 2007 00:06:49 +1000 Subject: [Numpy-discussion] Working with lists In-Reply-To: References: <46BB1AD8.2090009@iam.uni-stuttgart.de> Message-ID: <46BB1F79.2040204@bigpond.net.au> FWIW, The list comprehension is faster than using map() In [7]: %timeit map(lambda x:x[0],bounds) 10000 loops, best of 3: 49.6 -?s per loop In [8]: %timeit [x[0] for x in bounds] 10000 loops, best of 3: 20.8 -?s per loop Gary R. Keith Goodman wrote: > On 8/9/07, Nils Wagner wrote: >> [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0), >> (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0), >> (16.0, 18.0), (18.0, 20)] >> >> How can I extract the first value of each pair given in parenthesis i.e. >> 1950,1800,1600,1400,... ? > > Here's one way: > > [z[0] for z in bounds] From cournape at gmail.com Thu Aug 9 11:00:21 2007 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Aug 2007 00:00:21 +0900 Subject: [Numpy-discussion] Where to put misc C function in numpy ? Message-ID: <5b8d13220708090800y1993abd4j74ddc64618b17873@mail.gmail.com> Hi, Following the thread on facilities for SIMD friendly allocations, I have a basic private branch ready for review, but I have one problem: where to put the allocation functions ? The problem is the following: data buffers are allocated/deallocated with functions defined in ndarrayobject,h PyMemData_NEW(ptr) malloc(ptr) ... Which I would replace with PyMemData_NEW(ptr) npy_aligned_alloc(ptr, DEF_ALIGNMENT) Where to define npy_aligned_alloc ? As PyMemData_NEW is used outside numpy.core, the function needs to be available somewhat "publically", but as far as I understand numpy code structure, there is no such facility available (eg a pure C library, totally unaware of python, which would contain some useful tools for numpy), right ? David From cournape at gmail.com Thu Aug 9 11:03:31 2007 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Aug 2007 00:03:31 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> <46BACFA4.5010707@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220708090803g7c062b67he6735aa9f2d43d89@mail.gmail.com> On 8/9/07, Charles R Harris wrote: > > > On 8/9/07, David Cournapeau wrote: > > Charles R Harris wrote: > > > > > > Ah, you want it in C. > > What would be the use to get SIMD aligned arrays in python ? > > If I wanted a fairly specialized routine and didn't want to touch the guts > of numpy, I would pass the aligned array to a C function and use the data > pointer. The python code would be just a high level wrapper. You might even > be able to use ctypes to pass the pointer into a library function. It's not > necessary to code everything in C using the python C API. I certainly do not argue on this point. But if it was specialized, there would be no point putting in in numpy in the first place. What I hope is that at some point, the aligned allocators can be used inside core numpy to optimize things internally (ufunc, etc...). Those facilities would be really useful for many optimized libraries, which are all C: as such, doing it in C makes sense, no ? David From cournape at gmail.com Thu Aug 9 11:14:05 2007 From: cournape at gmail.com (David Cournapeau) Date: Fri, 10 Aug 2007 00:14:05 +0900 Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement In-Reply-To: <20070809094023.GI9452@mentat.za.net> References: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp> <20070808095330.GO30988@mentat.za.net> <46BA853D.90901@ar.media.kyoto-u.ac.jp> <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp> <20070809094023.GI9452@mentat.za.net> Message-ID: <5b8d13220708090814k74a47c09h867b68d9b4fab19f@mail.gmail.com> On 8/9/07, Stefan van der Walt wrote: > > It doesn't really matter where the memory allocation occurs, does it? > As far as I understand, the underlying fftw function has some flag to > indicate when the data is aligned. If so, we could expose that flag > in Python, and do something like > > x = align16(data) > _fft(x, is_aligned=True) > > I am not intimately familiar with the fft wrappers, so maybe I'm > missing something more fundamental. You can do that, but this is only a special case of what I have in mind. For example, what if you want to call functions which are relatively cheap, but called many times, and want an aligned array ? Going back and forth would be a huge waste. Also, having aligned buffers internally (in C_, even for non array data, can be useful (eg filters, and maybe even core numpy functionalities like ufunc, etc...). Another point I forgot to mention before is that we can define a default alignment which would already be SIMD friendly (as done on Mac OS X or FreeBSD by default malloc) for *all* numpy arrays at 0 cost: for fft, this means that most arrays would already by as wanted, meaning a huge boost of performances for free. Basically, the functionalities would be more usable in C, without too much constraint, because frankly, the implementation is not difficult: I have something almost ready, and the patch is 7kb, including code to detect platform dependent aligned allocator. The C code can be tested really easily (since it is independent of python). David From kwgoodman at gmail.com Thu Aug 9 12:28:42 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 9 Aug 2007 18:28:42 +0200 Subject: [Numpy-discussion] Working with lists In-Reply-To: <46BB1F79.2040204@bigpond.net.au> References: <46BB1AD8.2090009@iam.uni-stuttgart.de> <46BB1F79.2040204@bigpond.net.au> Message-ID: On 8/9/07, Gary Ruben wrote: > FWIW, > The list comprehension is faster than using map() > > In [7]: %timeit map(lambda x:x[0],bounds) > 10000 loops, best of 3: 49.6 -?s per loop > > In [8]: %timeit [x[0] for x in bounds] > 10000 loops, best of 3: 20.8 -?s per loop zip is even faster on my computer: >> timeit map(lambda x:x[0], bounds) 100000 loops, best of 3: 5.48 ?s per loop >> timeit [x[0] for x in bounds] 100000 loops, best of 3: 2.69 ?s per loop >> timeit a, b = zip(*bounds) 100000 loops, best of 3: 2.57 ?s per loop From Chris.Barker at noaa.gov Thu Aug 9 12:58:26 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 09 Aug 2007 09:58:26 -0700 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 In-Reply-To: References: Message-ID: <46BB47B2.20601@noaa.gov> Jarrod Millman wrote: > I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy > 0.5.3. Wonderful! Thanks. > Releasing SciPy 0.5.3 > We can also make the OS X > binaries especially if Robert Kern is stilling willing to help. What form will these take? It would be great if we could have Universal binaries, with no dependencies (other than Python and numpy, of course) -- I think that is now possible, Robert would certainly know. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paul at rudin.co.uk Thu Aug 9 13:14:41 2007 From: paul at rudin.co.uk (Paul Rudin) Date: Thu, 09 Aug 2007 18:14:41 +0100 Subject: [Numpy-discussion] Working with lists References: <46BB1AD8.2090009@iam.uni-stuttgart.de> <46BB1F79.2040204@bigpond.net.au> Message-ID: <87sl6s8wam.fsf@rudin.co.uk> "Keith Goodman" writes: > On 8/9/07, Gary Ruben wrote: >> FWIW, >> The list comprehension is faster than using map() >> >> In [7]: %timeit map(lambda x:x[0],bounds) >> 10000 loops, best of 3: 49.6 -?s per loop >> >> In [8]: %timeit [x[0] for x in bounds] >> 10000 loops, best of 3: 20.8 -?s per loop > > zip is even faster on my computer: > >>> timeit map(lambda x:x[0], bounds) > 100000 loops, best of 3: 5.48 ?s per loop >>> timeit [x[0] for x in bounds] > 100000 loops, best of 3: 2.69 ?s per loop >>> timeit a, b = zip(*bounds) > 100000 loops, best of 3: 2.57 ?s per loop itertools.izip is faster yet on mine. From robert.kern at gmail.com Thu Aug 9 14:07:36 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 09 Aug 2007 13:07:36 -0500 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 In-Reply-To: <46BB47B2.20601@noaa.gov> References: <46BB47B2.20601@noaa.gov> Message-ID: <46BB57E8.7060100@gmail.com> Christopher Barker wrote: > > Jarrod Millman wrote: >> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy >> 0.5.3. > > Wonderful! Thanks. > >> Releasing SciPy 0.5.3 >> We can also make the OS X >> binaries especially if Robert Kern is stilling willing to help. > > What form will these take? It would be great if we could have Universal > binaries, with no dependencies (other than Python and numpy, of course) > -- I think that is now possible, Robert would certainly know. Yes. http://mail.python.org/pipermail/pythonmac-sig/2007-June/018986.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Thu Aug 9 14:12:00 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 09 Aug 2007 08:12:00 -1000 Subject: [Numpy-discussion] rant against from numpy import * / from pylab import * In-Reply-To: References: <45FA4377.6010201@hawaii.edu> Message-ID: <46BB58F0.60507@hawaii.edu> Sebastian, I am trying to move things in the direction of simpler and cleaner namespaces, but I think that to do it well requires a systematic approach to the continuing numpification of mpl, so I have been working on mlab.py before tackling pylab. I hope everything can be done via reorganization, without requiring any import tricks, but that remains to be seen. I'm sorry this is taking a long time, but it is in the works. Eric Sebastian Haase wrote: > Hi all, > Here a quick update: > I'm trying to have a concise / sparse module with containing only > pylab-specific names and not all names I already have in numpy. > To easy typing I want to call numpy "N" and my pylab "P". > > I'm now using this code: > > import matplotlib, new > matplotlib.use('WXAgg') > from matplotlib import pylab > P = new.module("pylab_sparse","""pylab module minus stuff alreay > in numpy""") > for k,v in pylab.__dict__.iteritems(): > try: > if v is N.__dict__[k]: > continue > except KeyError: > pass > P.__dict__[k] = v > > P.ion() > del matplotlib, new, pylab > > > The result is "some" reduction in the number of non-pylab-specific > names in my "P"-module. However there seem to be still many extra > names left, like e.g.: > alltrue, amax, array, ... > look at this: > # 20070802 > # >>> len(dir(pylab)) > # 441 > # >>> len(dir(P)) > # 346 > # >>> P.nx.numpy.__version__ > # '1.0.1' > # >>> N.__version__ > # '1.0.1' > # >>> N.alltrue > # > # >>> P.alltrue > # > # >>> N.alltrue.__doc__ > # 'Perform a logical_and over the given axis.' > # >>> P.alltrue.__doc__ > # >>> #N.alltrue(x, axis=None, out=None) > # >>> #P.alltrue(x, axis=0) > > I'm using matplotlib with > __version__ = '0.90.0' > __revision__ = '$Revision: 3003 $' > __date__ = '$Date: 2007-02-06 22:24:06 -0500 (Tue, 06 Feb 2007) $' > > > Any hint how to further reduce the number of names in "P" ? > My ideal would be that the "P" module (short for pylab) would only > contain the stuff described in the __doc__ strings of `pylab.py` and > `__init__.py`(in matplotlib) (+ plus some extra, undocumented, yet > pylab specific things) > > Thanks > -Sebastian > > > On 3/16/07, Eric Firing wrote: >> Sebastian Haase wrote: >>> Hi! >>> I use the wxPython PyShell. >>> I like especially the feature that when typing a module and then the >>> dot "." I get a popup list of all available functions (names) inside >>> that module. >>> >>> Secondly, I think it really makes code clearer when one can see where >>> a function comes from. >>> >>> I have a default >>> import numpy as N >>> executed before my shell even starts. >>> In fact I have a bunch of my "standard" modules imported as >> single capital letter>. >>> >>> This - I think - is a good compromise to the commonly used "extra >>> typing" and "unreadable" argument. >>> >>> a = sin(b) * arange(10,50, .1) * cos(d) >>> vs. >>> a = N.sin(b) * N.arange(10,50, .1) * N.cos(d) >> I generally do the latter, but really, all those "N." bits are still >> visual noise when it comes to reading the code--that is, seeing the >> algorithm rather than where the functions come from. I don't think >> there is anything wrong with explicitly importing commonly-used names, >> especially things like sin and cos. >> >>> I would like to hear some comments by others. >>> >>> >>> On a different note: I just started using pylab, so I did added an >>> automatic "from matplotlib import pylab as P" -- but now P contains >>> everything that I already have in N. It makes it really hard to >>> *find* (as in *see* n the popup-list) the pylab-only functions. -- >>> what can I do about this ? >> A quick and dirty solution would be to comment out most of the imports >> in pylab.py; they are not needed for the pylab functions and are there >> only to give people lots of functionality in a single namespace. >> >> I am cross-posting this to matplotlib-users because it involves mpl, and >> an alternative solution would be for us to add an rcParam entry to allow >> one to turn off all of the namespace consolidation. A danger is that if >> someone is using "from pylab import *" in a script, then whether it >> would run would depend on the matplotlibrc file. To get around that, >> another possibility would be to break pylab.py into two parts, with >> pylab.py continuing to do the namespace consolidation and importing the >> second part, which would contain the actual pylab functions. Then if >> you don't want the namespace consolidation, you could simply import the >> second part instead of pylab. There may be devils in the details, but >> it seems to me that this last alternative--splitting pylab.py--might >> make a number of people happier while having no adverse effects on >> everyone else. >> >> Eric >>> >>> Thanks, >>> Sebastian > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Thu Aug 9 17:07:02 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 09 Aug 2007 16:07:02 -0500 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 In-Reply-To: <46BB47B2.20601@noaa.gov> References: <46BB47B2.20601@noaa.gov> Message-ID: <46BB81F6.2060100@gmail.com> Christopher Barker wrote: > > Jarrod Millman wrote: >> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy >> 0.5.3. > > Wonderful! Thanks. > >> Releasing SciPy 0.5.3 >> We can also make the OS X >> binaries especially if Robert Kern is stilling willing to help. > > What form will these take? It would be great if we could have Universal > binaries, with no dependencies (other than Python and numpy, of course) > -- I think that is now possible, Robert would certainly know. Whoops, the email I referenced is missing an important bit: use this gfortran binary instead of the one from hpc.sf.net. http://r.research.att.com/tools/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Thu Aug 9 22:42:05 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 10 Aug 2007 11:42:05 +0900 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 In-Reply-To: References: Message-ID: <46BBD07D.4000502@ar.media.kyoto-u.ac.jp> Jarrod Millman wrote: > I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy > 0.5.3. In order to actually get them both released I will obviously > need some help. But given the amount of work required and the number > of people who have offered to help, I believe this will be doable. > > Given the extensive discussion about what is needed for these > releases, I am fairly confident that I know what needs to be done. I > will try to be very specific about what I will do and what I will need > help with. Basically, I am just rewriting the plan described by > Robert Kern last month. Please let me know if you have any > suggestions/comments/problems with this plan and please let me know if > you can commit to helping in any way. > > [[NOTE: I just (on Monday) hired 2 full-time programmers to work on > the neuroimaging in python (NIPY) project, so they will be able to > help out with bug fixing as well as testing the pre-releases on > different platforms.]] > > Releasing NumPy 1.0.3.1 > =================== > On July 24th, Robert suggested making a numpy 1.0.3.1 point release. > He was concerned that there were some changes in numpy.distutils that > needed to cook a little longer. So I am offering to make a 1.0.3.1 > release. If Travis or one of the other core NumPy developers want to > make a 1.0.4 release in the next week or so, then there won't be a > need for a 1.0.3.1 release. > > First, I will branch from the 1.0.3 tag: > svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3 > http://svn.scipy.org/svn/numpy/branches/1.0.3 > > Second, I will apply all the patches necessary to build scipy from > svn, but nothing else. Then I will just follow the NumPy release > instructions: http://projects.scipy.org/scipy/numpy/wiki/MakingReleases > I will make the tarball and source rpm; but will need help with > everything else. Things will go faster if someone else can build the > Windows binaries. For windows, I understand the main problem is ATLAS, right ? I have discussed a bit the issue with Clint Whaley (the main developer of ATLAS), and I think I got a way to build ATLAS without using SSE (which caused trouble for some "old" ATHLON last time, AFAIK). I can provide the informations to you; I would just need someone to test the binaries on a non SSE machine, since I don't have any myself. cheers, David From javier.maria.torres at ericsson.com Fri Aug 10 04:01:27 2007 From: javier.maria.torres at ericsson.com (Javier Maria Torres (MI/EEM)) Date: Fri, 10 Aug 2007 10:01:27 +0200 Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin Message-ID: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se> Hi, I get the following output when trying to compile the latest Numpy SVN snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I also have the Windows version installed, this might cause problems?). I also include the (meager) site.cfg used. I would appreciate any comment. Thanks a lot, and greetings, Javier Torres -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: site.cfg Type: application/octet-stream Size: 84 bytes Desc: site.cfg URL: From pearu at cens.ioc.ee Fri Aug 10 04:23:11 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 10 Aug 2007 11:23:11 +0300 (EEST) Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin In-Reply-To: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se > References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se> Message-ID: <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee> On Fri, August 10, 2007 11:01 am, Javier Maria Torres (MI/EEM) wrote: > Hi, > > I get the following output when trying to compile the latest Numpy SVN > snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I > also have the Windows version installed, this might cause problems?). I > also include the (meager) site.cfg used. I would appreciate any comment. In your build command python setup.py config --compiler=mingw32 build --compiler=mingw32 install you are not using cygwin gcc compiler but mingw32. I think you cannot do this - don't ask why, some time ago I failed to determine the cause. Anyway, under cygwin just try python setup.py build then it should just pick up cygwin compiler. Or, execute python setup.py config --compiler=mingw32 build --compiler=mingw32 install from Windows cmd line. HTH, Pearu From javier.maria.torres at ericsson.com Fri Aug 10 04:27:59 2007 From: javier.maria.torres at ericsson.com (Javier Maria Torres (MI/EEM)) Date: Fri, 10 Aug 2007 10:27:59 +0200 Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin In-Reply-To: <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee> References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se> <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee> Message-ID: <262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se> Hi Pearu, Using just "python setup.py build" I get the following change in the same error: ... compile options: '-I/usr/local/include/python2.5 -Inumpy/core/src -Inumpy/core/include -I/usr/local/include/python2.5 -c' gcc: _configtest.c gcc _configtest.o -L/usr/local/lib -L/usr/lib -o _configtest.exe /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld: crt0.o: No such file: No such file or directory collect2: ld returned 1 exit status /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld: crt0.o: No such file: No such file or directory collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o Traceback (most recent call last): ... I completely remove the build directory between builds, just in case this helps. Thanks a lot, and greetings, Javier -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Pearu Peterson Sent: viernes, 10 de agosto de 2007 10:23 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Fail to compile Numpy on Cygwin On Fri, August 10, 2007 11:01 am, Javier Maria Torres (MI/EEM) wrote: > Hi, > > I get the following output when trying to compile the latest Numpy SVN > snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I > also have the Windows version installed, this might cause problems?). > I also include the (meager) site.cfg used. I would appreciate any comment. In your build command python setup.py config --compiler=mingw32 build --compiler=mingw32 install you are not using cygwin gcc compiler but mingw32. I think you cannot do this - don't ask why, some time ago I failed to determine the cause. Anyway, under cygwin just try python setup.py build then it should just pick up cygwin compiler. Or, execute python setup.py config --compiler=mingw32 build --compiler=mingw32 install from Windows cmd line. HTH, Pearu _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From pearu at cens.ioc.ee Fri Aug 10 04:40:24 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 10 Aug 2007 11:40:24 +0300 (EEST) Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin In-Reply-To: <262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se > References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se> <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee> <262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se> Message-ID: <51065.129.240.228.53.1186735224.squirrel@cens.ioc.ee> On Fri, August 10, 2007 11:27 am, Javier Maria Torres (MI/EEM) wrote: > Hi Pearu, > > Using just "python setup.py build" I get the following change in the > same error: > > ... > compile options: '-I/usr/local/include/python2.5 -Inumpy/core/src > -Inumpy/core/include -I/usr/local/include/python2.5 -c' > gcc: _configtest.c > gcc _configtest.o -L/usr/local/lib -L/usr/lib -o _configtest.exe > /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld: > crt0.o: No such file: No such file or directory > collect2: ld returned 1 exit status > /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld: > crt0.o: No such file: No such file or directory > collect2: ld returned 1 exit status > failure. Check were is the crt0.o file in your system, if it exists then your cygwin environment is not set up properly, I guess. If not, you might need to install compiler development libraries to cygwin system. Pearu From millman at berkeley.edu Fri Aug 10 04:45:07 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 10 Aug 2007 01:45:07 -0700 Subject: [Numpy-discussion] NumPy-1.0.3.x Message-ID: Hello everyone, I made a mumpy-1.0.3.x branch from the 1.0.3 tag and tried to get everything working (see changesets 3957-3961). I added back get_path to numpy/distutils/misc_util.py, which is used by Lib/odr/setup.py in scipy 0.5.2. I also tried to clean up a few issues by doing the same thing that was done to the trunk in: http://projects.scipy.org/scipy/numpy/changeset/3845 http://projects.scipy.org/scipy/numpy/changeset/3848 I am still seeing 2 problems: 1) http://projects.scipy.org/scipy/numpy/ticket/535 2) when I run scipy.test(1,10), I get: check_cosine_weighted_infinite (scipy.integrate.tests.test_quadpack.test_quad)Illegal instruction If anyone has any ideas as to what is wrong, please let me know. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From ryanlists at gmail.com Fri Aug 10 09:47:56 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Fri, 10 Aug 2007 08:47:56 -0500 Subject: [Numpy-discussion] I am volunteering to be the release manager for NumPy 1.0.3.1 and SciPy 0.5.2 In-Reply-To: <46BBD07D.4000502@ar.media.kyoto-u.ac.jp> References: <46BBD07D.4000502@ar.media.kyoto-u.ac.jp> Message-ID: I have access to one non-SSE (or at least non-SSE2) machine that I can test on. I sort of championed this cause the last time this came up out of fear that my students would have these problems. No one did. So, I don't know how many non-SSE machines are really out there. This may not be a big problem. If we can support non-SSE machines without too much trouble or create one windows binary that works for "everyone" without performance loss, great. I am still willing to test. Ryan On 8/9/07, David Cournapeau wrote: > Jarrod Millman wrote: > > I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy > > 0.5.3. In order to actually get them both released I will obviously > > need some help. But given the amount of work required and the number > > of people who have offered to help, I believe this will be doable. > > > > Given the extensive discussion about what is needed for these > > releases, I am fairly confident that I know what needs to be done. I > > will try to be very specific about what I will do and what I will need > > help with. Basically, I am just rewriting the plan described by > > Robert Kern last month. Please let me know if you have any > > suggestions/comments/problems with this plan and please let me know if > > you can commit to helping in any way. > > > > [[NOTE: I just (on Monday) hired 2 full-time programmers to work on > > the neuroimaging in python (NIPY) project, so they will be able to > > help out with bug fixing as well as testing the pre-releases on > > different platforms.]] > > > > Releasing NumPy 1.0.3.1 > > =================== > > On July 24th, Robert suggested making a numpy 1.0.3.1 point release. > > He was concerned that there were some changes in numpy.distutils that > > needed to cook a little longer. So I am offering to make a 1.0.3.1 > > release. If Travis or one of the other core NumPy developers want to > > make a 1.0.4 release in the next week or so, then there won't be a > > need for a 1.0.3.1 release. > > > > First, I will branch from the 1.0.3 tag: > > svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3 > > http://svn.scipy.org/svn/numpy/branches/1.0.3 > > > > Second, I will apply all the patches necessary to build scipy from > > svn, but nothing else. Then I will just follow the NumPy release > > instructions: http://projects.scipy.org/scipy/numpy/wiki/MakingReleases > > I will make the tarball and source rpm; but will need help with > > everything else. Things will go faster if someone else can build the > > Windows binaries. > For windows, I understand the main problem is ATLAS, right ? I have > discussed a bit the issue with Clint Whaley (the main developer of > ATLAS), and I think I got a way to build ATLAS without using SSE (which > caused trouble for some "old" ATHLON last time, AFAIK). I can provide > the informations to you; I would just need someone to test the binaries > on a non SSE machine, since I don't have any myself. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From Glen.Mabey at swri.org Fri Aug 10 12:20:16 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Fri, 10 Aug 2007 11:20:16 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070607214620.GM6116@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> Message-ID: <20070810162016.GA12992@bams.ccf.swri.edu> Hello, I posted this a while back and didn't get any replies. I'm running in to this issue again from a different aspect, and today I've been trying to figure out which method of ndarray needs to be overloaded for memmap so that the the ._mmap attribute gets handled appropriately. But, I have not been able to figure out what methods of ndarray are getting used in code such as this: >>> import numpy >>> amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, >>> shape=(4,5), mode='w+' ) >>> b = amemmap[2:3] >>> b >>> Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in ignored memmap([[ 0., 0., 0., 0., 0.]], dtype=float32) Furthermore, can anyone enlighten me as to why an AttributeError exception would be ignored? Am I using numpy.memmap instances appropriately? Thank you, Glen Mabey On Thu, Jun 07, 2007 at 04:46:20PM -0500, Glen W. Mabey wrote: > Hello, > > When assigning a variable that is the transpose() of a memmap array, the > ._mmap member doesn't get copied, I guess: > > In [1]:import numpy > > In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ) > > In [3]:bmemmap = amemmap.transpose() > > In [4]:bmemmap.close() > --------------------------------------------------------------------------- > Traceback (most recent call last) > > /home/gmabey/src/R9619_dev_acqlibweb/Projects/R9619_NChannelDetection/NED/ in () > > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py > in close(self) > 86 > 87 def close(self): > ---> 88 self._mmap.close() > 89 > 90 def __del__(self): > > : 'NoneType' object has no attribute 'close' > > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py(88)close() > 87 def close(self): > ---> 88 self._mmap.close() > 89 > > > > > This is an issue when the data is accessed in an order that is different > from how it is stored on disk, as: > > bmemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ).transpose() > > So the object that was originally produced not accessible. I imagine > there is some better way to indicate order of dimensions, but > regardless, doing > > In [4]:bmemmap._mmap = amemmap._mmap > > is a hack workaround. > > Best regards, > Glen Mabey > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From Glen.Mabey at swri.org Fri Aug 10 12:30:05 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Fri, 10 Aug 2007 11:30:05 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070810162016.GA12992@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> Message-ID: <20070810163005.GB13557@bams.ccf.swri.edu> On Fri, Aug 10, 2007 at 11:20:16AM -0500, Glen W. Mabey wrote: > I posted this a while back and didn't get any replies. I'm running in > to this issue again from a different aspect, and today I've been trying > to figure out which method of ndarray needs to be overloaded for memmap > so that the the ._mmap attribute gets handled appropriately. Oh, and Python 2.5.1 numpy svn as of yesterday ... AMD opteron, Linux/Debian Glen From robert.kern at gmail.com Fri Aug 10 14:26:10 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 10 Aug 2007 13:26:10 -0500 Subject: [Numpy-discussion] NumPy-1.0.3.x In-Reply-To: References: Message-ID: <46BCADC2.30104@gmail.com> Jarrod Millman wrote: > 2) when I run scipy.test(1,10), I get: > check_cosine_weighted_infinite > (scipy.integrate.tests.test_quadpack.test_quad)Illegal instruction > > If anyone has any ideas as to what is wrong, please let me know. What platform are you on and what underlying libraries (ATLAS, etc.) did you compile with? "Illegal instruction" usually comes from using an ATLAS library that was compiled with a higher level of SSE than your CPU supports. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Glen.Mabey at swri.org Fri Aug 10 17:14:38 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Fri, 10 Aug 2007 16:14:38 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070810162016.GA12992@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> Message-ID: <20070810211438.GF13557@bams.ccf.swri.edu> [I keep posting hoping that someone knowledgeable in these things will take notice ...] Just a couple of more notes regarding this numpy.memmap issue. It seems that any slice of a numpy.memmap that is greater than 1-d has a similar problem. In [1]:import numpy In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ) In [3]:amemmap[1,3:4] Out[3]:memmap([ 0.], dtype=float32) In [4]:amemmap[0:1,3:4] Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in ignored Out[4]:memmap([[ 0.]], dtype=float32) A very naive hack-fix of overloading the __getitem__ method of the numpy.memmap class such that the result of ndarray.__getitem__ gets the ._mmap attribute added didn't work ... I tried to follow the program flow into the bowels of multiarraymodule.c, but that was beyond me ... This problem started showing up when I changed to python 2.5 and persists in 2.5.1. I've considered switching back to 2.4 but I really need 64-bit array indexing ... Best Regards, Glen Mabey On Fri, Aug 10, 2007 at 11:20:16AM -0500, Glen W. Mabey wrote: > Hello, > > I posted this a while back and didn't get any replies. I'm running in > to this issue again from a different aspect, and today I've been trying > to figure out which method of ndarray needs to be overloaded for memmap > so that the the ._mmap attribute gets handled appropriately. > > But, I have not been able to figure out what methods of ndarray are > getting used in code such as this: > > >>> import numpy > >>> amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, > >>> shape=(4,5), mode='w+' ) > >>> b = amemmap[2:3] > >>> b > >>> Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in ignored memmap([[ 0., 0., 0., 0., 0.]], dtype=float32) > > > Furthermore, can anyone enlighten me as to why an AttributeError > exception would be ignored? > > Am I using numpy.memmap instances appropriately? > > Thank you, > Glen Mabey > > > > > On Thu, Jun 07, 2007 at 04:46:20PM -0500, Glen W. Mabey wrote: > > Hello, > > > > When assigning a variable that is the transpose() of a memmap array, the > > ._mmap member doesn't get copied, I guess: > > > > In [1]:import numpy > > > > In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ) > > > > In [3]:bmemmap = amemmap.transpose() > > > > In [4]:bmemmap.close() > > --------------------------------------------------------------------------- > > Traceback (most recent call last) > > > > /home/gmabey/src/R9619_dev_acqlibweb/Projects/R9619_NChannelDetection/NED/ in () > > > > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py > > in close(self) > > 86 > > 87 def close(self): > > ---> 88 self._mmap.close() > > 89 > > 90 def __del__(self): > > > > : 'NoneType' object has no attribute 'close' > > > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py(88)close() > > 87 def close(self): > > ---> 88 self._mmap.close() > > 89 > > > > > > > > > > This is an issue when the data is accessed in an order that is different > > from how it is stored on disk, as: > > > > bmemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ).transpose() > > > > So the object that was originally produced not accessible. I imagine > > there is some better way to indicate order of dimensions, but > > regardless, doing > > > > In [4]:bmemmap._mmap = amemmap._mmap > > > > is a hack workaround. > > > > Best regards, > > Glen Mabey > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From david at ar.media.kyoto-u.ac.jp Sat Aug 11 03:06:23 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 11 Aug 2007 16:06:23 +0900 Subject: [Numpy-discussion] SIMD friendly allocators: first patch Message-ID: <46BD5FEF.7060602@ar.media.kyoto-u.ac.jp> Hi, I put a first version of aligned allocators for numpy here: http://projects.scipy.org/scipy/numpy/ticket/568 (sorry for the duplicate, but I had problems connecting with the trac server). I have tested it on linux only for now, but if problem arise, it should not be too difficult to solve (will test it soon on windows and mac os X). It does not provide yet high level interface (eg requestion python arrays with given alignment), but if people agree with the current design, those should not be too difficult to implement. cheers, David From david at ar.media.kyoto-u.ac.jp Mon Aug 13 01:43:52 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 13 Aug 2007 14:43:52 +0900 Subject: [Numpy-discussion] [f2py] Adding custom code in module initialization code Message-ID: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp> Hi, I would like to know if it is possible to tell f2py to call some functions inside the initialization function of a module ? I found a mention to add some function to the module function list, but nothing about the initialization function. cheers, David From pearu at cens.ioc.ee Mon Aug 13 01:57:59 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Mon, 13 Aug 2007 08:57:59 +0300 (EEST) Subject: [Numpy-discussion] [f2py] Adding custom code in module initialization code In-Reply-To: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp> References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp> Message-ID: <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee> On Mon, August 13, 2007 8:43 am, David Cournapeau wrote: > Hi, > > I would like to know if it is possible to tell f2py to call some > functions inside the initialization function of a module ? I found a > mention to add some function to the module function list, but nothing > about the initialization function. Yes, it is possible. Look for `usercode` statement in http://cens.ioc.ee/projects/f2py2e/usersguide/index.html In particular, see the `Extended F2PY usage` section for an example. HTH, Pearu From david at ar.media.kyoto-u.ac.jp Mon Aug 13 05:24:28 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 13 Aug 2007 18:24:28 +0900 Subject: [Numpy-discussion] [f2py] Adding custom code in module initialization code In-Reply-To: <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee> References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp> <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee> Message-ID: <46C0234C.4090802@ar.media.kyoto-u.ac.jp> Pearu Peterson wrote: > On Mon, August 13, 2007 8:43 am, David Cournapeau wrote: > >> Hi, >> >> I would like to know if it is possible to tell f2py to call some >> functions inside the initialization function of a module ? I found a >> mention to add some function to the module function list, but nothing >> about the initialization function. >> > > Yes, it is possible. Look for `usercode` statement in > > http://cens.ioc.ee/projects/f2py2e/usersguide/index.html > > In particular, see the `Extended F2PY usage` section for an example. > I see how to use usercode to add C function to the module, but how to tell f2py to call a given C function in the init* function ? For example, let's say I have a module foo, and I want the function init_foo to call the function bar() ? David From david at ar.media.kyoto-u.ac.jp Mon Aug 13 05:47:47 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 13 Aug 2007 18:47:47 +0900 Subject: [Numpy-discussion] [f2py] Adding custom code in module initialization code In-Reply-To: <46C0234C.4090802@ar.media.kyoto-u.ac.jp> References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp> <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee> <46C0234C.4090802@ar.media.kyoto-u.ac.jp> Message-ID: <46C028C3.3040502@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > Pearu Peterson wrote: > >> On Mon, August 13, 2007 8:43 am, David Cournapeau wrote: >> >> >>> Hi, >>> >>> I would like to know if it is possible to tell f2py to call some >>> functions inside the initialization function of a module ? I found a >>> mention to add some function to the module function list, but nothing >>> about the initialization function. >>> >>> >> Yes, it is possible. Look for `usercode` statement in >> >> http://cens.ioc.ee/projects/f2py2e/usersguide/index.html >> >> In particular, see the `Extended F2PY usage` section for an example. >> >> > I see how to use usercode to add C function to the module, but how to > tell f2py to call a given C function in the init* function ? For > example, let's say I have a module foo, and I want the function init_foo > to call the function bar() ? > Sorry, it was in front of my eyes for sometimes, and I didn't see it. David From listservs at mac.com Mon Aug 13 09:15:29 2007 From: listservs at mac.com (Chris Fonnesbeck) Date: Mon, 13 Aug 2007 13:15:29 +0000 (UTC) Subject: [Numpy-discussion] Vectorize leaks Message-ID: I have narrowed a memory leak in PyMC down to the vectorize() function in numpy. I have a simple inverse logit transformation function: invlogit = lambda x: 1.0 / (1.0 + exp(-1.0 * x)) which runs without leaking when used iteratively during simulations. However, when I try to vectorize it, the process' rsize grows each iteration of the simulation. Using a recent (<2 days old) svn build of numpy on OS X 10.4. C. From mpmusu at cc.usu.edu Mon Aug 13 11:04:32 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Mon, 13 Aug 2007 09:04:32 -0600 Subject: [Numpy-discussion] f2py and string arrays Message-ID: <46C07300.3060902@cc.usu.edu> Quick question... I have a Fortran function for f2py declared as follows (just an example). module test1 contains subroutine manip(length, array, a, b) integer :: length, a,b character(length) :: array(0:a-1,0:b-1) !f2py intent(in) length,a,b !f2py intent(inout) array array(0,0) = "1111" end subroutine manip end module test1 It compiles using f2py without issue. The f2py-generated docstring correctly lists the required arguments: manip - Function signature: manip(length,array,[a,b]) Required arguments: length : input int array : in/output rank-2 array('S') with bounds (a,b) However, I'm getting some type conversion errors when using the function in python: >>> from mymodule import test1 >>> import numpy >>> a=numpy.empty((3,3,'S4',order='F') >>> a[:,:]='2222' >>> test1.manip(4,a,3,3) Traceback (most recent call last): File "(stdin)", line 1, in (module) ValueError: failed to initialize intent(inout) array --expected elsize = 1 but got 4 -- input 'S' not compatible to 'c' Does f2py not work with numpy string arrays? I have some excellent alternate implementations for this type of thing, but would prefer an approach similar to that outlined above. Thanks, -Mark From aisaac at american.edu Mon Aug 13 11:51:49 2007 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 13 Aug 2007 11:51:49 -0400 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070810211438.GF13557@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu><20070810162016.GA12992@bams.ccf.swri.edu><20070810211438.GF13557@bams.ccf.swri.edu> Message-ID: On Fri, 10 Aug 2007, "Glen W. Mabey" apparently wrote: > It seems that any slice of a numpy.memmap that is greater than 1-d has > a similar problem. > In [1]:import numpy > In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ) > In [3]:amemmap[1,3:4] > Out[3]:memmap([ 0.], dtype=float32) > In [4]:amemmap[0:1,3:4] > Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in ignored > Out[4]:memmap([[ 0.]], dtype=float32) You have not heard from anyone on this yet, right? Please continue to post your findings. Cheers, Alan Isaac From Glen.Mabey at swri.org Mon Aug 13 12:19:45 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Mon, 13 Aug 2007 11:19:45 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> Message-ID: <20070813161945.GA12360@bams.ccf.swri.edu> On Mon, Aug 13, 2007 at 11:51:49AM -0400, Alan G Isaac wrote: > You have not heard from anyone on this yet, right? Nope, but I'm glad to hear even this response. > Please continue to post your findings. At this point, I'm guessing that the __getitem__() method of ndarray returns a numpy.memmap instance instead of a ndarray instance, but that numpy.memmap.__new__() is not getting executed, resulting in ._mmap not getting initialized, so that when numpy.memmap.__del__() gets called, it chokes because ._mmap doesn't exist. For my purposes, I am mostly opening these files read-only, so I don't need to have flush() called. For the returned valued of __getitem__, it is not appropriate to have ._mmap.close() called (the other operation in numpy.memmap.__del__(). So, I just commented out the __del__() overloaded function. When I do open memmap'ed files read-write, I can manually perform a flush() operation before I'm done, and things seem to work out okay even though .close() isn't called. As I have tried to think through what should be the appropriate behavior for the returned value of __getitem__, I have not been able to see an appropriate solution (let alone know how to implement it) to this issue. Thank you, Glen Mabey From david.huard at gmail.com Mon Aug 13 20:14:12 2007 From: david.huard at gmail.com (David Huard) Date: Mon, 13 Aug 2007 20:14:12 -0400 Subject: [Numpy-discussion] Vectorize leaks In-Reply-To: References: Message-ID: <91cf711d0708131714w7b143bc6r12435687d4b7e711@mail.gmail.com> Hi Chris, Same problem for ubuntu linux. Darn, I spent an hour tracking this bug and now I see you found it before... 2007/8/13, Chris Fonnesbeck : > > I have narrowed a memory leak in PyMC down to the vectorize() function > in numpy. I have a simple inverse logit transformation function: > > invlogit = lambda x: 1.0 / (1.0 + exp(-1.0 * x)) > > which runs without leaking when used iteratively during simulations. > However, when I try to vectorize it, the process' rsize grows each > iteration of the simulation. > > Using a recent (<2 days old) svn build of numpy on OS X 10.4. > > C. > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Tue Aug 14 00:23:26 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 14 Aug 2007 00:23:26 -0400 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070813161945.GA12360@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> Message-ID: On 13/08/07, Glen W. Mabey wrote: > As I have tried to think through what should be the appropriate > behavior for the returned value of __getitem__, I have not been able to > see an appropriate solution (let alone know how to implement it) to this > issue. Is the problem one of finalization? That is, making sure the memory map gets (flushed and) closed exactly once? In this case the numpythonic solution is to have only the original mmap object do any finalization; any slices contain a reference to it anyway, so they cannot be kept after it is collected. If the problem is that you want to do an explicit close/flush on a slice object, you could just always apply the close/flush to the base object of the slice if it has one or the slice itself if it doesn't. I'm afraid I don't really understand the problem but it seems like nobody who just knows the answer is about to speak up... Anne From josh.p.marshall at gmail.com Tue Aug 14 08:01:34 2007 From: josh.p.marshall at gmail.com (Josh Marshall) Date: Tue, 14 Aug 2007 22:01:34 +1000 Subject: [Numpy-discussion] OS X universal SciPy: success! In-Reply-To: References: Message-ID: <5D925F6F-6090-417D-8845-60459601777B@gmail.com> I've been trying to easily build an OS X universal SciPy for some time now. [1] My prior efforts consisted of lipo-ing together the PPC and x86 'Superpacks' put together by Chris Fonnesbeck [2], which worked for distributing my image processing app locally. However, this isn't really a good way to do it, specially not for putting up for general use. I came across a successful Universal build of gfortran [4,5], and then shortly found this message [5] claiming it could possibly be used to build a universal SciPy. So, I gave it a shot and it works! (with some tricks...) The build has both ppc and i386 architectures in every .so in the scipy install. The tests run fine on my G4, but I haven't yet had the chance to try it on an Intel Mac. If anyone is keen to do so, please let me know. There will need to be some modifications to numpy distutils, since it presumes that there isn't a universal Fortran. The patch below is just a hack to get it working, and will break any non-universal gfortran. Essentially, all that needs to happen is to have '-arch ppc -arch i386' added to any call to gfortran (both compile and link) , and the '-march' flags removed. What's the best way to add this functionality to numpy distutils? I couldn't think of any way to test for a universal compiler, other than trying to compile a test file and seeing if it dies with the multiple arch flags. Regards, Josh Marshall [1] http://mail.python.org/pipermail/pythonmac-sig/2006-December/ 018556.html [2] http://trichech.us/?page_id=5 [3] http://r.research.att.com/tools/ [4] http://r.research.att.com/gfortran-4.2.1.dmg [5] http://mail.python.org/pipermail/pythonmac-sig/2007-May/018975.html isengard:~/Development/Python/numpy-svn/numpy/distutils/fcompiler Josh $ svn diff gnu.py Index: gnu.py =================================================================== --- gnu.py (revision 3964) +++ gnu.py (working copy) @@ -102,6 +102,7 @@ minor) opt.extend(['-undefined', 'dynamic_lookup', '-bundle']) + opt.extend(['-arch ppc -arch i386']) else: opt.append("-shared") if sys.platform.startswith('sunos'): @@ -183,12 +184,13 @@ # Since Apple doesn't distribute a GNU Fortran compiler, we # can't add -arch ppc or -arch i386, as only their version # of the GNU compilers accepts those. - for a in '601 602 603 603e 604 604e 620 630 740 7400 7450 750'\ - '403 505 801 821 823 860'.split(): - if getattr(cpu,'is_ppc%s'%a)(): - opt.append('-mcpu='+a) - opt.append('-mtune='+a) - break + #for a in '601 602 603 603e 604 604e 620 630 740 7400 7450 750'\ + # '403 505 801 821 823 860'.split(): + # if getattr(cpu,'is_ppc%s'%a)(): + # opt.append('-mcpu='+a) + # opt.append('-mtune='+a) + # break + opt.append('-arch ppc -arch i386') return opt From markbak at gmail.com Wed Aug 15 05:07:26 2007 From: markbak at gmail.com (mark) Date: Wed, 15 Aug 2007 02:07:26 -0700 Subject: [Numpy-discussion] deleting value from array Message-ID: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> I am trying to delete a value from an array This seems to work as follows >>> a = array([1,2,3,4]) >>> a = delete( a, 1 ) >>> a array([1, 3, 4]) But wouldn't it make more sense to have a function like a.delete(1) ? I now get the feeling the delete command needs to copy the entire array with exception of the deleted item. I guess this is a hard thing to do efficiently? Thanks, Mark From matthieu.brucher at gmail.com Wed Aug 15 05:19:35 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 15 Aug 2007 11:19:35 +0200 Subject: [Numpy-discussion] deleting value from array In-Reply-To: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> Message-ID: > > I now get the feeling the delete command needs to copy the entire > array with exception of the deleted item. I guess this is a hard thing > to do efficiently? > Well, if you don't copy the array, the value will always remain present. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From Andy.cheesman at bristol.ac.uk Wed Aug 15 05:53:05 2007 From: Andy.cheesman at bristol.ac.uk (Andy Cheesman) Date: Wed, 15 Aug 2007 10:53:05 +0100 Subject: [Numpy-discussion] Finding a row match within a numpy array Message-ID: <46C2CD01.5030307@bristol.ac.uk> Dear nice people I'm trying to match a row (b) within a large numpy array (a). My most successful attempt is below hit = equal(b, a) total_hits = add.reduce(hit, 1) max_hit = argmax(total_hits, 0) answer = a[max_hit] where ... a = array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) b = array([8, 9, 10, 11]) I was wondering if people could suggest a possible more efficient route as there seems to be numerous steps. Thanks Andy From markbak at gmail.com Wed Aug 15 06:09:16 2007 From: markbak at gmail.com (mark) Date: Wed, 15 Aug 2007 03:09:16 -0700 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <46C2CD01.5030307@bristol.ac.uk> References: <46C2CD01.5030307@bristol.ac.uk> Message-ID: <1187172556.613122.207400@r29g2000hsg.googlegroups.com> I think you can create an array with a true value in the right spot as folows: row = all( equal(a,b), 1 ) Then you can either find the row (but you already knew that one, as it is b) a[row] or the row index find(row==True) Mark On Aug 15, 11:53 am, Andy Cheesman wrote: > Dear nice people > > I'm trying to match a row (b) within a large numpy array (a). My most > successful attempt is below > > hit = equal(b, a) > total_hits = add.reduce(hit, 1) > max_hit = argmax(total_hits, 0) > answer = a[max_hit] > > where ... > a = array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11], > [12, 13, 14, 15]]) > > b = array([8, 9, 10, 11]) > > I was wondering if people could suggest a possible more efficient route > as there seems to be numerous steps. > > Thanks > Andy > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From markbak at gmail.com Wed Aug 15 06:11:59 2007 From: markbak at gmail.com (mark) Date: Wed, 15 Aug 2007 03:11:59 -0700 Subject: [Numpy-discussion] deleting value from array In-Reply-To: References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> Message-ID: <1187172719.626721.255050@r34g2000hsd.googlegroups.com> Yeah, I can see the copying is essential. I just think the syntax a = delete(a,1) confusing, as I would expect the deleted value back, rather than the updated array. As in the 'pop' function for lists. No 'pop' in numpy? (I presume this may have been debated extensively in the past). I find the syntax a.delete(1) more logical. Mark On Aug 15, 11:19 am, "Matthieu Brucher" wrote: > > I now get the feeling the delete command needs to copy the entire > > array with exception of the deleted item. I guess this is a hard thing > > to do efficiently? > > Well, if you don't copy the array, the value will always remain present. > > Matthieu > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From Andy.cheesman at bristol.ac.uk Wed Aug 15 06:26:15 2007 From: Andy.cheesman at bristol.ac.uk (Andy Cheesman) Date: Wed, 15 Aug 2007 11:26:15 +0100 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <1187172556.613122.207400@r29g2000hsg.googlegroups.com> References: <46C2CD01.5030307@bristol.ac.uk> <1187172556.613122.207400@r29g2000hsg.googlegroups.com> Message-ID: <46C2D4C7.4010305@bristol.ac.uk> Thanks for the speedy response but where can I locate the find function as it isn't in numpy. Andy mark wrote: > I think you can create an array with a true value in the right spot as > folows: > > row = all( equal(a,b), 1 ) > > Then you can either find the row (but you already knew that one, as it > is b) > > a[row] > > or the row index > > find(row==True) > > Mark > > On Aug 15, 11:53 am, Andy Cheesman > wrote: >> Dear nice people >> >> I'm trying to match a row (b) within a large numpy array (a). My most >> successful attempt is below >> >> hit = equal(b, a) >> total_hits = add.reduce(hit, 1) >> max_hit = argmax(total_hits, 0) >> answer = a[max_hit] >> >> where ... >> a = array([[ 0, 1, 2, 3], >> [ 4, 5, 6, 7], >> [ 8, 9, 10, 11], >> [12, 13, 14, 15]]) >> >> b = array([8, 9, 10, 11]) >> >> I was wondering if people could suggest a possible more efficient route >> as there seems to be numerous steps. >> >> Thanks >> Andy >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From markbak at gmail.com Wed Aug 15 06:59:48 2007 From: markbak at gmail.com (mark) Date: Wed, 15 Aug 2007 03:59:48 -0700 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <46C2D4C7.4010305@bristol.ac.uk> References: <46C2CD01.5030307@bristol.ac.uk> <1187172556.613122.207400@r29g2000hsg.googlegroups.com> <46C2D4C7.4010305@bristol.ac.uk> Message-ID: <1187175588.007436.125720@w3g2000hsg.googlegroups.com> Oops, 'find' is in pylab (matplotlib). I guess in numpy you have to use 'where', which does almost the same, but it returns a Tuple. Is there a function that is more like the find in matplotlib? Mark On Aug 15, 12:26 pm, Andy Cheesman wrote: > Thanks for the speedy response but where can I locate the find function > as it isn't in numpy. > > Andy > > > > mark wrote: > > I think you can create an array with a true value in the right spot as > > folows: > > > row = all( equal(a,b), 1 ) > > > Then you can either find the row (but you already knew that one, as it > > is b) > > > a[row] > > > or the row index > > > find(row==True) > > > Mark > > > On Aug 15, 11:53 am, Andy Cheesman > > wrote: > >> Dear nice people > > >> I'm trying to match a row (b) within a large numpy array (a). My most > >> successful attempt is below > > >> hit = equal(b, a) > >> total_hits = add.reduce(hit, 1) > >> max_hit = argmax(total_hits, 0) > >> answer = a[max_hit] > > >> where ... > >> a = array([[ 0, 1, 2, 3], > >> [ 4, 5, 6, 7], > >> [ 8, 9, 10, 11], > >> [12, 13, 14, 15]]) > > >> b = array([8, 9, 10, 11]) > > >> I was wondering if people could suggest a possible more efficient route > >> as there seems to be numerous steps. > > >> Thanks > >> Andy > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discuss... at scipy.org > >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From matthieu.brucher at gmail.com Wed Aug 15 07:38:14 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 15 Aug 2007 13:38:14 +0200 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <1187175588.007436.125720@w3g2000hsg.googlegroups.com> References: <46C2CD01.5030307@bristol.ac.uk> <1187172556.613122.207400@r29g2000hsg.googlegroups.com> <46C2D4C7.4010305@bristol.ac.uk> <1187175588.007436.125720@w3g2000hsg.googlegroups.com> Message-ID: The where function ? Matthieu 2007/8/15, mark : > > Oops, 'find' is in pylab (matplotlib). > I guess in numpy you have to use 'where', which does almost the same, > but it returns a Tuple. > Is there a function that is more like the find in matplotlib? > Mark > > > On Aug 15, 12:26 pm, Andy Cheesman > wrote: > > Thanks for the speedy response but where can I locate the find function > > as it isn't in numpy. > > > > Andy > > > > > > > > mark wrote: > > > I think you can create an array with a true value in the right spot as > > > folows: > > > > > row = all( equal(a,b), 1 ) > > > > > Then you can either find the row (but you already knew that one, as it > > > is b) > > > > > a[row] > > > > > or the row index > > > > > find(row==True) > > > > > Mark > > > > > On Aug 15, 11:53 am, Andy Cheesman > > > wrote: > > >> Dear nice people > > > > >> I'm trying to match a row (b) within a large numpy array (a). My most > > >> successful attempt is below > > > > >> hit = equal(b, a) > > >> total_hits = add.reduce(hit, 1) > > >> max_hit = argmax(total_hits, 0) > > >> answer = a[max_hit] > > > > >> where ... > > >> a = array([[ 0, 1, 2, 3], > > >> [ 4, 5, 6, 7], > > >> [ 8, 9, 10, 11], > > >> [12, 13, 14, 15]]) > > > > >> b = array([8, 9, 10, 11]) > > > > >> I was wondering if people could suggest a possible more efficient > route > > >> as there seems to be numerous steps. > > > > >> Thanks > > >> Andy > > >> _______________________________________________ > > >> Numpy-discussion mailing list > > >> Numpy-discuss... at scipy > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discuss... at scipy.org > > >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discuss... at scipy > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Wed Aug 15 09:30:53 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 15 Aug 2007 15:30:53 +0200 Subject: [Numpy-discussion] deleting value from array In-Reply-To: <1187172719.626721.255050@r34g2000hsd.googlegroups.com> References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> <1187172719.626721.255050@r34g2000hsd.googlegroups.com> Message-ID: <20070815133053.GA15510@clipper.ens.fr> On Wed, Aug 15, 2007 at 03:11:59AM -0700, mark wrote: > Yeah, I can see the copying is essential. > I just think the syntax > a = delete(a,1) > confusing, as I would expect the deleted value back, rather than the > updated array. > As in the 'pop' function for lists. > No 'pop' in numpy? (I presume this may have been debated extensively > in the past). > I find the syntax > a.delete(1) more logical. It is often considered in OO language that foo.method() modifies the foo object, while function(foo) returns a new object, not modifying foo. This is not always true in Python. Sometimes (eg strings) this is not true because the object is immutable, sometimes there isn't this good reason. I would be happy if we sticked to this convention. I find it makes the language easier to guess. Ga?l From Shawn.Gong at drdc-rddc.gc.ca Wed Aug 15 10:47:08 2007 From: Shawn.Gong at drdc-rddc.gc.ca (Gong, Shawn (Contractor)) Date: Wed, 15 Aug 2007 10:47:08 -0400 Subject: [Numpy-discussion] memory error caused by astype() Message-ID: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca> Hi list, When I do large array manipulations, I get out-of-memory errors. If the array size is 5000 by 6000, the following codes use nearly 1G. Then my PC displays a Python error box. The try/except won't catch it if the memory error happens in "astype" instead of "array1* array2" try: if ( array1.typecode() in cplx_types ): array1 = abs(array1.astype(Numeric.Complex32)) else: array1 = array1.astype(Numeric.Float32) if ( array2.typecode() in cplx_types ): array2 = abs(array2.astype(Numeric.Complex32)) else: array2 = array2.astype(Numeric.Float32) array1 = Numeric.sqrt(array1) * Numeric.sqrt(array2) return array1 except: gvutils.error("Memory error occurred\nPlease select a smaller array") return None My questions are: 1) Is there a more memory efficient way of doing this? 2) How do I deal with exception if astype is the only way to go 3) Is there a way in Python that detects the available RAM and limits the array size before he/she can go ahead with the array multiplications? i.e. detects the available RAM, say 800K Assume worst case - Complex32 Figure out how many temp_arrays used by numpy Calculate array size limit = ?? 4) If there is no 3) Is there something in Python that monitors memory and warns the user. I have these "astype" at a number functions. Do I have to put try/except at each location? Thanks, Shaw Gong -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Wed Aug 15 11:01:11 2007 From: markbak at gmail.com (mark) Date: Wed, 15 Aug 2007 15:01:11 -0000 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: References: <46C2CD01.5030307@bristol.ac.uk> <1187172556.613122.207400@r29g2000hsg.googlegroups.com> <46C2D4C7.4010305@bristol.ac.uk> <1187175588.007436.125720@w3g2000hsg.googlegroups.com> Message-ID: <1187190071.384881.240470@w3g2000hsg.googlegroups.com> Maybe this is not the intended use of where, but it seems to work: >>> from numpy import * # No complaining now >>> a = arange(12) >>> a.shape = (4,3) >>> a array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) >>> b = array([6,7,8]) >>> row = all( equal(a,b), 1 ) >>> where(row==True) (array([2]),) On Aug 15, 1:38 pm, "Matthieu Brucher" wrote: > The where function ? > > Matthieu > > 2007/8/15, mark : > > > > > Oops, 'find' is in pylab (matplotlib). > > I guess in numpy you have to use 'where', which does almost the same, > > but it returns a Tuple. > > Is there a function that is more like the find in matplotlib? > > Mark > > > On Aug 15, 12:26 pm, Andy Cheesman > > wrote: > > > Thanks for the speedy response but where can I locate the find function > > > as it isn't in numpy. > > > > Andy > > > > mark wrote: > > > > I think you can create an array with a true value in the right spot as > > > > folows: > > > > > row = all( equal(a,b), 1 ) > > > > > Then you can either find the row (but you already knew that one, as it > > > > is b) > > > > > a[row] > > > > > or the row index > > > > > find(row==True) > > > > > Mark > > > > > On Aug 15, 11:53 am, Andy Cheesman > > > > wrote: > > > >> Dear nice people > > > > >> I'm trying to match a row (b) within a large numpy array (a). My most > > > >> successful attempt is below > > > > >> hit = equal(b, a) > > > >> total_hits = add.reduce(hit, 1) > > > >> max_hit = argmax(total_hits, 0) > > > >> answer = a[max_hit] > > > > >> where ... > > > >> a = array([[ 0, 1, 2, 3], > > > >> [ 4, 5, 6, 7], > > > >> [ 8, 9, 10, 11], > > > >> [12, 13, 14, 15]]) > > > > >> b = array([8, 9, 10, 11]) > > > > >> I was wondering if people could suggest a possible more efficient > > route > > > >> as there seems to be numerous steps. > > > > >> Thanks > > > >> Andy > > > >> _______________________________________________ > > > >> Numpy-discussion mailing list > > > >> Numpy-discuss... at scipy > > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > > > Numpy-discussion mailing list > > > > Numpy-discuss... at scipy.org > > > >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discuss... at scipy > > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discuss... at scipy.org > >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From millman at berkeley.edu Wed Aug 15 12:22:38 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 15 Aug 2007 09:22:38 -0700 Subject: [Numpy-discussion] NumPy 1.0.3.x and SciPy 0.5.2.x Message-ID: Hello, I am hoping to release NumPy 1.0.3.1 and SciPy 0.5.2.1 this weekend. These releases will work with each other and get rid of the annoying deprecation warning about SciPyTest. They are both basically ready to release. If you have some time, please build and install the stable branches and let me know if you have any errors. You can check out the code here: svn co http://svn.scipy.org/svn/numpy/branches/1.0.3.x svn co http://svn.scipy.org/svn/scipy/branches/0.5.2.x Below is a list of the changes I have made. NumPy 1.0.3.1 ============ * Adds back get_path to numpy.distutils.misc_utils SciPy 0.5.2.1 ========== * Replaces ScipyTest with NumpyTest * Fixes mio5.py as per revision 2893 * Adds missing test definition in scipy.cluster as per revision 2941 * Synchs odr module with trunk since odr is broken in 0.5.2 * Updates for SWIG > 1.3.29 and fixes memory leak of type 'void *' Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From Glen.Mabey at swri.org Wed Aug 15 12:36:12 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Wed, 15 Aug 2007 11:36:12 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> Message-ID: <20070815163612.GB23855@bams.ccf.swri.edu> On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote: > On 13/08/07, Glen W. Mabey wrote: > > > As I have tried to think through what should be the appropriate > > behavior for the returned value of __getitem__, I have not been able to > > see an appropriate solution (let alone know how to implement it) to this > > issue. > > Is the problem one of finalization? That is, making sure the memory > map gets (flushed and) closed exactly once? In this case the > numpythonic solution is to have only the original mmap object do any > finalization; any slices contain a reference to it anyway, so they > cannot be kept after it is collected. If the problem is that you want > to do an explicit close/flush on a slice object, you could just always > apply the close/flush to the base object of the slice if it has one or > the slice itself if it doesn't. The immediate problem is that when a numpy.memmap instance is created as another view of the original array, then __del__ on that new view fails. flush()ing and closing aren't an issue for me, but they can't be performed at all on derived views right now. It seems to me that any derived view ought to be able to flush(), and ideally in my mind, close() would be called [automatically] only just before the reference count gets decremented to zero. That doesn't seem to match the numpythonic philosophy you described, Anne, but seems logical to me, while still allowing for both manual flush() and close() operations. Thanks for your response. Glen From matthew.brett at gmail.com Wed Aug 15 15:02:31 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 15 Aug 2007 20:02:31 +0100 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070815163612.GB23855@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> <20070815163612.GB23855@bams.ccf.swri.edu> Message-ID: <1e2af89e0708151202q68e2973co1fe407df07af2df2@mail.gmail.com> Hi, Thanks for looking into this because we (neuroimaging.scipy.org) use mmaps a lot. I am very away from my desk at the moment but please do keep us all informed, and we'll try and pitch in if we can... Matthew On 8/15/07, Glen W. Mabey wrote: > On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote: > > On 13/08/07, Glen W. Mabey wrote: > > > > > As I have tried to think through what should be the appropriate > > > behavior for the returned value of __getitem__, I have not been able to > > > see an appropriate solution (let alone know how to implement it) to this > > > issue. > > > > Is the problem one of finalization? That is, making sure the memory > > map gets (flushed and) closed exactly once? In this case the > > numpythonic solution is to have only the original mmap object do any > > finalization; any slices contain a reference to it anyway, so they > > cannot be kept after it is collected. If the problem is that you want > > to do an explicit close/flush on a slice object, you could just always > > apply the close/flush to the base object of the slice if it has one or > > the slice itself if it doesn't. > > The immediate problem is that when a numpy.memmap instance is created as > another view of the original array, then __del__ on that new view fails. > > flush()ing and closing aren't an issue for me, but they can't be > performed at all on derived views right now. It seems to me that any > derived view ought to be able to flush(), and ideally in my mind, > close() would be called [automatically] only just before the reference > count gets decremented to zero. > > That doesn't seem to match the numpythonic philosophy you described, > Anne, but seems logical to me, while still allowing for both manual > flush() and close() operations. > > Thanks for your response. > > Glen > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From peridot.faceted at gmail.com Wed Aug 15 20:50:28 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 15 Aug 2007 20:50:28 -0400 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070815163612.GB23855@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> <20070815163612.GB23855@bams.ccf.swri.edu> Message-ID: On 15/08/07, Glen W. Mabey wrote: > On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote: > > On 13/08/07, Glen W. Mabey wrote: > > > > > As I have tried to think through what should be the appropriate > > > behavior for the returned value of __getitem__, I have not been able to > > > see an appropriate solution (let alone know how to implement it) to this > > > issue. > > > > Is the problem one of finalization? That is, making sure the memory > > map gets (flushed and) closed exactly once? In this case the > > numpythonic solution is to have only the original mmap object do any > > finalization; any slices contain a reference to it anyway, so they > > cannot be kept after it is collected. If the problem is that you want > > to do an explicit close/flush on a slice object, you could just always > > apply the close/flush to the base object of the slice if it has one or > > the slice itself if it doesn't. > > The immediate problem is that when a numpy.memmap instance is created as > another view of the original array, then __del__ on that new view fails. Yes, this is definitely broken. > flush()ing and closing aren't an issue for me, but they can't be > performed at all on derived views right now. It seems to me that any > derived view ought to be able to flush(), and ideally in my mind, > close() would be called [automatically] only just before the reference > count gets decremented to zero. > > That doesn't seem to match the numpythonic philosophy you described, > Anne, but seems logical to me, while still allowing for both manual > flush() and close() operations. You have to be a bit careful, because a view really is just a view into the array - the original is still around. So you can't really delete the array contents when the view is deleted. Really, if you do: B = A[::2] del B nothing at all should happen to A. But to be pythonic, or numpythonic, when the original A is garbage-collected, the garbage collection should certainly close the mmap. Being able to apply flush() or whatever to slices is not necessarily unpythonic, but it's probably a lot simpler to reliably implement slices of mmap()s as simple slices of ordinary arrays. It means you need to keep the original mmap object around (or traverse up the tree of bases: T = A while T.base is not None: T = T.base T.flush() ) (Note that this would be simpler if when you did A = arange(100) B = A[::2] C = B[::2] you found that C.base were A rather than B.) Anne From nadavh at visionsense.com Thu Aug 16 06:04:35 2007 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 16 Aug 2007 13:04:35 +0300 Subject: [Numpy-discussion] deleting value from array In-Reply-To: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com> Message-ID: <1187258675.27158.1.camel@nadav.envision.co.il> The closest I can think of is: a = a[range(len(a)) != 1] Nadav. On Wed, 2007-08-15 at 02:07 -0700, mark wrote: > I am trying to delete a value from an array > This seems to work as follows > > >>> a = array([1,2,3,4]) > >>> a = delete( a, 1 ) > >>> a > array([1, 3, 4]) > > But wouldn't it make more sense to have a function like > > a.delete(1) ? > > I now get the feeling the delete command needs to copy the entire > array with exception of the deleted item. I guess this is a hard thing > to do efficiently? > > Thanks, > > Mark > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From Glen.Mabey at swri.org Thu Aug 16 10:17:10 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Thu, 16 Aug 2007 09:17:10 -0500 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> <20070815163612.GB23855@bams.ccf.swri.edu> Message-ID: <20070816141710.GA3154@bams.ccf.swri.edu> On Wed, Aug 15, 2007 at 08:50:28PM -0400, Anne Archibald wrote: > You have to be a bit careful, because a view really is just a view > into the array - the original is still around. So you can't really > delete the array contents when the view is deleted. Really, if you do: > B = A[::2] > del B > nothing at all should happen to A. Okay, right. I was muddling those two concepts. > But to be pythonic, or numpythonic, when the original A is > garbage-collected, the garbage collection should certainly close the > mmap. Humm, this would be less than ideal for my use case, when the data on disk is organized in a different dimensional order than I want to refer to it in my code. For example: p_data = numpy.memmap( datafilename, shape=( 10, 1024, 20 ), dtype=numpy.float32, mode='r') u_data = p_data.transpose( [ 2, 0, 1 ] ) and I don't want to have to keep track of p_data because its only u_data that I care about and want to use. And I promise, this is not a contrived example. I have data that I really do want to be ordered in a certain way on disk, for I/O efficiency reasons, yet when I logically index into it in my code, I want the dimensions to be in a different order. > Being able to apply flush() or whatever to slices is not necessarily > unpythonic, but it's probably a lot simpler to reliably implement > slices of mmap()s as simple slices of ordinary arrays. I considered this approach, but what happens if you want to instantiate a slice that is very large, e.g., larger than the size of your physical RAM? In that case, you can't afford to make simple slices be ordinary arrays, besides the case where you want to change values. Making them functionally memmap-arrays, but without .sync() and .close() doesn't seem right either. > It means you > need to keep the original mmap object around (or traverse up the tree > of bases: > T = A > while T.base is not None: T = T.base > T.flush() > ) > > (Note that this would be simpler if when you did > A = arange(100) > B = A[::2] > C = B[::2] > you found that C.base were A rather than B.) Okay, this would make it so that I didn't have to explicitly keep track of p_data, in my example. Not bad, although I'd never noticed a .base member before ... Thank you, Glen Mabey From Andy.cheesman at bristol.ac.uk Tue Aug 14 06:53:03 2007 From: Andy.cheesman at bristol.ac.uk (Andy Cheesman) Date: Tue, 14 Aug 2007 11:53:03 +0100 Subject: [Numpy-discussion] Finding a row match within a numpy array Message-ID: <46C1898F.6020107@bristol.ac.uk> Dear nice people I'm trying to match a row (b) within a large numpy array (a). My most successful attempt is below hit = equal(b, a) total_hits = add.reduce(hit, 1) max_hit = argmax(total_hits, 0) answer = a[max_hit] where ... a = array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) b = array([8, 9, 10, 11]) I was wondering if people could suggest a possible more efficient route as there seems to be numerous steps. Thanks Andy From efiring at hawaii.edu Thu Aug 16 15:20:41 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 16 Aug 2007 09:20:41 -1000 Subject: [Numpy-discussion] fast putmask implementation Message-ID: <46C4A389.5020108@hawaii.edu> In looking at maskedarray performance, I found that the filled() function or method is a bottleneck. I think it can be sped up by using putmask instead of indexed assignment, but I found that putmask itself is slower than it needs to be. So I followed David Cournapeau's example of fastclip and made a similar fastputmask. The diff relative to current svn (3967) is attached. The faster version makes a factor-of-ten or larger improvement in putmask speed. numpy.test() still passes. With 10000-element integer arrays the new version reduces the times from 136 to 15 usec for 1000 masked elements, and 445 to 18 usec for 5000 masked elements, with a scalar value argument. It is only slightly slower with an array value argument. (Times are for Intel Core2, 2 GH, linux.) I hope someone will take a look and either tell me what I need to fix or commit it as-is. Thanks. Eric -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy_putmask.diff Type: text/x-patch Size: 32047 bytes Desc: not available URL: From zyzhu2000 at gmail.com Thu Aug 16 21:26:34 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Thu, 16 Aug 2007 20:26:34 -0500 Subject: [Numpy-discussion] numpy.array does not take generators Message-ID: Hi All, I want to construct a numpy array based on Python objects. In the below code, opts is a list of tuples. For example, opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')] If I use a generator like the following: K=numpy.array(o[2]/1000.0 for o in opts) It does not work. I have to use: numpy.array([o[2]/1000.0 for o in opts]) Is this behavior intended? By the way, it is quite inefficient to create numpy array this way, because I have to create a regular python first, and then construct a numpy array. But I do not want to store everything in vector form initially, as it is more natural to store them in objects, and easier to use when organizing the data. Does anyone know any better way? Thanks, Geoffrey From aisaac at american.edu Thu Aug 16 22:04:11 2007 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 16 Aug 2007 22:04:11 -0400 Subject: [Numpy-discussion] numpy.array does not take generators In-Reply-To: References: Message-ID: On Thu, 16 Aug 2007, Geoffrey Zhu apparently wrote: > K=numpy.array(o[2]/1000.0 for o in opts) > It does not work. K=numpy.fromiter((o[2]/1000.0 for o in opts),'float') hth, Alan Isaac From cournape at gmail.com Thu Aug 16 22:05:15 2007 From: cournape at gmail.com (David Cournapeau) Date: Fri, 17 Aug 2007 11:05:15 +0900 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <46C4A389.5020108@hawaii.edu> References: <46C4A389.5020108@hawaii.edu> Message-ID: <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> On 8/17/07, Eric Firing wrote: > In looking at maskedarray performance, I found that the filled() > function or method is a bottleneck. I think it can be sped up by using > putmask instead of indexed assignment, but I found that putmask itself > is slower than it needs to be. So I followed David Cournapeau's example > of fastclip and made a similar fastputmask. The diff relative to > current svn (3967) is attached. Great ! putmask was actually the function I wanted to improve after clip, because it is the second bottleneck for matplotlib imagesc :) I would not be suprised if now imagesc has descent speed compared to matlab. > > I hope someone will take a look and either tell me what I need to fix or > commit it as-is. It looks like there are a lot of spurious diff in you patch (space vs tab, or endline problems ?). Could you regenerate a patch without them, since half of the patch is "garbage" ? It would be much easier to see the changes you actually made. cheers, David From efiring at hawaii.edu Thu Aug 16 22:39:02 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 16 Aug 2007 16:39:02 -1000 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> Message-ID: <46C50A46.1070208@hawaii.edu> David Cournapeau wrote: > On 8/17/07, Eric Firing wrote: >> In looking at maskedarray performance, I found that the filled() >> function or method is a bottleneck. I think it can be sped up by using >> putmask instead of indexed assignment, but I found that putmask itself >> is slower than it needs to be. So I followed David Cournapeau's example >> of fastclip and made a similar fastputmask. The diff relative to >> current svn (3967) is attached. > > Great ! putmask was actually the function I wanted to improve after > clip, because it is the second bottleneck for matplotlib imagesc :) I > would not be suprised if now imagesc has descent speed compared to > matlab. > >> I hope someone will take a look and either tell me what I need to fix or >> commit it as-is. > > It looks like there are a lot of spurious diff in you patch (space vs > tab, or endline problems ?). Could you regenerate a patch without > them, since half of the patch is "garbage" ? It would be much easier > to see the changes you actually made. Agreed. This is because my editor deletes spurious whitespace that was already in the file. If I ruled the world, the spurious whitespace and hard tabs would never be there in the first place. (If I were younger I might use smileys in places like this, but they just don't come naturally to me.) As far as I can see there is no way of using svn diff to deal with this automatically, so in the attached revision I have manually removed chunks resulting solely from whitespace. Some of the remaining chunks unfortunately have a mixture of whitespace and substantive differences. And manually removing chunks is risky. Is there a better way to handle this problem? A better way to make diffs? Or any possibility of routinely cleaning the junk out of the svn source files? (Yes, I know--what is junk to me probably results from what others consider good behavior of the editor.) Eric > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy_putmask.diff Type: text/x-patch Size: 8716 bytes Desc: not available URL: From cookedm at physics.mcmaster.ca Thu Aug 16 23:07:19 2007 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu, 16 Aug 2007 23:07:19 -0400 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <46C50A46.1070208@hawaii.edu> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> Message-ID: <20070817030719.GA5542@arbutus.physics.mcmaster.ca> On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote: > As far as I can see there is no way of using svn diff to deal with > this automatically, so in the attached revision I have manually removed > chunks resulting solely from whitespace. > > Is there a better way to handle this problem? A better way to make diffs? > Or any possibility of routinely cleaning the junk out of the svn source > files? (Yes, I know--what is junk to me probably results from what others > consider good behavior of the editor.) 'svn diff -x -b' might work better (-b gets passed to diff, which makes it ignore space changes). Or svn diff -x -w to ignore all whitespace. Me, I hate trailing ws too (I've got Emacs set up so that gets highlighted as red, which makes me angry :). The hard tabs in C code is keeping with the style used in the C Python sources (Emacs even has a 'python' C style -- do "C-c . python"). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From efiring at hawaii.edu Fri Aug 17 01:32:51 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 16 Aug 2007 19:32:51 -1000 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <20070817030719.GA5542@arbutus.physics.mcmaster.ca> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> <20070817030719.GA5542@arbutus.physics.mcmaster.ca> Message-ID: <46C53303.4000806@hawaii.edu> David M. Cooke wrote: > On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote: >> As far as I can see there is no way of using svn diff to deal with >> this automatically, so in the attached revision I have manually removed >> chunks resulting solely from whitespace. >> >> Is there a better way to handle this problem? A better way to make diffs? >> Or any possibility of routinely cleaning the junk out of the svn source >> files? (Yes, I know--what is junk to me probably results from what others >> consider good behavior of the editor.) > > 'svn diff -x -b' might work better (-b gets passed to diff, which makes > it ignore space changes). Or svn diff -x -w to ignore all whitespace. > > Me, I hate trailing ws too (I've got Emacs set up so that gets > highlighted as red, which makes me angry :). The hard tabs in C code is > keeping with the style used in the C Python sources (Emacs even has a > 'python' C style -- do "C-c . python"). > David, Thank you. I had tried something like that a while ago without success, and now I know why: the '-w' has to be quoted to keep it out of the clutches of the shell, so it is "svn diff -x '-w'". The result is attached. Much better. As for hard tabs in C Python sources--it is still a bad idea even if the BDFL himself does it--very bad for Python, not quite as bad for C, but still bad. Too fragile, too dependent on editor configuration, and in numpy, not done consistently--it's a complete mishmash of tabs and spaces. OK, enough of that. Eric -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: putmask.diff_w URL: From robert.kern at gmail.com Fri Aug 17 02:19:53 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 16 Aug 2007 23:19:53 -0700 Subject: [Numpy-discussion] numpy.array does not take generators In-Reply-To: References: Message-ID: <46C53E09.9020306@gmail.com> Geoffrey Zhu wrote: > Hi All, > > I want to construct a numpy array based on Python objects. In the > below code, opts is a list of tuples. > > For example, > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')] > > If I use a generator like the following: > > K=numpy.array(o[2]/1000.0 for o in opts) > > It does not work. > > I have to use: > > numpy.array([o[2]/1000.0 for o in opts]) > > Is this behavior intended? Yes. With arbitrary generators, there is no good way to do the kind of mind-reading that numpy.array() usually does with sequences. It would have to unroll the whole generator anyways. fromiter() works for this, but you are restricted to 1-D arrays which is a lot easier to implement the mind-reading for. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Fri Aug 17 02:40:01 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 16 Aug 2007 20:40:01 -1000 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <20070817030719.GA5542@arbutus.physics.mcmaster.ca> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> <20070817030719.GA5542@arbutus.physics.mcmaster.ca> Message-ID: <46C542C1.8060907@hawaii.edu> David M. Cooke wrote: > On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote: >> As far as I can see there is no way of using svn diff to deal with >> this automatically, so in the attached revision I have manually removed >> chunks resulting solely from whitespace. >> >> Is there a better way to handle this problem? A better way to make diffs? >> Or any possibility of routinely cleaning the junk out of the svn source >> files? (Yes, I know--what is junk to me probably results from what others >> consider good behavior of the editor.) > > 'svn diff -x -b' might work better (-b gets passed to diff, which makes > it ignore space changes). Or svn diff -x -w to ignore all whitespace. > > Me, I hate trailing ws too (I've got Emacs set up so that gets > highlighted as red, which makes me angry :). The hard tabs in C code is > keeping with the style used in the C Python sources (Emacs even has a > 'python' C style -- do "C-c . python"). > Not any more! See the revised PEP 007, http://www.python.org/dev/peps/pep-0007/ In Python 3000 (and in the 2.x series, in new source files), we'll switch to a different indentation style: 4 spaces per indent, all spaces (no tabs in any file). The rest will remain the same. I would love to see this as the standard in numpy as well. Then files obey WYSIWYG regardless of editor. (Except for unicode woes, but that is another topic.) Eric From peridot.faceted at gmail.com Fri Aug 17 04:11:12 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 17 Aug 2007 04:11:12 -0400 Subject: [Numpy-discussion] .transpose() of memmap array fails to close() In-Reply-To: <20070816141710.GA3154@bams.ccf.swri.edu> References: <20070607214620.GM6116@bams.ccf.swri.edu> <20070810162016.GA12992@bams.ccf.swri.edu> <20070810211438.GF13557@bams.ccf.swri.edu> <20070813161945.GA12360@bams.ccf.swri.edu> <20070815163612.GB23855@bams.ccf.swri.edu> <20070816141710.GA3154@bams.ccf.swri.edu> Message-ID: On 16/08/07, Glen W. Mabey wrote: > On Wed, Aug 15, 2007 at 08:50:28PM -0400, Anne Archibald wrote: > > But to be pythonic, or numpythonic, when the original A is > > garbage-collected, the garbage collection should certainly close the > > mmap. > > Humm, this would be less than ideal for my use case, when the data on > disk is organized in a different dimensional order than I want to refer > to it in my code. For example: > > p_data = numpy.memmap( datafilename, shape=( 10, 1024, 20 ), dtype=numpy.float32, mode='r') > u_data = p_data.transpose( [ 2, 0, 1 ] ) > > and I don't want to have to keep track of p_data because its only u_data > that I care about and want to use. And I promise, this is not a > contrived example. I have data that I really do want to be ordered in a > certain way on disk, for I/O efficiency reasons, yet when I logically > index into it in my code, I want the dimensions to be in a different > order. Perfectly reasonable. Note that p_data cannot be collected until u_data goes away too, so the mmap is safe. And transpose()ing doesn't copy any data, so even if you get an ndarray, you haven't lost the ability to modify things on disk. > > Being able to apply flush() or whatever to slices is not necessarily > > unpythonic, but it's probably a lot simpler to reliably implement > > slices of mmap()s as simple slices of ordinary arrays. > > I considered this approach, but what happens if you want to instantiate > a slice that is very large, e.g., larger than the size of your physical > RAM? In that case, you can't afford to make simple slices be ordinary > arrays, besides the case where you want to change values. Making them > functionally memmap-arrays, but without .sync() and .close() doesn't > seem right either. I was a bit ambiguous. An ordinary numpy array is an ndarray object, which contains some housekeeping data (dimension, shape, stride lengths, some flags, what have you) and a pointer to a hunk of memory. That hunk of memory can be pretty much any directly-addressable memory, for example a contiguous block of malloc()ed RAM, the beginning of a (possibly strided) subblock of an existing piece of malloc()ed RAM, a pointer to an array statically allocated in some C or Fortran library... or a piece of memory in an mmap()ed region. Numpy doesn't care at all about the difference. In fact this is the beauty of numpy: because all it cares about is where the elements start, what they look like, how many there are, and how far apart they are, it can cheaply create subarrays without copying any data. So naively, one might implement mmap()ed arrays with a factory function that called mmap(), got back a pointer to the place in virtual memory where the file's contents appear to live, and whipped up a perfectly ordinary ndarray to point to the contents. It would work, thanks to the magic of the OS's mmap() call. The only problem is you would have to figure out when it was safe to close the mmap() (invalidating the array's memory!) and you would have no convenient way to flush() the mmap() out to disk. So the mmap() objects exist. All they are is ndarrays that keep track of how the mmap() was done and provide flush() and close() methods; they also (I hope!) make sure close() gets called when they get garbage-collected. Note that the safety-scissors way to do this would be to *not* provide a close() method, since a close() leaves the object's data unusable, just waiting for an unwise attempt to index into the object. It's probably better not to ever close() an mmap() object. What should happen when you take a slice of an mmap() object? (this includes transposes and other non-copying ways to get at its contents). You get a fresh new ndarray object that does all the numpy magic. But should it also do the mmap() magic? It doesn't need the mmap() creation magic, since the mmap() already exists. flush() would be sort of nice, since that's meaningful (though it might take a long time, if it flushes the whole mmap). close() is just asking to shoot yourself in the foot, since it not only invalidates the slice you took but the whole mmap()! It seems to me - remember, I don't use mmap or develop numpy, so give this opinion the corresponding weight - that the Right Answer for mmap() is to provide flush(), but not to provide close() except on finalization (you can ensure finalization happens by deleting all references to the array). Finally, if you take a slice of an mmap(), I think you should get a simple ndarray. This ensures you don't have to thread type-duplication code into everywhere that might make a slice. But if you do make slices themselves mmap()s, providing flush() to slices too, great. Just don't provide close(), and particularly *don't* invoke it on finalization of slices, or things will die horribly. Anne From matthew.brett at gmail.com Fri Aug 17 08:11:41 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 17 Aug 2007 13:11:41 +0100 Subject: [Numpy-discussion] Error in allclose with inf values Message-ID: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com> Hi, I noticed that allclose does not always behave correctly for arrays with infs. I've attached a test script for allclose, and here's an alternative implementation that I believe behaves correctly. Obviously the test script could be a test case in core/tests/test_numeric.py I wonder if we should allow nans in the test arrays - possibly with an optional keyword arg like allownan. After all inf-inf is nan - but we allow that in the test. Best, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: my_allclose.py Type: text/x-python-script Size: 758 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_allclose.py Type: text/x-python-script Size: 1463 bytes Desc: not available URL: From zyzhu2000 at gmail.com Fri Aug 17 18:26:55 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Fri, 17 Aug 2007 17:26:55 -0500 Subject: [Numpy-discussion] numpy.array does not take generators In-Reply-To: <46C53E09.9020306@gmail.com> References: <46C53E09.9020306@gmail.com> Message-ID: On 8/17/07, Robert Kern wrote: > Geoffrey Zhu wrote: > > Hi All, > > > > I want to construct a numpy array based on Python objects. In the > > below code, opts is a list of tuples. > > > > For example, > > > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')] > > > > If I use a generator like the following: > > > > K=numpy.array(o[2]/1000.0 for o in opts) > > > > It does not work. > > > > I have to use: > > > > numpy.array([o[2]/1000.0 for o in opts]) > > > > Is this behavior intended? > > Yes. With arbitrary generators, there is no good way to do the kind of > mind-reading that numpy.array() usually does with sequences. It would have to > unroll the whole generator anyways. fromiter() works for this, but you are > restricted to 1-D arrays which is a lot easier to implement the mind-reading for. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > I see. Thanks for explaining. From barrywark at gmail.com Fri Aug 17 19:00:58 2007 From: barrywark at gmail.com (Barry Wark) Date: Fri, 17 Aug 2007 16:00:58 -0700 Subject: [Numpy-discussion] numpy.array does not take generators In-Reply-To: <46C53E09.9020306@gmail.com> References: <46C53E09.9020306@gmail.com> Message-ID: Is there a reason not to add an argument to fromiter that specifies the final size of the n-d array? Reading this discussion, I realized that there are several places in my code where I create 2-D arrays like this: arr = N.array([d.data() for d in list_of_data_containers]), where d.data() returns a buffer object. I would guess that this paradigm causes lots of memory copying. The more efficient solution, I think, would be to preallocate the array and then assign each row in a loop. It's so much clearer this way, however, that I've kept it as is in the code. So, what if I could do something like arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)), with the contract that fromiter will throw an exception if any of the d.data() are not of size y or if there are more than x elements in list_of_data_containers? Just a thought for discussion. barry On 8/16/07, Robert Kern wrote: > Geoffrey Zhu wrote: > > Hi All, > > > > I want to construct a numpy array based on Python objects. In the > > below code, opts is a list of tuples. > > > > For example, > > > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')] > > > > If I use a generator like the following: > > > > K=numpy.array(o[2]/1000.0 for o in opts) > > > > It does not work. > > > > I have to use: > > > > numpy.array([o[2]/1000.0 for o in opts]) > > > > Is this behavior intended? > > Yes. With arbitrary generators, there is no good way to do the kind of > mind-reading that numpy.array() usually does with sequences. It would have to > unroll the whole generator anyways. fromiter() works for this, but you are > restricted to 1-D arrays which is a lot easier to implement the mind-reading for. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From tim.hochberg at ieee.org Fri Aug 17 20:00:24 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 17 Aug 2007 17:00:24 -0700 Subject: [Numpy-discussion] numpy.array does not take generators In-Reply-To: References: <46C53E09.9020306@gmail.com> Message-ID: On 8/17/07, Barry Wark wrote: > > Is there a reason not to add an argument to fromiter that specifies > the final size of the n-d array? Reading this discussion, I realized > that there are several places in my code where I create 2-D arrays > like this: > > arr = N.array([d.data() for d in list_of_data_containers]), > > where d.data() returns a buffer object. > > I would guess that this paradigm causes lots of memory copying. The > more efficient solution, I think, would be to preallocate the array > and then assign each row in a loop. It's so much clearer this way, > however, that I've kept it as is in the code. > > So, what if I could do something like > > arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)), I don't know that there's any theoretical problem in terms of doing something like this. There are a couple of practical issues though. One is that it would significantly increase the implementation complexity of fromiter, which right now is about as simple as it can reasonably be. Someone would need to step forward and write and test the code. The second issue is with the interface. The interface that you propose isn't really right. The current interface is: fromiter(iterable, dtype, count=-1) where count indicates how many items to extract from the iterable (-1 iterates until it is empty). 'shape' as you propose would couple to this in an unnatural way. Adding another keyword argument that indicates just the shape of the elements would make more sense, but it starts to seem a bit clunky. fromiter(iterable, dtype, count-1, itemshape=()) For this particular application, there doesn't seem to be any problem simply defining yourself a little utility function to do this for you. def from_shaped_iter(iterable, dtype, shape): a = numpy.empty(shape, dtype) for i, x in enumerate(iterable): a[i] = x return a I expect this would have decent performance if y dimension is reasonably large. regards, -tim with the contract that fromiter will throw an exception if any of the > d.data() are not of size y or if there are more than x elements in > list_of_data_containers? > > Just a thought for discussion. > > barry > > On 8/16/07, Robert Kern wrote: > > Geoffrey Zhu wrote: > > > Hi All, > > > > > > I want to construct a numpy array based on Python objects. In the > > > below code, opts is a list of tuples. > > > > > > For example, > > > > > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')] > > > > > > If I use a generator like the following: > > > > > > K=numpy.array(o[2]/1000.0 for o in opts) > > > > > > It does not work. > > > > > > I have to use: > > > > > > numpy.array([o[2]/1000.0 for o in opts]) > > > > > > Is this behavior intended? > > > > Yes. With arbitrary generators, there is no good way to do the kind of > > mind-reading that numpy.array() usually does with sequences. It would > have to > > unroll the whole generator anyways. fromiter() works for this, but you > are > > restricted to 1-D arrays which is a lot easier to implement the > mind-reading for. > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an enigma, a harmless > enigma > > that is made terrible by our own mad attempt to interpret it as though > it had > > an underlying truth." > > -- Umberto Eco > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant.travis at ieee.org Sat Aug 18 03:51:50 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat, 18 Aug 2007 01:51:50 -0600 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <46C542C1.8060907@hawaii.edu> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> <20070817030719.GA5542@arbutus.physics.mcmaster.ca> <46C542C1.8060907@hawaii.edu> Message-ID: <46C6A516.4000607@ieee.org> > Not any more! See the revised PEP 007, > http://www.python.org/dev/peps/pep-0007/ > > In Python 3000 (and in the 2.x series, in new source files), > we'll switch to a different indentation style: 4 spaces per indent, > all spaces (no tabs in any file). The rest will remain the same. > > I would love to see this as the standard in numpy as well. Then files > obey WYSIWYG regardless of editor. (Except for unicode woes, but that > is another topic.) > I'm fine with this. Some information on how to make sure emacs (and other editors) does this would be helpful. -Travis From jensj at fysik.dtu.dk Sat Aug 18 05:00:56 2007 From: jensj at fysik.dtu.dk (Jens =?ISO-8859-1?Q?J=F8rgen?= Mortensen) Date: Sat, 18 Aug 2007 11:00:56 +0200 Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing Message-ID: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk> I would like all these arrays to be contiguous: >>> import numpy as npy >>> npy.__version__ '1.0.4.dev3967' >>> x = npy.arange(4) >>> y = x[npy.newaxis, :] >>> z = x.reshape((1, 4)) >>> for a in [x, y, z]: ... print a.shape, a.strides, a.flags.contiguous ... (4,) (4,) True (1, 4) (0, 4) False (1, 4) (16, 4) True But y is not contiguous according to y.flags.contiguous - why not and why does y and z not have the same strides? I found this comment just before the _IsContiguous function in arrayobject.c: /* 0-strided arrays are not contiguous (even if dimension == 1) */ Is this correct? Jens J?rgen Mortensen From stefan at sun.ac.za Sat Aug 18 06:21:07 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat, 18 Aug 2007 12:21:07 +0200 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <46C1898F.6020107@bristol.ac.uk> References: <46C1898F.6020107@bristol.ac.uk> Message-ID: <20070818102107.GL2977@mentat.za.net> On Tue, Aug 14, 2007 at 11:53:03AM +0100, Andy Cheesman wrote: > Dear nice people > > I'm trying to match a row (b) within a large numpy array (a). My most > successful attempt is below > > hit = equal(b, a) > total_hits = add.reduce(hit, 1) > max_hit = argmax(total_hits, 0) > answer = a[max_hit] > > where ... > a = array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11], > [12, 13, 14, 15]]) > > b = array([8, 9, 10, 11]) > > > > I was wondering if people could suggest a possible more efficient route > as there seems to be numerous steps. Another way to do it: a[N.apply_along_axis(N.all,1,a==b)] Cheers St?fan From stefan at sun.ac.za Sat Aug 18 06:25:21 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat, 18 Aug 2007 12:25:21 +0200 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <46C1898F.6020107@bristol.ac.uk> References: <46C1898F.6020107@bristol.ac.uk> Message-ID: <20070818102521.GM2977@mentat.za.net> On Tue, Aug 14, 2007 at 11:53:03AM +0100, Andy Cheesman wrote: > Dear nice people > > I'm trying to match a row (b) within a large numpy array (a). My most > successful attempt is below > > hit = equal(b, a) > total_hits = add.reduce(hit, 1) > max_hit = argmax(total_hits, 0) > answer = a[max_hit] > > where ... > a = array([[ 0, 1, 2, 3], > [ 4, 5, 6, 7], > [ 8, 9, 10, 11], > [12, 13, 14, 15]]) > > b = array([8, 9, 10, 11]) > > > > I was wondering if people could suggest a possible more efficient route > as there seems to be numerous steps. For large arrays, you may not want to calculate a == b, so you could also do [row for row in a if N.all(row == b)] or find the indices using [r for r,row in enumerate(a) if N.all(row == b)] Cheers St?fan From oliphant.travis at ieee.org Sat Aug 18 06:51:53 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat, 18 Aug 2007 04:51:53 -0600 Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing In-Reply-To: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk> References: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk> Message-ID: <46C6CF49.6020202@ieee.org> Jens J?rgen Mortensen wrote: > I would like all these arrays to be contiguous: > >>>> import numpy as npy >>>> npy.__version__ > '1.0.4.dev3967' >>>> x = npy.arange(4) >>>> y = x[npy.newaxis, :] >>>> z = x.reshape((1, 4)) >>>> for a in [x, y, z]: > ... print a.shape, a.strides, a.flags.contiguous > ... > (4,) (4,) True > (1, 4) (0, 4) False > (1, 4) (16, 4) True > > But y is not contiguous according to y.flags.contiguous - why not and > why does y and z not have the same strides? > > I f We've tried a few times to let them be contiguous, but it breaks code in various ways because NumPy takes advantage of 0-striding to accomplish broadcasting. In theory, it might be able to be fixed, but the fact that simple fixes don't work makes me wonder. ound this comment just before the _IsContiguous function in > arrayobject.c: > > /* 0-strided arrays are not contiguous (even if dimension == 1) */ > > Is this correct? Yes. -Travis From gael.varoquaux at normalesup.org Sat Aug 18 11:11:49 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 18 Aug 2007 17:11:49 +0200 Subject: [Numpy-discussion] fast putmask implementation In-Reply-To: <46C6A516.4000607@ieee.org> References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> <20070817030719.GA5542@arbutus.physics.mcmaster.ca> <46C542C1.8060907@hawaii.edu> <46C6A516.4000607@ieee.org> Message-ID: <20070818151149.GB16053@clipper.ens.fr> On Sat, Aug 18, 2007 at 01:51:50AM -0600, Travis Oliphant wrote: > I'm fine with this. Some information on how to make sure emacs (and > other editors) does this would be helpful. Under vim, put in your .vimrc: autocmd FileType python set autoindent tabstop=4 shiftwidth=4 smarttab expandtab Ga?l From faltet at carabos.com Sat Aug 18 16:04:39 2007 From: faltet at carabos.com (Francesc Altet) Date: Sat, 18 Aug 2007 22:04:39 +0200 Subject: [Numpy-discussion] Fwd: Request for Use Cases - h5import and text data Message-ID: <200708182204.39674.faltet@carabos.com> Hi, This has been sent to the hdf-forum at hdfgroup.org list, but it should of interest to NumPy/SciPy lists too. Remember that you can access most of the HDF5 files from Python by using PyTables. Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" ----------- Original message ------------------------------------- Request for Use Cases - h5import and text data h5import is an HDF5 tool that converts floating point or integer data stored in ASCII or binary files into the HDF5 format. Currently h5import only processes numeric data. The HDF Group plans to add support for importing text data into HDF5 using h5import. We are now soliciting use cases that will guide the design of the text to dataset import feature in h5import. Please consider text you might want to import and how you would want to access that text once it is in the HDF5 file, and send your use cases to help at hdfgroup.org before September 17, 2007. Thank you for your help as we strive to improve our tools. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Pytables-users mailing list Pytables-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users ------------------------------------------------------- -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From matthew.brett at gmail.com Sun Aug 19 07:57:50 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 19 Aug 2007 12:57:50 +0100 Subject: [Numpy-discussion] Error in allclose with inf values In-Reply-To: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com> References: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com> Message-ID: <1e2af89e0708190457o7e0269b9ta2b8b0aff787e029@mail.gmail.com> Hi again, > I noticed that allclose does not always behave correctly for arrays with infs. Sorry, perhaps I should have been more specific; this is the behavior of allclose that I was referring to (documented in the tests I attached): In [6]:N.allclose([N.inf, 1, 2], [10, 10, N.inf]) Out[6]:array([ True], dtype=bool) In [7]:N.allclose([N.inf, N.inf, N.inf], [10, 10, N.inf]) Warning: invalid value encountered in subtract Out[7]:True In [9]:N.allclose([N.inf, N.inf], [10, 10]) --------------------------------------------------------------------------- exceptions.AttributeError Traceback (most recent call last) /home/mb312/ /home/mb312/lib/python2.4/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol) 843 d3 = (x[xinf] == y[yinf]) 844 d4 = (~xinf & ~yinf) --> 845 if d3.size < 2: 846 if d3.size==0: 847 return False AttributeError: 'bool' object has no attribute 'size' Matthew From fperez.net at gmail.com Sun Aug 19 20:51:11 2007 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 19 Aug 2007 18:51:11 -0600 Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x In-Reply-To: References: Message-ID: Hey, On 8/15/07, Jarrod Millman wrote: > Hello, > > I am hoping to release NumPy 1.0.3.1 and SciPy 0.5.2.1 this weekend. > These releases will work with each other and get rid of the annoying > deprecation warning about SciPyTest. I just wanted to give you a public, huge thank you for tackling this most thankless but important problem. Many people at the just finished SciPy'07 conference mentioned better deployment/installation support as their main issue with scipy. Our tools are maturing, but we won't get very far if they don't actually get in the hands of users. Regards, f From markbak at gmail.com Mon Aug 20 09:05:54 2007 From: markbak at gmail.com (mark) Date: Mon, 20 Aug 2007 13:05:54 -0000 Subject: [Numpy-discussion] selecting part of an array like a[ a<5 ] Message-ID: <1187615154.941614.314430@50g2000hsm.googlegroups.com> Hello - I am wondering what the better way is to select part of an array. Say I have an array a: a = arange(10) Now I want to select the values larger than 5 a[ a>5 ] and later I need the values smaller or equal to 5 a[ a<=5 ] It seems that doing the comparison twice is extra work (especially if the array is large). So I thought I store the comparison b = a>5 Now I can do a[b] But how do I get the others? a[not b] or a[!b] don't work. So it's gotta be something different. Besides, is it a good idea to store b like I suggest? Thanks for the help, Mark From kwgoodman at gmail.com Mon Aug 20 09:19:34 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 20 Aug 2007 15:19:34 +0200 Subject: [Numpy-discussion] selecting part of an array like a[ a<5 ] In-Reply-To: <1187615154.941614.314430@50g2000hsm.googlegroups.com> References: <1187615154.941614.314430@50g2000hsm.googlegroups.com> Message-ID: On 8/20/07, mark wrote: > b = a>5 > > a[not b] or a[!b] don't work. So it's gotta be something different. a[~b] From sameerslists at gmail.com Mon Aug 20 09:34:53 2007 From: sameerslists at gmail.com (Sameer DCosta) Date: Mon, 20 Aug 2007 08:34:53 -0500 Subject: [Numpy-discussion] Setting numpy record array elements Message-ID: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com> Hi, In the example below I have a record array *a* that has a column *col1". I am trying to set the first element of a.col1 to zero in two different ways. 1. a[0].col1 = 0 (This fails silently) 2. a.col1[0] = 0 (This works fine) I am using the latest svn version of numpy. Is this a bug? or is the first method supposed to fail? If it is supposed to fail should it fail silently? Thanks in advance for any replies. Cheers, Sameer %%%% Example Code %%% import numpy print numpy.__version__ a = numpy.rec.fromrecords( [ (1,2,3), (4,5,6)], dtype=[("col1", " References: <46C4A389.5020108@hawaii.edu> <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com> <46C50A46.1070208@hawaii.edu> <20070817030719.GA5542@arbutus.physics.mcmaster.ca> <46C542C1.8060907@hawaii.edu> <46C6A516.4000607@ieee.org> Message-ID: <20070820142619.GD7531@mentat.za.net> On Sat, Aug 18, 2007 at 01:51:50AM -0600, Travis Oliphant wrote: > > Not any more! See the revised PEP 007, > > http://www.python.org/dev/peps/pep-0007/ > > > > In Python 3000 (and in the 2.x series, in new source files), > > we'll switch to a different indentation style: 4 spaces per indent, > > all spaces (no tabs in any file). The rest will remain the same. > > > > I would love to see this as the standard in numpy as well. Then files > > obey WYSIWYG regardless of editor. (Except for unicode woes, but that > > is another topic.) > > > > I'm fine with this. Some information on how to make sure emacs (and > other editors) does this would be helpful. Here are some of my .emacs snippets. I assume that .el files are placed in ~/elisp and that the following line is in your emacs configuration: (add-to-list 'load-path "~/elisp") Highlight unnecessary whitespace ================================ Download: http://www.emacswiki.org/cgi-bin/wiki/show-wspace.el ; Show whitespace (require 'show-wspace) (add-hook 'python-mode-hook 'highlight-tabs) (add-hook 'font-lock-mode-hook 'highlight-trailing-whitespace) Clean up tabs and trailing spaces ================================= M-x untabify and M-x whitespace-cleanup Tell emacs never to use tabs ============================ (setq-default indent-tabs-mode nil) Highlight all text after column 80 ================================== Download: http://www.emacswiki.org/cgi-bin/wiki/column-marker.el (require 'column-marker) (add-hook 'font-lock-mode-hook (lambda () (interactive) (column-marker-1 80))) Show a ruler with the current column position ============================================= (require 'ruler-mode) (add-hook 'font-lock-mode-hook 'ruler-mode) Enable restructured text (ReST) editing ======================================= (require 'rst) (add-hook 'text-mode-hook 'rst-text-mode-bindings) Fix outline-mode to work with Python ==================================== (add-hook 'python-mode-hook 'my-python-hook) (defun py-outline-level () "This is so that `current-column` DTRT in otherwise-hidden text" ;; from ada-mode.el (let (buffer-invisibility-spec) (save-excursion (skip-chars-forward "\t ") (current-column)))) ; this fragment originally came from the web somewhere, but the outline-regexp ; was horribly broken and is broken in all instances of this code floating ; around. Finally fixed by Charl P. Botha ; <http://cpbotha.net/> (defun my-python-hook () (setq outline-regexp "[^ \t\n]\\|[ \t]*\\(def[ \t]+\\|class[ \t]+\\)") ; enable our level computation (setq outline-level 'py-outline-level) ; do not use their \C-c@ prefix, too hard to type. Note this overides ;some python mode bindings ;(setq outline-minor-mode-prefix "\C-c") ; turn on outline mode (outline-minor-mode t) ; initially hide all but the headers ; (hide-body) (show-paren-mode 1) ) Cheers St?fan From stefan at sun.ac.za Mon Aug 20 10:54:55 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 20 Aug 2007 16:54:55 +0200 Subject: [Numpy-discussion] Error in allclose with inf values In-Reply-To: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com> References: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com> Message-ID: <20070820145455.GF7531@mentat.za.net> Hi Matthew On Fri, Aug 17, 2007 at 01:11:41PM +0100, Matthew Brett wrote: > I noticed that allclose does not always behave correctly for arrays with infs. > > I've attached a test script for allclose, and here's an alternative > implementation that I believe behaves correctly. Thanks for the patch -- I applied it in r3977 with the appropriate tests. > I wonder if we should allow nans in the test arrays - possibly with an > optional keyword arg like allownan. After all inf-inf is nan - but we > allow that in the test. I'm happy with both allclose([Inf],[-Inf]) and allclose([anything],[NaN]) returning False. Cheers St?fan From stefan at sun.ac.za Mon Aug 20 12:10:25 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 20 Aug 2007 18:10:25 +0200 Subject: [Numpy-discussion] Setting numpy record array elements In-Reply-To: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com> References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com> Message-ID: <20070820161025.GH7531@mentat.za.net> On Mon, Aug 20, 2007 at 08:34:53AM -0500, Sameer DCosta wrote: > In the example below I have a record array *a* that has a column > *col1". I am trying to set the first element of a.col1 to zero in two > different ways. > > 1. a[0].col1 = 0 (This fails silently) > 2. a.col1[0] = 0 (This works fine) > > I am using the latest svn version of numpy. Is this a bug? or is the > first method supposed to fail? If it is supposed to fail should it > fail silently? This looks like a bug, since a[0][0] = 0 works fine. I'll take a closer look and make sure. Cheers St?fan From zyzhu2000 at gmail.com Mon Aug 20 17:36:52 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Mon, 20 Aug 2007 16:36:52 -0500 Subject: [Numpy-discussion] "Extended" Outer Product Message-ID: Hi Everyone, I am wondering if there is an "extended" outer product. Take the example in "Guide to Numpy." Instead of doing an multiplication, I want to call a custom function for each pair. >>> print outer([1,2,3],[10,100,1000]) [[ 10 100 1000] [ 20 200 2000] [ 30 300 3000]] So I want: [ [f(1,10), f(1,100), f(1,1000)], [f(2,10), f(2, 100), f(2, 1000)], [f(3,10), f(3, 100), f(3,1000)] ] Does anyone know how to do this without using a double loop? Thanks, Geoffrey From robert.kern at gmail.com Mon Aug 20 18:37:01 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 20 Aug 2007 17:37:01 -0500 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: <46CA178D.5020901@gmail.com> Geoffrey Zhu wrote: > Hi Everyone, > > I am wondering if there is an "extended" outer product. Take the > example in "Guide to Numpy." Instead of doing an multiplication, I > want to call a custom function for each pair. > >>>> print outer([1,2,3],[10,100,1000]) > > [[ 10 100 1000] > [ 20 200 2000] > [ 30 300 3000]] > > > So I want: > > [ > [f(1,10), f(1,100), f(1,1000)], > [f(2,10), f(2, 100), f(2, 1000)], > [f(3,10), f(3, 100), f(3,1000)] > ] > > Does anyone know how to do this without using a double loop? If you can code your function such that it only uses operations that broadcast (i.e. operators and ufuncs) and avoids things like branching or loops, then you can just use numpy.newaxis on the first array. from numpy import array, newaxis x = array([1, 2, 3]) y = array([10, 100, 1000]) f(x[:,newaxis], y) Otherwise, you can use numpy.vectorize() to turn your function into one that will do that broadcasting for you. from numpy import array, newaxis, vectorize x = array([1, 2, 3]) y = array([10, 100, 1000]) f = vectorize(f) f(x[:,newaxis], y) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sameerslists at gmail.com Mon Aug 20 19:26:30 2007 From: sameerslists at gmail.com (Sameer DCosta) Date: Mon, 20 Aug 2007 18:26:30 -0500 Subject: [Numpy-discussion] Setting numpy record array elements In-Reply-To: <20070820161025.GH7531@mentat.za.net> References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com> <20070820161025.GH7531@mentat.za.net> Message-ID: <8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com> On 8/20/07, Stefan van der Walt wrote: > > This looks like a bug, since > > a[0][0] = 0 > > works fine. I'll take a closer look and make sure. > Thanks Stefan for offering to take a closer look. I have attached a patch against the latest svn which fixes this problem. Both this patched version and the current subversion source do not throw an AttributeError exception if you do something like a[0].non_existent_column = 10 That is a different problem that probably should be fixed. Cheers, Sameer -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy_patch.svn.diff Type: application/octet-stream Size: 988 bytes Desc: not available URL: From Chris.Barker at noaa.gov Mon Aug 20 19:51:55 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 20 Aug 2007 16:51:55 -0700 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: <46CA178D.5020901@gmail.com> References: <46CA178D.5020901@gmail.com> Message-ID: <46CA291B.30800@noaa.gov> Robert Kern wrote: > If you can code your function such that it only uses operations that broadcast > (i.e. operators and ufuncs) and avoids things like branching or loops, then you > can just use numpy.newaxis on the first array. > > from numpy import array, newaxis > x = array([1, 2, 3]) > y = array([10, 100, 1000]) > f(x[:,newaxis], y) in fact, it may make sense to just have your x be column vector anyway: >>> x array([1, 2, 3]) >>> y array([10, 11, 12]) >>> x.shape = (-1,1) >>> x array([[1], [2], [3]]) >>> x * y array([[10, 11, 12], [20, 22, 24], [30, 33, 36]]) Broadcasting is VERY cool! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Tue Aug 21 01:14:52 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Aug 2007 23:14:52 -0600 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/20/07, Geoffrey Zhu wrote: > > Hi Everyone, > > I am wondering if there is an "extended" outer product. Take the > example in "Guide to Numpy." Instead of doing an multiplication, I > want to call a custom function for each pair. > > >>> print outer([1,2,3],[10,100,1000]) > > [[ 10 100 1000] > [ 20 200 2000] > [ 30 300 3000]] > > > So I want: > > [ > [f(1,10), f(1,100), f(1,1000)], > [f(2,10), f(2, 100), f(2, 1000)], > [f(3,10), f(3, 100), f(3,1000)] > ] You could make two matrices like so: In [46]: a = arange(3) In [47]: b = a.reshape(1,3).repeat(3,0) In [48]: c = a.reshape(3,1).repeat(3,1) In [49]: b Out[49]: array([[0, 1, 2], [0, 1, 2], [0, 1, 2]]) In [50]: c Out[50]: array([[0, 0, 0], [1, 1, 1], [2, 2, 2]]) which will give you all pairs. You can then make a function of these in various ways, for example In [52]: c**b Out[52]: array([[1, 0, 0], [1, 1, 1], [1, 2, 4]]) That is a bit clumsy, though. I don't know how to do what you want in a direct way. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue Aug 21 01:50:21 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 20 Aug 2007 22:50:21 -0700 Subject: [Numpy-discussion] NumPy 1.0.3.1 released Message-ID: I'm pleased to announce the release of NumPy 1.0.3.1 This a minor bug fix release, which enables the latest release of SciPy to build. Bug-fixes =============== * Add back get_path to numpy.distutils.misc_utils * Fix 64-bit zgeqrf * Add parenthesis around GETPTR macros Thank you to everybody who contributed to the recent release. Best regards, NumPy Developers http://numpy.scipy.org From stefan at sun.ac.za Tue Aug 21 02:40:16 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 21 Aug 2007 08:40:16 +0200 Subject: [Numpy-discussion] Setting numpy record array elements In-Reply-To: <8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com> References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com> <20070820161025.GH7531@mentat.za.net> <8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com> Message-ID: <20070821064015.GB14999@mentat.za.net> Hi Sameer On Mon, Aug 20, 2007 at 06:26:30PM -0500, Sameer DCosta wrote: > On 8/20/07, Stefan van der Walt wrote: > Thanks Stefan for offering to take a closer look. I have attached a > patch against the latest svn which fixes this problem. Yup, right on the money. The __setattr__ call sets the value of a[0].col1, but a[0].col1 is in fact a pointer to a[0][0]. It is therefore necessary to use the setfields method. I cannot think of any situation where you would need to call __setattr__ on another member of "void", so I'm going to apply the patch unless anyone objects. Cheers St?fan From mpmusu at cc.usu.edu Tue Aug 21 10:23:30 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Tue, 21 Aug 2007 08:23:30 -0600 Subject: [Numpy-discussion] Finding a row match within a numpy array In-Reply-To: <1187190071.384881.240470@w3g2000hsg.googlegroups.com> References: <46C2CD01.5030307@bristol.ac.uk> <1187172556.613122.207400@r29g2000hsg.googlegroups.com> <46C2D4C7.4010305@bristol.ac.uk> <1187175588.007436.125720@w3g2000hsg.googlegroups.com> <1187190071.384881.240470@w3g2000hsg.googlegroups.com> Message-ID: <46CAF562.9060009@cc.usu.edu> A slightly related question on this topic... Is there a good loopless way to identify all of the unique rows in an array? Something like numpy.unique() is ideal, but capable of extracting unique subarrays along an axis. Thanks, -Mark mark wrote: > Maybe this is not the intended use of where, but it seems to work: >>>> from numpy import * # No complaining now >>>> a = arange(12) >>>> a.shape = (4,3) >>>> a > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 6, 7, 8], > [ 9, 10, 11]]) >>>> b = array([6,7,8]) >>>> row = all( equal(a,b), 1 ) >>>> where(row==True) > (array([2]),) > From charlesr.harris at gmail.com Tue Aug 21 12:44:09 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 21 Aug 2007 10:44:09 -0600 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/20/07, Geoffrey Zhu wrote: > > Hi Everyone, > > I am wondering if there is an "extended" outer product. Take the > example in "Guide to Numpy." Instead of doing an multiplication, I > want to call a custom function for each pair. > > >>> print outer([1,2,3],[10,100,1000]) > > [[ 10 100 1000] > [ 20 200 2000] > [ 30 300 3000]] > > > So I want: > > [ > [f(1,10), f(1,100), f(1,1000)], > [f(2,10), f(2, 100), f(2, 1000)], > [f(3,10), f(3, 100), f(3,1000)] > ] Maybe something like In [15]: f = lambda x,y : x*sin(y) In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)]) In [17]: a Out[17]: array([[ 0. , 0. , 0. ], [ 0. , 0.84147098, 1.68294197], [ 0. , 0.90929743, 1.81859485]]) I don't know if nested list comprehensions are faster than two nested loops, but at least they avoid array indexing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Tue Aug 21 13:59:56 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 21 Aug 2007 10:59:56 -0700 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/21/07, Charles R Harris wrote: > > > > On 8/20/07, Geoffrey Zhu wrote: > > > > Hi Everyone, > > > > I am wondering if there is an "extended" outer product. Take the > > example in "Guide to Numpy." Instead of doing an multiplication, I > > want to call a custom function for each pair. > > > > >>> print outer([1,2,3],[10,100,1000]) > > > > [[ 10 100 1000] > > [ 20 200 2000] > > [ 30 300 3000]] > > > > > > So I want: > > > > [ > > [f(1,10), f(1,100), f(1,1000)], > > [f(2,10), f(2, 100), f(2, 1000)], > > [f(3,10), f(3, 100), f(3,1000)] > > ] > > > Maybe something like > > In [15]: f = lambda x,y : x*sin(y) > > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)]) > > In [17]: a > Out[17]: > array([[ 0. , 0. , 0. ], > [ 0. , 0.84147098, 1.68294197], > [ 0. , 0.90929743, 1.81859485]]) > > I don't know if nested list comprehensions are faster than two nested > loops, but at least they avoid array indexing. > This is just a general comment on recent threads of this type and not directed specifically at Chuck or anyone else. IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is often more memory friendly and thus faster to vectorize only the inner loop and leave outer loops alone. Everything varies with the specific case of course, but trying to avoid FOR loops on principle is not a good strategy. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyzhu2000 at gmail.com Tue Aug 21 15:46:28 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Tue, 21 Aug 2007 14:46:28 -0500 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/21/07, Timothy Hochberg wrote: > > > > On 8/21/07, Charles R Harris wrote: > > > > > > > > On 8/20/07, Geoffrey Zhu < zyzhu2000 at gmail.com> wrote: > > > Hi Everyone, > > > > > > I am wondering if there is an "extended" outer product. Take the > > > example in "Guide to Numpy." Instead of doing an multiplication, I > > > want to call a custom function for each pair. > > > > > > >>> print outer([1,2,3],[10,100,1000]) > > > > > > [[ 10 100 1000] > > > [ 20 200 2000] > > > [ 30 300 3000]] > > > > > > > > > So I want: > > > > > > [ > > > [f(1,10), f(1,100), f(1,1000)], > > > [f(2,10), f(2, 100), f(2, 1000)], > > > [f(3,10), f(3, 100), f(3,1000)] > > > ] > > > > > > Maybe something like > > > > In [15]: f = lambda x,y : x*sin(y) > > > > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)]) > > > > In [17]: a > > Out[17]: > > array([[ 0. , 0. , 0. ], > > [ 0. , 0.84147098, 1.68294197], > > [ 0. , 0.90929743, 1.81859485]]) > > > > I don't know if nested list comprehensions are faster than two nested > loops, but at least they avoid array indexing. > > This is just a general comment on recent threads of this type and not > directed specifically at Chuck or anyone else. > > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is > often more memory friendly and thus faster to vectorize only the inner loop > and leave outer loops alone. Everything varies with the specific case of > course, but trying to avoid FOR loops on principle is not a good strategy. > I agree. My original post asked for solutions without using two nested for loops because I already know the two for loop solution. Besides, I was hoping that some version of 'outer' will take in a function reference and call the function instead of doing multiplifcation. From tim.hochberg at ieee.org Tue Aug 21 15:56:03 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 21 Aug 2007 12:56:03 -0700 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/21/07, Geoffrey Zhu wrote: > > On 8/21/07, Timothy Hochberg wrote: > > > > > > > > On 8/21/07, Charles R Harris wrote: > > > > > > > > > > > > On 8/20/07, Geoffrey Zhu < zyzhu2000 at gmail.com> wrote: > > > > Hi Everyone, > > > > > > > > I am wondering if there is an "extended" outer product. Take the > > > > example in "Guide to Numpy." Instead of doing an multiplication, I > > > > want to call a custom function for each pair. > > > > > > > > >>> print outer([1,2,3],[10,100,1000]) > > > > > > > > [[ 10 100 1000] > > > > [ 20 200 2000] > > > > [ 30 300 3000]] > > > > > > > > > > > > So I want: > > > > > > > > [ > > > > [f(1,10), f(1,100), f(1,1000)], > > > > [f(2,10), f(2, 100), f(2, 1000)], > > > > [f(3,10), f(3, 100), f(3,1000)] > > > > ] > > > > > > > > > Maybe something like > > > > > > In [15]: f = lambda x,y : x*sin(y) > > > > > > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)]) > > > > > > In [17]: a > > > Out[17]: > > > array([[ 0. , 0. , 0. ], > > > [ 0. , 0.84147098, 1.68294197], > > > [ 0. , 0.90929743, 1.81859485]]) > > > > > > I don't know if nested list comprehensions are faster than two nested > > loops, but at least they avoid array indexing. > > > > This is just a general comment on recent threads of this type and not > > directed specifically at Chuck or anyone else. > > > > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is > > often more memory friendly and thus faster to vectorize only the inner > loop > > and leave outer loops alone. Everything varies with the specific case of > > course, but trying to avoid FOR loops on principle is not a good > strategy. > > > > I agree. My original post asked for solutions without using two nested > for loops because I already know the two for loop solution. Besides, I > was hoping that some version of 'outer' will take in a function > reference and call the function instead of doing multiplifcation. A specific example would help here. There are ways to deal with certain subclasses of problems that won't necessarily generalize. For example, are you aware of the outer methods on ufuncs (add.outer, substract.outer, etc)? Typical dimensions also matter, since some approaches work well for certain shapes, but are pretty miserable for others. FWIW, I often have very good luck with removing the inner for-loop in favor of vector operations. This tends to be simpler than trying to vectorize everything and often has better performance since it's often more memory friendly. However, it all depends on specifics of the problem. Regards, -tim -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Tue Aug 21 16:32:49 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 21 Aug 2007 16:32:49 -0400 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 21/08/07, Timothy Hochberg wrote: > This is just a general comment on recent threads of this type and not > directed specifically at Chuck or anyone else. > > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is > often more memory friendly and thus faster to vectorize only the inner loop > and leave outer loops alone. Everything varies with the specific case of > course, but trying to avoid FOR loops on principle is not a good strategy. Yes and no. From a performance point of view, you are certainly right; vectorizing is definitely not always a speedup. But for me, the main advantage of vectorized operations is generally clarity: C = A*B is clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's not clearer and simpler, I feel no compunction about falling back to list comprehensions and for loops. That said, it would often be nice to have something like map(f,arange(10)) for arrays; the best I've found is vectorize(f)(arange(10)). vectorize, of course, is a good example of my point above: it really just loops, in python IIRC, but conceptually it's extremely handy for doing exactly what the OP wanted. Unfortunately vectorize() does not yield a sufficiently ufunc-like object to support .outer(), as that would be extremely tidy. Anne From tim.hochberg at ieee.org Tue Aug 21 17:14:00 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 21 Aug 2007 14:14:00 -0700 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: On 8/21/07, Anne Archibald wrote: > > On 21/08/07, Timothy Hochberg wrote: > > > This is just a general comment on recent threads of this type and not > > directed specifically at Chuck or anyone else. > > > > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is > > often more memory friendly and thus faster to vectorize only the inner > loop > > and leave outer loops alone. Everything varies with the specific case of > > course, but trying to avoid FOR loops on principle is not a good > strategy. > > Yes and no. From a performance point of view, you are certainly right; > vectorizing is definitely not always a speedup. But for me, the main > advantage of vectorized operations is generally clarity: C = A*B is > clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's > not clearer and simpler, I feel no compunction about falling back to > list comprehensions and for loops. I always assume that in these cases performance is a driver of the question. It would be straightforward to code an outer equivalent in Python to hide this for anyone who cares. Since no one who asks these questions ever does, I assume they must be primarily motivated by performance. That said, it would often be nice to have something like > map(f,arange(10)) for arrays; the best I've found is > vectorize(f)(arange(10)). > > vectorize, of course, is a good example of my point above: it really > just loops, in python IIRC, I used to think that too, but then I looked at it and I believe it actually grabs the code object out of the function and loops in C. You still have to run the code object at each point though so it's not that fast. It's been a while since I did that looking so I may be totally wrong. but conceptually it's extremely handy for > doing exactly what the OP wanted. Unfortunately vectorize() does not > yield a sufficiently ufunc-like object to support .outer(), as that > would be extremely tidy. I suppose someone should fix that someday. However, I still think vectorize is an attractive nuisance in the sense that someone has a function that they want to apply to an array and they get sucked into throwing vectorize at the problem. More often than not, vectorize makes things slower than they need to be. If you don't care about performance, that's fine, but I live in fear of code like: def f(a, b): return sin(a*b + a**2) f = vectorize(f) The original function f is a perfectly acceptable vectorized function (assuming one uses numpy.sin), but now it's been replaced by a slower version by passing it through vectorize. To be sure, this isn't always the case; in cases where you have to make choices, things get messier. Still, I'm not convinced that vectorize doesn't hurt more than it helps. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 21 18:00:27 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 21 Aug 2007 17:00:27 -0500 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: <46CB607B.4050400@gmail.com> Timothy Hochberg wrote: > On 8/21/07, *Anne Archibald* > wrote: > but conceptually it's extremely handy for > doing exactly what the OP wanted. Unfortunately vectorize() does not > yield a sufficiently ufunc-like object to support .outer(), as that > would be extremely tidy. > > I suppose someone should fix that someday. Not much to fix. There is already frompyfunc() which does make a real ufunc. However, (and it's a big "however"), those ufuncs only output object arrays. That's why I didn't mention it earlier. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Wed Aug 22 03:45:46 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 22 Aug 2007 09:45:46 +0200 Subject: [Numpy-discussion] "Extended" Outer Product In-Reply-To: References: Message-ID: <20070822074546.GA18548@clipper.ens.fr> On Tue, Aug 21, 2007 at 02:14:00PM -0700, Timothy Hochberg wrote: > I suppose someone should fix that someday. However, I still think > vectorize is an attractive nuisance in the sense that someone has a > function that they want to apply to an array and they get sucked into > throwing vectorize at the problem. More often than not, vectorize makes > things slower than they need to be. If you don't care about performance, > that's fine, but I live in fear of code like: > def f(a, b): > return sin(a*b + a**2) > f = vectorize(f) > The original function f is a perfectly acceptable vectorized function > (assuming one uses numpy.sin), but now it's been replaced by a slower > version by passing it through vectorize. To be sure, this isn't always the > case; in cases where you have to make choices, things get messier. Still, > I'm not convinced that vectorize doesn't hurt more than it helps. I often have code where I am going to loop over a large amount of nested loops, some thing like: # A function to return the optical field in each point: def optical_field( (x, y, z) ): loop over an array of laser wave-vector return optical field # Evaluate the optical field on a grid to plot it : x, y z = mgrid[-10:10, -10:10, -10:10] field = optical_field( (x, y, z) ) In such a code every single operation could be vectorized, but the problem is that each function assumes the input array to be of a certain dimension: I may be using some code like: r = c_[x, y, z] cross(r, r_o) So implementing loops with arrays is not that convenient, because I have to add dimensions to my arrays, and to make sure that my inner functions are robust to these extra dimensions. Looking at some of my code where I had this kind of problems, I see functions similar to: def delta(r, v, k): return dot(r, transpose(k)) + Gaussian_beam(r) + dot(v, transpose(k)) I am starting to realize that the real problem is that there is no info of what the expected size for the input and output arguments should be. Given such info, the function could resize its input and output arguments. Maybe some clever decorators could be written to address this issue, something like: @inputsize( (3, -1), (3, -1), (3, -1) ) which would reshape every input positional argument to the shape given in the list of shapes, and reshape the output argument to the shape of the first input argument. As I worked around these problems in my code I cannot say whether these decorators would get rid of them (I had not had the idea at the time), I like the idea, and I will try next time I run into these problems. I just wanted to point out that replacing for loops with arrays was not always that simple and that using "vectorize" sometimes was a quick and a dirty way to get things done. Ga?l From faltet at carabos.com Wed Aug 22 05:11:16 2007 From: faltet at carabos.com (Francesc Altet) Date: Wed, 22 Aug 2007 11:11:16 +0200 Subject: [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match within a numpy array] In-Reply-To: <46CAF562.9060009@cc.usu.edu> References: <46C2CD01.5030307@bristol.ac.uk> <1187190071.384881.240470@w3g2000hsg.googlegroups.com> <46CAF562.9060009@cc.usu.edu> Message-ID: <200708221111.17141.faltet@carabos.com> A Tuesday 21 August 2007, Mark.Miller escrigu?: > A slightly related question on this topic... > > Is there a good loopless way to identify all of the unique rows in an > array? Something like numpy.unique() is ideal, but capable of > extracting unique subarrays along an axis. You can always do a view of the rows as strings and then use unique(). Here is an example: In [1]: import numpy In [2]: a=numpy.arange(12).reshape(4,3) In [3]: a[2]=(3,4,5) In [4]: a Out[4]: array([[ 0, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) now, create the view and select the unique rows: In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4') and finally restore the shape: In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) Out[6]: array([[ 0, 1, 2], [ 3, 4, 5], [ 9, 10, 11]]) If you want to find unique columns instead of rows, do a tranpose first on the initial array. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From jensj at fysik.dtu.dk Wed Aug 22 05:36:21 2007 From: jensj at fysik.dtu.dk (Jens =?iso-8859-1?Q?J=F8rgen_Mortensen?=) Date: Wed, 22 Aug 2007 11:36:21 +0200 (CEST) Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing In-Reply-To: <46C6CF49.6020202@ieee.org> References: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk> <46C6CF49.6020202@ieee.org> Message-ID: <10306.85.81.43.249.1187775381.squirrel@webmail.fysik.dtu.dk> > Jens J?rgen Mortensen wrote: >> I would like all these arrays to be contiguous: >> >>>>> import numpy as npy >>>>> npy.__version__ >> '1.0.4.dev3967' >>>>> x = npy.arange(4) >>>>> y = x[npy.newaxis, :] >>>>> z = x.reshape((1, 4)) >>>>> for a in [x, y, z]: >> ... print a.shape, a.strides, a.flags.contiguous >> ... >> (4,) (4,) True >> (1, 4) (0, 4) False >> (1, 4) (16, 4) True >> >> But y is not contiguous according to y.flags.contiguous - why not and >> why does y and z not have the same strides? >> >> I f > > We've tried a few times to let them be contiguous, but it breaks code in > various ways because NumPy takes advantage of 0-striding to accomplish > broadcasting. In theory, it might be able to be fixed, but the fact > that simple fixes don't work makes me wonder. OK, then how about giving y the strides (16, 4) like z? Then _IsContiguous() will say thay y is contiguous. Will that break any code? I can take a look at how to fix the strides for newaxis-indexed arrays if this is way to go. Jens J?rgen > ound this comment just before the _IsContiguous function in >> arrayobject.c: >> >> /* 0-strided arrays are not contiguous (even if dimension == 1) */ >> >> Is this correct? > > Yes. > > -Travis > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From Shawn.Gong at drdc-rddc.gc.ca Wed Aug 22 12:36:09 2007 From: Shawn.Gong at drdc-rddc.gc.ca (Gong, Shawn (Contractor)) Date: Wed, 22 Aug 2007 12:36:09 -0400 Subject: [Numpy-discussion] memory error caused by astype() In-Reply-To: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca> References: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca> Message-ID: <2E58C246F17003499C141D334794D049027683FA@ottawaex02.Ottawa.drdc-rddc.gc.ca> Hi list, When I do large array manipulations, I get out-of-memory errors. For instance if the array size is 5000 by 6000, the following codes use nearly 1G of RAM. Then my PC displays a Python error box. The try/except won't even catch it if the error happens in "astype" instead of "array1* array2" try: if ( array1.typecode() in cplx_types ): array1 = abs(array1.astype(Numeric.Complex32)) else: array1 = array1.astype(Numeric.Float32) if ( array2.typecode() in cplx_types ): array2 = abs(array2.astype(Numeric.Complex32)) else: array2 = array2.astype(Numeric.Float32) array1 = Numeric.sqrt(array1) * Numeric.sqrt(array2) return array1 except: gvutils.error("Memory error occurred\nPlease select a smaller array") return None My questions are: 1) Is there a more memory efficient way instead of using astype? 2) If not, then how do I catch error during astype? 3) Is there a way in Python that detects the available RAM and limits the array size before he/she can go ahead with the array multiplications? i.e. detects the available RAM, say 1G Assume the worst case - Complex32 Figure out the array size limit and warn user Thanks, Shaw Gong -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at stsci.edu Wed Aug 22 13:25:26 2007 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 22 Aug 2007 13:25:26 -0400 Subject: [Numpy-discussion] latest svn version fails on Solaris Message-ID: <46CC7186.7010109@stsci.edu> Hi, The latest version of numpy has a unit test failure on big endian machines. ====================================================================== FAIL: test_record_array (numpy.core.tests.test_multiarray.test_putmask) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/basil5/site-packages/lib/python/numpy/core/tests/test_multiarray.py", line 450, in test_record_array assert_array_equal(rec['x'],[10,5]) File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 223, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 215, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 50.0%) x: array([ 4.58492919e-320, 5.00000000e+000]) y: array([10, 5]) ---------------------------------------------------------------------- Ran 670 tests in 47.182s Chris From millman at berkeley.edu Wed Aug 22 17:35:24 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 22 Aug 2007 14:35:24 -0700 Subject: [Numpy-discussion] Branch and Tag Maintenance Message-ID: Hello, I deleted any old (2+ years since modified) branches and tags. Nothing is actually deleted so if you need to access the old code simply use the relevant revision number with svn checkout, svn switch, or svn list. It is also very easy to restore if you are planning to continue working on some of the code. For example if you need to restore the numarray branch just use: svn copy -r 3988 http://svn.scipy.org/svn/numpy/branches/numarray \ http://svn.scipy.org/svn/numpy/branches/numarray I have attached a text file with the deletions I made and what revision they were committed on. You can also view the changes using the trac site: http://projects.scipy.org/scipy/numpy/timeline Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ -------------- next part -------------- NumPy Branch Maintenance ======================== svn delete http://svn.scipy.org/svn/numpy/branches/build_src -m "Removing old branch" Committed revision 3985. svn delete http://svn.scipy.org/svn/numpy/branches/kiva_window_branch -m "Removing old branch" Committed revision 3986. svn delete http://svn.scipy.org/svn/numpy/branches/newunicode -m "Removing old branch" Committed revision 3987. svn delete http://svn.scipy.org/svn/numpy/branches/numarray -m "Removing old branch" Committed revision 3988. svn delete http://svn.scipy.org/svn/numpy/branches/oldcore -m "Removing old branch" Committed revision 3989. svn delete http://svn.scipy.org/svn/numpy/branches/v0_3_2 -m "Removing old branch" Committed revision 3990. NumPy Tag Maintenance ======================== svn delete http://svn.scipy.org/svn/numpy/tags/beta-0.4.2 -m "Removing old tag" Committed revision 3991. svn delete http://svn.scipy.org/svn/numpy/tags/Daily_Snapshot_01-11-2002 -m "Removing old tag" Committed revision 3992. svn delete http://svn.scipy.org/svn/numpy/tags/kiva_window -m "Removing old tag" Committed revision 3993. svn delete http://svn.scipy.org/svn/numpy/tags/post_numarray_merge -m "Removing old tag" Committed revision 3994. svn delete http://svn.scipy.org/svn/numpy/tags/pre_classify_conversion -m "Removing old tag" Committed revision 3995. svn delete http://svn.scipy.org/svn/numpy/tags/pre_compiler_removal -m "Removing old tag" Committed revision 3996. svn delete http://svn.scipy.org/svn/numpy/tags/pre_numarray -m "Removing old tag" Committed revision 3997. svn delete http://svn.scipy.org/svn/numpy/tags/pre_numarray_merge -m "Removing old tag" Committed revision 3998. svn delete http://svn.scipy.org/svn/numpy/tags/pre_org -m "Removing old tag" Committed revision 3999. svn delete http://svn.scipy.org/svn/numpy/tags/release_0_2_0 -m "Removing old tag" Committed revision 4000. svn delete http://svn.scipy.org/svn/numpy/tags/v0_2_0 -m "Removing old tag" Committed revision 4001. svn delete http://svn.scipy.org/svn/numpy/tags/v0_2_2 -m "Removing old tag" Committed revision 4002. svn delete http://svn.scipy.org/svn/numpy/tags/v0_3_0 -m "Removing old tag" Committed revision 4003. From stefan at sun.ac.za Wed Aug 22 18:36:48 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu, 23 Aug 2007 00:36:48 +0200 Subject: [Numpy-discussion] latest svn version fails on Solaris In-Reply-To: <46CC7186.7010109@stsci.edu> References: <46CC7186.7010109@stsci.edu> Message-ID: <20070822223648.GA8884@mentat.za.net> Hi Chris Do you have a Solaris machine that we can use as a client for the buildbot (this can be a desktop machine)? I didn't see this problem earlier, since all the other platforms built without problems. I also noticed that not all platforms execute the same number of tests, which is worrisome. Cheers St?fan On Wed, Aug 22, 2007 at 01:25:26PM -0400, Christopher Hanley wrote: > Hi, > > The latest version of numpy has a unit test failure on big endian machines. > > ====================================================================== > FAIL: test_record_array (numpy.core.tests.test_multiarray.test_putmask) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/data/basil5/site-packages/lib/python/numpy/core/tests/test_multiarray.py", line 450, in test_record_array > assert_array_equal(rec['x'],[10,5]) > File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 223, in assert_array_equal > verbose=verbose, header='Arrays are not equal') > File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 215, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not equal > > (mismatch 50.0%) > x: array([ 4.58492919e-320, 5.00000000e+000]) > y: array([10, 5]) > > ---------------------------------------------------------------------- > Ran 670 tests in 47.182s From Chris.Barker at noaa.gov Thu Aug 23 20:34:20 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 23 Aug 2007 17:34:20 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. Message-ID: <46CE278C.6010804@noaa.gov> Hi all, I was just trying to write a unit test for something where I was expecting to get some NaN's in the array. However, since NaN == NaN returns false, the simple test: assert(alltrue(a == b)) >>> a = N.array((1,2,3,N.nan)) >>> b = N.array((1,2,3,N.nan)) >>> a == b array([ True, True, True, False], dtype=bool) >>> assert(N.alltrue(a == b)) Traceback (most recent call last): File "", line 1, in AssertionError >>> So is there any way to test is two arrays are the same, when there may be a NaN or two mixed in??? With a bit of thought -- this works: >>> N.alltrue(a[~N.isnan(a)] == b[~N.isnan(b)]) True but that feels like a kludge. maybe some sort of "TheseArrays are binary equal" would be useful. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From focke at slac.stanford.edu Thu Aug 23 22:51:50 2007 From: focke at slac.stanford.edu (Warren Focke) Date: Thu, 23 Aug 2007 19:51:50 -0700 (PDT) Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <46CE278C.6010804@noaa.gov> References: <46CE278C.6010804@noaa.gov> Message-ID: On Thu, 23 Aug 2007, Christopher Barker wrote: > but that feels like a kludge. maybe some sort of "TheseArrays are binary > equal" would be useful. But there are multiple possible NaNs, so you couldn't rely on the bits comparing. Maybe something with masked arrays? w From matthieu.brucher at gmail.com Fri Aug 24 04:41:18 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 24 Aug 2007 10:41:18 +0200 Subject: [Numpy-discussion] Error code of NumpyTest() Message-ID: Hi, I wondered if there was a way of returning another error code than 0 when executing the test suite so that a parent process can immediately know if all the tests passed or not. The numpy buildbot seems to have the same behaviour BTW. I don't know if it is possible, but it would be great. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoeffken at ipk-gatersleben.de Fri Aug 24 04:46:05 2007 From: hoeffken at ipk-gatersleben.de (=?UTF-8?B?TWF0dGhpYXMgSMO2ZmZrZW4=?=) Date: Fri, 24 Aug 2007 10:46:05 +0200 Subject: [Numpy-discussion] Reference counter of builtin descriptor objects Message-ID: <46CE9ACD.3080607@ipk-gatersleben.de> Greetings, I struggling with the numpy C-API (version 1.0.3). Now I have obscurities concerning the reference counter of builtin descriptor objects. In some situation, when running my own code, the reference counter fall to zero an I get warning messages. In some other samples the reference counter increases more and more while the program is running and the average number of used object is keeping constant. Now I would like to know when I have to take care about the reference counter of builtin descriptor objects. Especially when using "PyArray_SimpleNew", "PyArray_SimpleNewFromData", "PyArray_NewFromDescr" and "PyArray_SimpleNewFromDescr". Up to now I never touched the counters in my code after using this functions resulting in the described problems. Another case concerns parsing the arguments of functions. I often use such kind of expressions: PyArg_ParseTupleAndKeywords(args, kwds, "O!O!", kwlist, &PyArray_Type, &array1, &PyArray_Type, &array1)) Normally I would expect, that no reference counter is changed. Is that really true? Many thanks in advance! Matthias -------------- next part -------------- A non-text attachment was scrubbed... Name: hoeffken.vcf Type: text/x-vcard Size: 315 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature URL: From haase at msg.ucsf.edu Fri Aug 24 05:22:53 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 24 Aug 2007 11:22:53 +0200 Subject: [Numpy-discussion] pyOpenGL with numpy support Message-ID: Hi, The latest release notes of pyOpenGL (Feb 15, 2007) say that "Numarray support [was] reenabled". The current version is 3.0.0a6. Does anyone here know the status of the new (ctypes based) pyOpenGL ? How is the binding to ("modern") numpy ? I'm especially interested in fast memory access. So far I have to SWIG my own call to glVertexPointer to reduce the execution from about 160ms to a few ms. I think without the numpy support arrays are accessed through a very slow list protocol. (I'm just guessing in the dark.) I use pyOpenGL with great pleasure to display medical/microscopy images on mutli-color, color-maps using 2d-textures. It works very fast. Thanks, Sebastian From markbak at gmail.com Fri Aug 24 11:06:35 2007 From: markbak at gmail.com (mark) Date: Fri, 24 Aug 2007 15:06:35 -0000 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> Message-ID: <1187967995.784494.4110@x35g2000prf.googlegroups.com> There may be multiple nan-s, but what Chris did is simply create one with the same nan's >>> a = N.array((1,2,3,N.nan)) >>> b = N.array((1,2,3,N.nan)) I think these should be the same. Can anybody give me a good reason why they shouldn't, because it could confuse a lot of people? Thanks, Mark ps. I have to admit though, that matlab does the same thing. nan==nan is false. On Aug 24, 4:51 am, Warren Focke wrote: > On Thu, 23 Aug 2007, Christopher Barker wrote: > > but that feels like a kludge. maybe some sort of "TheseArrays are binary > > equal" would be useful. > > But there are multiple possible NaNs, so you couldn't rely on the bits > comparing. > > Maybe something with masked arrays? > > w > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From matthieu.brucher at gmail.com Fri Aug 24 11:25:43 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 24 Aug 2007 17:25:43 +0200 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <1187967995.784494.4110@x35g2000prf.googlegroups.com> References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> Message-ID: 2007/8/24, mark : > > There may be multiple nan-s, but what Chris did is simply create one > with the same nan's > > >>> a = N.array((1,2,3,N.nan)) > >>> b = N.array((1,2,3,N.nan)) > > I think these should be the same. > Can anybody give me a good reason why they shouldn't, because it could > confuse a lot of people? > > Thanks, Mark > It's the IEEE norm for flotting point numbers. You can have sevaral different NaN, although in this case, they are the same kind. Even if they are the same kind, the norm tells that NaN != NaN. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From Glen.Mabey at swri.org Fri Aug 24 11:46:39 2007 From: Glen.Mabey at swri.org (Glen W. Mabey) Date: Fri, 24 Aug 2007 10:46:39 -0500 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> Message-ID: <20070824154639.GA21230@bams.ccf.swri.edu> On Fri, Aug 24, 2007 at 05:25:43PM +0200, Matthieu Brucher wrote: > It's the IEEE norm for flotting point numbers. You can have sevaral > different NaN, although in this case, they are the same kind. > Even if they are the same kind, the norm tells that NaN != NaN. Someone mentioned using masked arrays. There is one "standard" mask that comes with numpy.ma (dunno about maskedarray -- is that still in the scipy sandbox?). Anyway, there could be another standard mask for NaN, which would serve to simplify the answer to those who encounter this in the future ... Glen From cournape at gmail.com Fri Aug 24 12:04:38 2007 From: cournape at gmail.com (David Cournapeau) Date: Sat, 25 Aug 2007 01:04:38 +0900 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> Message-ID: <5b8d13220708240904x3ea17c08u5f5fbb2130b39bd8@mail.gmail.com> On 8/25/07, Matthieu Brucher wrote: > > > 2007/8/24, mark : > > There may be multiple nan-s, but what Chris did is simply create one > > with the same nan's > > > > >>> a = N.array((1,2,3,N.nan)) > > >>> b = N.array((1,2,3,N.nan)) > > > > I think these should be the same. > > Can anybody give me a good reason why they shouldn't, because it could > > confuse a lot of people? > > > > Thanks, Mark > > > > It's the IEEE norm for flotting point numbers. You can have sevaral > different NaN, although in this case, they are the same kind. > Even if they are the same kind, the norm tells that NaN != NaN. > AFAIK, this is the definition of Nan, eg on a system which FPU is IEEE compatible, a number is x is NAN iff x != x. A Nan is defined at the binary level as having the exponent to 1 everywhere, and any non zero value in the mantissa: http://en.wikipedia.org/wiki/NaN Personaly, I would simply compare the non Nan numbers if Nan is a possible outcome of the operation. Checking at the binary level may make sense, but it really depends on the cases. David From Chris.Barker at noaa.gov Fri Aug 24 12:08:05 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 24 Aug 2007 09:08:05 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> Message-ID: <46CF0265.7040005@noaa.gov> Matthieu Brucher wrote: > 2007/8/24, mark >: > There may be multiple nan-s, but what Chris did is simply create one > with the same nan's > > >>> a = N.array((1,2,3,N.nan)) > >>> b = N.array((1,2,3,N.nan)) > > I think these should be the same. I'm the OP, but It depends what you mean by "the same". Yes, these two arrays are the same, and that's what I want to test for in this case. However, in the mathematical sense, I do understand what NaN == NaN should be false -- if you're doing math, those NaN's could have been arrived at by very different calculations, so you really wouldn't want them to compare equal, so the IEEE standard that NaN does not compare equal to anything makes sense to me. However, what I'm doing is testing to make sure I got the result I expected, so I want to know if two arrays are the same, including NaN's in the same places. If I wasn't working with an array package, I guess I'd be testing for NaN specifically where I expect it, so the solution I came up with before makes the most sense: N.alltrue(a[~N.isnan(a)] == b[~N.isnan(b)]) However, it's not likely, but that could give a true result if the NaN's were in different places, but there were the same number and everything happened to work out right. So maybe there is a need for a: nanequal, to go with: nanargmax nanargmin nanmax nanmin nansum > You can have several different NaN, You can? I thought NaN was defined by IEEE 754 as a particular bit pattern (one for each precision, anyway). Warren Focke wrote: > Maybe something with masked arrays? In this case, I'm using NaN to mean: "no valid data", so masked arrays are probably a better solution anyway. However, I like the simplicity of storing a non-value in the same binary array. However, if I do go with masked arrays: What's the status of the two masked array implementations? Which should I use? Unless there are huge feature differences (which I don't think there are), then I want to use the one that's going to get maintained into the future -- do we know yet which that will be? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tim.hochberg at ieee.org Fri Aug 24 12:15:09 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 24 Aug 2007 09:15:09 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <46CF0265.7040005@noaa.gov> References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> <46CF0265.7040005@noaa.gov> Message-ID: On 8/24/07, Christopher Barker wrote: [SNIP] > You can have several different NaN, > > You can? I thought NaN was defined by IEEE 754 as a particular bit > pattern (one for each precision, anyway). There's more than one way to spell NaN in binary and they tend to mean different things IIRC. Signalling NaNs and quiet NaNs and all of that. (Can you tell how superficial my knowledge is here, good). However, if you are inserting the NaNs yourself as placeholders, then they should all be the same kind and a binary comparison should be fine. [SNIP] -Chris > > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From David.L.Goldsmith at noaa.gov Fri Aug 24 12:23:23 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Fri, 24 Aug 2007 09:23:23 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <1187967995.784494.4110@x35g2000prf.googlegroups.com> References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> Message-ID: <46CF05FB.4040700@noaa.gov> What is meant by "multiple nan-s"? DG mark wrote: > There may be multiple nan-s, but what Chris did is simply create one > with the same nan's > > >>>> a = N.array((1,2,3,N.nan)) >>>> b = N.array((1,2,3,N.nan)) >>>> > > I think these should be the same. > Can anybody give me a good reason why they shouldn't, because it could > confuse a lot of people? > > Thanks, Mark > > ps. I have to admit though, that matlab does the same thing. nan==nan > is false. > > On Aug 24, 4:51 am, Warren Focke wrote: > >> On Thu, 23 Aug 2007, Christopher Barker wrote: >> >>> but that feels like a kludge. maybe some sort of "TheseArrays are binary >>> equal" would be useful. >>> >> But there are multiple possible NaNs, so you couldn't rely on the bits >> comparing. >> >> Maybe something with masked arrays? >> >> w >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- ERD/ORR/NOS/NOAA From David.L.Goldsmith at noaa.gov Fri Aug 24 12:33:04 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Fri, 24 Aug 2007 09:33:04 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <46CF05FB.4040700@noaa.gov> References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> <46CF05FB.4040700@noaa.gov> Message-ID: <46CF0840.400@noaa.gov> Never mind. (Posted that before finishing the thread, sorry). DG David Goldsmith wrote: > What is meant by "multiple nan-s"? > > DG > > mark wrote: > >> There may be multiple nan-s, but what Chris did is simply create one >> with the same nan's >> >> >> >>>>> a = N.array((1,2,3,N.nan)) >>>>> b = N.array((1,2,3,N.nan)) >>>>> >>>>> >> I think these should be the same. >> Can anybody give me a good reason why they shouldn't, because it could >> confuse a lot of people? >> >> Thanks, Mark >> >> ps. I have to admit though, that matlab does the same thing. nan==nan >> is false. >> >> On Aug 24, 4:51 am, Warren Focke wrote: >> >> >>> On Thu, 23 Aug 2007, Christopher Barker wrote: >>> >>> >>>> but that feels like a kludge. maybe some sort of "TheseArrays are binary >>>> equal" would be useful. >>>> >>>> >>> But there are multiple possible NaNs, so you couldn't rely on the bits >>> comparing. >>> >>> Maybe something with masked arrays? >>> >>> w >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > -- ERD/ORR/NOS/NOAA From tim.hochberg at ieee.org Fri Aug 24 12:40:51 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 24 Aug 2007 09:40:51 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> <46CF0265.7040005@noaa.gov> Message-ID: On 8/24/07, Timothy Hochberg wrote: > > > > On 8/24/07, Christopher Barker wrote: > [SNIP] > > > > You can have several different NaN, > > > > You can? I thought NaN was defined by IEEE 754 as a particular bit > > pattern (one for each precision, anyway). > > > There's more than one way to spell NaN in binary and they tend to mean > different things IIRC. Signalling NaNs and quiet NaNs and all of that. (Can > you tell how superficial my knowledge is here, good). > > However, if you are inserting the NaNs yourself as placeholders, then they > should all be the same kind and a binary comparison should be fine. > To beat this horse a little more: IEEE 754 NaNs are represented with the exponential field filled with ones and some non-zero number in the mantissa. A bit-wise example of a IEEE floating-point standardsingle precision NaN: x11111111axxxxxxxxxxxxxxxxxxxxxx. x = undefined. If a = 1, it is a *quiet NaN*, otherwise it is a *signalling NaN*. That's from http://en.wikipedia.org/wiki/NaN#NaN_encodings. So there a bunch of undefined bits that could be set for the private use of whoever is producing the NaNs for their own purposes. I don't know how often those bits vary in practice, but in principle it's not safe to rely on NaNs being bitwise equal. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri Aug 24 12:53:51 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 24 Aug 2007 09:53:51 -0700 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: References: <46CE278C.6010804@noaa.gov> <1187967995.784494.4110@x35g2000prf.googlegroups.com> <46CF0265.7040005@noaa.gov> Message-ID: <46CF0D1F.8030503@noaa.gov> Timothy Hochberg wrote: > in principle it's not safe to rely on NaNs being bitwise equal. Thanks Tim, I always learn a lot on this list. Anyway, I think my suggestion of "binary equal" wasn't really what I want. What I want is essentially a NaN-safe comparison, much like the NaN-safe functions like nanmax, nanmin, etc. I guess what that would involve is looping through the arrays, checking for "==", then checking if both are NaN if it returns false. (or checking for the NaN's first). I'm not sure NaNifying the other comparisons would make any sense: NaN > NaN (and all the other comparison's would have to return False anyway. So, would a nanequal function be useful? would it be hard to write? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Fri Aug 24 13:03:40 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 24 Aug 2007 13:03:40 -0400 Subject: [Numpy-discussion] comparing arrays with NaN in them. In-Reply-To: <46CF0265.7040005@noaa.gov> References: <46CE278C.6010804@noaa.gov> <46CF0265.7040005@noaa.gov> Message-ID: <200708241303.41843.pgmdevlist@gmail.com> All, Using the maskedarray package: >>>import maskedarray as ma >>>x = numpy.array([1,numpy.nan,3]) >>>y = numpy.array([1,numpy.nan,3]) >>>ma.allclose(ma.array(x,mask=numpy.isnan(x)),ma.array(y,mask=numpy.isnan(y)) ) True or even simpler: >>> maskedarray.testutils.assert_equal(x,y) #........................................ > What's the status of the two masked array implementations? One is official but no longer really supported (numpy.ma), one is still unofficial but fully functional (maskedarray), and supported (by me at least). My understanding is that maskedarray will stay in the sandbox as long as we don't have enough feedback from users. > Which should > I use? Unless there are huge feature differences (which I don't think > there are), Actually there is at least one big difference: the masked arrays you get from numpy.ma are NOT ndarrays. Therefore, a code like: >>>numpy.asanyarray(numpy.ma.array([1,2,3],mask=[0,1,0])) array([1, 2, 3]) loses your mask. On the other side, the maskedarray package (still in the sandbox) implements masked arrays as a subclass of ndarrays, so: >>>numpy.asanyarray(maskedarray.array([1,2,3],mask=[0,1,0])) masked_array(data = [1 -- 3], mask = [False True False], fill_value=999999) Apart from that, maskedarray implements more functions and methods than are available in numpy.ma. > then I want to use the one that's going to get maintained > into the future -- do we know yet which that will be? I've already committed myself to the support of maskedarray for the time being. Eric Firing and I have been in contact over the last few weeks about how to optimize maskedarray, for example by porting part of the code to C. There are still a couple of conceptual issues we need to address first, as presented in another thread From robert.kern at gmail.com Fri Aug 24 15:46:16 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 24 Aug 2007 14:46:16 -0500 Subject: [Numpy-discussion] pyOpenGL with numpy support In-Reply-To: References: Message-ID: <46CF3588.8020206@gmail.com> Sebastian Haase wrote: > Hi, > The latest release notes of pyOpenGL (Feb 15, 2007) say that "Numarray > support [was] reenabled". > The current version is 3.0.0a6. > > Does anyone here know the status of the new (ctypes based) pyOpenGL ? > How is the binding to ("modern") numpy ? numpy is the primary array type in PyOpenGL 3.0. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From seandavi at gmail.com Fri Aug 24 17:03:20 2007 From: seandavi at gmail.com (Sean Davis) Date: Fri, 24 Aug 2007 17:03:20 -0400 Subject: [Numpy-discussion] Dict of lists to numpy recarray Message-ID: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com> I have a simple question (I assume), but I can't quite get a handle on the answer. I have a dict with each member a list having a long (>5M elements). I would like to convert that into a numpy recarray. So far, my only thought is to loop over the length of the lists and convert to a list of tuples--this is SLOW. What I really need to be able to do is to supply columns of data to create a recarray, but I haven't found an example of how to do that. Thanks, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From seandavi at gmail.com Fri Aug 24 17:45:18 2007 From: seandavi at gmail.com (Sean Davis) Date: Fri, 24 Aug 2007 17:45:18 -0400 Subject: [Numpy-discussion] Dict of lists to numpy recarray In-Reply-To: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com> References: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com> Message-ID: <264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com> On 8/24/07, Sean Davis wrote: > > I have a simple question (I assume), but I can't quite get a handle on the > answer. I have a dict with each member a list having a long (>5M > elements). I would like to convert that into a numpy recarray. So far, my > only thought is to loop over the length of the lists and convert to a list > of tuples--this is SLOW. What I really need to be able to do is to supply > columns of data to create a recarray, but I haven't found an example of how > to do that. Sorry for the noise. Found it. newrecarray = numpy.rec.fromarrays ([x1,x2,x3],names='x1,x2,x3',formats='f8,i8,i8') Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Fri Aug 24 17:50:12 2007 From: aisaac at american.edu (Alan Isaac) Date: Fri, 24 Aug 2007 17:50:12 -0400 Subject: [Numpy-discussion] additional thanks Message-ID: I know thanks have already been offered, but I hope one more on the list will be acceptable. I start classes next week, in Economics. It is easy to discourage some of my students, if the "getting started" part of new software is rough. The new compatible NumPy and SciPy binaries are VERY HELPFUL!!! Thanks! Alan Isaac PS Just a warning to others in my position: students using VISTA are reporting install difficulties for Python 2.5.1. It sounds like a fix is to proceed as at but as I do not have access to a VISTA machine I have not been able to test this. From ryanlists at gmail.com Fri Aug 24 17:55:46 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Fri, 24 Aug 2007 16:55:46 -0500 Subject: [Numpy-discussion] additional thanks In-Reply-To: References: Message-ID: I helped a coulpe of my students install on Vista. It was enough to right click on the exe and choose "Run as Administrator". A pop-up window then comes up asking you if you trust the file or something and you have to chose an option that is something like, "yes, let it proceed". On 8/24/07, Alan Isaac wrote: > I know thanks have already been offered, > but I hope one more on the list will be acceptable. > > I start classes next week, in Economics. > It is easy to discourage some of my students, > if the "getting started" part of new software is rough. > The new compatible NumPy and SciPy binaries are VERY HELPFUL!!! > > Thanks! > Alan Isaac > > PS Just a warning to others in my position: > students using VISTA are reporting install difficulties > for Python 2.5.1. It sounds like a fix is to proceed as at > > but as I do not have access to a VISTA machine I have not been able to test this. > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From tom.denniston at alum.dartmouth.org Fri Aug 24 17:59:53 2007 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Fri, 24 Aug 2007 16:59:53 -0500 Subject: [Numpy-discussion] Dict of lists to numpy recarray In-Reply-To: <264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com> References: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com> <264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com> Message-ID: Try itertools.izipping the lists and then use numpy.fromiter. On 8/24/07, Sean Davis wrote: > > > On 8/24/07, Sean Davis wrote: > > I have a simple question (I assume), but I can't quite get a handle on the > answer. I have a dict with each member a list having a long (>5M elements). > I would like to convert that into a numpy recarray. So far, my only > thought is to loop over the length of the lists and convert to a list of > tuples--this is SLOW. What I really need to be able to do is to supply > columns of data to create a recarray, but I haven't found an example of how > to do that. > > Sorry for the noise. Found it. > > newrecarray = > numpy.rec.fromarrays([x1,x2,x3],names='x1,x2,x3',formats='f8,i8,i8') > > Sean > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From pgmdevlist at gmail.com Fri Aug 24 20:27:41 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 24 Aug 2007 20:27:41 -0400 Subject: [Numpy-discussion] Maskedarray implementations Message-ID: <200708242027.41896.pgmdevlist@gmail.com> All, As you might be aware, there are currently two concurrent implementations of masked arrays in numpy: * numpy.ma is the official implementation, but it is unclear whether it is still actively maintained. * maskedarray is the alternative I've been developing initially for my own purpose from numpy.ma. It is available in the scipy svn sandbox, but is already fully functional The main difference between numpy.ma and maskedarray is that the objects created by numpy.ma are NOT ndarrays, while maskedarray.MaskedArray is a full subclass of ndarrays. For example: >>>import numpy, maskedarray >>>x = numpy.ma.array([1,2], mask=[0,1]) >>>isinstance(x, numpy.ndarray) False >>>numpy.asanyarray(x) array([1,2]) Note that we just lost the mask... >>>x = maskedarray.array([1,2], mask=[0,1]) >>>isinstance(x, numpy.ndarray) True >>>numpy.asanyarray(x) masked_array(data = [1 --], ? ? ? mask = [False ?True], ? ? ? fill_value=999999) Note that the mask is conserved. Having the masked array be a subclass of ndarray makes masked arrays easier to mix with other ndarray types and to subclass. An example of application is the TimeSeries package, where the main TimeSeries class is a subclass of maskedarray.MaskedArray. * Does anyone see any *disadvantages* to this aspect of maskedarray relative to numpy.ma? * What would be the requisites to move maskedarray out of the sandbox ? We hope to be able in the short term to either replace or at least merge the two implementations, once a couple of issues are addressed (but we can talk about that later...) Thanks a lot in advance for your feedback Pierre From aisaac at american.edu Fri Aug 24 20:34:48 2007 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 24 Aug 2007 20:34:48 -0400 Subject: [Numpy-discussion] additional thanks In-Reply-To: References: Message-ID: On Fri, 24 Aug 2007, Ryan Krauss apparently wrote: > I helped a couple of my students install on Vista. It was > enough to right click on the exe and choose "Run as > Administrator". A pop-up window then comes up asking you > if you trust the file or something and you have to chose > an option that is something like, "yes, let it proceed". OK. I was not present for the installs. (Our classes start next week.) I did of course check that they were installing as Administrator, and they claimed "yes". I'll know more next week. Thanks, Alan Isaac From ryanlists at gmail.com Fri Aug 24 20:44:15 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Fri, 24 Aug 2007 19:44:15 -0500 Subject: [Numpy-discussion] additional thanks In-Reply-To: References: Message-ID: I think in Vista this is different from being logged into an administrator account (which I think is different from XP). I don't actually have a Vista machine to test on, but did help my students do it on theirs. No idea how their user accounts were set up. So, I think you have to right click and choose "Run as Administrator" regardless of what account you are logged into (I think, based on my limited Vista experience). Ryan On 8/24/07, Alan G Isaac wrote: > On Fri, 24 Aug 2007, Ryan Krauss apparently wrote: > > I helped a couple of my students install on Vista. It was > > enough to right click on the exe and choose "Run as > > Administrator". A pop-up window then comes up asking you > > if you trust the file or something and you have to chose > > an option that is something like, "yes, let it proceed". > > OK. I was not present for the installs. (Our classes start > next week.) I did of course check that they were installing > as Administrator, and they claimed "yes". I'll know more > next week. > > Thanks, > Alan Isaac > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From David.L.Goldsmith at noaa.gov Fri Aug 24 20:49:17 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Fri, 24 Aug 2007 17:49:17 -0700 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> Message-ID: <46CF7C8D.90005@noaa.gov> Pierre GM wrote: > * Does anyone see any *disadvantages* to this aspect of maskedarray relative > to numpy.ma? > What *is* numpy.ma derived from? DG -- ERD/ORR/NOS/NOAA From pgmdevlist at gmail.com Fri Aug 24 21:08:48 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 24 Aug 2007 21:08:48 -0400 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <46CF7C8D.90005@noaa.gov> References: <200708242027.41896.pgmdevlist@gmail.com> <46CF7C8D.90005@noaa.gov> Message-ID: <200708242108.51833.pgmdevlist@gmail.com> On Friday 24 August 2007 20:49:17 David Goldsmith wrote: > Pierre GM wrote: > > * Does anyone see any *disadvantages* to this aspect of maskedarray > > relative to numpy.ma? > > What *is* numpy.ma derived from? If you're talking about numpy.ma arrays: A numpy.ma.MaskedArray is an independent object consisting of two ndarrays (one for the data, one for the mask). A maskedarray.MaskedArray is a ndarray with another ndarray as attribute (the mask). Therefore, it inherits the methods of a ndarray. >>>import numpy, maskedarray >>>x = numpy.ma.array([1,2,3],mask=[1,0,0]) >>>type(x._data),type(x._mask) (, ) >>>x.view(numpy.ndarray) NotImplementedError: not yet implemented for numpy.ma arrays >>>x = maskedarray.array([1,2,3],mask=[1,0,0]) (, ) >>>x.view(numpy.ndarray) array([1, 2, 3]) If you're talking about the package itself: numpy.ma derives from the corresponding Numeric module, written by Paul Dubois. The maskedarray implementation relies quite heavily on Paul's work, I can't thank him enough. From jks at iki.fi Fri Aug 24 23:08:33 2007 From: jks at iki.fi (=?iso-8859-1?Q?Jouni_K=2E_Sepp=E4nen?=) Date: Sat, 25 Aug 2007 06:08:33 +0300 Subject: [Numpy-discussion] Finding unique rows in an array References: <46C2CD01.5030307@bristol.ac.uk> <1187190071.384881.240470@w3g2000hsg.googlegroups.com> <46CAF562.9060009@cc.usu.edu> <200708221111.17141.faltet@carabos.com> Message-ID: Francesc Altet writes: > A Tuesday 21 August 2007, Mark.Miller escrigu?: >> Is there a good loopless way to identify all of the unique rows in an >> array? Something like numpy.unique() is ideal, but capable of >> extracting unique subarrays along an axis. > > You can always do a view of the rows as strings and then use unique(). For large arrays it probably makes sense to hash the rows by taking a dot product with a random vector. Then sort the hash values and identify blocks of equal values (allowing for rounding errors). Rows with different hash values are guaranteed to be different; for blocks of rows with the same hash value, you'll have to check, but this will probably be much less work than checking every row, and (I hope) BLAS makes the dot-product phase go fast. -- Jouni K. Sepp?nen http://www.iki.fi/jks From lxander.m at gmail.com Sat Aug 25 11:43:32 2007 From: lxander.m at gmail.com (Alexander Michael) Date: Sat, 25 Aug 2007 11:43:32 -0400 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> Message-ID: <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com> Is there any documentation available for your maskedarray? I would like to get a feel for the basics, like how do I take the dot product, do elementwise multiplication, etc, with your implementation. Thanks, Alex From v.tini at tu-bs.de Tue Aug 21 08:11:37 2007 From: v.tini at tu-bs.de (Vivian Tini) Date: Tue, 21 Aug 2007 14:11:37 +0200 Subject: [Numpy-discussion] Installation problem NumPy-1.0.3 on Linux x86_64 Python 2.4.2 Message-ID: <1187698297.46cad67939e01@webmail.tu-bs.de> Dear All, I am trying to install the package NumPy-1.0.3 on Linux x86_64 with Python version 2.4.2 and after using the standard installation command : python setup.py install I received the following error message: error: could not create '/usr/local/lib64/python2.4/site-packages/numpy': Permission denied What is the cause of this problem? What kind of permission do I need? The python executable is located in /usr/bin however the numpy directory from where I tried to install is located in /home/../numpy-1.0.3. I hope someone could help me to figure out how shall I proceed. Thank you very much in advance. Regards Vivian Tini -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 3163 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: unnamed URL: From v.tini at tu-bs.de Fri Aug 24 11:46:55 2007 From: v.tini at tu-bs.de (Vivian Tini) Date: Fri, 24 Aug 2007 17:46:55 +0200 Subject: [Numpy-discussion] problem on testing numpy Message-ID: <1187970415.46cefd6f7805f@webmail.tu-bs.de> Dear All, I have just installed NumPy and I am excited to test it. Since I have no access as root then I installed Numpy in my home directory. The following messages appears as I tried some commands: >>> import numpy Running from numpy source directory >>> from numpy import * >>> a = array([1,2,3]) Traceback (most recent call last): File "", line 1, in ? NameError: name 'array' is not defined >>> import Numeric >>> from Numeric import * >>> a = array([1,2,3]) # this works fine >>> b = array([4,5,6]) >>> c = inner(a,b) Traceback (most recent call last): File "", line 1, in ? NameError: name 'inner' is not defined How should I proceed to make either the numpy or Numeric works? Is it the problem from the installation? Thanks a lot in advance. Regards, Vivian Tini From ondrej at certik.cz Sun Aug 19 23:30:19 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Sun, 19 Aug 2007 20:30:19 -0700 Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x In-Reply-To: References: Message-ID: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com> > I just wanted to give you a public, huge thank you for tackling this > most thankless but important problem. Many people at the just > finished SciPy'07 conference mentioned better deployment/installation > support as their main issue with scipy. Our tools are maturing, but > we won't get very far if they don't actually get in the hands of > users. I think all of the developers should make sure, that scipy and numpy installs natively in their own favourite distribution. So for example I am using Debian, so I'll try to keep an eye on it and help the maintainer of the deb package. This way it should cover the most distributions. Ondrej P.S. I don't know what the native way of installing packages on Mac OS X is, but I know of the fink project, that basically allows to use debian packages: http://finkproject.org/ From oliphant at enthought.com Fri Aug 24 21:43:23 2007 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 24 Aug 2007 19:43:23 -0600 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> Message-ID: <46CF893B.8080007@enthought.com> Pierre GM wrote: > All, > > > > * Does anyone see any *disadvantages* to this aspect of maskedarray relative > to numpy.ma? > > * What would be the requisites to move maskedarray out of the sandbox ? We > hope to be able in the short term to either replace or at least merge the two > implementations, once a couple of issues are addressed (but we can talk about > that later...) > I like the direction of this work. For me, the biggest issue is whether or not matplotlib (and other code depending on numpy.ma) works with it. I'm pretty sure this can be handled and so, I'd personally like to see it. Best, -Travis From matthieu.brucher at gmail.com Sat Aug 25 12:31:08 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 25 Aug 2007 18:31:08 +0200 Subject: [Numpy-discussion] Installation problem NumPy-1.0.3 on Linux x86_64 Python 2.4.2 In-Reply-To: <1187698297.46cad67939e01@webmail.tu-bs.de> References: <1187698297.46cad67939e01@webmail.tu-bs.de> Message-ID: 2007/8/21, Vivian Tini : > > Dear All, > > I am trying to install the package NumPy-1.0.3 on Linux x86_64 with Python > version 2.4.2 and after using the standard installation command : > > python setup.py install > > I received the following error message: > error: could not create '/usr/local/lib64/python2.4/site-packages/numpy': > Permission denied > > What is the cause of this problem? What kind of permission do I need? Root permission. If you want to install it in a local folder, try --prefix=/home/something/local and set PYTHONPATH to /home/something/local/lib/python2.4/site-packages Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sat Aug 25 12:31:45 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 25 Aug 2007 18:31:45 +0200 Subject: [Numpy-discussion] problem on testing numpy In-Reply-To: <1187970415.46cefd6f7805f@webmail.tu-bs.de> References: <1187970415.46cefd6f7805f@webmail.tu-bs.de> Message-ID: Where did you launch Python from ? Matthieu 2007/8/24, Vivian Tini : > > Dear All, > > I have just installed NumPy and I am excited to test it. > Since I have no access as root then I installed Numpy in my home > directory. > The following messages appears as I tried some commands: > > >>> import numpy > Running from numpy source directory > > >>> from numpy import * > >>> a = array([1,2,3]) > Traceback (most recent call last): > File "", line 1, in ? > NameError: name 'array' is not defined > > >>> import Numeric > >>> from Numeric import * > >>> a = array([1,2,3]) # this works fine > >>> b = array([4,5,6]) > >>> c = inner(a,b) > Traceback (most recent call last): > File "", line 1, in ? > NameError: name 'inner' is not defined > > How should I proceed to make either the numpy or Numeric works? Is it the > problem from the installation? > > Thanks a lot in advance. > > Regards, > > Vivian Tini > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat Aug 25 12:50:38 2007 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 25 Aug 2007 06:50:38 -1000 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com> Message-ID: <46D05DDE.5060006@hawaii.edu> Alexander Michael wrote: > Is there any documentation available for your maskedarray? I would > like to get a feel for the basics, like how do I take the dot product, > do elementwise multiplication, etc, with your implementation. > > Thanks, > Alex > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Alex, Pierre wrote some notes about maskedarray here: http://projects.scipy.org/scipy/numpy/wiki/MaskedArray starting half-way down the page. For normal use, do whatever you would do with numpy.ma; the maskedarray implementation is highly compatible, so the same functions and methods are available with the same signatures. Eric From jdh2358 at gmail.com Sat Aug 25 14:06:50 2007 From: jdh2358 at gmail.com (John Hunter) Date: Sat, 25 Aug 2007 13:06:50 -0500 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <46CF893B.8080007@enthought.com> References: <200708242027.41896.pgmdevlist@gmail.com> <46CF893B.8080007@enthought.com> Message-ID: <88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com> On 8/24/07, Travis Oliphant wrote: > I like the direction of this work. For me, the biggest issue is whether > or not matplotlib (and other code depending on numpy.ma) works with it. > I'm pretty sure this can be handled and so, I'd personally like to see it. mpl already supports it (both ma and masked array via a config setting) and we would be very happy to just maskedarray so we don't have to support both. Eric Firing added support for this a couple of months back... Things like having support for masked record arrays are a big incentive to use maskedarray for me.... JDH From pgmdevlist at gmail.com Sat Aug 25 15:06:06 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 25 Aug 2007 15:06:06 -0400 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <46D05DDE.5060006@hawaii.edu> References: <200708242027.41896.pgmdevlist@gmail.com> <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com> <46D05DDE.5060006@hawaii.edu> Message-ID: <200708251506.08568.pgmdevlist@gmail.com> On Saturday 25 August 2007 12:50:38 Eric Firing wrote: > Alexander Michael wrote: > > Is there any documentation available for your maskedarray? > > Pierre wrote some notes about maskedarray here: > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray > starting half-way down the page. Please note that I should probably edit the page, as it starts to be a bit old. We could also start another wiki page for masked arrays... In addition, there are a lot of unittest available, which can give you a first taste. The 'dot' function in maskedarray takes an additional parameter, strict. If strict is True, the masked values are propagated: if a masked value appears in a row or column, the whole row or column is considered masked. That's basically what you would have if masked values were nan (on a float array). If strict is False, masked values are considered as 0. > For normal use, do whatever you would do with numpy.ma; the maskedarray > implementation is highly compatible, so the same functions and methods > are available with the same signatures. Please don't hesitate to let me know where the doc is lacking, I'll fix that. As noted by Eric and John, mpl is fully compatible w/ maskedarray. Until recently, one had to edit the matplotlib.numerix.ma manually. Thanks to Eric, rcParams now accept a parameter that sets whether numpy.ma or maskedarray should be used. John, masked records are still experimental. I wrote the basis for the code (the mrecords module), tweaked it here and there according to the feedback I received (not so much so far, I want to thank Matt Knox (with whom we wrote TimeSeries) for starting to use mrecords on a regular basis), I'd be of course more than happy to fix any problem we may/will run into. An interesting feature of masked records is that individual fields can be masked (instead of masking full records). From efiring at hawaii.edu Sat Aug 25 15:48:00 2007 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 25 Aug 2007 09:48:00 -1000 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <200708251506.08568.pgmdevlist@gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com> <46D05DDE.5060006@hawaii.edu> <200708251506.08568.pgmdevlist@gmail.com> Message-ID: <46D08770.7050007@hawaii.edu> Pierre GM wrote: > On Saturday 25 August 2007 12:50:38 Eric Firing wrote: >> Alexander Michael wrote: >>> Is there any documentation available for your maskedarray? >> Pierre wrote some notes about maskedarray here: >> http://projects.scipy.org/scipy/numpy/wiki/MaskedArray >> starting half-way down the page. > > Please note that I should probably edit the page, as it starts to be a bit > old. We could also start another wiki page for masked arrays... I've made a couple of small "emergency" edits, but a separate page would make things much more visible and less confusing. Eric From mattknox_ca at hotmail.com Sat Aug 25 17:13:44 2007 From: mattknox_ca at hotmail.com (Matt Knox) Date: Sat, 25 Aug 2007 21:13:44 +0000 (UTC) Subject: [Numpy-discussion] Maskedarray implementations References: <200708242027.41896.pgmdevlist@gmail.com> <46CF893B.8080007@enthought.com> <88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com> Message-ID: I think it's reasonably safe to say at this point that most people are in favor of the new maskedarray implementation becoming the default numpy.ma at some point in the future. So the question is, when/how will the migration process be done? - just swap the whole thing as is and hope for the best? - start a numpy 1.1 branch and put it in there as a replacement for numpy.ma? - put it in numpy as a separate module from numpy.ma initially? (eg. "numpy.ma_new" ?) - some other approach? The first option would be perfectly fine by me since I don't use the standard numpy.ma anyway, but I suspect some people might have a problem with that. So what is the best way to do this? - Matt From pgmdevlist at gmail.com Sat Aug 25 20:21:26 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 25 Aug 2007 20:21:26 -0400 Subject: [Numpy-discussion] Maskedarray implementations : new developer zone wiki page In-Reply-To: <46D08770.7050007@hawaii.edu> References: <200708242027.41896.pgmdevlist@gmail.com> <200708251506.08568.pgmdevlist@gmail.com> <46D08770.7050007@hawaii.edu> Message-ID: <200708252021.26965.pgmdevlist@gmail.com> On Saturday 25 August 2007 15:48:00 Eric Firing wrote: > I've made a couple of small "emergency" edits, but a separate page would > make things much more visible and less confusing. So here it is: http://projects.scipy.org/scipy/numpy/wiki/MaskedArrayAlternative Please note the section : Optimizing maskedarray. You'll find the quick description of a test case (three implementations of divide) that emerged from on off-list discussion with Eric Firing. The problem can be formulated as "do we need to fill masked arrays before processing or not ?". Eric is in favor of the second solution (prefilling according to the domain mask), while the more it goes, the more I'm leaning towards the third one "bah, let numpy take care of that." I would be very grateful if you could post your comments/ideas/suggestions about the three implementations on that list. This is an issue I'd like to solve ASAP. Thanks a lot in advance Pierre From stefan at sun.ac.za Sun Aug 26 08:04:19 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sun, 26 Aug 2007 14:04:19 +0200 Subject: [Numpy-discussion] problem on testing numpy In-Reply-To: <1187970415.46cefd6f7805f@webmail.tu-bs.de> References: <1187970415.46cefd6f7805f@webmail.tu-bs.de> Message-ID: <20070826120419.GB20731@mentat.za.net> On Fri, Aug 24, 2007 at 05:46:55PM +0200, Vivian Tini wrote: > Dear All, > > I have just installed NumPy and I am excited to test it. > Since I have no access as root then I installed Numpy in my home directory. > The following messages appears as I tried some commands: > > >>> import numpy > Running from numpy source directory ^^^ You shouldn't be running from the source directory. Change to another directory and try again -- it should work. Cheers St?fan From mnandris at btinternet.com Sun Aug 26 08:45:55 2007 From: mnandris at btinternet.com (Michael Nandris) Date: Sun, 26 Aug 2007 13:45:55 +0100 (BST) Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's Message-ID: <442992.24353.qm@web86509.mail.ird.yahoo.com> Hi, Is there an easy way around this problem, that does not involve fixing the API (like using NaN instead of 0.0)? >>> from numpy.random import multinomial >>> multinomial(100,[ 0.2, 0.4, 0.1, 0.3 ]) array([19, 45, 10, 26]) >>> multinomial( 100, [0.2, 0.0, 0.8, 0.0] ) Traceback (most recent call last): File "", line 1, in File "mtrand.pyx", line 1173, in mtrand.RandomState.multinomial TypeError: exceptions must be strings, classes, or instances, not exceptions.ValueError I found a similar problem in scipy.stats.rv_discrete() which was fixed by adding sort to a dictionary handler: """ def reverse_dict(dict): newdict = {} for key in dict.keys(): # DUFF newdict[dict[key]] = key return newdict """ def reverse_dict(dict): newdict = {} for key in dict.keys(): sorted_keys = copy(dict.keys()) sorted_keys.sort() for key in sorted_keys[::-1]: # NEW newdict[dict[key]] = key return newdict Obviously this cannot be done with numpy since it runs in C or something which I don't understand. Can anyone help? Numpy is great and the simulation I want to code requires speed. Thanks for any advice given Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun Aug 26 19:36:27 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 27 Aug 2007 01:36:27 +0200 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <442992.24353.qm@web86509.mail.ird.yahoo.com> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> Message-ID: <20070826233627.GF14395@mentat.za.net> Hi Michael On Sun, Aug 26, 2007 at 01:45:55PM +0100, Michael Nandris wrote: > Is there an easy way around this problem, that does not involve fixing the API > (like using NaN instead of 0.0)? > > >>> from numpy.random import multinomial > >>> multinomial(100,[ 0.2, 0.4, 0.1, 0.3 ]) > array([19, 45, 10, 26]) > >>> multinomial( 100, [0.2, 0.0, 0.8, 0.0] ) > Traceback (most recent call last): > File "", line 1, in > File "mtrand.pyx", line 1173, in mtrand.RandomState.multinomial > TypeError: exceptions must be strings, classes, or instances, not > exceptions.ValueError For some reason, the kahan_sum of [0.2,0.0,0.8,0.0] is ever so slightly larger than 1.0 (in the order of 1e-16), but I'm not sure why, yet (this isn't specific to kahan summation -- normal summation shows the same behaviour). As a quick workaround, you can subtract 1e-16 from all your probabilities. Regards St?fan From martin.wiechert at gmx.de Mon Aug 27 08:57:28 2007 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Mon, 27 Aug 2007 14:57:28 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault Message-ID: <200708271457.28679.martin.wiechert@gmx.de> Hi list, I'm suffering from a strange segfault and would appreciate your help. I'm calling a small C function using ctypes / numpy.ctypeslib. The function works in the sense that it returns correct results. After calling the function however I can reliably evoke a segfault by using readline tab completion. I'm not very experienced, but this smells like a memory management bug to me, which is strange, because I'm not doing any mallocing/freeing at all in the C code. I could not reproduce the bug in a debug build of python (--without-pymalloc) or on another machine. The crashing machine is an eight-way opteron. Maybe I should mention that the C function calls two lapack fortran functions. Can this cause problems? Anyway, I'm at a loss. Please help! I've attached the files for reference. Thanks Martin P.S.: Here is what valgrind finds in the debug build: -bash-3.1$ valgrind ~/local/debug/bin/python ==16266== Memcheck, a memory error detector. ==16266== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==16266== Using LibVEX rev 1658, a library for dynamic binary translation. ==16266== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==16266== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==16266== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==16266== For more details, rerun with: -v ==16266== Python 2.5.1 (r251:54863, Aug 24 2007, 16:13:26) [GCC 4.1.1 20070105 (Red Hat 4.1.1-51)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> execfile ('recttest.py') --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 78.6006829739 4 78.6006829739 [92353 refs] >>> ==16266== Conditional jump or move depends on uninitialised value(s) ==16266== at 0x41361F: parsetok (parsetok.c:189) ==16266== by 0x4131B6: PyParser_ParseFileFlags (parsetok.c:89) ==16266== by 0x4E01D2: PyParser_ASTFromFile (pythonrun.c:1381) ==16266== by 0x4DE15C: PyRun_InteractiveOneFlags (pythonrun.c:770) ==16266== by 0x4DDF15: PyRun_InteractiveLoopFlags (pythonrun.c:721) ==16266== by 0x4DDD70: PyRun_AnyFileExFlags (pythonrun.c:690) ==16266== by 0x412E05: Py_Main (main.c:523) ==16266== by 0x411D62: main (python.c:23) [92353 refs] [37870 refs] ==16266== ==16266== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 98 from 1) ==16266== malloc/free: in use at exit: 2,791,249 bytes in 17,090 blocks. ==16266== malloc/free: 174,713 allocs, 157,623 frees, 376,495,377 bytes allocated. ==16266== For counts of detected errors, rerun with: -v ==16266== searching for pointers to 17,090 not-freed blocks. ==16266== checked 5,156,624 bytes. ==16266== ==16266== LEAK SUMMARY: ==16266== definitely lost: 0 bytes in 0 blocks. ==16266== possibly lost: 35,400 bytes in 103 blocks. ==16266== still reachable: 2,755,849 bytes in 16,987 blocks. ==16266== suppressed: 0 bytes in 0 blocks. ==16266== Reachable blocks (those to which a pointer was found) are not shown. ==16266== To see them, rerun with: --show-reachable=yes -------------- next part -------------- A non-text attachment was scrubbed... Name: solver.c Type: text/x-csrc Size: 4892 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cMontecarlo.py Type: application/x-python Size: 2657 bytes Desc: not available URL: From seandavi at gmail.com Mon Aug 27 13:23:34 2007 From: seandavi at gmail.com (Sean Davis) Date: Mon, 27 Aug 2007 13:23:34 -0400 Subject: [Numpy-discussion] Issue with converting from numpy record to list/tuple Message-ID: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com> I have a numpy recarray that I want to load into a database using insert statements. To do so, I need to convert each record to a tuple. Here is what I get (using psycopg2) In [1]: a[1] Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', 'chr3:1-199501827', 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L, 23L, 1L) In [2]: type(a[1]) Out[2]: In [3]: sqlcommand Out[3]: 'insert into nbl_tmp values (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);' In [4]: cur.execute(sqlcommand,tuple(a[1])) --------------------------------------------------------------------------- Traceback (most recent call last) /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/ in () : can't adapt In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', 'chr3:1-199501827', 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L, 23L, 1L) In [6]: cur.execute(sqlcommand,b) In [7]: a[1].dtype Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'), ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', '|S40'), ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', ' From Chris.Barker at noaa.gov Mon Aug 27 13:31:38 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 27 Aug 2007 10:31:38 -0700 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com> References: <200708242027.41896.pgmdevlist@gmail.com> Message-ID: <46D30A7A.7080403@noaa.gov> Pierre GM wrote: > * Does anyone see any *disadvantages* to this aspect of maskedarray relative > to numpy.ma? Nope, but I sure do see the advantages! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Aug 27 14:07:00 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 27 Aug 2007 11:07:00 -0700 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <20070826233627.GF14395@mentat.za.net> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> Message-ID: <46D312C4.4080504@noaa.gov> Stefan van der Walt wrote: > For some reason, the kahan_sum of [0.2,0.0,0.8,0.0] is ever so > slightly larger than 1.0 (in the order of 1e-16), but I'm not sure > why, yet (this isn't specific to kahan summation -- normal summation > shows the same behavior). Just to make sure -- is the khan_sum "compensated summation"? Is the kahan_sum closer? -- it should be, though compensated summation is really for adding LOTS of numbers, for 4, it's pointless at best. Anyway, binary floating point has its errors, and compensated summation can help, but it's still not exact for numbers that can't be exactly represented by binary. i.e. if your result is within 15 decimal digits of the exact result, that's as good as it gets. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Aug 27 14:09:33 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 27 Aug 2007 11:09:33 -0700 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: References: <200708242027.41896.pgmdevlist@gmail.com> <46CF893B.8080007@enthought.com> <88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com> Message-ID: <46D3135D.6000109@noaa.gov> Matt Knox wrote: > - put it in numpy as a separate module from numpy.ma initially? > (eg. "numpy.ma_new" ?) This is the best bet, or we could call the new one ma, and the old one ma_old. In any case, the old one needs to stick around until the new one has been fully tested for compatibility (and otherwise). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From l.mastrodomenico at gmail.com Mon Aug 27 14:21:43 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Mon, 27 Aug 2007 20:21:43 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <200708271457.28679.martin.wiechert@gmx.de> References: <200708271457.28679.martin.wiechert@gmx.de> Message-ID: Hi Martin, 2007/8/27, Martin Wiechert : > I could not reproduce the bug in a debug build of python (--without-pymalloc) > or on another machine. The crashing machine is an eight-way opteron. Not sure if it's related to your problem, but on 64-bit architectures sizeof(ssize_t) is 8. -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From seandavi at gmail.com Mon Aug 27 16:07:33 2007 From: seandavi at gmail.com (Sean Davis) Date: Mon, 27 Aug 2007 16:07:33 -0400 Subject: [Numpy-discussion] Issue with converting from numpy record to list/tuple In-Reply-To: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com> References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com> Message-ID: <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com> On 8/27/07, Sean Davis wrote: > > I have a numpy recarray that I want to load into a database using insert > statements. To do so, I need to convert each record to a tuple. Here is > what I get (using psycopg2) > > In [1]: a[1] > Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: 76.00 > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > 'chr3:1-199501827', > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L, > 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L, > 23L, 1L) > > In [2]: type(a[1]) > Out[2]: > > In [3]: sqlcommand > Out[3]: 'insert into nbl_tmp values > (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);' > > In [4]: cur.execute(sqlcommand,tuple(a[1])) > > --------------------------------------------------------------------------- > Traceback (most recent call > last) > > /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/ console> in () > > : can't adapt > > In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: 76.00 > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > 'chr3:1-199501827', > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L, > 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L, > 23L, 1L) > > In [6]: cur.execute(sqlcommand,b) > > In [7]: a[1].dtype > Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'), > ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', '|S40'), > ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', ' ('FEATURE_ID', ' ('PROBE_CLASS', '|S40'), ('PROBE_ID', '|S40'), ('POSITION', ' ('DESIGN_ID', ' > Why does the casting using tuple() not work while cut-and-paste of the > a[1] record into a new variable works just fine? I answered part of the question myself. In the coercion back to tuple from a record, the datatypes remain numpy datatypes. Is there a way to convert back from numpy datatypes to standard python types (string, int, float, etc.) without needing to check every numpy type and determine the appropriate python type? In other words, is there a single function that I can feed a numpy type to (or a variable that has a numpy type) and have the standard python type (or an appropriately-coerced variable)? Thanks, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Aug 27 19:22:53 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 28 Aug 2007 01:22:53 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: References: <200708271457.28679.martin.wiechert@gmx.de> Message-ID: <20070827232253.GU14395@mentat.za.net> On Mon, Aug 27, 2007 at 08:21:43PM +0200, Lino Mastrodomenico wrote: > Hi Martin, > > 2007/8/27, Martin Wiechert : > > I could not reproduce the bug in a debug build of python (--without-pymalloc) > > or on another machine. The crashing machine is an eight-way opteron. > > Not sure if it's related to your problem, but on 64-bit architectures > sizeof(ssize_t) is 8. You should be able to circumvent this problem by referring to ctypes.c_size_t or ctypes.int instead of specifying the width explicitly. Regards St?fan From stefan at sun.ac.za Mon Aug 27 19:29:56 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 28 Aug 2007 01:29:56 +0200 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <46D312C4.4080504@noaa.gov> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> Message-ID: <20070827232956.GV14395@mentat.za.net> Hi Chris On Mon, Aug 27, 2007 at 11:07:00AM -0700, Christopher Barker wrote: > Is the kahan_sum closer? -- it should be, though compensated summation > is really for adding LOTS of numbers, for 4, it's pointless at best. > Anyway, binary floating point has its errors, and compensated summation > can help, but it's still not exact for numbers that can't be exactly > represented by binary. > > i.e. if your result is within 15 decimal digits of the exact result, > that's as good as it gets. I find this behaviour odd for addition. Under python: In [7]: 0.8+0.2 > 1.0 Out[7]: False but using the Pyrex module, it yields true. You can find the code at http://mentat.za.net/html/refer/somesumbug.tar.bz2 and compile it using pyrexc sum.pyx ; python setup.py build_ext -i When you run the test, it illustrates the problem: Sum: 1.00000000000000000000000000000000000000000000000000 Is greater than 1.0? True Cheers St?fan From Chris.Barker at noaa.gov Mon Aug 27 19:46:45 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 27 Aug 2007 16:46:45 -0700 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <20070827232956.GV14395@mentat.za.net> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> <20070827232956.GV14395@mentat.za.net> Message-ID: <46D36265.8010208@noaa.gov> Stefan van der Walt wrote: > I find this behaviour odd for addition. Under python: > > In [7]: 0.8+0.2 > 1.0 > Out[7]: False > > but using the Pyrex module, it yields true. odd. I wonder if one is using extended floating point in the FPU, and the other not? What hardware/OS/compiler are you using? I'm no numerical analyst, I just know enough not to expect floating point calculations to be accurate to the last couple digits. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Aug 27 19:54:21 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 27 Aug 2007 16:54:21 -0700 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <20070827232956.GV14395@mentat.za.net> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> <20070827232956.GV14395@mentat.za.net> Message-ID: <46D3642D.6000100@noaa.gov> Stefan van der Walt wrote: > but using the Pyrex module, it yields true. You can find the code at > > http://mentat.za.net/html/refer/somesumbug.tar.bz2 That link appears to be broken. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From l.mastrodomenico at gmail.com Mon Aug 27 20:02:53 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Tue, 28 Aug 2007 02:02:53 +0200 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <46D3642D.6000100@noaa.gov> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> <20070827232956.GV14395@mentat.za.net> <46D3642D.6000100@noaa.gov> Message-ID: 2007/8/28, Christopher Barker : > Stefan van der Walt wrote: > > but using the Pyrex module, it yields true. You can find the code at > > > > http://mentat.za.net/html/refer/somesumbug.tar.bz2 > > That link appears to be broken. The correct one is probably: http://mentat.za.net/refer/somesumbug.tar.bz2 -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From l.mastrodomenico at gmail.com Mon Aug 27 20:39:51 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Tue, 28 Aug 2007 02:39:51 +0200 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <20070827232956.GV14395@mentat.za.net> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> <20070827232956.GV14395@mentat.za.net> Message-ID: 2007/8/28, Stefan van der Walt : > I find this behaviour odd for addition. Under python: > > In [7]: 0.8+0.2 > 1.0 > Out[7]: False Keep in mind that both 0.2 and 0.8 cannot be represented exactly as floating-point numbers (unless you use decimal floating points, like the "decimal" module), so the starting point isn't what it appears to be. > Sum: 1.00000000000000000000000000000000000000000000000000 > Is greater than 1.0? True I get True on a x86, gcc-3.3.1, numpy-1.0b5, GNU/Linux box, and False on x86_64, gcc-4.1.1, numpy-1.0.3.1, GNU/Linux. YMMV ;-) -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From stefan at sun.ac.za Tue Aug 28 03:07:59 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 28 Aug 2007 09:07:59 +0200 Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's In-Reply-To: <46D3642D.6000100@noaa.gov> References: <442992.24353.qm@web86509.mail.ird.yahoo.com> <20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov> <20070827232956.GV14395@mentat.za.net> <46D3642D.6000100@noaa.gov> Message-ID: <20070828070759.GW14395@mentat.za.net> On Mon, Aug 27, 2007 at 04:54:21PM -0700, Christopher Barker wrote: > Stefan van der Walt wrote: > > but using the Pyrex module, it yields true. You can find the code at > > > > http://mentat.za.net/html/refer/somesumbug.tar.bz2 > > That link appears to be broken. Sorry, http://mentat.za.net/refer/somesumbug.tar.bz2 Cheers St?fan From martin.wiechert at gmx.de Tue Aug 28 04:58:06 2007 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Tue, 28 Aug 2007 10:58:06 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <20070827232253.GU14395@mentat.za.net> References: <200708271457.28679.martin.wiechert@gmx.de> <20070827232253.GU14395@mentat.za.net> Message-ID: <200708281058.06578.martin.wiechert@gmx.de> Lino and Stefan, thanks for your suggestion. However, I doubt this is the problem because as far as I know numpy.intp is actually ssize_t. Thanks, Martin On Tuesday 28 August 2007 01:22, Stefan van der Walt wrote: > On Mon, Aug 27, 2007 at 08:21:43PM +0200, Lino Mastrodomenico wrote: > > Hi Martin, > > > > 2007/8/27, Martin Wiechert : > > > I could not reproduce the bug in a debug build of python > > > (--without-pymalloc) or on another machine. The crashing machine is an > > > eight-way opteron. > > > > Not sure if it's related to your problem, but on 64-bit architectures > > sizeof(ssize_t) is 8. > > You should be able to circumvent this problem by referring to > ctypes.c_size_t or ctypes.int instead of specifying the width > explicitly. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From stefan at sun.ac.za Tue Aug 28 07:11:38 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 28 Aug 2007 13:11:38 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <200708271457.28679.martin.wiechert@gmx.de> References: <200708271457.28679.martin.wiechert@gmx.de> Message-ID: <20070828111138.GB14395@mentat.za.net> Hi Martin On Mon, Aug 27, 2007 at 02:57:28PM +0200, Martin Wiechert wrote: > I'm suffering from a strange segfault and would appreciate your help. > > I'm calling a small C function using ctypes / numpy.ctypeslib. The function > works in the sense that it returns correct results. After calling the > function however I can reliably evoke a segfault by using readline tab > completion. > > I'm not very experienced, but this smells like a memory management bug to me, > which is strange, because I'm not doing any mallocing/freeing at all in the C > code. > > I could not reproduce the bug in a debug build of python (--without-pymalloc) > or on another machine. The crashing machine is an eight-way opteron. I had to #include in solver, and modify cMonteCarlo not to depend on GV. Then, I used gcc -o solver.os -c -O2 -ggdb -Wall -ansi -pedantic -fPIC solver.c gcc -o librectify.so -shared solver.os -llapack to compile. Please send me the script that excercises the solver, then I will test on my machines here. > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 This could be a valgrind issue. Cheers St?fan From faltet at carabos.com Tue Aug 28 06:14:22 2007 From: faltet at carabos.com (Francesc Altet) Date: Tue, 28 Aug 2007 12:14:22 +0200 Subject: [Numpy-discussion] Issue with converting from numpy record to list/tuple In-Reply-To: <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com> References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com> <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com> Message-ID: <200708281214.23333.faltet@carabos.com> A Monday 27 August 2007, Sean Davis escrigu?: > On 8/27/07, Sean Davis wrote: > > I have a numpy recarray that I want to load into a database using > > insert statements. To do so, I need to convert each record to a > > tuple. Here is what I get (using psycopg2) > > > > In [1]: a[1] > > Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: > > 76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > > 'chr3:1-199501827', > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', > > 6149104L, 5151L, 23L, 1L) > > > > In [2]: type(a[1]) > > Out[2]: > > > > In [3]: sqlcommand > > Out[3]: 'insert into nbl_tmp values > > (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);' > > > > In [4]: cur.execute(sqlcommand,tuple(a[1])) > > > > ------------------------------------------------------------------- > >-------- Traceback (most > > recent call last) > > > > /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/ >ython console> in () > > > > : can't adapt > > > > In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', > > 'target_tm: 76.00 > > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > > 'chr3:1-199501827', > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', > > 6149104L, 5151L, 23L, 1L) > > > > In [6]: cur.execute(sqlcommand,b) > > > > In [7]: a[1].dtype > > Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'), > > ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', > > '|S40'), ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', ' > ('MATCH_INDEX', ' > ('COL_NUM', ' > ('POSITION', ' > ' > > > Why does the casting using tuple() not work while cut-and-paste of > > the a[1] record into a new variable works just fine? > > I answered part of the question myself. In the coercion back to > tuple from a record, the datatypes remain numpy datatypes. Is there > a way to convert back from numpy datatypes to standard python types > (string, int, float, etc.) without needing to check every numpy type > and determine the appropriate python type? In other words, is there > a single function that I can feed a numpy type to (or a variable that > has a numpy type) and have the standard python type (or an > appropriately-coerced variable)? Use .tolist() method. Here is an example: In [92]: r=numpy.empty(5, 'f8,i4,f8') In [93]: type(tuple(r[0])[0]) Out[93]: In [94]: type(r[0].tolist()[0]) Out[94]: HTH, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From seandavi at gmail.com Tue Aug 28 07:38:29 2007 From: seandavi at gmail.com (Sean Davis) Date: Tue, 28 Aug 2007 07:38:29 -0400 Subject: [Numpy-discussion] Issue with converting from numpy record to list/tuple In-Reply-To: <200708281214.23333.faltet@carabos.com> References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com> <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com> <200708281214.23333.faltet@carabos.com> Message-ID: <264855a00708280438q3555dc4aq30e43de028554d8@mail.gmail.com> On 8/28/07, Francesc Altet wrote: > > A Monday 27 August 2007, Sean Davis escrigu?: > > On 8/27/07, Sean Davis wrote: > > > I have a numpy recarray that I want to load into a database using > > > insert statements. To do so, I need to convert each record to a > > > tuple. Here is what I get (using psycopg2) > > > > > > In [1]: a[1] > > > Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: > > > 76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > > > 'chr3:1-199501827', > > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, > > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', > > > 6149104L, 5151L, 23L, 1L) > > > > > > In [2]: type(a[1]) > > > Out[2]: > > > > > > In [3]: sqlcommand > > > Out[3]: 'insert into nbl_tmp values > > > (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);' > > > > > > In [4]: cur.execute(sqlcommand,tuple(a[1])) > > > > > > ------------------------------------------------------------------- > > >-------- Traceback (most > > > recent call last) > > > > > > /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/ > >ython console> in () > > > > > > : can't adapt > > > > > > In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', > > > 'target_tm: 76.00 > > > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658', > > > 'chr3:1-199501827', > > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, > > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', > > > 6149104L, 5151L, 23L, 1L) > > > > > > In [6]: cur.execute(sqlcommand,b) > > > > > > In [7]: a[1].dtype > > > Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'), > > > ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', > > > '|S40'), ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', ' > > ('MATCH_INDEX', ' > > ('COL_NUM', ' > > ('POSITION', ' > > ' > > > > > Why does the casting using tuple() not work while cut-and-paste of > > > the a[1] record into a new variable works just fine? > > > > I answered part of the question myself. In the coercion back to > > tuple from a record, the datatypes remain numpy datatypes. Is there > > a way to convert back from numpy datatypes to standard python types > > (string, int, float, etc.) without needing to check every numpy type > > and determine the appropriate python type? In other words, is there > > a single function that I can feed a numpy type to (or a variable that > > has a numpy type) and have the standard python type (or an > > appropriately-coerced variable)? > > Use .tolist() method. Here is an example: > > In [92]: r=numpy.empty(5, 'f8,i4,f8') > > In [93]: type(tuple(r[0])[0]) > Out[93]: > > In [94]: type(r[0].tolist()[0]) > Out[94]: > > HTH, That will do it. Thanks, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.wiechert at gmx.de Tue Aug 28 08:03:52 2007 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Tue, 28 Aug 2007 14:03:52 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <20070828111138.GB14395@mentat.za.net> References: <200708271457.28679.martin.wiechert@gmx.de> <20070828111138.GB14395@mentat.za.net> Message-ID: <200708281403.52444.martin.wiechert@gmx.de> Wow, thanks a lot for putting so much efffort! Here's the test script. I'm using it via execfile from an interactive session, so I can inspect (and crash with readline) afterwards. Here's how I compiled: gcc solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas -lgfortran -o librectify.so Thanks, Martin On Tuesday 28 August 2007 13:11, Stefan van der Walt wrote: > Hi Martin > > On Mon, Aug 27, 2007 at 02:57:28PM +0200, Martin Wiechert wrote: > > I'm suffering from a strange segfault and would appreciate your help. > > > > I'm calling a small C function using ctypes / numpy.ctypeslib. The > > function works in the sense that it returns correct results. After > > calling the function however I can reliably evoke a segfault by using > > readline tab completion. > > > > I'm not very experienced, but this smells like a memory management bug to > > me, which is strange, because I'm not doing any mallocing/freeing at all > > in the C code. > > > > I could not reproduce the bug in a debug build of python > > (--without-pymalloc) or on another machine. The crashing machine is an > > eight-way opteron. > > I had to #include in solver, and modify cMonteCarlo not to > depend on GV. Then, I used > > gcc -o solver.os -c -O2 -ggdb -Wall -ansi -pedantic -fPIC solver.c > gcc -o librectify.so -shared solver.os -llapack > > to compile. > > Please send me the script that excercises the solver, then I will test > on my machines here. > > > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 > > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10 > > This could be a valgrind issue. > > Cheers > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: recttest.py Type: application/x-python Size: 389 bytes Desc: not available URL: From pgmdevlist at gmail.com Tue Aug 28 10:25:46 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 28 Aug 2007 10:25:46 -0400 Subject: [Numpy-discussion] Maskedarray implementations In-Reply-To: <46D3135D.6000109@noaa.gov> References: <200708242027.41896.pgmdevlist@gmail.com> <46D3135D.6000109@noaa.gov> Message-ID: <200708281025.46269.pgmdevlist@gmail.com> On Monday 27 August 2007 14:09:33 Christopher Barker wrote: > This is the best bet, or we could call the new one ma, and the old one > ma_old. In any case, the old one needs to stick around until the new one > has been fully tested for compatibility (and otherwise). That shouldn't be a pb, the tests I've performed so far w/ the two implementations seem to run, but sure, that's the wisest. However, maskedarray is spread on several files (core, extras, mrecords, mstats). What would be the best structure for numpy, then ? From pgmdevlist at gmail.com Tue Aug 28 10:32:07 2007 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 28 Aug 2007 10:32:07 -0400 Subject: [Numpy-discussion] maskedarray : new developer zone wiki page In-Reply-To: <46D08770.7050007@hawaii.edu> References: <200708242027.41896.pgmdevlist@gmail.com> <200708251506.08568.pgmdevlist@gmail.com> <46D08770.7050007@hawaii.edu> Message-ID: <200708281032.07372.pgmdevlist@gmail.com> On Saturday 25 August 2007 15:48:00 Eric Firing wrote: > I've made a couple of small "emergency" edits, but a separate page would > make things much more visible and less confusing. So here it is: http://projects.scipy.org/scipy/numpy/wiki/MaskedArrayAlternative Please note the section : Optimizing maskedarray. You'll find the quick description of a test case (three implementations of divide) that emerged from on off-list discussion with Eric Firing. The problem can be formulated as "do we need to fill masked arrays before processing or not ?". Eric is in favor of the second solution (prefilling according to the domain mask), while the more it goes, the more I'm leaning towards the third one "bah, let numpy take care of that." I would be very grateful if you could post your comments/ideas/suggestions about the three implementations on that list. This is an issue I'd like to solve ASAP. Thanks a lot in advance Pierre PS: Sorry if I bumped this thread, I'm not sure on what list I sent it. Cross-posting is bad... From stefan at sun.ac.za Tue Aug 28 12:45:27 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 28 Aug 2007 18:45:27 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <200708281403.52444.martin.wiechert@gmx.de> References: <200708271457.28679.martin.wiechert@gmx.de> <20070828111138.GB14395@mentat.za.net> <200708281403.52444.martin.wiechert@gmx.de> Message-ID: <20070828164527.GA19381@mentat.za.net> On Tue, Aug 28, 2007 at 02:03:52PM +0200, Martin Wiechert wrote: > Here's the test script. I'm using it via execfile from an interactive session, > so I can inspect (and crash with readline) afterwards. > > Here's how I compiled: > gcc > solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas -lgfortran -o > librectify.so It works perfectly on the two Linux machines I tried (32-bit and 64-bit). Maybe your lapack isn't healthy? Cheers St?fan From martin.wiechert at gmx.de Wed Aug 29 05:03:26 2007 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Wed, 29 Aug 2007 11:03:26 +0200 Subject: [Numpy-discussion] possibly ctypes related segfault In-Reply-To: <20070828164527.GA19381@mentat.za.net> References: <200708271457.28679.martin.wiechert@gmx.de> <200708281403.52444.martin.wiechert@gmx.de> <20070828164527.GA19381@mentat.za.net> Message-ID: <200708291103.26636.martin.wiechert@gmx.de> Hmpf. Anyway, thanks again, Stefan! Cheers, Martin On Tuesday 28 August 2007 18:45, Stefan van der Walt wrote: > On Tue, Aug 28, 2007 at 02:03:52PM +0200, Martin Wiechert wrote: > > Here's the test script. I'm using it via execfile from an interactive > > session, so I can inspect (and crash with readline) afterwards. > > > > Here's how I compiled: > > gcc > > solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas > > -lgfortran -o librectify.so > > It works perfectly on the two Linux machines I tried (32-bit and > 64-bit). Maybe your lapack isn't healthy? > > Cheers > St?fan From numpy-discussion at maubp.freeserve.co.uk Wed Aug 29 07:44:01 2007 From: numpy-discussion at maubp.freeserve.co.uk (Peter) Date: Wed, 29 Aug 2007 12:44:01 +0100 Subject: [Numpy-discussion] Citing Numeric and numpy Message-ID: <46D55C01.6050604@maubp.freeserve.co.uk> Dear Travis and the Numerical Python community, I would like to know if there is a preferred form for citing the old "Numeric" library and more recent "numpy" libraries in a publication. I have checked the mailing list archives, but didn't find an answer. I am not aware of any publication for the original Numeric library, leaving just the project webpage. Is something like this acceptable?: David Ascher et al. (2001) Numerical Python, http://www.numpy.org For NumPy, is it best to cite http://www.numpy.org or the book? Would this suffice for the NumPy book citation: Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA. It would be nice to have the full citation reference details (e.g. publisher's address and explicit year of publication) on the book's webpage: http://www.tramy.us Thanks, Peter From ryanlists at gmail.com Wed Aug 29 09:06:42 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Wed, 29 Aug 2007 08:06:42 -0500 Subject: [Numpy-discussion] Citing Numeric and numpy In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk> References: <46D55C01.6050604@maubp.freeserve.co.uk> Message-ID: Obviously this is mainly Travis' question to answer and it depends on the nature of the reference, but I would like to see Travis's article in the recent special issue on Python for scientific use in CiSE cited as well because I think it does a great job of presenting why Python should be taken seriously as a language for scientific computing. FWIW, Ryan On 8/29/07, Peter wrote: > Dear Travis and the Numerical Python community, > > I would like to know if there is a preferred form for citing the old > "Numeric" library and more recent "numpy" libraries in a publication. I > have checked the mailing list archives, but didn't find an answer. > > I am not aware of any publication for the original Numeric library, > leaving just the project webpage. Is something like this acceptable?: > > David Ascher et al. (2001) Numerical Python, http://www.numpy.org > > For NumPy, is it best to cite http://www.numpy.org or the book? Would > this suffice for the NumPy book citation: > > Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA. > > It would be nice to have the full citation reference details (e.g. > publisher's address and explicit year of publication) on the book's > webpage: http://www.tramy.us > > Thanks, > > Peter > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From ryanlists at gmail.com Wed Aug 29 09:06:42 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Wed, 29 Aug 2007 08:06:42 -0500 Subject: [Numpy-discussion] Citing Numeric and numpy In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk> References: <46D55C01.6050604@maubp.freeserve.co.uk> Message-ID: Obviously this is mainly Travis' question to answer and it depends on the nature of the reference, but I would like to see Travis's article in the recent special issue on Python for scientific use in CiSE cited as well because I think it does a great job of presenting why Python should be taken seriously as a language for scientific computing. FWIW, Ryan On 8/29/07, Peter wrote: > Dear Travis and the Numerical Python community, > > I would like to know if there is a preferred form for citing the old > "Numeric" library and more recent "numpy" libraries in a publication. I > have checked the mailing list archives, but didn't find an answer. > > I am not aware of any publication for the original Numeric library, > leaving just the project webpage. Is something like this acceptable?: > > David Ascher et al. (2001) Numerical Python, http://www.numpy.org > > For NumPy, is it best to cite http://www.numpy.org or the book? Would > this suffice for the NumPy book citation: > > Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA. > > It would be nice to have the full citation reference details (e.g. > publisher's address and explicit year of publication) on the book's > webpage: http://www.tramy.us > > Thanks, > > Peter > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From ryanlists at gmail.com Wed Aug 29 09:06:42 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Wed, 29 Aug 2007 08:06:42 -0500 Subject: [Numpy-discussion] Citing Numeric and numpy In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk> References: <46D55C01.6050604@maubp.freeserve.co.uk> Message-ID: Obviously this is mainly Travis' question to answer and it depends on the nature of the reference, but I would like to see Travis's article in the recent special issue on Python for scientific use in CiSE cited as well because I think it does a great job of presenting why Python should be taken seriously as a language for scientific computing. FWIW, Ryan On 8/29/07, Peter wrote: > Dear Travis and the Numerical Python community, > > I would like to know if there is a preferred form for citing the old > "Numeric" library and more recent "numpy" libraries in a publication. I > have checked the mailing list archives, but didn't find an answer. > > I am not aware of any publication for the original Numeric library, > leaving just the project webpage. Is something like this acceptable?: > > David Ascher et al. (2001) Numerical Python, http://www.numpy.org > > For NumPy, is it best to cite http://www.numpy.org or the book? Would > this suffice for the NumPy book citation: > > Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA. > > It would be nice to have the full citation reference details (e.g. > publisher's address and explicit year of publication) on the book's > webpage: http://www.tramy.us > > Thanks, > > Peter > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Wed Aug 29 09:17:21 2007 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 29 Aug 2007 09:17:21 -0400 Subject: [Numpy-discussion] Citing Numeric and numpy In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk> References: <46D55C01.6050604@maubp.freeserve.co.uk> Message-ID: On Wed, 29 Aug 2007, Peter apparently wrote: > I would like to know if there is a preferred form for > citing the old > "Numeric" library I'll attach text from the first two pages of *Numerical Python* below. Cheers, Alan Isaac ------------------------------------------------------------- An Open Source Project Numerical Python David Ascher Paul F. Dubois Konrad Hinsen Jim Hugunin Travis Oliphant with contributions from the Numerical Python community. September 7, 2001 Lawrence Livermore National Laboratory, Livermore, CA 94566 UCRL?MA?128569 ii Legal Notice Please see file Legal.html in the source distribution. This open source project has been contributed to by many people, including personnel of the Lawrence Liver? more National Laboratory. The following notice covers those contributions including this manual. Copyright (c) 1999, 2000, 2001. The Regents of the University of California. All rights reserved. Permission to use, copy, modify, and distribute this software for any purpose without fee is hereby granted, provided that this entire notice is included in all copies of any software which is or includes a copy or modifi? cation of this software and in all copies of the supporting documentation for such software. This work was produced at the University of California, Lawrence Livermore National Laboratory under con? tract no. W?7405?ENG?48 between the U.S. Department of Energy and The Regents of the University of Cali? fornia for the operation of UC LLNL. From Joris.DeRidder at ster.kuleuven.be Wed Aug 29 10:56:02 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 29 Aug 2007 16:56:02 +0200 Subject: [Numpy-discussion] Trac ticket Message-ID: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be> Hi, Perhaps a stupid question, but I don't seem to find any info about it on the web. I would like to take up a (simple) Numpy Trac ticket, and fix it in the Numpy trunk. How can I assign the ticket to myself? After logging in, I don't see any obvious way of doing this. Secondly, committing a fix back to the SVN repository seems to require a specific login/pw, how to get one (assuming my fix is welcome)? Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From lists.steve at arachnedesign.net Wed Aug 29 11:05:14 2007 From: lists.steve at arachnedesign.net (Steve Lianoglou) Date: Wed, 29 Aug 2007 11:05:14 -0400 Subject: [Numpy-discussion] Trac ticket In-Reply-To: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be> References: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be> Message-ID: > Perhaps a stupid question, but I don't seem to find any info about it > on the web. > I would like to take up a (simple) Numpy Trac ticket, and fix it in > the Numpy trunk. How can I assign the ticket to myself? I'm not sure how the trac system is setup @ numpy, but you may not have the perms to do that yourself. Perhaps you can add a comment to the ticket saying that you are working on it (an expected completion date may be helpful) > After logging > in, I don't see any obvious way of doing this. Secondly, committing a > fix back to the SVN repository seems to require a specific login/pw, > how to get one (assuming my fix is welcome)? You should most likely just attach a patch against the latest trunk to the ticket itself for review. -steve From charlesr.harris at gmail.com Wed Aug 29 11:42:50 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 09:42:50 -0600 Subject: [Numpy-discussion] Bug in resize method? Message-ID: Hi all, This looks like a bug to me. >>> a = arange(6).reshape(2,3) >>> a.resize((3,3)) Traceback (most recent call last): File "", line 1, in ValueError: cannot resize this array: it does not own its data Is there any reason resize should fail in this case? Resize should be returning an new array, no? There are several other things that look like bugs in this method, for instance: >>> a = arange(6).resize((2,3)) >>> a `a` has no value and no error is raised. The resize function works as expected >>> resize(a,(3,3)) array([[0, 1, 2], [3, 4, 5], [0, 1, 2]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Aug 29 11:58:57 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 29 Aug 2007 17:58:57 +0200 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: Message-ID: <20070829155856.GV14395@mentat.za.net> Hi Charles On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote: > Hi all, > > This looks like a bug to me. > > >>> a = arange(6).reshape(2,3) > >>> a.resize((3,3)) > Traceback (most recent call last): > File "", line 1, in > ValueError: cannot resize this array: it does not own its data >From the docstring of a.resize: Change size and shape of self inplace. Array must own its own memory and not be referenced by other arrays. Returns None. The reshaped array is a view on the original data, hence it doesn't own it: In [15]: a = N.arange(6).reshape(2,3) In [16]: a.flags Out[16]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False > >>> a = arange(6).resize((2,3)) > >>> a > > `a` has no value and no error is raised. It is because `a` is now None. Cheers St?fan From charlesr.harris at gmail.com Wed Aug 29 12:28:21 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 10:28:21 -0600 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: <20070829155856.GV14395@mentat.za.net> References: <20070829155856.GV14395@mentat.za.net> Message-ID: On 8/29/07, Stefan van der Walt wrote: > > Hi Charles > > On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote: > > Hi all, > > > > This looks like a bug to me. > > > > >>> a = arange(6).reshape(2,3) > > >>> a.resize((3,3)) > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: cannot resize this array: it does not own its data > > >From the docstring of a.resize: > > Change size and shape of self inplace. Array must own its own memory > and > not be referenced by other arrays. Returns None. The documentation is bogus: >>> a = arange(6).reshape(2,3) >>> a array([[0, 1, 2], [3, 4, 5]]) >>> a.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> a.resize((3,2)) >>> a array([[0, 1], [2, 3], [4, 5]]) The reshaped array is a view on the original data, hence it doesn't > own it: > > In [15]: a = N.arange(6).reshape(2,3) > > In [16]: a.flags > Out[16]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > > >>> a = arange(6).resize((2,3)) > > >>> a > > > > `a` has no value and no error is raised. > > It is because `a` is now None. This behaviour doesn't match documentation elsewhere, which is why I am raising the question. What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape, so why do we need it? Apart from that, the reshape method looks like it would serve for most cases. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From frankb.mail at gmail.com Wed Aug 29 12:41:29 2007 From: frankb.mail at gmail.com (F Bitonti) Date: Wed, 29 Aug 2007 12:41:29 -0400 Subject: [Numpy-discussion] linear algebra error? Message-ID: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com> I am trying to install the linear algebra package from the NumPy package and i keep getting this error. I have the most recent version of numpy 1.0.3.1. this is the error Traceback (most recent call last): File "C:\Python25\lib\site-packages\numpy\linalg\setup.py", line 31, in setup(configuration=configuration) File "C:\Python25\Lib\site-packages\numpy\distutils\core.py", line 113, in setup return setup(**attr) File "C:\Python25\Lib\site-packages\numpy\distutils\core.py", line 173, in setup return old_setup(**new_attr) File "C:\Python25\lib\distutils\core.py", line 168, in setup raise SystemExit, "error: " + str(msg) SystemExit: error: Python was built with Visual Studio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py. However, I was told that the reason I am getting this error is because the linear algebra module is already installed yet it dosn't seem to be becaues when I exectue the following commands i get these error messages. Am I doing someting wrong I have only been using python two days. >>> from numpy import * >>> from linalg import * Traceback (most recent call last): File "", line 1, in from linalg import * ImportError: No module named linalg >>> a = reshape(arange(25.0), (5,5)) + identity(5) >>> print a [[ 1. 1. 2. 3. 4.] [ 5. 7. 7. 8. 9.] [ 10. 11. 13. 13. 14.] [ 15. 16. 17. 19. 19.] [ 20. 21. 22. 23. 25.]] >>> inv_a = inverse(a) Traceback (most recent call last): File "", line 1, in inv_a = inverse(a) NameError: name 'inverse' is not defined >>> Thank you for any help you can provide. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Wed Aug 29 12:43:42 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 29 Aug 2007 18:43:42 +0200 Subject: [Numpy-discussion] linear algebra error? In-Reply-To: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com> References: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com> Message-ID: > > > >>> from numpy import * > >>> from numpy.linalg import * > linalg is in the numpy module -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Aug 29 12:52:15 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 29 Aug 2007 09:52:15 -0700 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> Message-ID: <46D5A43F.5080702@noaa.gov> Charles R Harris wrote: > What *should* the resize method do? It looks like > it is equivalent to assigning a shape tuple to a.shape, No, that's what reshape does. > so why do we need it? resize() will change the SIZE of the array (number of elements), where reshape() will only change the shape, but not the number of elements. The fact that the size is changing is why it won't work if if doesn't own the data. >>> a = N.array((1,2,3)) >>> a.reshape((6,)) Traceback (most recent call last): File "", line 1, in ValueError: total size of new array must be unchanged can't reshape to a shape that is a different size. >>> b = a.resize((6,)) >>> repr(b) 'None' resize changes the array in place, so it returns None, but a has been changed: >>> a array([1, 2, 3, 0, 0, 0]) Perhaps you want the function, rather than the method: >>> b = N.resize(a, (12,)) >>> b array([1, 2, 3, 0, 0, 0, 1, 2, 3, 0, 0, 0]) >>> a array([1, 2, 3, 0, 0, 0]) a hasn't been changed, b is a brand new array. -CHB Apart from that, the reshape method looks like it would serve > for most cases. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed Aug 29 12:59:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 10:59:22 -0600 Subject: [Numpy-discussion] Trace returns int32 type for int8 array. Message-ID: Hi all, The documentation of trace says it returns the same type as the array. Yet: >>> trace(eye(2, dtype=int8)).dtype dtype('int32') For float types this promotion does not occur >>> trace(eye(2, dtype=float32)).dtype dtype('float32') Trace operates the same way as sum. What should be the case here? And if type promotion is the default, shouldn't float32 be promoted to double? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Aug 29 13:03:51 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 12:03:51 -0500 Subject: [Numpy-discussion] linear algebra error? In-Reply-To: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com> References: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com> Message-ID: <46D5A6F7.3060507@gmail.com> F Bitonti wrote: > However, I was told that the reason I am getting this error is because > the linear algebra module is already installed The more proximate reasonThat's not the reason why you are getting the error, it's just that you don't need to and shouldn't try to execute that setup.py since it's already installed. that you are getting that particular traceback is because you don't have a compiler installed. However, that's not relevant here since numpy.linalg is already installed. > yet it dosn't seem to be > becaues when I exectue the following commands i get these error > messages. Am I doing someting wrong I have only been using python two days. > >>>> from numpy import * >>>> from linalg import * from numpy.linalg import * inv(...) Or preferably: from numpy import linalg linalg.inv(...) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Aug 29 13:14:38 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 11:14:38 -0600 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: <46D5A43F.5080702@noaa.gov> References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Christopher Barker wrote: > > Charles R Harris wrote: > > What *should* the resize method do? It looks like > > it is equivalent to assigning a shape tuple to a.shape, > > No, that's what reshape does. No, reshape returns a view and the view doesn't own its data. Totally different behavior in this context. > so why do we need it? > > resize() will change the SIZE of the array (number of elements), where > reshape() will only change the shape, but not the number of elements. > The fact that the size is changing is why it won't work if if doesn't > own the data. According to the documentation, the resize method changes the array inplace. How can it be inplace if the number of elements changes? Admittedly, it *will* change the size, but that is not consistent with the documentation. I suspect it reallocates memory and (hopefully) frees the old, but then that is what the documentation should say because it explains why the data must be owned -- a condition violated in some cases as demonstrated above. I am working on documentation and that is why I am raising these questions. There seem to be some inconsistencies that need clarification and/or fixing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 29 13:25:25 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 11:25:25 -0600 Subject: [Numpy-discussion] svn down Message-ID: Hi all, The svn server seems to be down, I am getting error messages from the buildbots: svn: PROPFIND request failed on '/svn/numpy/trunk' svn: PROPFIND of '/svn/numpy/trunk': could not connect to server (http://scipy.org) program finished with exit code 1 It might be reasonable to check this case before sending posts. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Wed Aug 29 13:30:33 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 29 Aug 2007 10:30:33 -0700 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Charles R Harris wrote: > > > > On 8/29/07, Christopher Barker wrote: > > > > Charles R Harris wrote: > > > What *should* the resize method do? It looks like > > > it is equivalent to assigning a shape tuple to a.shape, > > > > No, that's what reshape does. > > > No, reshape returns a view and the view doesn't own its data. Totally > different behavior in this context. > > > so why do we need it? > > > > resize() will change the SIZE of the array (number of elements), where > > reshape() will only change the shape, but not the number of elements. > > The fact that the size is changing is why it won't work if if doesn't > > own the data. > > > According to the documentation, the resize method changes the array > inplace. How can it be inplace if the number of elements changes? > It sounds like you and Chris are talking past each other on a matter of terminology. At a C-level, it's obviously not (necessarily) in place, since the array may get realloced as you surmise below. However, at the Python level, the change is in fact in place, in the same sense that appending to a Python list operates in-place, even though under the covers memory may get realloced there as well. > Admittedly, it *will* change the size, but that is not consistent with the > documentation. I suspect it reallocates memory and (hopefully) frees the > old, but then that is what the documentation should say because it explains > why the data must be owned -- a condition violated in some cases as > demonstrated above. I am working on documentation and that is why I am > raising these questions. There seem to be some inconsistencies that need > clarification and/or fixing. > The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at stsci.edu Wed Aug 29 13:37:10 2007 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 29 Aug 2007 13:37:10 -0400 Subject: [Numpy-discussion] svn down In-Reply-To: References: Message-ID: <46D5AEC6.5010505@stsci.edu> This could be a problem with the buildbots. I was just able to update from svn. Chris Charles R Harris wrote: > Hi all, > > The svn server seems to be down, I am getting error messages from the > buildbots: > > svn: PROPFIND request failed on '/svn/numpy/trunk' > svn: PROPFIND of '/svn/numpy/trunk': could not connect to server ( > http://scipy.org) > program finished with exit code 1 > > It might be reasonable to check this case before sending posts. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From mpmusu at cc.usu.edu Wed Aug 29 14:05:09 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Wed, 29 Aug 2007 12:05:09 -0600 Subject: [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match within a numpy array] In-Reply-To: <200708221111.17141.faltet@carabos.com> References: <46C2CD01.5030307@bristol.ac.uk> <1187190071.384881.240470@w3g2000hsg.googlegroups.com> <46CAF562.9060009@cc.usu.edu> <200708221111.17141.faltet@carabos.com> Message-ID: <46D5B555.2070904@cc.usu.edu> A belated thanks...but yes. That does the trick. I've not worked with views explicitly, so I appreciate the input. I definitely foresee additional applications of these types of things in the future. Thanks again, -Mark Francesc Altet wrote: > > You can always do a view of the rows as strings and then use unique(). > Here is an example: > > In [1]: import numpy > In [2]: a=numpy.arange(12).reshape(4,3) > In [3]: a[2]=(3,4,5) > In [4]: a > Out[4]: > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 3, 4, 5], > [ 9, 10, 11]]) > > now, create the view and select the unique rows: > > In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4') > > and finally restore the shape: > > In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) > Out[6]: > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 9, 10, 11]]) > > If you want to find unique columns instead of rows, do a tranpose first > on the initial array. > > Cheers, > From peridot.faceted at gmail.com Wed Aug 29 14:19:22 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 29 Aug 2007 14:19:22 -0400 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 29/08/2007, Timothy Hochberg wrote: > The main inconsistency I see above is that resize appears to only require > ownership of the data if in fact the number of items changes. I don't think > that's actually a bug, but I don't like it much; I would prefer that resize > be strict and always require ownership. However, I'm fairly certain that > there are people that prefer "friendliness" over consistency, so I wouldn't > be surprised to get some pushback on changing that. It seems to me like inplace resize is a problem, no matter how you implement it --- is there any way to verify that no view exists of a given array? (refcounts won't do it since there are other, non-view ways to increase the refcount of an array.) If there's a view of an array, you resize() it in place, and realloc() moves the data, the views now point to bogus memory: you can cause the python interpreter to segfault by addressing their contents. I really can't see any way around this; why not remove inplace resize() (or make it raise exceptions if the size has to change) and allow only the function resize()? Anne From charlesr.harris at gmail.com Wed Aug 29 14:29:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 12:29:22 -0600 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Timothy Hochberg wrote: > > > > On 8/29/07, Charles R Harris wrote: > > > > > > > > On 8/29/07, Christopher Barker < Chris.Barker at noaa.gov> wrote: > > > > > > Charles R Harris wrote: > > > > What *should* the resize method do? It looks like > > > > it is equivalent to assigning a shape tuple to a.shape, > > > > > > No, that's what reshape does. > > > > > > No, reshape returns a view and the view doesn't own its data. Totally > > different behavior in this context. > > > > > so why do we need it? > > > > > > resize() will change the SIZE of the array (number of elements), where > > > > > > reshape() will only change the shape, but not the number of elements. > > > The fact that the size is changing is why it won't work if if doesn't > > > own the data. > > > > > > According to the documentation, the resize method changes the array > > inplace. How can it be inplace if the number of elements changes? > > > > It sounds like you and Chris are talking past each other on a matter of > terminology. At a C-level, it's obviously not (necessarily) in place, since > the array may get realloced as you surmise below. However, at the Python > level, the change is in fact in place, in the same sense that appending to a > Python list operates in-place, even though under the covers memory may get > realloced there as well. > > > > Admittedly, it *will* change the size, but that is not consistent with > > the documentation. I suspect it reallocates memory and (hopefully) frees the > > old, but then that is what the documentation should say because it explains > > why the data must be owned -- a condition violated in some cases as > > demonstrated above. I am working on documentation and that is why I am > > raising these questions. There seem to be some inconsistencies that need > > clarification and/or fixing. > > > > The main inconsistency I see above is that resize appears to only require > ownership of the data if in fact the number of items changes. I don't think > that's actually a bug, but I don't like it much; I would prefer that resize > be strict and always require ownership. However, I'm fairly certain that > there are people that prefer "friendliness" over consistency, so I wouldn't > be surprised to get some pushback on changing that. > I still don't see why the method is needed at all. Given the conditions on the array, the only thing it buys you over the resize function or a reshape is the automatic deletion of the old memory if new memory is allocated. And the latter is easily done as a = reshape(a, new_shape). I know there was a push to make most things methods, but it is possible to overdo it. Is this a Numarray compatibility issue? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Wed Aug 29 14:31:12 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 29 Aug 2007 11:31:12 -0700 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Anne Archibald wrote: > > On 29/08/2007, Timothy Hochberg wrote: > > > The main inconsistency I see above is that resize appears to only > require > > ownership of the data if in fact the number of items changes. I don't > think > > that's actually a bug, but I don't like it much; I would prefer that > resize > > be strict and always require ownership. However, I'm fairly certain that > > there are people that prefer "friendliness" over consistency, so I > wouldn't > > be surprised to get some pushback on changing that. > > It seems to me like inplace resize is a problem, no matter how you > implement it --- is there any way to verify that no view exists of a > given array? (refcounts won't do it since there are other, non-view > ways to increase the refcount of an array.) I think that may be overstating the problem a bit; refcounts should work in the sense that they would prevent segfaults. They'll just be too conservative in many cases, preventing resizes in cases where they would otherwise work. > If there's a view of an > array, you resize() it in place, and realloc() moves the data, the > views now point to bogus memory: you can cause the python interpreter > to segfault by addressing their contents. I really can't see any way > around this; why not remove inplace resize() (or make it raise > exceptions if the size has to change) and allow only the function > resize()? Probably because in a few cases, it's vastly more efficient to realloc the data than to copy it. FWIW, I don't use either the resize function or the resize method, but if I was going to get rid of one, personally I'd axe the function. Resizing is a confusing operation and the function doesn't have the possibility of better efficiency to justify it's existence. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Aug 29 14:34:28 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 13:34:28 -0500 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: <46D5BC34.7080400@gmail.com> Anne Archibald wrote: > On 29/08/2007, Timothy Hochberg wrote: > >> The main inconsistency I see above is that resize appears to only require >> ownership of the data if in fact the number of items changes. I don't think >> that's actually a bug, but I don't like it much; I would prefer that resize >> be strict and always require ownership. However, I'm fairly certain that >> there are people that prefer "friendliness" over consistency, so I wouldn't >> be surprised to get some pushback on changing that. > > It seems to me like inplace resize is a problem, no matter how you > implement it --- is there any way to verify that no view exists of a > given array? (refcounts won't do it since there are other, non-view > ways to increase the refcount of an array.) Yes, as long as every view is created using the C API correctly. That's why Chuck saw the exception he did, because he tried to resize() an array that had a view stuck of it (or rather, he was trying to resize() the view, which didn't have ownership of the data). In [8]: from numpy import * In [9]: a = zeros(10) In [10]: a.resize(15) In [11]: b = a[:] In [12]: a.resize(20) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/rkern/src/VTK-5.0.2/ in () ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function Of course, if you muck around with the raw data pointer using ctypes, you might have problems, but that's ctypes for you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Wed Aug 29 14:37:57 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 29 Aug 2007 20:37:57 +0200 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: <20070829183757.GB10641@clipper.ens.fr> On Wed, Aug 29, 2007 at 11:31:12AM -0700, Timothy Hochberg wrote: > FWIW, I don't use either the resize function or the resize method, but if > I was going to get rid of one, personally I'd axe the function. Resizing > is a confusing operation and the function doesn't have the possibility of > better efficiency to justify it's existence. My understand of OOP is that I expect a method to modify an object in place, and a function to return a new object (or a view). Now this is not true with Python, as some objects are imutable and this is not possible, but at least there seems to be some logic that a method returns a new object only if the object is imutable. With numpy I often fail to see the logic, but I'd love to see one. Ga?l From tim.hochberg at ieee.org Wed Aug 29 14:38:54 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 29 Aug 2007 11:38:54 -0700 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Charles R Harris wrote: > > I still don't see why the method is needed at all. Given the conditions on > the array, the only thing it buys you over the resize function or a reshape > is the automatic deletion of the old memory if new memory is allocated. > Can you explain this more? Both you and Anne seem to share the opinion that the resize method is useless, while the resize function is useful. So, now I'm worried I'm missing something since as far as I can tell the function is useless and the method is only mostly useless. > And the latter is easily done as a = reshape(a, new_shape). I know there > was a push to make most things methods, > In general I think methods are easy to overdo, but I'm not on board for this particular case. but it is possible to overdo it. Is this a Numarray compatibility issue? > Dunno about that. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 29 15:14:54 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 13:14:54 -0600 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: On 8/29/07, Timothy Hochberg wrote: > > > > On 8/29/07, Charles R Harris wrote: > > > > > I still don't see why the method is needed at all. Given the conditions > > on the array, the only thing it buys you over the resize function or a > > reshape is the automatic deletion of the old memory if new memory is > > allocated. > > > > Can you explain this more? Both you and Anne seem to share the opinion > that the resize method is useless, while the resize function is useful. So, > now I'm worried I'm missing something since as far as I can tell the > function is useless and the method is only mostly useless. > Heh. I might dump both. The resize function is a concatenation followed by reshape. It differs from the resize method in that it always returns a new array and repeats the data instead of filling with zeros. The inconsistency in the way the array is filled bothers me a bit, I would have just named the method realloc. I really don't see the need for either except for backward compatibility. Maybe someone can make a case. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Wed Aug 29 15:25:59 2007 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 29 Aug 2007 09:25:59 -1000 Subject: [Numpy-discussion] Bug in resize method? In-Reply-To: References: <20070829155856.GV14395@mentat.za.net> <46D5A43F.5080702@noaa.gov> Message-ID: <46D5C847.8080208@hawaii.edu> Timothy Hochberg wrote: > > > On 8/29/07, *Charles R Harris* > wrote: > > > I still don't see why the method is needed at all. Given the > conditions on the array, the only thing it buys you over the resize > function or a reshape is the automatic deletion of the old memory if > new memory is allocated. > > > Can you explain this more? Both you and Anne seem to share the opinion > that the resize method is useless, while the resize function is useful. > So, now I'm worried I'm missing something since as far as I can tell the > function is useless and the method is only mostly useless. The resize function docstring makes the following distinction: Definition: numpy.resize(a, new_shape) Docstring: Return a new array with the specified shape. The original array's total size can be any size. The new array is filled with repeated copies of a. Note that a.resize(new_shape) will fill the array with 0's beyond current definition of a. So the method and the function are subtly different. As far as I can see, the method is causing more trouble than it is worth. Under what circumstances, in real code, can it provide enough benefit to override the penalty it is now exacting in confusion? Eric From ipan at freeshell.org Wed Aug 29 15:34:24 2007 From: ipan at freeshell.org (Ivan Pan) Date: Wed, 29 Aug 2007 14:34:24 -0500 Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x In-Reply-To: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com> References: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com> Message-ID: On 8/19/07, Ondrej Certik wrote: > I don't know what the native way of installing packages on Mac OS > X is, but I know of the fink project, that basically allows to use > debian packages: > > http://finkproject.org/ Besides fink, there is also MacPort . It is similar to BSD Portage. They have fairly recent SciPy (0.5.2), NumPy (1.0.3), IPython (0.8.1) and many more ... Chris Fonnesbeck provides a Mac OS X installer for SciPy (0.5.3), NumPy (1.0.4), Matplotlib (0.90.1), IPython (0.8.2) with readline, and PyMC (1.3). He provdies binaries for both Intel and PPC version. It is fairly up-to-date. He releases weekly or bi-monthly. ip From myeates at jpl.nasa.gov Wed Aug 29 15:59:18 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 12:59:18 -0700 Subject: [Numpy-discussion] help! not using lapack Message-ID: <46D5D016.2070000@jpl.nasa.gov> Hi When I try import numpy id(numpy.dot) == id(numpy.core.multiarray.dot) I get True. But I have liblapck.a installed in ~/lib and I put the lines [DEFAULT] library_dirs = /home/myeates/lib include_dirs = /home/myeates/include in site.cfg In fact, when I build and run a sytem trace I see that liblapack.a is being accessed. Any ideas? Mathew From robert.kern at gmail.com Wed Aug 29 16:12:30 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:12:30 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D016.2070000@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> Message-ID: <46D5D32E.4060507@gmail.com> Mathew Yeates wrote: > Hi > When I try > import numpy > id(numpy.dot) == id(numpy.core.multiarray.dot) > > I get True. But I have liblapck.a installed in ~/lib and I put the lines > [DEFAULT] > library_dirs = /home/myeates/lib > include_dirs = /home/myeates/include > > in site.cfg > In fact, when I build and run a sytem trace I see that liblapack.a is > being accessed. > > Any ideas? It is possible that you have a linking problem with _dotblas.so. On some systems, such a problem will only manifest itself at run-time, not build-time. At runtime, you will get an ImportError, which we catch because that's also the error one gets if the _dotblas is legitimately absent. Try importing _dotblas by itself to see the error message. In [8]: from numpy.core import _dotblas Most likely you are missing the appropriate libblas, too, since you don't mention it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Wed Aug 29 16:15:39 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:15:39 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D32E.4060507@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> Message-ID: <46D5D3EB.4070608@jpl.nasa.gov> yes, I get from numpy.core import _dotblas ImportError: No module named multiarray Now what? uname -a Linux 2.6.9-55.0.2.EL #1 Tue Jun 12 17:47:10 EDT 2007 i686 athlon i386 GNU/Linux Robert Kern wrote: > Mathew Yeates wrote: > >> Hi >> When I try >> import numpy >> id(numpy.dot) == id(numpy.core.multiarray.dot) >> >> I get True. But I have liblapck.a installed in ~/lib and I put the lines >> [DEFAULT] >> library_dirs = /home/myeates/lib >> include_dirs = /home/myeates/include >> >> in site.cfg >> In fact, when I build and run a sytem trace I see that liblapack.a is >> being accessed. >> >> Any ideas? >> > > It is possible that you have a linking problem with _dotblas.so. On some > systems, such a problem will only manifest itself at run-time, not build-time. > At runtime, you will get an ImportError, which we catch because that's also the > error one gets if the _dotblas is legitimately absent. > > Try importing _dotblas by itself to see the error message. > > > In [8]: from numpy.core import _dotblas > > > Most likely you are missing the appropriate libblas, too, since you don't > mention it. > > From robert.kern at gmail.com Wed Aug 29 16:18:56 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:18:56 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D3EB.4070608@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> Message-ID: <46D5D4B0.6040705@gmail.com> Mathew Yeates wrote: > yes, I get > from numpy.core import _dotblas > ImportError: No module named multiarray That's just weird. Can you import numpy.core.multiarray by itself? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Wed Aug 29 16:20:13 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:20:13 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D4B0.6040705@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> Message-ID: <46D5D4FD.3060706@jpl.nasa.gov> yes Robert Kern wrote: > Mathew Yeates wrote: > >> yes, I get >> from numpy.core import _dotblas >> ImportError: No module named multiarray >> > > That's just weird. Can you import numpy.core.multiarray by itself? > > From myeates at jpl.nasa.gov Wed Aug 29 16:22:23 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:22:23 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D4B0.6040705@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> Message-ID: <46D5D57F.5080701@jpl.nasa.gov> oops. sorry from numpy.core import _dotblas ImportError: /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: undefined symbol: cblas_zaxpy Robert Kern wrote: > Mathew Yeates wrote: > >> yes, I get >> from numpy.core import _dotblas >> ImportError: No module named multiarray >> > > That's just weird. Can you import numpy.core.multiarray by itself? > > From robert.kern at gmail.com Wed Aug 29 16:26:31 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:26:31 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D57F.5080701@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> Message-ID: <46D5D677.1070408@gmail.com> Mathew Yeates wrote: > oops. sorry > from numpy.core import _dotblas > ImportError: > /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: > undefined symbol: cblas_zaxpy Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you specify one to use. Follow the directions in site.cfg.example. If you need more help, please tell us what libraries you are using, your full site.cfg and the output of $ python setup.py config -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Wed Aug 29 16:29:46 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:29:46 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D677.1070408@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> Message-ID: <46D5D73A.4080500@jpl.nasa.gov> my site,cfg just is [DEFAULT] library_dirs = /home/myeates/lib include_dirs = /home/myeates/include python setup.py config gives F2PY Version 2_3979 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /home/myeates/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /home/myeates/lib NOT AVAILABLE blas_info: FOUND: libraries = ['blas'] library_dirs = ['/home/myeates/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/home/myeates/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /home/myeates/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib libraries lapack_atlas not found in /home/myeates/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /home/myeates/lib libraries lapack_atlas not found in /home/myeates/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE lapack_info: FOUND: libraries = ['lapack'] library_dirs = ['/home/myeates/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/home/myeates/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running config Robert Kern wrote: > Mathew Yeates wrote: > >> oops. sorry >> from numpy.core import _dotblas >> ImportError: >> /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: >> undefined symbol: cblas_zaxpy >> > > Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you > specify one to use. Follow the directions in site.cfg.example. If you need more > help, please tell us what libraries you are using, your full site.cfg and the > output of > > $ python setup.py config > > From myeates at jpl.nasa.gov Wed Aug 29 16:35:36 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:35:36 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D73A.4080500@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> Message-ID: <46D5D898.3060809@jpl.nasa.gov> more info. My blas library has zaxpy defined but not cblas_zaxpy Mathew Yeates wrote: > my site,cfg just is > [DEFAULT] > library_dirs = /home/myeates/lib > include_dirs = /home/myeates/include > > python setup.py config gives > F2PY Version 2_3979 > blas_opt_info: > blas_mkl_info: > libraries mkl,vml,guide not found in /home/myeates/lib > NOT AVAILABLE > > atlas_blas_threads_info: > Setting PTATLAS=ATLAS > libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib > NOT AVAILABLE > > atlas_blas_info: > libraries f77blas,cblas,atlas not found in /home/myeates/lib > NOT AVAILABLE > > blas_info: > FOUND: > libraries = ['blas'] > library_dirs = ['/home/myeates/lib'] > language = f77 > > FOUND: > libraries = ['blas'] > library_dirs = ['/home/myeates/lib'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > lapack_opt_info: > lapack_mkl_info: > mkl_info: > libraries mkl,vml,guide not found in /home/myeates/lib > NOT AVAILABLE > > NOT AVAILABLE > > atlas_threads_info: > Setting PTATLAS=ATLAS > libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib > libraries lapack_atlas not found in /home/myeates/lib > numpy.distutils.system_info.atlas_threads_info > NOT AVAILABLE > > atlas_info: > libraries f77blas,cblas,atlas not found in /home/myeates/lib > libraries lapack_atlas not found in /home/myeates/lib > numpy.distutils.system_info.atlas_info > NOT AVAILABLE > > lapack_info: > FOUND: > libraries = ['lapack'] > library_dirs = ['/home/myeates/lib'] > language = f77 > > FOUND: > libraries = ['lapack', 'blas'] > library_dirs = ['/home/myeates/lib'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > running config > > > Robert Kern wrote: > >> Mathew Yeates wrote: >> >> >>> oops. sorry >>> from numpy.core import _dotblas >>> ImportError: >>> /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: >>> undefined symbol: cblas_zaxpy >>> >>> >> Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you >> specify one to use. Follow the directions in site.cfg.example. If you need more >> help, please tell us what libraries you are using, your full site.cfg and the >> output of >> >> $ python setup.py config >> >> >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From robert.kern at gmail.com Wed Aug 29 16:35:50 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:35:50 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D73A.4080500@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> Message-ID: <46D5D8A6.7010502@gmail.com> If your BLAS just the reference BLAS, don't bother with _dotblas. It won't be any faster than the default implementation in numpy. You only get a win if you are using an accelerated BLAS with the CBLAS interface for C-style row-major matrices. Your libblas does not seem to be such an accelerated BLAS. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Wed Aug 29 16:39:26 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:39:26 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D8A6.7010502@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> <46D5D8A6.7010502@gmail.com> Message-ID: <46D5D97E.9040401@jpl.nasa.gov> I'm the one who created libblas.a so I must have done something wrong. This is lapack-3.1.1. Robert Kern wrote: > If your BLAS just the reference BLAS, don't bother with _dotblas. It won't be > any faster than the default implementation in numpy. You only get a win if you > are using an accelerated BLAS with the CBLAS interface for C-style row-major > matrices. Your libblas does not seem to be such an accelerated BLAS. > > From robert.kern at gmail.com Wed Aug 29 16:46:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:46:17 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5D97E.9040401@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> <46D5D8A6.7010502@gmail.com> <46D5D97E.9040401@jpl.nasa.gov> Message-ID: <46D5DB19.6090808@gmail.com> Mathew Yeates wrote: > I'm the one who created libblas.a so I must have done something wrong. > This is lapack-3.1.1. No, you didn't do anything wrong, per se, you just built the reference F77 BLAS. It's not an accelerated BLAS, so there's no point in using it with numpy. There's not way you *can* build it to be an accelerated BLAS. If you want an accelerated BLAS, try to use ATLAS: http://math-atlas.sourceforge.net/ It is possible that your Linux distribution, whatever it is, already has a build of it for you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Wed Aug 29 16:52:59 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 13:52:59 -0700 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5DB19.6090808@gmail.com> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> <46D5D8A6.7010502@gmail.com> <46D5D97E.9040401@jpl.nasa.gov> <46D5DB19.6090808@gmail.com> Message-ID: <46D5DCAB.6000304@jpl.nasa.gov> Thanks Robert I have a deadline and don't have time to install ATLAS. Instead I'm installing clapack. Is this the corrrect thing to do? Mathew Robert Kern wrote: > Mathew Yeates wrote: > >> I'm the one who created libblas.a so I must have done something wrong. >> This is lapack-3.1.1. >> > > No, you didn't do anything wrong, per se, you just built the reference F77 BLAS. > It's not an accelerated BLAS, so there's no point in using it with numpy. > There's not way you *can* build it to be an accelerated BLAS. > > If you want an accelerated BLAS, try to use ATLAS: > > http://math-atlas.sourceforge.net/ > > It is possible that your Linux distribution, whatever it is, already has a build > of it for you. > > From Joris.DeRidder at ster.kuleuven.be Wed Aug 29 16:53:25 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 29 Aug 2007 22:53:25 +0200 Subject: [Numpy-discussion] Trac ticket In-Reply-To: References: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be> Message-ID: <42F6611A-C621-4D31-AA66-784E1F597639@ster.kuleuven.be> > You should most likely just attach a patch against the latest trunk > to the ticket itself for review. Done. The patch adds an 'axis' keyword to median(). J. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From robert.kern at gmail.com Wed Aug 29 16:55:05 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 15:55:05 -0500 Subject: [Numpy-discussion] help! not using lapack In-Reply-To: <46D5DCAB.6000304@jpl.nasa.gov> References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com> <46D5D3EB.4070608@jpl.nasa.gov> <46D5D4B0.6040705@gmail.com> <46D5D57F.5080701@jpl.nasa.gov> <46D5D677.1070408@gmail.com> <46D5D73A.4080500@jpl.nasa.gov> <46D5D8A6.7010502@gmail.com> <46D5D97E.9040401@jpl.nasa.gov> <46D5DB19.6090808@gmail.com> <46D5DCAB.6000304@jpl.nasa.gov> Message-ID: <46D5DD29.2040405@gmail.com> Mathew Yeates wrote: > Thanks Robert > I have a deadline and don't have time to install ATLAS. Instead I'm > installing clapack. Is this the corrrect thing to do? No. Just leave things alone if you don't have an accelerated BLAS at hand. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed Aug 29 17:53:12 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 29 Aug 2007 14:53:12 -0700 Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x In-Reply-To: References: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com> Message-ID: <46D5EAC8.8000109@noaa.gov> Ivan Pan wrote: > On 8/19/07, Ondrej Certik wrote: >> I don't know what the native way of installing packages on Mac OS >> X is Boy, I wish this weren't such a mess. Quite some time ago, a bunch of us on the pythonmac list tried to establish the idea of a "one 'standard' python for OS-X", and a set of pre-built packages for it. It is the one you find here: http://www.pythonmac.org/packages/py25-fat/ That Python is the same as the one you find at python.org too. It is the closest one comes to a "native" set of packages for OS-X. It would be really nice if the scipy/numpy projects would provide binaries (or at least have setup.py ready to go) for that repository. I really like being able to tell folks ONE place to go to get python packages. There are a number of Mac folks that help build the packages there. For a while, SciPy had a key problem -- no one knew how to build Universal(Intel and PPC) packages from Fortran code, and that repository really should have Universal binaries, so that folks don't have to think about what hardware they are running, and can bundle up apps with Py2App that will work on any Mac (with a new enough OS). I understand that the Universal problem has been solved now. I hope that if the SciPy project "officially" releases binaries for OS-X, they will be Universal binaries compatible with that Python. About fink/macport. They are fine systems that have some real use. However, they really should be thought of as different platforms (or at least different distributions), much like CygWin, or Ubuntu vs. Fedora. If you make a fink package, you are making a fink package, NOT an OS-X one. Anyway, sorry to be so pedantic when I don't think I will get the time to do the building myself, but I wanted to lay out a goal anyway. I do offer to do some testing, etc if needed. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From myeates at jpl.nasa.gov Wed Aug 29 17:53:42 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 14:53:42 -0700 Subject: [Numpy-discussion] gesdd hangs Message-ID: <46D5EAE6.6070809@jpl.nasa.gov> I guess I can't blame lapack. My system has atlas so I recompiled numpy pointing to atlas. Now id(numpy.dot) == id(numpy.core.multiarray.dot) is False However when I run decomp.svd on a 25 by 25 identity matrix, it hangs when gesdd is called (line 501 of linalag/decomp.py) Anybody else seeing this? Mathew From peridot.faceted at gmail.com Wed Aug 29 18:31:55 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 29 Aug 2007 18:31:55 -0400 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft Message-ID: Hi, numpy's Fourier transforms have the handy feature of being able to upsample and downsample signals; for example the documentation cites irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A. However, there is a peculiarity with the way numpy handles the highest-frequency coefficient. First of all, the normalization: In [65]: rfft(cos(2*pi*arange(8)/8.)) Out[65]: array([ -3.44505240e-16 +0.00000000e+00j, 4.00000000e+00 -1.34392280e-15j, 1.22460635e-16 -0.00000000e+00j, -1.16443313e-16 -8.54080261e-16j, 9.95839695e-17 +0.00000000e+00j]) In [66]: rfft(cos(2*4*pi*arange(8)/8.)) Out[66]: array([ 0.+0.j, 0.+0.j, 0.-0.j, 0.+0.j, 8.+0.j]) So a cosine signal gives 0.5*N if its frequency F is 0 References: <46D5EAE6.6070809@jpl.nasa.gov> Message-ID: On 8/29/07, Mathew Yeates wrote: > > I guess I can't blame lapack. My system has atlas so I recompiled numpy > pointing to atlas. Now > > id(numpy.dot) == id(numpy.core.multiarray.dot) is False > > However when I run decomp.svd on a 25 by 25 identity matrix, it hangs when > gesdd is called (line 501 of linalag/decomp.py) > > Anybody else seeing this? What do you mean by hang? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From myeates at jpl.nasa.gov Wed Aug 29 18:35:55 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 29 Aug 2007 15:35:55 -0700 Subject: [Numpy-discussion] gesdd hangs In-Reply-To: References: <46D5EAE6.6070809@jpl.nasa.gov> Message-ID: <46D5F4CB.4020702@jpl.nasa.gov> never returns Charles R Harris wrote: > > > On 8/29/07, *Mathew Yeates* > wrote: > > I guess I can't blame lapack. My system has atlas so I recompiled > numpy > pointing to atlas. Now > > id(numpy.dot) == id(numpy.core.multiarray.dot) is False > > However when I run decomp.svd on a 25 by 25 identity matrix, it > hangs when gesdd is called (line 501 of linalag/decomp.py) > > Anybody else seeing this? > > > What do you mean by hang? > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Wed Aug 29 19:08:08 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 17:08:08 -0600 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: Anne, On 8/29/07, Anne Archibald wrote: > > Hi, > > numpy's Fourier transforms have the handy feature of being able to > upsample and downsample signals; for example the documentation cites > irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A. > However, there is a peculiarity with the way numpy handles the > highest-frequency coefficient. The upshot is, if I correctly understand what is going on, that the > last coefficient needs to be treated somewhat differently from the > others; when one pads with zeros in order to upsample the signal, one > should multiply the last coefficient by 0.5. Should this be done in > numpy's upsampling code? Should it at least be documented? What is going on is that the coefficient at the Nyquist frequency appears once in the unextended array, but twice when the array is extended with zeros because of the Hermitean symmetry. That should probably be fixed in the upsampling code. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 29 19:44:09 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 17:44:09 -0600 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: On 8/29/07, Charles R Harris wrote: > > Anne, > > On 8/29/07, Anne Archibald wrote: > > > > Hi, > > > > numpy's Fourier transforms have the handy feature of being able to > > upsample and downsample signals; for example the documentation cites > > irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A. > > However, there is a peculiarity with the way numpy handles the > > highest-frequency coefficient. > > > > > The upshot is, if I correctly understand what is going on, that the > > last coefficient needs to be treated somewhat differently from the > > others; when one pads with zeros in order to upsample the signal, one > > should multiply the last coefficient by 0.5. Should this be done in > > numpy's upsampling code? Should it at least be documented? > > > What is going on is that the coefficient at the Nyquist frequency appears > once in the unextended array, but twice when the array is extended with > zeros because of the Hermitean symmetry. That should probably be fixed in > the upsampling code. > The inverse irfft also scales by dividing by the new transform size instead of the original size, so the result needs to be scaled up for the interpolation to work. It is easy to go wrong with fft's when the correct sampling/frequency scales aren't carried with the data. I always do that myself so that the results are independent of transform size/interpolation and expressed in some standard units. In [9]: a = array([1, 0, 0, 0], dtype=double) In [10]: b = rfft(a) In [11]: b[2] *= .5 In [12]: irfft(b,8) Out[12]: array([ 0.5 , 0.3017767, 0. , -0.0517767, 0. , -0.0517767, 0. , 0.3017767]) In [13]: 2*irfft(b,8) Out[13]: array([ 1. , 0.60355339, 0. , -0.10355339, 0. , -0.10355339, 0. , 0.60355339]) I don't know where that should be fixed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 29 20:14:30 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 18:14:30 -0600 Subject: [Numpy-discussion] gesdd hangs In-Reply-To: <46D5F4CB.4020702@jpl.nasa.gov> References: <46D5EAE6.6070809@jpl.nasa.gov> <46D5F4CB.4020702@jpl.nasa.gov> Message-ID: On 8/29/07, Mathew Yeates wrote: > > never returns Where is decomp coming from? linalg.svd(eye(25)) works fine here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Aug 29 20:18:14 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Aug 2007 19:18:14 -0500 Subject: [Numpy-discussion] gesdd hangs In-Reply-To: References: <46D5EAE6.6070809@jpl.nasa.gov> <46D5F4CB.4020702@jpl.nasa.gov> Message-ID: <46D60CC6.4040007@gmail.com> Charles R Harris wrote: > > On 8/29/07, *Mathew Yeates* > wrote: > > never returns > > Where is decomp coming from? linalg.svd(eye(25)) works fine here. scipy, most likely. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Wed Aug 29 20:49:08 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 29 Aug 2007 20:49:08 -0400 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: On 29/08/2007, Charles R Harris wrote: > > > What is going on is that the coefficient at the Nyquist frequency appears > once in the unextended array, but twice when the array is extended with > zeros because of the Hermitean symmetry. That should probably be fixed in > the upsampling code. Is this also appropriate for the other FFTs? (inverse real, complex, hermitian, what have you) I have written a quick hack (attached) that should do just that rescaling, but I don't know that it's a good idea, as implemented. Really, for a complex IFFT it's extremely peculiar to add the padding where we do (between frequency -1 and frequency zero); it would make more sense to pad at the high frequencies (which are in the middle of the array). Forward FFTs, though, can reasonably be padded at the end, and it doesn't make much sense to rescale the last data point. > The inverse irfft also scales by dividing by the new transform size instead > of the original size, so the result needs to be scaled up for the > interpolation to work. It is easy to go wrong with fft's when the correct > sampling/frequency scales aren't carried with the data. I always do that > myself so that the results are independent of transform size/interpolation > and expressed in some standard units. The scaling of the FFT is a pain everywhere. I always just try it a few times until I get the coefficients right. I sort of like FFTW's convention of never normalizing anything - it means the transforms have nice simple formulas, though unfortunately it also means that ifft(fft(A))!=A. In any case the normalization of numpy's FFTs is not something that can reasonably be changed, even in the special case of the zero-padding inverse (and forward) FFTs. Anne -------------- next part -------------- A non-text attachment was scrubbed... Name: fftfix Type: application/octet-stream Size: 1242 bytes Desc: not available URL: From charlesr.harris at gmail.com Wed Aug 29 21:46:32 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 19:46:32 -0600 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: Hi Anne, On 8/29/07, Anne Archibald wrote: > > On 29/08/2007, Charles R Harris wrote: > > > > > What is going on is that the coefficient at the Nyquist frequency > appears > > once in the unextended array, but twice when the array is extended with > > zeros because of the Hermitean symmetry. That should probably be fixed > in > > the upsampling code. > > Is this also appropriate for the other FFTs? (inverse real, complex, > hermitian, what have you) I have written a quick hack (attached) that > should do just that rescaling, but I don't know that it's a good idea, > as implemented. Really, for a complex IFFT it's extremely peculiar to > add the padding where we do (between frequency -1 and frequency zero); > it would make more sense to pad at the high frequencies (which are in > the middle of the array). Forward FFTs, though, can reasonably be > padded at the end, and it doesn't make much sense to rescale the last > data point. It all depends on the data and what you intend. Much of my experience is with Michaelson interferometers and in that case the interferogram is essentially an autocorrelation, so it is desirable to keep its center at sample zero and let the left side wrap around, so ideally you fill in the middle as you suggest. You can also pad at the end if you don't put the center at zero, but then you need to phase shift the spectrum in a way that corresponds to rotating the center to index zero and padding in the middle. I expect you would want to do the same thing for complex transforms if they are of real data and do the nyquist divided by two thingy. If the high frequencies in a complex transform are actually high frequencies and not aliases of negative frequencies, then you will want to just append zeros. That case also occurs, I have designed decimating complex filters that produce output like that, they were like single sideband in the radio world. > The inverse irfft also scales by dividing by the new transform size > instead > > of the original size, so the result needs to be scaled up for the > > interpolation to work. It is easy to go wrong with fft's when the > correct > > sampling/frequency scales aren't carried with the data. I always do that > > myself so that the results are independent of transform > size/interpolation > > and expressed in some standard units. > > The scaling of the FFT is a pain everywhere. I always just try it a > few times until I get the coefficients right. I sort of like FFTW's > convention of never normalizing anything - it means the transforms > have nice simple formulas, though unfortunately it also means that > ifft(fft(A))!=A. In any case the normalization of numpy's FFTs is not > something that can reasonably be changed, even in the special case of > the zero-padding inverse (and forward) FFTs. I usually multiply the forward transform by the sample interval, in secs or cm, and the unscaled inverse transform by the frequency sample interval, in Hz or cm^-1. That treats both the forward and inverse fft like approximations to the integral transforms and makes the units those of spectral density. If you think trapezoidal rule, then you will also see factors of .5 at the ends, but that is a sort of apodization that is consistent with how Fourier series converge at discontinuities. In the normal case where no interpolation is done the product of the sample intervals is 1/N, so it reduces to the usual convention. Note that in your example the sampling interval decreases when you do the interpolation, so if you did another forward transform it would be scaled down to account for the extra points in the data. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Aug 29 22:24:50 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 29 Aug 2007 22:24:50 -0400 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: On 29/08/2007, Charles R Harris wrote: > > Is this also appropriate for the other FFTs? (inverse real, complex, > > hermitian, what have you) I have written a quick hack (attached) that > > should do just that rescaling, but I don't know that it's a good idea, > > as implemented. Really, for a complex IFFT it's extremely peculiar to > > add the padding where we do (between frequency -1 and frequency zero); > > it would make more sense to pad at the high frequencies (which are in > > the middle of the array). Forward FFTs, though, can reasonably be > > padded at the end, and it doesn't make much sense to rescale the last > > data point. > > It all depends on the data and what you intend. Much of my experience is > with Michaelson interferometers and in that case the interferogram is > essentially an autocorrelation, so it is desirable to keep its center at > sample zero and let the left side wrap around, so ideally you fill in the > middle as you suggest. You can also pad at the end if you don't put the > center at zero, but then you need to phase shift the spectrum in a way that > corresponds to rotating the center to index zero and padding in the middle. > I expect you would want to do the same thing for complex transforms if they > are of real data and do the nyquist divided by two thingy. If the high > frequencies in a complex transform are actually high frequencies and not > aliases of negative frequencies, then you will want to just append zeros. > That case also occurs, I have designed decimating complex filters that > produce output like that, they were like single sideband in the radi o > world. So is it a fair summary to say that for irfft, it is fairly clear that one should adjust the Nyquist coefficient, but for the other varieties of FFT, the padding done by numpy is just one of many possible choices? Should numpy be modified so that irfft adjusts the Nyquist coefficient? Should this happen only for irfft? > I usually multiply the forward transform by the sample interval, in secs or > cm, and the unscaled inverse transform by the frequency sample interval, in > Hz or cm^-1. That treats both the forward and inverse fft like > approximations to the integral transforms and makes the units those of > spectral density. If you think trapezoidal rule, then you will also see > factors of .5 at the ends, but that is a sort of apodization that is > consistent with how Fourier series converge at discontinuities. In the > normal case where no interpolation is done the product of the sample > intervals is 1/N, so it reduces to the usual convention. Note that in your > example the sampling interval decreases when you do the interpolation, so if > you did another forward transform it would be scaled down to account for the > extra points in the data. That's a convenient normalization. Do you know if there's a current package to associate units with numpy arrays? For my purposes it would usually be sufficient to have arrays of quantities with uniform units. Conversions need only be multiplicative (I don't care about Celsius-to-Fahrenheit style conversions) and need not even be automatic, though of course that would be convenient. Right now I use Frink for that sort of thing, but it would have saved me from making a number of minor mistakes in several pieces of python code I've written. Thanks, Anne From charlesr.harris at gmail.com Wed Aug 29 23:25:55 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 29 Aug 2007 21:25:55 -0600 Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: On 8/29/07, Anne Archibald wrote: > > On 29/08/2007, Charles R Harris wrote: > > > > Is this also appropriate for the other FFTs? (inverse real, complex, > > > hermitian, what have you) I have written a quick hack (attached) that > > > should do just that rescaling, but I don't know that it's a good idea, > > > as implemented. Really, for a complex IFFT it's extremely peculiar to > > > add the padding where we do (between frequency -1 and frequency zero); > > > it would make more sense to pad at the high frequencies (which are in > > > the middle of the array). Forward FFTs, though, can reasonably be > > > padded at the end, and it doesn't make much sense to rescale the last > > > data point. > > > > It all depends on the data and what you intend. Much of my experience is > > with Michaelson interferometers and in that case the interferogram is > > essentially an autocorrelation, so it is desirable to keep its center at > > sample zero and let the left side wrap around, so ideally you fill in > the > > middle as you suggest. You can also pad at the end if you don't put the > > center at zero, but then you need to phase shift the spectrum in a way > that > > corresponds to rotating the center to index zero and padding in the > middle. > > I expect you would want to do the same thing for complex transforms if > they > > are of real data and do the nyquist divided by two thingy. If the high > > frequencies in a complex transform are actually high frequencies and not > > aliases of negative frequencies, then you will want to just append > zeros. > > That case also occurs, I have designed decimating complex filters that > > produce output like that, they were like single sideband in the radi o > > world. > > So is it a fair summary to say that for irfft, it is fairly clear that > one should adjust the Nyquist coefficient, but for the other varieties > of FFT, the padding done by numpy is just one of many possible > choices? > > Should numpy be modified so that irfft adjusts the Nyquist > coefficient? Should this happen only for irfft? Yes, I think that should be the case. If the complex transforms pad in the middle, then they are treating the high frequencies as aliases, but unless they explicitly duplicate the Nyquist coefficient scaling isn't needed. Hmm, actually, I think that is wrong. The original data points will be reproduced, but what happens in between points? In between there is a difference between positive and negative frequences. So in a complex transform of real data one would want to split the Nyquist coefficient between high and low frequencies. I don't think it is possible to make a general statement about the complex case. Just hope the middle frequency is zero so you can ignore the problem ;) What happens in the real case is that the irfft algorithm uses the Hermitean symmetry of the spectrum, so the coefficient is implicitly duplicated. > I usually multiply the forward transform by the sample interval, in secs > or > > cm, and the unscaled inverse transform by the frequency sample interval, > in > > Hz or cm^-1. That treats both the forward and inverse fft like > > approximations to the integral transforms and makes the units those of > > spectral density. If you think trapezoidal rule, then you will also see > > factors of .5 at the ends, but that is a sort of apodization that is > > consistent with how Fourier series converge at discontinuities. In the > > normal case where no interpolation is done the product of the sample > > intervals is 1/N, so it reduces to the usual convention. Note that in > your > > example the sampling interval decreases when you do the interpolation, > so if > > you did another forward transform it would be scaled down to account for > the > > extra points in the data. > > That's a convenient normalization. > > Do you know if there's a current package to associate units with numpy > arrays? For my purposes it would usually be sufficient to have arrays > of quantities with uniform units. Conversions need only be > multiplicative (I don't care about Celsius-to-Fahrenheit style > conversions) and need not even be automatic, though of course that > would be convenient. Right now I use Frink for that sort of thing, but > it would have saved me from making a number of minor mistakes in > several pieces of python code I've written. There was a presentation by some fellow from CalTech at SciPy 2005 (4?) about such a system, but ISTR it looked pretty complex. C++ template programming does it with traits and maybe the Enthought folks have something useful along those lines. Otherwise, I don't know of any such system for general use. Maybe ndarray could be subclassed? It can be convenient to multiply and divide units, so maybe some sort of string with something to gather the same units together with a power could be a useful way to track them and wouldn't tie one down to any particular choice. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From numpy-discussion at maubp.freeserve.co.uk Thu Aug 30 05:28:09 2007 From: numpy-discussion at maubp.freeserve.co.uk (Peter) Date: Thu, 30 Aug 2007 10:28:09 +0100 Subject: [Numpy-discussion] Citing Numeric and numpy In-Reply-To: References: <46D55C01.6050604@maubp.freeserve.co.uk> Message-ID: <46D68DA9.8090606@maubp.freeserve.co.uk> Thank you Ryan & Alan for the feedback - the three references are summarized here for anyone searching for the citations in future. The recent overview was: Travis E. Oliphant, "Python for Scientific Computing," Computing in Science & Engineering, vol. 9, no. 3, May/June 2007, pp. 10-20. Numerical Python citation, available online at: http://numpy.scipy.org/numpydoc/numpy.html D. Ascher et al., Numerical Python, tech. report UCRL-MA-128569, Lawrence Livermore National Laboratory, 2001; http://numpy.scipy.org. NumPy book citation, see also http://www.tramy.us for details: Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA; http://numpy.scipy.org. Cheers, Peter From pearu at cens.ioc.ee Thu Aug 30 05:48:44 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 30 Aug 2007 12:48:44 +0300 (EEST) Subject: [Numpy-discussion] Error code of NumpyTest() In-Reply-To: References: Message-ID: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee> On Fri, August 24, 2007 11:41 am, Matthieu Brucher wrote: > Hi, > > I wondered if there was a way of returning another error code than 0 when > executing the test suite so that a parent process can immediately know if > all the tests passed or not. > The numpy buildbot seems to have the same behaviour BTW. > I don't know if it is possible, but it would be great. The svn version of test() function now returns TestResult object. So, test() calls in buildbot should read: import numpy,sys; sys.exit(not numpy.test(verbosity=9999,level=9999).wasSuccessful()) Hopefully buildbot admins can update the test commands accordingly. Pearu From matthieu.brucher at gmail.com Thu Aug 30 05:59:13 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 30 Aug 2007 11:59:13 +0200 Subject: [Numpy-discussion] Error code of NumpyTest() In-Reply-To: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee> References: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee> Message-ID: Thank you for the answer The svn version of test() function now returns TestResult object. Numpy 1.3.x does not provide this ? I can't upgrade the numpy packages on the Linux boxes (on the Windows box, I suppose that I could use an Enthought egg). So, test() calls in buildbot should read: > > import numpy,sys; sys.exit(not > numpy.test(verbosity=9999,level=9999).wasSuccessful()) > > Hopefully buildbot admins can update the test commands accordingly. I'll be able to do this as the tests are located on the repository. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryanv at enthought.com Thu Aug 30 11:17:00 2007 From: bryanv at enthought.com (Bryan Van de Ven) Date: Thu, 30 Aug 2007 10:17:00 -0500 Subject: [Numpy-discussion] Units; was Bug or surprising undocumented behaviour in irfft In-Reply-To: References: Message-ID: <46D6DF6C.7010805@enthought.com> > Do you know if there's a current package to associate units with numpy > arrays? For my purposes it would usually be sufficient to have arrays > of quantities with uniform units. Conversions need only be > multiplicative (I don't care about Celsius-to-Fahrenheit style > conversions) and need not even be automatic, though of course that > would be convenient. Right now I use Frink for that sort of thing, but > it would have saved me from making a number of minor mistakes in > several pieces of python code I've written. Anne, We have an enthought.units package in ETS, and for unit-ed numpy arrays we have (fairly new) UnitArray and UnitScalar in enthought.numerical_modeling.units.api Automatic conversions on arithmetic expressions are not performed; however, we do have a "@has_units" function decorator that will perform unit conversions on function inputs automatically (and will label--but not convert--the outputs of a function) If you are interested in checking it out I can get you more information/examples. Bryan From donovan at mirsl.ecs.umass.edu Thu Aug 30 11:25:44 2007 From: donovan at mirsl.ecs.umass.edu (Brian Donovan) Date: Thu, 30 Aug 2007 11:25:44 -0400 Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion Message-ID: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com> Hello all, I'm wondering if there is a way to use a numpy array that uses disk as a memory store rather than ram. I'm looking for something like mmap but which can be used like a numpy array. The general idea is this. I'm simulating a system which produces a large dataset over a few hours of processing time. Rather than store the numpy array in memory during processing I'd like to write the data directly to disk but still be able to treat the array as a numpy array. Is this possible? Any ideas? Thanks, Brian -- Brian Donovan Research Assistant Microwave Remote Sensing Lab UMass Amherst -------------- next part -------------- An HTML attachment was scrubbed... URL: From broman at spawar.navy.mil Thu Aug 30 11:24:27 2007 From: broman at spawar.navy.mil (Vincent Broman) Date: Thu, 30 Aug 2007 08:24:27 -0700 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: References: Message-ID: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil> My build of numpy fails under Yellow Dog Linux 2.1, running on a powerpc multiprocessor board from Curtiss-Wright. Its kernel is 2.4.19-Asmp tailored by the vendor. The gcc compiler is configured as ppc-yellowdog-linux with version number 2.95.3 20010111. The python I'm using is Python 2.5.1 (r251:54863) installed as python2. Plain /usr/bin/python is 1.5.x . The numpy version I'm trying to build is r4003 for v1.0.4 . The setup fails compiling build/src.linux-ppc-2.5/numpy/core/src/umathmodule.c with a long list of error messages of the following two kinds. warning: conflicting types for built-in function `sinl' repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442. inconsistent operand constraints in an ?`asm', triggered by lines 1100, 1124, 1150, 1755, 1785, and 1834. I cannot see on those source lines what causes such a message; I suspect there is some long complicated cpp macro or asm statement in some include file which I don't find. Has anyone tried building numpy on Yellow Dog Linux or on a PowerPC with gcc? Vincent Broman broman at spawar.navy.mil From rmay at ou.edu Thu Aug 30 11:33:29 2007 From: rmay at ou.edu (Ryan May) Date: Thu, 30 Aug 2007 10:33:29 -0500 Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion In-Reply-To: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com> References: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com> Message-ID: <46D6E349.4010109@ou.edu> Brian Donovan wrote: > Hello all, > > I'm wondering if there is a way to use a numpy array that uses disk as > a memory store rather than ram. I'm looking for something like mmap but > which can be used like a numpy array. The general idea is this. I'm > simulating a system which produces a large dataset over a few hours of > processing time. Rather than store the numpy array in memory during > processing I'd like to write the data directly to disk but still be able > to treat the array as a numpy array. Is this possible? Any ideas? What you're looking for is numpy.memmap, though the documentation is eluding me at the moment. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From peridot.faceted at gmail.com Thu Aug 30 11:34:11 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 30 Aug 2007 11:34:11 -0400 Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion In-Reply-To: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com> References: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com> Message-ID: On 30/08/2007, Brian Donovan wrote: > Hello all, > > I'm wondering if there is a way to use a numpy array that uses disk as a > memory store rather than ram. I'm looking for something like mmap but which > can be used like a numpy array. The general idea is this. I'm simulating a > system which produces a large dataset over a few hours of processing time. > Rather than store the numpy array in memory during processing I'd like to > write the data directly to disk but still be able to treat the array as a > numpy array. Is this possible? Any ideas? You want numpy.memmap: http://mail.python.org/pipermail/python-list/2007-May/443036.html This will do exactly what you want (though you may have problems with arrays bigger than a few gigabytes, particularly on 32-bit systems) and there may be a few rough edges. You will probably need to create the file first. Keep in mind that if the array is actually temporary, the virtual memory system will push unused parts out to disk as memory fills up, so there's no need to use memmap explicitly. If you want the array permanently on disk, though, memmap is probably the most convenient way to do it - though if your access patterns are not local it may involve a lot of thrashing. Sequential disk writes have the advantage (?) of forcing you to write code that accesses disks in a local fashion. Anne From stefan at sun.ac.za Thu Aug 30 18:04:51 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri, 31 Aug 2007 00:04:51 +0200 Subject: [Numpy-discussion] Error code of NumpyTest() In-Reply-To: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee> References: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee> Message-ID: <20070830220451.GG14395@mentat.za.net> On Thu, Aug 30, 2007 at 12:48:44PM +0300, Pearu Peterson wrote: > The svn version of test() function now returns TestResult object. > > So, test() calls in buildbot should read: > > import numpy,sys; sys.exit(not > numpy.test(verbosity=9999,level=9999).wasSuccessful()) > > Hopefully buildbot admins can update the test commands accordingly. Thanks, Pearu. I forwarded your instructions to the relevant parties. Cheers St?fan From oliphant at enthought.com Fri Aug 31 15:22:06 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 31 Aug 2007 14:22:06 -0500 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil> References: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil> Message-ID: <46D86A5E.7010809@enthought.com> Vincent Broman wrote: > My build of numpy fails under Yellow Dog Linux 2.1, > running on a powerpc multiprocessor board from Curtiss-Wright. > > Its kernel is 2.4.19-Asmp tailored by the vendor. > The gcc compiler is configured as ppc-yellowdog-linux with > version number 2.95.3 20010111. > The python I'm using is Python 2.5.1 (r251:54863) installed as python2. > Plain /usr/bin/python is 1.5.x . > The numpy version I'm trying to build is r4003 for v1.0.4 . > > The setup fails compiling build/src.linux-ppc-2.5/numpy/core/src/umathmodule.c > with a long list of error messages of the following two kinds. > > warning: conflicting types for built-in function `sinl' > repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442. > You may be the first one to build on this platform. What needs to happen is that the correct config.h file needs to be set up for that platform. The long-float versions of certain functions are being incorrectly identified. Would you be willing to help get the config.h file set up correctly? -Travis From charlesr.harris at gmail.com Fri Aug 31 16:35:57 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 31 Aug 2007 14:35:57 -0600 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil> References: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil> Message-ID: On 8/30/07, Vincent Broman wrote: > > My build of numpy fails under Yellow Dog Linux 2.1, > running on a powerpc multiprocessor board from Curtiss-Wright. > > Its kernel is 2.4.19-Asmp tailored by the vendor. Which vendor? The gcc compiler is configured as ppc-yellowdog-linux with > version number 2.95.3 20010111. That compiler is really, I mean really, ancient. And the API changed in newer gcc (> 3.x.x), so code compiled with later versions isn't binary compatible. Hmmm. Curtiss-Wright now supports Linux and kernel 2.6.16 on some of their newer hardware, you might want to check with them or install a more current distro from Fedora or someone else who supports the PPC. The python I'm using is Python 2.5.1 (r251:54863) installed as python2. > Plain /usr/bin/python is 1.5.x . > The numpy version I'm trying to build is r4003 for v1.0.4 . > > The setup fails compiling build/src.linux-ppc-2.5 > /numpy/core/src/umathmodule.c > with a long list of error messages of the following two kinds. > > warning: conflicting types for built-in function `sinl' > repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442. Any more detail on these? What causes the conflict. I've got to wonder about the the libc/libm versions also. Does the include file math.h say anything about the prototypes for these functions? I expect cosl et.al. to be potential problems on the PPC anyway due to the way long doubles were implemented. inconsistent operand constraints in an `asm', > triggered by lines 1100, 1124, 1150, 1755, 1785, and 1834. > > I cannot see on those source lines what causes such a > message; I suspect there is some long complicated > cpp macro or asm statement in some include file which > I don't find. What is the PPC model number? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: