From ndbecker2 at gmail.com Fri May 1 14:28:48 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 01 May 2009 14:28:48 -0400 Subject: [Numpy-discussion] apply table lookup to each element Message-ID: Suggestion for efficient way to apply a table lookup to each element of an integer array? import numpy as np _cos = np.empty ((2**rom_in_bits,), dtype=int) _sin = np.empty ((2**rom_in_bits,), dtype=int) for address in xrange (2**12): _cos[address] = nint ((2.0**(rom_out_bits-1)-1) * cos (2 * pi * address * (2.0**-rom_in_bits))) _sin[address] = nint ((2.0**(rom_out_bits-1)-1) * sin (2 * pi * address * (2.0**-rom_in_bits))) Now _cos, _sin are arrays of integers (quantized sin, cos lookup tables) How to apply _cos lookup to each element of an integer array: phase = np.array (..., dtype =int) cos_out = lookup (phase, _cos) ??? From ndbecker2 at gmail.com Fri May 1 15:02:32 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 01 May 2009 15:02:32 -0400 Subject: [Numpy-discussion] Really strange result Message-ID: In [16]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*2).dtype Out[16]: dtype('uint64') In [17]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*n).dtype Out[17]: dtype('float64') In [18]: type(n) Out[18]: Now that's just strange. What's going on? From charlesr.harris at gmail.com Fri May 1 18:58:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 May 2009 16:58:42 -0600 Subject: [Numpy-discussion] Really strange result In-Reply-To: References: Message-ID: On Fri, May 1, 2009 at 1:02 PM, Neal Becker wrote: > In [16]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*2).dtype > Out[16]: dtype('uint64') > > In [17]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*n).dtype > Out[17]: dtype('float64') > > In [18]: type(n) > Out[18]: > > Now that's just strange. What's going on? > > The n is signed, uint64 is unsigned. So a signed type that can hold uint64 is needed. There ain't no such integer, so float64 is used. I think the logic here is a bit goofy myself since float64 doesn't have the needed 64 bit precision and the conversion from int kind to float kind is confusing. I think it would be better to raise a NotAvailable error or some such. Lest you think this is an isolated oddity, sometimes numeric arrays can be converted to object arrays. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri May 1 21:24:19 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 01 May 2009 21:24:19 -0400 Subject: [Numpy-discussion] Really strange result References: Message-ID: Charles R Harris wrote: > On Fri, May 1, 2009 at 1:02 PM, Neal Becker wrote: > >> In [16]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*2).dtype >> Out[16]: dtype('uint64') >> >> In [17]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*n).dtype >> Out[17]: dtype('float64') >> >> In [18]: type(n) >> Out[18]: >> >> Now that's just strange. What's going on? >> >> > The n is signed, uint64 is unsigned. So a signed type that can hold > uint64 is needed. There ain't no such integer, so float64 is used. I think > the logic here is a bit goofy myself since float64 doesn't have the needed > 64 bit precision and the conversion from int kind to float kind is > confusing. I think it would be better to raise a NotAvailable error or > some such. Lest you think this is an isolated oddity, sometimes numeric > arrays can be converted to object arrays. > > Chuck I don't think that any type of integer arithmetic should ever be automatically promoted to float. Besides that, what about the first example? There, I used '2' rather than 'n'. Is not '2' also an int? From charlesr.harris at gmail.com Fri May 1 21:39:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 May 2009 19:39:32 -0600 Subject: [Numpy-discussion] Really strange result In-Reply-To: References: Message-ID: On Fri, May 1, 2009 at 7:24 PM, Neal Becker wrote: > Charles R Harris wrote: > > > On Fri, May 1, 2009 at 1:02 PM, Neal Becker wrote: > > > >> In [16]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*2).dtype > >> Out[16]: dtype('uint64') > >> > >> In [17]: (np.linspace (0, len (x)-1, len(x)).astype (np.uint64)*n).dtype > >> Out[17]: dtype('float64') > >> > >> In [18]: type(n) > >> Out[18]: > >> > >> Now that's just strange. What's going on? > >> > >> > > The n is signed, uint64 is unsigned. So a signed type that can hold > > uint64 is needed. There ain't no such integer, so float64 is used. I > think > > the logic here is a bit goofy myself since float64 doesn't have the > needed > > 64 bit precision and the conversion from int kind to float kind is > > confusing. I think it would be better to raise a NotAvailable error or > > some such. Lest you think this is an isolated oddity, sometimes numeric > > arrays can be converted to object arrays. > > > > Chuck > > I don't think that any type of integer arithmetic should ever be > automatically promoted to float. > > Besides that, what about the first example? There, I used '2' rather than > 'n'. Is not '2' also an int? What version of numpy are you using? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 1 21:40:21 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 May 2009 19:40:21 -0600 Subject: [Numpy-discussion] Really strange result In-Reply-To: References: Message-ID: On Fri, May 1, 2009 at 7:39 PM, Charles R Harris wrote: > > > On Fri, May 1, 2009 at 7:24 PM, Neal Becker wrote: > >> Charles R Harris wrote: >> >> > On Fri, May 1, 2009 at 1:02 PM, Neal Becker >> wrote: >> > >> >> In [16]: (np.linspace (0, len (x)-1, len(x)).astype >> (np.uint64)*2).dtype >> >> Out[16]: dtype('uint64') >> >> >> >> In [17]: (np.linspace (0, len (x)-1, len(x)).astype >> (np.uint64)*n).dtype >> >> Out[17]: dtype('float64') >> >> >> >> In [18]: type(n) >> >> Out[18]: >> >> >> >> Now that's just strange. What's going on? >> >> >> >> >> > The n is signed, uint64 is unsigned. So a signed type that can hold >> > uint64 is needed. There ain't no such integer, so float64 is used. I >> think >> > the logic here is a bit goofy myself since float64 doesn't have the >> needed >> > 64 bit precision and the conversion from int kind to float kind is >> > confusing. I think it would be better to raise a NotAvailable error or >> > some such. Lest you think this is an isolated oddity, sometimes numeric >> > arrays can be converted to object arrays. >> > >> > Chuck >> >> I don't think that any type of integer arithmetic should ever be >> automatically promoted to float. >> >> Besides that, what about the first example? There, I used '2' rather than >> 'n'. Is not '2' also an int? > > > What version of numpy are you using? > And what is the value of n? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Sat May 2 07:38:04 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 02 May 2009 07:38:04 -0400 Subject: [Numpy-discussion] Really strange result References: Message-ID: Charles R Harris wrote: > On Fri, May 1, 2009 at 7:39 PM, Charles R Harris > wrote: > >> >> >> On Fri, May 1, 2009 at 7:24 PM, Neal Becker wrote: >> >>> Charles R Harris wrote: >>> >>> > On Fri, May 1, 2009 at 1:02 PM, Neal Becker >>> wrote: >>> > >>> >> In [16]: (np.linspace (0, len (x)-1, len(x)).astype >>> (np.uint64)*2).dtype >>> >> Out[16]: dtype('uint64') >>> >> >>> >> In [17]: (np.linspace (0, len (x)-1, len(x)).astype >>> (np.uint64)*n).dtype >>> >> Out[17]: dtype('float64') >>> >> >>> >> In [18]: type(n) >>> >> Out[18]: >>> >> >>> >> Now that's just strange. What's going on? >>> >> >>> >> >>> > The n is signed, uint64 is unsigned. So a signed type that can hold >>> > uint64 is needed. There ain't no such integer, so float64 is used. I >>> think >>> > the logic here is a bit goofy myself since float64 doesn't have the >>> needed >>> > 64 bit precision and the conversion from int kind to float kind is >>> > confusing. I think it would be better to raise a NotAvailable error or >>> > some such. Lest you think this is an isolated oddity, sometimes >>> > numeric arrays can be converted to object arrays. >>> > >>> > Chuck >>> >>> I don't think that any type of integer arithmetic should ever be >>> automatically promoted to float. >>> >>> Besides that, what about the first example? There, I used '2' rather >>> than >>> 'n'. Is not '2' also an int? >> >> >> What version of numpy are you using? >> > > And what is the value of n? > > Chuck np.version.version Out[5]: '1.3.0' (I think the previous test was on 1.2.0 and did the same thing) (np.linspace (0, 1023,1024).astype(np.uint64)*2).dtype Out[2]: dtype('uint64') In [3]: n=-7 In [4]: (np.linspace (0, 1023,1024).astype(np.uint64)*n).dtype Out[4]: dtype('float64') From matthew.brett at gmail.com Sun May 3 00:54:44 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 2 May 2009 21:54:44 -0700 Subject: [Numpy-discussion] Structured array with no fields - possible? Message-ID: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> Hello, I'm trying to fix a bug in the scipy matlab loading routines, and this requires me to somehow represent an empty structured array. In matlab this is: >> a = struct() In numpy, you can do this: In [1]: dt = np.dtype([]) but then you can't use it: In [2]: np.zeros((),dtype=dt) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/mb312/tmp/ in () ValueError: Empty data-type Is there any way of representing a structured / record array, but with no fields? Thanks for any thoughts, Matthew From david at ar.media.kyoto-u.ac.jp Sun May 3 02:22:02 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 03 May 2009 15:22:02 +0900 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> Message-ID: <49FD380A.9080600@ar.media.kyoto-u.ac.jp> Hi Matthew, Matthew Brett wrote: > Hello, > > I'm trying to fix a bug in the scipy matlab loading routines, and this > requires me to somehow represent an empty structured array. > Do you need the struct to be empty (size is 0) or to have no fields ? What would you expect np.zeros((), dtype=np.dtype([])) to return, for example ? cheers, David From cournape at gmail.com Sun May 3 02:49:34 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 3 May 2009 15:49:34 +0900 Subject: [Numpy-discussion] Porting strategy for py3k In-Reply-To: <49F16F17.9000303@student.matnat.uio.no> References: <5b8d13220904230638i553d0e7bx2f3d93572861940c@mail.gmail.com> <5b8d13220904230752u3c8dd7fyb14f937d358472fd@mail.gmail.com> <49F095F0.90604@noaa.gov> <49F13EEF.6060205@ar.media.kyoto-u.ac.jp> <49F16F17.9000303@student.matnat.uio.no> Message-ID: <5b8d13220905022349v2f103cf5r3793cf8b54cdc405@mail.gmail.com> On Fri, Apr 24, 2009 at 4:49 PM, Dag Sverre Seljebotn wrote: > David Cournapeau wrote: >> Christopher Barker wrote: >>> Though I'm a bit surprised that that's not how the print function is >>> written in the first place (maybe it is in py3k -- I'm testing on 2.5) >>> >> >> That's actually how it works as far as I can tell. The thing with >> removing those print is that we can do it without too much trouble. As >> long as we cannot actually test any py3k code, warnings from python 2.6 >> is all we can get. >> >> I think we should aim at getting "something" which builds and runs (even >> if does not go further than import stage), so we can gradually port. For >> now, porting py3k is this huge thing that nobody can work on for say one >> hour. I would like to make sure we get at that stage, so that many >> people can take part of it, instead of the currently quite few people >> who are deeply intimate with numpy. > > One thing somebody *could* work on rather independently for some hours > is proper PEP 3118 support, as that is available in Python 2.6+ as well > and could be conditionally used on those systems. Yes, this could be done independently. I am not familiar with PEP 3118; from the python-dev ML, it looks like the current buffer API has some serious shortcomings, I don't whether this implies to numpy or not. Do you have more on this ? Another relatively independent thing is to be able to bootstrap our build from py3k. At least, distutils and the code for bootstrapping, so that we can then run 2to3 on the source code from distutils. Not being able to bootstrap our build process under py3k from distutils sounds too much of a pain. The only real alternative I could see is to have two codebases, because 2to3 does not seem able to convert numpy.distutils 100 % automatically. David From charlesr.harris at gmail.com Sun May 3 03:10:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 3 May 2009 01:10:32 -0600 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> Message-ID: On Sat, May 2, 2009 at 10:54 PM, Matthew Brett wrote: > Hello, > > I'm trying to fix a bug in the scipy matlab loading routines, and this > requires me to somehow represent an empty structured array. > > In matlab this is: > > >> a = struct() > Wouldn't a dictionary fit the matlab structure a bit better? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Sun May 3 12:24:47 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 3 May 2009 18:24:47 +0200 (CEST) Subject: [Numpy-discussion] Porting strategy for py3k In-Reply-To: <5b8d13220905022349v2f103cf5r3793cf8b54cdc405@mail.gmail.com> References: <5b8d13220904230638i553d0e7bx2f3d93572861940c@mail.gmail.com> <5b8d13220904230752u3c8dd7fyb14f937d358472fd@mail.gmail.com> <49F095F0.90604@noaa.gov> <49F13EEF.6060205@ar.media.kyoto-u.ac.jp> <49F16F17.9000303@student.matnat.uio.no> <5b8d13220905022349v2f103cf5r3793cf8b54cdc405@mail.gmail.com> Message-ID: David Cournapeau wrote: > On Fri, Apr 24, 2009 at 4:49 PM, Dag Sverre Seljebotn >> One thing somebody *could* work on rather independently for some hours >> is proper PEP 3118 support, as that is available in Python 2.6+ as well >> and could be conditionally used on those systems. > > Yes, this could be done independently. I am not familiar with PEP > 3118; from the python-dev ML, it looks like the current buffer API has > some serious shortcomings, I don't whether this implies to numpy or > not. Do you have more on this ? Not sure what you refer to ... I'll just write more and hope it answers your question. The difference with PEP 3118 is that many more memory models are supported (beyond 1D contiguous). All of NumPy's strided arrays are very easy to expose (after all, Travis Oliphant did the PEP), with no information lost (i.e. the dtype, shape and strides can be communicated). This means that one should in Python 3/2.6+ be able to use other CPython libraries (like image libraries etc.) in a seamless fashion with NumPy arrays without those libraries having to know about NumPy as such; they can simply request a strided view of the data through PEP 3118. To support clients one mainly has to copy out information that is already there into a struct when requested. The one small challenge is creating a format string for the buffer dtype (which is incompatible with the current string representations of dtype). In addition it would be natural to act as a client, so that calling "np.array(obj)" (and/or np.view?) would acquire the data through PEP 3118. There is a class of buffers which doesn't fit in NumPy's memory model (e.g. pointers to rows of a matrix) and for which a copy would have to be made, but a lot of them could be used through a direct view as well. Dag Sverre From sccolbert at gmail.com Sun May 3 15:30:28 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 15:30:28 -0400 Subject: [Numpy-discussion] is there a way to get Nd indices of max of Nd array Message-ID: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> my case is only for 2d, but should apply to Nd as well. It would be convienent if np.max would return a tuple of the max value and its Nd location indices. Is there an easier way than just using the 1d flattened array max index (np.argmax) and calculating its corresponding Nd location? Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun May 3 16:34:23 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 3 May 2009 16:34:23 -0400 Subject: [Numpy-discussion] is there a way to get Nd indices of max of Nd array In-Reply-To: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> References: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> Message-ID: <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> On Sun, May 3, 2009 at 3:30 PM, Chris Colbert wrote: > my case is only for 2d, but should apply to Nd as well. > > It would be convienent if np.max would return a tuple of the max value and > its Nd location indices. > > Is there an easier way than just using the 1d flattened array max index > (np.argmax)?and calculating its corresponding Nd location? > >>> factors = np.random.randint(5,size=(4,3)) >>> factors array([[1, 1, 3], [0, 2, 1], [4, 4, 1], [2, 2, 4]]) >>> factors.max() 4 >>> np.argmax(factors) 6 >>> np.nonzero(factors==factors.max()) (array([2, 2, 3]), array([0, 1, 2])) Josef From sccolbert at gmail.com Sun May 3 18:30:37 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 18:30:37 -0400 Subject: [Numpy-discussion] is there a way to get Nd indices of max of Nd array In-Reply-To: <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> References: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> Message-ID: <7f014ea60905031530p6fb47439w4e101bfd4afff1a1@mail.gmail.com> but this gives me just the locations of the column/row maximums. I need the (x,y) location of the array maximum. Chris On Sun, May 3, 2009 at 4:34 PM, wrote: > On Sun, May 3, 2009 at 3:30 PM, Chris Colbert > wrote: > > my case is only for 2d, but should apply to Nd as well. > > > > It would be convienent if np.max would return a tuple of the max value > and > > its Nd location indices. > > > > Is there an easier way than just using the 1d flattened array max index > > (np.argmax) and calculating its corresponding Nd location? > > > > >>> factors = np.random.randint(5,size=(4,3)) > >>> factors > array([[1, 1, 3], > [0, 2, 1], > [4, 4, 1], > [2, 2, 4]]) > >>> factors.max() > 4 > >>> np.argmax(factors) > 6 > >>> np.nonzero(factors==factors.max()) > (array([2, 2, 3]), array([0, 1, 2])) > > Josef > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sun May 3 18:31:37 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 18:31:37 -0400 Subject: [Numpy-discussion] is there a way to get Nd indices of max of Nd array In-Reply-To: <7f014ea60905031530p6fb47439w4e101bfd4afff1a1@mail.gmail.com> References: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> <7f014ea60905031530p6fb47439w4e101bfd4afff1a1@mail.gmail.com> Message-ID: <7f014ea60905031531j431a4d92j53303a1c1b45f4dc@mail.gmail.com> wait, nevermind. your're right. Thanks! On Sun, May 3, 2009 at 6:30 PM, Chris Colbert wrote: > but this gives me just the locations of the column/row maximums. > > I need the (x,y) location of the array maximum. > > Chris > > On Sun, May 3, 2009 at 4:34 PM, wrote: > >> On Sun, May 3, 2009 at 3:30 PM, Chris Colbert >> wrote: >> > my case is only for 2d, but should apply to Nd as well. >> > >> > It would be convienent if np.max would return a tuple of the max value >> and >> > its Nd location indices. >> > >> > Is there an easier way than just using the 1d flattened array max index >> > (np.argmax) and calculating its corresponding Nd location? >> > >> >> >>> factors = np.random.randint(5,size=(4,3)) >> >>> factors >> array([[1, 1, 3], >> [0, 2, 1], >> [4, 4, 1], >> [2, 2, 4]]) >> >>> factors.max() >> 4 >> >>> np.argmax(factors) >> 6 >> >>> np.nonzero(factors==factors.max()) >> (array([2, 2, 3]), array([0, 1, 2])) >> >> Josef >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sun May 3 19:34:55 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 19:34:55 -0400 Subject: [Numpy-discussion] index into a 3d histogram Message-ID: <7f014ea60905031634h35b54586x193e21e5a3a2adab@mail.gmail.com> Lets say I have histogram of a color image that is of size [16, 16, 16]. Now, I have a function that converts my rgb image into the format where each rgb color (i.e. img[x, y, :] = (r, g, b)) is an integer in the range(0, 16) I want create a new 2d array where new2darray[x, y] = hist[img[x,y, :]] that is, for each 3 tuple in img, use that as an index into the 3d histogram and store the value of the histogram into the (x,y) position of the 2d array. I've prototyped all the algorithms using for loops, now im just trying to speed it up. I can't quite wrap my head fully around this fancy indexing yet. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sun May 3 19:55:22 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 19:55:22 -0400 Subject: [Numpy-discussion] index into a 3d histogram In-Reply-To: <7f014ea60905031634h35b54586x193e21e5a3a2adab@mail.gmail.com> References: <7f014ea60905031634h35b54586x193e21e5a3a2adab@mail.gmail.com> Message-ID: <7f014ea60905031655k69c81200m85bafb1cbddb2813@mail.gmail.com> solved my own question: for future googlers: newarray = histogram[img[:, :, 0], img[:, :, 1], img[:, :, 2]) where histogram is the 3d histogram and img is the 3d img of (r, g, b) histogram localization bins. This version is also 300x faster than nested for loops. Chris On Sun, May 3, 2009 at 7:34 PM, Chris Colbert wrote: > Lets say I have histogram of a color image that is of size [16, 16, 16]. > > Now, I have a function that converts my rgb image into the format where > each rgb color (i.e. img[x, y, :] = (r, g, b)) is an integer in the range(0, > 16) > > I want create a new 2d array where new2darray[x, y] = hist[img[x,y, :]] > > that is, for each 3 tuple in img, use that as an index into the 3d > histogram and store the value of the histogram into the (x,y) position of > the 2d array. > > I've prototyped all the algorithms using for loops, now im just trying to > speed it up. > > I can't quite wrap my head fully around this fancy indexing yet. > > Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sun May 3 20:15:55 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 20:15:55 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation Message-ID: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> in my endless pursuit of perfomance, i'm searching for a quick way to create a 3d histogram from a 3d rgb image. Here is what I have so far for a (16,16,16) 3d histogram: def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() (i, j) = imgarray.shape[0:2] temp = (temp - temp % 16) / 16 for a in range(i): for b in range(j): (b1, b2, b3) = temp[a, b, :] histarray[b1, b2, b3] += 1 return histarray this works, but takes about 4 seconds for a 640x480 image. I tried doing the inverse of my previous post, namely replacing the nested for loop with: histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1 but that doesn't work for whatever reason. It gives me number, but they're incorrect. Any ideas? Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun May 3 20:31:58 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 4 May 2009 02:31:58 +0200 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> Message-ID: <9457e7c80905031731r7db53e40p45c8a5d6e066b6e8@mail.gmail.com> Hi Chris 2009/5/4 Chris Colbert : > in my endless pursuit of perfomance, i'm searching for a quick way to create > a 3d histogram from a 3d rgb image. Does histogramdd do what you want? Regards St?fan From sccolbert at gmail.com Sun May 3 20:36:04 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 3 May 2009 20:36:04 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <9457e7c80905031731r7db53e40p45c8a5d6e066b6e8@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <9457e7c80905031731r7db53e40p45c8a5d6e066b6e8@mail.gmail.com> Message-ID: <7f014ea60905031736s283d015ev2d4736007fbe5c4a@mail.gmail.com> Stefan, I'm not sure: the docs say the input has to be: sample : array_like Data to histogram passed as a sequence of D arrays of length N, or as an (N,D) array i have an (N,M,D) array and not sure how to get it to conform to input required, mainly because I don't understand what it's asking. Chris 2009/5/3 St?fan van der Walt > Hi Chris > > 2009/5/4 Chris Colbert : > > in my endless pursuit of perfomance, i'm searching for a quick way to > create > > a 3d histogram from a 3d rgb image. > > Does histogramdd do what you want? > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun May 3 20:36:09 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 3 May 2009 20:36:09 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> Message-ID: <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> On Sun, May 3, 2009 at 8:15 PM, Chris Colbert wrote: > in my endless pursuit of perfomance, i'm searching for a quick way to create > a 3d histogram from a 3d rgb image. > > Here is what I have so far for a (16,16,16) 3d histogram: > > def hist3d(imgarray): > ??? histarray = N.zeros((16, 16, 16)) > ??? temp = imgarray.copy() > ??? (i, j) = imgarray.shape[0:2] > ??? temp = (temp - temp % 16) / 16 > ??? for a in range(i): > ??????? for b in range(j): > ??????????? (b1, b2, b3) = temp[a, b, :] > ??????????? histarray[b1, b2, b3] += 1 > ????return histarray > > this works, but takes about 4 seconds for a 640x480 image. > > I tried doing the inverse of my previous post, namely replacing the nested > for loop with: > histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1 > > > but that doesn't work for whatever reason. It gives me number, but they're > incorrect. > > Any ideas? I'm not sure what exactly you need, but did you look at np.histogramdd ? reading the help file, this might work numpy.histogramdd(temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel(), bins=16) but I never used histogramdd. also looking at the source of numpy is often very instructive, lots of good tricks to find in there: np.source(np.histogramdd). Josef From stefan at sun.ac.za Sun May 3 20:57:23 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 4 May 2009 02:57:23 +0200 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905031736s283d015ev2d4736007fbe5c4a@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <9457e7c80905031731r7db53e40p45c8a5d6e066b6e8@mail.gmail.com> <7f014ea60905031736s283d015ev2d4736007fbe5c4a@mail.gmail.com> Message-ID: <9457e7c80905031757n9605cbka8e44dcf580c42e7@mail.gmail.com> Hi Chris 2009/5/4 Chris Colbert : > I'm not sure: > > the docs say the input has to be: > sample : array_like > ??? Data to histogram passed as a sequence of D arrays of length N, or > ??? as an (N,D) array > > i have an (N,M,D) array and not sure how to get it to conform to input > required, mainly because I don't understand what it's asking. Try count, bins = np.histogramdd(x.reshape((-1,3)), bins=16) Regards St?fan From stefan at sun.ac.za Sun May 3 21:00:45 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 4 May 2009 03:00:45 +0200 Subject: [Numpy-discussion] is there a way to get Nd indices of max of Nd array In-Reply-To: <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> References: <7f014ea60905031230l679e61d0h601d25aa29f218eb@mail.gmail.com> <1cd32cbb0905031334oee89fccw864e662d3349aabd@mail.gmail.com> Message-ID: <9457e7c80905031800r7cf83580q78d6b1e41f3d9f86@mail.gmail.com> 2009/5/3 : >>>> factors = np.random.randint(5,size=(4,3)) >>>> factors > array([[1, 1, 3], > ? ? ? [0, 2, 1], > ? ? ? [4, 4, 1], > ? ? ? [2, 2, 4]]) >>>> factors.max() > 4 >>>> np.argmax(factors) > 6 >>>> np.nonzero(factors==factors.max()) > (array([2, 2, 3]), array([0, 1, 2])) Since you have more than one maximum here, you have to do something like Josef outlined above. If you only want the indices of the first maximum argument, you can use np.unravel_index(6, factors.shape) Regards St?fan From cournape at gmail.com Mon May 4 00:03:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 4 May 2009 13:03:37 +0900 Subject: [Numpy-discussion] Porting strategy for py3k In-Reply-To: References: <5b8d13220904230638i553d0e7bx2f3d93572861940c@mail.gmail.com> <5b8d13220904230752u3c8dd7fyb14f937d358472fd@mail.gmail.com> <49F095F0.90604@noaa.gov> <49F13EEF.6060205@ar.media.kyoto-u.ac.jp> <49F16F17.9000303@student.matnat.uio.no> <5b8d13220905022349v2f103cf5r3793cf8b54cdc405@mail.gmail.com> Message-ID: <5b8d13220905032103w5ff20605q7afa936bae05ce3f@mail.gmail.com> On Mon, May 4, 2009 at 1:24 AM, Dag Sverre Seljebotn wrote: > David Cournapeau wrote: >> On Fri, Apr 24, 2009 at 4:49 PM, Dag Sverre Seljebotn >>> One thing somebody *could* work on rather independently for some hours >>> is proper PEP 3118 support, as that is available in Python 2.6+ as well >>> and could be conditionally used on those systems. >> >> Yes, this could be done independently. I am not familiar with PEP >> 3118; from the python-dev ML, it looks like the current buffer API has >> some serious shortcomings, I don't whether this implies to numpy or >> not. Do you have more on this ? > > Not sure what you refer to ... http://mail.python.org/pipermail/python-dev/2009-April/088211.html Thank you for those information. I don't understand what is meant by "not implemented for multi-dimensional array", and the consequences for numpy. Does it mean that PEP 3118 is not fully implemented ? Is the status of the buffer interface the same for python 2.6 and python 3 ? > In addition it would be natural to act as a client, so that calling > "np.array(obj)" (and/or np.view?) would acquire the data through PEP 3118. Yes, it would help making sure we implement the interface correctly for once :) I am almost done having a numpy.distutils which can bootstrap itself to the point of converting the rest of the python code to py3k. With the buffer interface, this should enable moving forward in a piecewise manner. thank you, David From sccolbert at gmail.com Mon May 4 00:31:40 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 4 May 2009 00:31:40 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> Message-ID: <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> this actually sort of worked. Thanks for putting me on the right track. Here is what I ended up with. this is what I ended up with: def hist3d(imgarray): histarray = N.zeros((16, 16, 16)) temp = imgarray.copy() bins = N.arange(0, 257, 16) histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] return histarray this creates a 3d histogram of rgb image values in the range 0,255 using 16 bins per component color. on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for loop. not quite framerate, but good enough for prototyping. Thanks! Chris On Sun, May 3, 2009 at 8:36 PM, wrote: > On Sun, May 3, 2009 at 8:15 PM, Chris Colbert > wrote: > > in my endless pursuit of perfomance, i'm searching for a quick way to > create > > a 3d histogram from a 3d rgb image. > > > > Here is what I have so far for a (16,16,16) 3d histogram: > > > > def hist3d(imgarray): > > histarray = N.zeros((16, 16, 16)) > > temp = imgarray.copy() > > (i, j) = imgarray.shape[0:2] > > temp = (temp - temp % 16) / 16 > > for a in range(i): > > for b in range(j): > > (b1, b2, b3) = temp[a, b, :] > > histarray[b1, b2, b3] += 1 > > return histarray > > > > this works, but takes about 4 seconds for a 640x480 image. > > > > I tried doing the inverse of my previous post, namely replacing the > nested > > for loop with: > > histarray[temp[:,:,0], temp[:,:,1], temp[:,:,2]] += 1 > > > > > > but that doesn't work for whatever reason. It gives me number, but > they're > > incorrect. > > > > Any ideas? > > I'm not sure what exactly you need, but did you look at np.histogramdd ? > > reading the help file, this might work > > numpy.histogramdd(temp[:,:,0].ravel(), temp[:,:,1].ravel(), > temp[:,:,2].ravel(), bins=16) > > but I never used histogramdd. > > also looking at the source of numpy is often very instructive, lots of > good tricks to find in there: np.source(np.histogramdd). > > Josef > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 4 07:00:13 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 4 May 2009 07:00:13 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> Message-ID: <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert wrote: > this actually sort of worked. Thanks for putting me on the right track. > > Here is what I ended up with. > > this is what I ended up with: > > def hist3d(imgarray): > ??? histarray = N.zeros((16, 16, 16)) > ??? temp = imgarray.copy() > ????bins = N.arange(0, 257, 16) > ??? histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] > ??? return histarray > > this creates a 3d histogram of rgb image values in the range 0,255 using 16 > bins per component color. > > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for > loop. > > not quite framerate, but good enough for prototyping. > I don't think your copy to temp is necessary, and use reshape(-1,3) as in the example of Stefan, which will avoid copying the array 3 times. If you need to gain some more speed, then rewriting histogramdd and removing some of the unnecessary checks and calculations looks possible. Josef From ndbecker2 at gmail.com Mon May 4 07:54:34 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 04 May 2009 07:54:34 -0400 Subject: [Numpy-discussion] apply table lookup to each element References: Message-ID: Neal Becker wrote: > Suggestion for efficient way to apply a table lookup to each element of an > integer array? > > import numpy as np > _cos = np.empty ((2**rom_in_bits,), dtype=int) > _sin = np.empty ((2**rom_in_bits,), dtype=int) > for address in xrange (2**12): > _cos[address] = nint ((2.0**(rom_out_bits-1)-1) * cos (2 * pi * > address > * (2.0**-rom_in_bits))) > _sin[address] = nint ((2.0**(rom_out_bits-1)-1) * sin (2 * pi * > address > * (2.0**-rom_in_bits))) > > Now _cos, _sin are arrays of integers (quantized sin, cos lookup tables) > > How to apply _cos lookup to each element of an integer array: > > phase = np.array (..., dtype =int) > cos_out = lookup (phase, _cos) ??? Turns out that if A is an np.array and B is an np.array, then A[B] will do exactly what I wanted. Is this mentioned anywhere in the documentation? From stefan at sun.ac.za Mon May 4 08:10:58 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 4 May 2009 14:10:58 +0200 Subject: [Numpy-discussion] apply table lookup to each element In-Reply-To: References: Message-ID: <9457e7c80905040510v5244d611k5d160ef8bd6ad76d@mail.gmail.com> 2009/5/4 Neal Becker : > Turns out that if A is an np.array and B is an np.array, then > A[B] will do exactly what I wanted. > > Is this mentioned anywhere in the documentation? http://docs.scipy.org/numpy/docs/numpy-docs/reference/arrays.indexing.rst/#arrays-indexing St?fan From dagss at student.matnat.uio.no Mon May 4 08:22:34 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 04 May 2009 14:22:34 +0200 Subject: [Numpy-discussion] Porting strategy for py3k In-Reply-To: <5b8d13220905032103w5ff20605q7afa936bae05ce3f@mail.gmail.com> References: <5b8d13220904230638i553d0e7bx2f3d93572861940c@mail.gmail.com> <5b8d13220904230752u3c8dd7fyb14f937d358472fd@mail.gmail.com> <49F095F0.90604@noaa.gov> <49F13EEF.6060205@ar.media.kyoto-u.ac.jp> <49F16F17.9000303@student.matnat.uio.no> <5b8d13220905022349v2f103cf5r3793cf8b54cdc405@mail.gmail.com> <5b8d13220905032103w5ff20605q7afa936bae05ce3f@mail.gmail.com> Message-ID: <49FEDE0A.30105@student.matnat.uio.no> David Cournapeau wrote: > On Mon, May 4, 2009 at 1:24 AM, Dag Sverre Seljebotn > wrote: > >> David Cournapeau wrote: >> >>> On Fri, Apr 24, 2009 at 4:49 PM, Dag Sverre Seljebotn >>> >>>> One thing somebody *could* work on rather independently for some hours >>>> is proper PEP 3118 support, as that is available in Python 2.6+ as well >>>> and could be conditionally used on those systems. >>>> >>> Yes, this could be done independently. I am not familiar with PEP >>> 3118; from the python-dev ML, it looks like the current buffer API has >>> some serious shortcomings, I don't whether this implies to numpy or >>> not. Do you have more on this ? >>> >> Not sure what you refer to ... >> > > http://mail.python.org/pipermail/python-dev/2009-April/088211.html > > Thank you for those information. > > I don't understand what is meant by "not implemented for > multi-dimensional array", and the consequences for numpy. Does it mean > that PEP 3118 is not fully implemented ? Is the status of the buffer > interface the same for python 2.6 and python 3 ? > The "memoryview" is not implemented on 2.6, but that's just a utility for being able to acquire a buffer and inspect it from Python-space. From Cython or C one still has access. I think this just refers to there not being any multidimensional consumers nor exporters in the standard library. So from the point of view of the standard library it is a bit useless; but it is not if one uses 3rd party libraries. The API itself is working fine, and you can e.g. export a multidimensional buffer and use it in Cython (defined __getbuffer__ for a cdef class and then access it through classname[dtype, ndim=...]), under both 2.6 and 3.0. Dag Sverre From bsouthey at gmail.com Mon May 4 11:14:34 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 04 May 2009 10:14:34 -0500 Subject: [Numpy-discussion] Really strange result In-Reply-To: References: Message-ID: <49FF065A.6050307@gmail.com> Neal Becker wrote: > Charles R Harris wrote: > > >> On Fri, May 1, 2009 at 7:39 PM, Charles R Harris >> wrote: >> >> >>> On Fri, May 1, 2009 at 7:24 PM, Neal Becker wrote: >>> >>> >>>> Charles R Harris wrote: >>>> >>>> >>>>> On Fri, May 1, 2009 at 1:02 PM, Neal Becker >>>>> >>>> wrote: >>>> >>>>>> In [16]: (np.linspace (0, len (x)-1, len(x)).astype >>>>>> >>>> (np.uint64)*2).dtype >>>> >>>>>> Out[16]: dtype('uint64') >>>>>> >>>>>> In [17]: (np.linspace (0, len (x)-1, len(x)).astype >>>>>> >>>> (np.uint64)*n).dtype >>>> >>>>>> Out[17]: dtype('float64') >>>>>> >>>>>> In [18]: type(n) >>>>>> Out[18]: >>>>>> >>>>>> Now that's just strange. What's going on? >>>>>> >>>>>> >>>>>> >>>>> The n is signed, uint64 is unsigned. So a signed type that can hold >>>>> uint64 is needed. There ain't no such integer, so float64 is used. I >>>>> >>>> think >>>> >>>>> the logic here is a bit goofy myself since float64 doesn't have the >>>>> >>>> needed >>>> >>>>> 64 bit precision and the conversion from int kind to float kind is >>>>> confusing. I think it would be better to raise a NotAvailable error or >>>>> some such. Lest you think this is an isolated oddity, sometimes >>>>> numeric arrays can be converted to object arrays. >>>>> >>>>> Chuck >>>>> >>>> I don't think that any type of integer arithmetic should ever be >>>> automatically promoted to float. >>>> >>>> Besides that, what about the first example? There, I used '2' rather >>>> than >>>> 'n'. Is not '2' also an int? >>>> >>> What version of numpy are you using? >>> >>> >> And what is the value of n? >> >> > > >> Chuck >> > > np.version.version > Out[5]: '1.3.0' > (I think the previous test was on 1.2.0 and did the same thing) > > (np.linspace (0, 1023,1024).astype(np.uint64)*2).dtype > Out[2]: dtype('uint64') > > In [3]: n=-7 > > In [4]: (np.linspace (0, 1023,1024).astype(np.uint64)*n).dtype > Out[4]: dtype('float64') > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, //I think this behavior has been raised before. IIRC, Numpy is trying to do the operation that is requested by converting the dtype into floats since this is a generic solution that will avoid overflow with any ints not just unsigned ints. Note that you get a different result if you use subtraction than multiplication. >>> np.linspace (0, 1023,1024) array([ 0.00000000e+00, 1.00000000e+00, 2.00000000e+00, ..., 1.02100000e+03, 1.02200000e+03, 1.02300000e+03]) >>> np.linspace (0, 1023,1024).astype(np.uint64)*-7 array([ -0.00000000e+00, -7.00000000e+00, -1.40000000e+01, ..., -7.14700000e+03, -7.15400000e+03, -7.16100000e+03]) >>> np.linspace (0, 1023,1024).astype(np.uint64)-7 array([18446744073709551609, 18446744073709551610, 18446744073709551611, ..., 1014, 1015, 1016], dtype=uint64) Bruce From dfnsonfsduifb at gmx.de Mon May 4 11:40:30 2009 From: dfnsonfsduifb at gmx.de (Johannes Bauer) Date: Mon, 04 May 2009 17:40:30 +0200 Subject: [Numpy-discussion] Efficient scaling of array Message-ID: <49FF0C6E.2040609@gmx.de> Hello list, is there a possibility to scale an array by interpolation, automatically? For illustration a 1D-example would be an array of size 5, which is scaled to size 3: before: [ 1, 2, 3, 4, 5 ] 1/1 2/3 1/3 1 1/3 2/3 1 after : [ 2.33, 5, 7.66 ] The same thing should be possible in nD, with the obvious analogy. Is there such a function in numpy? Kind regards, Johannes From zachary.pincus at yale.edu Mon May 4 11:52:20 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 4 May 2009 11:52:20 -0400 Subject: [Numpy-discussion] Efficient scaling of array In-Reply-To: <49FF0C6E.2040609@gmx.de> References: <49FF0C6E.2040609@gmx.de> Message-ID: scipy.ndimage.zoom (and related interpolation functions) would be a good bet -- different orders of interpolation are available, too, which can be useful. Zach On May 4, 2009, at 11:40 AM, Johannes Bauer wrote: > Hello list, > > is there a possibility to scale an array by interpolation, > automatically? For illustration a 1D-example would be an array of size > 5, which is scaled to size 3: > > before: [ 1, 2, 3, 4, 5 ] > 1/1 2/3 > 1/3 1 1/3 > 2/3 1 > after : [ 2.33, 5, 7.66 ] > > > The same thing should be possible in nD, with the obvious analogy. Is > there such a function in numpy? > > Kind regards, > Johannes > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From taste_of_r at yahoo.com Mon May 4 12:43:24 2009 From: taste_of_r at yahoo.com (Wei Su) Date: Mon, 4 May 2009 09:43:24 -0700 (PDT) Subject: [Numpy-discussion] How to convert a list into a structured array? Message-ID: <41459.25081.qm@web43503.mail.sp1.yahoo.com> Hi,All: ? My first post! I am very excited to find out structured array (record array) in Python. Since I do data manipulation every day, this is truly great. However, I typically download data using pyodbc, the default output is a big list. So I am wondering how to convert that big list into a structured array? using array() will turn it into a text array, afaik. it is even better if anybody can show me some tricks to download the data directly as a structured array. ? Thanks a lot for the help. ? Wei Su ? BTW: I am also interested in Python's ability to handle large data. Any hints or suggestion is welcome. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Mon May 4 12:48:55 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 04 May 2009 18:48:55 +0200 Subject: [Numpy-discussion] stop criterion for an alternating signal Message-ID: Hi all, How can I define a stop criterion for an alternating series ? Any pointer would be appreciated. Nils from numpy import loadtxt, arange from pylab import plot, show A = loadtxt('alternate.dat') m = len(A) x = arange(0,m) plot(x,A) show() -------------- next part -------------- A non-text attachment was scrubbed... Name: alternate.dat Type: video/mpeg Size: 210 bytes Desc: not available URL: From charlesr.harris at gmail.com Mon May 4 12:52:59 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 4 May 2009 10:52:59 -0600 Subject: [Numpy-discussion] stop criterion for an alternating signal In-Reply-To: References: Message-ID: On Mon, May 4, 2009 at 10:48 AM, Nils Wagner wrote: > Hi all, > > How can I define a stop criterion for an alternating series ? > > Any pointer would be appreciated. > Where does the series come from and what are you trying to do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Mon May 4 12:59:57 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 04 May 2009 18:59:57 +0200 Subject: [Numpy-discussion] stop criterion for an alternating signal In-Reply-To: References: Message-ID: On Mon, 4 May 2009 10:52:59 -0600 Charles R Harris wrote: > On Mon, May 4, 2009 at 10:48 AM, Nils Wagner > wrote: > >> Hi all, >> >> How can I define a stop criterion for an alternating >>series ? >> >> Any pointer would be appreciated. >> > > Where does the series come from and what are you trying >to do? > > Chuck The data come from an iterative process. I am looking for convergence criteria. It should be possible to stop the process after 10-15 iterations. Nils From dfnsonfsduifb at gmx.de Mon May 4 13:17:47 2009 From: dfnsonfsduifb at gmx.de (Johannes Bauer) Date: Mon, 04 May 2009 19:17:47 +0200 Subject: [Numpy-discussion] Efficient scaling of array In-Reply-To: References: <49FF0C6E.2040609@gmx.de> Message-ID: <49FF233B.6080608@gmx.de> Zachary Pincus schrieb: > scipy.ndimage.zoom (and related interpolation functions) would be a > good bet -- different orders of interpolation are available, too, > which can be useful. Thanks a lot - exactly what I was looking for! Kind regards, Johannes From Chris.Barker at noaa.gov Mon May 4 13:23:43 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 04 May 2009 10:23:43 -0700 Subject: [Numpy-discussion] Really strange result In-Reply-To: References: Message-ID: <49FF249F.1010205@noaa.gov> Neal Becker wrote: > In [3]: n=-7 > > In [4]: (np.linspace (0, 1023,1024).astype(np.uint64)*n).dtype > Out[4]: dtype('float64') what would you like (expect) to happen when you multiply an unsigned type by a negative number? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Mon May 4 13:45:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 4 May 2009 11:45:03 -0600 Subject: [Numpy-discussion] stop criterion for an alternating signal In-Reply-To: References:

Message-ID: On Mon, May 4, 2009 at 10:59 AM, Nils Wagner wrote: > On Mon, 4 May 2009 10:52:59 -0600 > Charles R Harris wrote: > > On Mon, May 4, 2009 at 10:48 AM, Nils Wagner > > wrote: > > > >> Hi all, > >> > >> How can I define a stop criterion for an alternating > >>series ? > >> > >> Any pointer would be appreciated. > >> > > > > Where does the series come from and what are you trying > >to do? > > > > Chuck > > The data come from an iterative process. > I am looking for convergence criteria. Well, the example didn't show convergence. If you are working on the harmonic series it takes an awful lot of iterations to get anywhere. So if your example is representative the algorithm needs fixing to accelerate the convergence. Assuming it actually converges and I'm not convinced of that. > > It should be possible to stop the process after 10-15 > iterations. > When an alternating series converges there is a decreasing upper bound and increasing lower bound and the difference goes to zero. Pick a cutoff and quit when the bounds are less than that apart. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Mon May 4 15:18:20 2009 From: david.huard at gmail.com (David Huard) Date: Mon, 4 May 2009 15:18:20 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> Message-ID: <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> On Mon, May 4, 2009 at 7:00 AM, wrote: > On Mon, May 4, 2009 at 12:31 AM, Chris Colbert > wrote: > > this actually sort of worked. Thanks for putting me on the right track. > > > > Here is what I ended up with. > > > > this is what I ended up with: > > > > def hist3d(imgarray): > > histarray = N.zeros((16, 16, 16)) > > temp = imgarray.copy() > > bins = N.arange(0, 257, 16) > > histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), > > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] > > return histarray > > > > this creates a 3d histogram of rgb image values in the range 0,255 using > 16 > > bins per component color. > > > > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for > > loop. > > > > not quite framerate, but good enough for prototyping. > > > > I don't think your copy to temp is necessary, and use reshape(-1,3) as > in the example of Stefan, which will avoid copying the array 3 times. > > If you need to gain some more speed, then rewriting histogramdd and > removing some of the unnecessary checks and calculations looks > possible. Indeed, the strategy used in the histogram function is faster than the one used in the histogramdd case, so porting one to the other should speed things up. David > > Josef > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Mon May 4 16:00:31 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 4 May 2009 16:00:31 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> Message-ID: <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> i'll take a look at them over the next few days and see what i can hack out. Chris On Mon, May 4, 2009 at 3:18 PM, David Huard wrote: > > > On Mon, May 4, 2009 at 7:00 AM, wrote: > >> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert >> wrote: >> > this actually sort of worked. Thanks for putting me on the right track. >> > >> > Here is what I ended up with. >> > >> > this is what I ended up with: >> > >> > def hist3d(imgarray): >> > histarray = N.zeros((16, 16, 16)) >> > temp = imgarray.copy() >> > bins = N.arange(0, 257, 16) >> > histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(), >> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] >> > return histarray >> > >> > this creates a 3d histogram of rgb image values in the range 0,255 using >> 16 >> > bins per component color. >> > >> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for >> > loop. >> > >> > not quite framerate, but good enough for prototyping. >> > >> >> I don't think your copy to temp is necessary, and use reshape(-1,3) as >> in the example of Stefan, which will avoid copying the array 3 times. >> >> If you need to gain some more speed, then rewriting histogramdd and >> removing some of the unnecessary checks and calculations looks >> possible. > > > Indeed, the strategy used in the histogram function is faster than the one > used in the histogramdd case, so porting one to the other should speed > things up. > > David > > >> >> Josef >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruno.piguet at gmail.com Mon May 4 16:06:54 2009 From: bruno.piguet at gmail.com (bruno Piguet) Date: Mon, 4 May 2009 22:06:54 +0200 Subject: [Numpy-discussion] loadtxt example problem ? Message-ID: Hello, I'm new to numpy, and considering using loadtxt() to read a data file. As a starter, I tried the example of the doc page ( http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) : >>> from StringIO import StringIO # StringIO behaves like a file object >>> c = StringIO("0 1\n2 3") >>> np.loadtxt(c) I didn't get the expectd answer, but : Traceback (moste recent call last): File"(stdin)", line 1, in File "C:\Python25\lib\sire-packages\numpy\core\numeric.py", line 725, in loadtxt X = array(X, dtype) ValueError: setting an array element with a sequence. I'm using verison 1.0.4 of numpy). I got the same problem on a Ms-Windows and a Linux Machine. I could run the example by adding a \n at the end of c : c = StringIO("0 1\n2 3\n") Is it the normal and expected behaviour ? Bruno. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Mon May 4 16:10:25 2009 From: rmay31 at gmail.com (Ryan May) Date: Mon, 4 May 2009 15:10:25 -0500 Subject: [Numpy-discussion] loadtxt example problem ? In-Reply-To: References: Message-ID: On Mon, May 4, 2009 at 3:06 PM, bruno Piguet wrote: > Hello, > > I'm new to numpy, and considering using loadtxt() to read a data file. > > As a starter, I tried the example of the doc page ( > http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) : > > > >>> from StringIO import StringIO # StringIO behaves like a file object > >>> c = StringIO("0 1\n2 3") > >>> np.loadtxt(c) > I didn't get the expectd answer, but : > > Traceback (moste recent call last): > File"(stdin)", line 1, in > File "C:\Python25\lib\sire-packages\numpy\core\numeric.py", line 725, in loadtxt > X = array(X, dtype) > ValueError: setting an array element with a sequence. > > > I'm using verison 1.0.4 of numpy). > > I got the same problem on a Ms-Windows and a Linux Machine. > > I could run the example by adding a \n at the end of c : > c = StringIO("0 1\n2 3\n") > > > Is it the normal and expected behaviour ? > > Bruno. > > It's a bug that's been fixed. Numpy 1.0.4 is quite a bit out of date, so I'd recommend updating to the latest (1.3). Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 4 16:18:11 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 4 May 2009 16:18:11 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> Message-ID: <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> On Mon, May 4, 2009 at 4:00 PM, Chris Colbert wrote: > i'll take a look at them over the next few days and see what i can hack out. > > Chris > > On Mon, May 4, 2009 at 3:18 PM, David Huard wrote: >> >> >> On Mon, May 4, 2009 at 7:00 AM, wrote: >>> >>> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert >>> wrote: >>> > this actually sort of worked. Thanks for putting me on the right track. >>> > >>> > Here is what I ended up with. >>> > >>> > this is what I ended up with: >>> > >>> > def hist3d(imgarray): >>> > ??? histarray = N.zeros((16, 16, 16)) >>> > ??? temp = imgarray.copy() >>> > ????bins = N.arange(0, 257, 16) >>> > ??? histarray = N.histogramdd((temp[:,:,0].ravel(), >>> > temp[:,:,1].ravel(), >>> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] >>> > ??? return histarray >>> > >>> > this creates a 3d histogram of rgb image values in the range 0,255 >>> > using 16 >>> > bins per component color. >>> > >>> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for >>> > loop. >>> > >>> > not quite framerate, but good enough for prototyping. >>> > >>> >>> I don't think your copy to temp is necessary, and use reshape(-1,3) as >>> in the example of Stefan, which will avoid copying the array 3 times. >>> >>> If you need to gain some more speed, then rewriting histogramdd and >>> removing some of the unnecessary checks and calculations looks >>> possible. >> >> Indeed, the strategy used in the histogram function is faster than the one >> used in the histogramdd case, so porting one to the other should speed >> things up. >> >> David is searchsorted faster than digitize and bincount ? Using the idea of histogramdd, I get a bit below a tenth of a second, my best for this problem is below. I was trying for a while what the fastest way is to convert a two dimensional array into a one dimensional index for bincount. I found that using the return index of unique1d is very slow compared to numeric index calculation. Josef example timed for: nobs = 307200 nbins = 16 factors = np.random.randint(256,size=(nobs,3)).copy() factors2 = factors.reshape(-1,480,3).copy() def hist3(factorsin, nbins): if factorsin.ndim != 2: factors = factorsin.reshape(-1,factorsin.shape[-1]) else: factors = factorsin N, D = factors.shape darr = np.empty(factors.T.shape, dtype=int) nele = np.max(factors)+1 bins = np.arange(0, nele, nele/nbins) bins[-1] += 1 for i in range(D): darr[i] = np.digitize(factors[:,i],bins) - 1 #add weighted rows darrind = darr[D-1] for i in range(D-1): darrind += darr[i]*nbins**(D-i-1) return np.bincount(darrind) # return flat not reshaped From dwf at cs.toronto.edu Mon May 4 16:55:59 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 4 May 2009 16:55:59 -0400 Subject: [Numpy-discussion] object arrays and == Message-ID: <33C0A55A-45AB-45A5-8F75-DF99DF4182F1@cs.toronto.edu> Hi, Is there a simple way to compare each element of an object array to a single object? objarray == None, for example, gives me a single "False". I couldn't find any reference to it in the documentation, but I'll admit, I wasn't quite sure where to look. David From rmay31 at gmail.com Mon May 4 17:02:52 2009 From: rmay31 at gmail.com (Ryan May) Date: Mon, 4 May 2009 16:02:52 -0500 Subject: [Numpy-discussion] object arrays and == In-Reply-To: <33C0A55A-45AB-45A5-8F75-DF99DF4182F1@cs.toronto.edu> References: <33C0A55A-45AB-45A5-8F75-DF99DF4182F1@cs.toronto.edu> Message-ID: On Mon, May 4, 2009 at 3:55 PM, David Warde-Farley wrote: > Hi, > > Is there a simple way to compare each element of an object array to a > single object? objarray == None, for example, gives me a single > "False". I couldn't find any reference to it in the documentation, but > I'll admit, I wasn't quite sure where to look. > I think it might depend on some factors: In [1]: a = np.array(['a','b'], dtype=np.object) In [2]: a=='a' Out[2]: array([ True, False], dtype=bool) In [3]: a==None Out[3]: False In [4]: a == [] Out[4]: False In [5]: a == '' Out[5]: array([False, False], dtype=bool) In [6]: a == dict() Out[6]: array([False, False], dtype=bool) In [7]: numpy.__version__ Out[7]: '1.4.0.dev6885' In [8]: a == 5 Out[8]: array([False, False], dtype=bool) In [9]: a == 5. Out[9]: array([False, False], dtype=bool) But based on these results, I have no idea what the factors might be. I know this works with datetime objects, but I'm really not sure why None and the empty list don't work. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from Norman, Oklahoma, United States -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Mon May 4 17:19:59 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 4 May 2009 14:19:59 -0700 Subject: [Numpy-discussion] object arrays and == In-Reply-To: References: <33C0A55A-45AB-45A5-8F75-DF99DF4182F1@cs.toronto.edu> Message-ID: On Mon, May 4, 2009 at 2:02 PM, Ryan May wrote: > On Mon, May 4, 2009 at 3:55 PM, David Warde-Farley > wrote: >> >> Hi, >> >> Is there a simple way to compare each element of an object array to a >> single object? objarray == None, for example, gives me a single >> "False". I couldn't find any reference to it in the documentation, but >> I'll admit, I wasn't quite sure where to look. > > I think it might depend on some factors: > > In [1]: a = np.array(['a','b'], dtype=np.object) > > In [2]: a=='a' > Out[2]: array([ True, False], dtype=bool) > > In [3]: a==None > Out[3]: False > > In [4]: a == [] > Out[4]: False > > In [5]: a == '' > Out[5]: array([False, False], dtype=bool) > > In [6]: a == dict() > Out[6]: array([False, False], dtype=bool) > > In [7]: numpy.__version__ > Out[7]: '1.4.0.dev6885' > > In [8]: a == 5 > Out[8]: array([False, False], dtype=bool) > > In [9]: a == 5. > Out[9]: array([False, False], dtype=bool) > > But based on these results, I have no idea what the factors might be.? I > know this works with datetime objects, but I'm really not sure why None and > the empty list don't work. Doing a little poking around, I found this: >> a = np.array([True, False]) >> a == None False >> np.equal(a, None) array([False, False], dtype=bool) From python at beyondcode.org Mon May 4 18:19:51 2009 From: python at beyondcode.org (Philipp K. Janert) Date: Mon, 4 May 2009 15:19:51 -0700 Subject: [Numpy-discussion] SVD failure Message-ID: <200905041519.52056.python@beyondcode.org> The following code: from scipy import * from scipy import linalg m = matrix( [ [1,1,0,0], [1,1,0,0], [0,0,1,1], [0,0,1,1] ] ) u,s,v = linalg.svd( m ) fails with the following message: Traceback (most recent call last): File "boo.py", line 10, in u,s,v = linalg.svd( m ) File "/usr/lib64/python2.6/site-packages/scipy/linalg/decomp.py", line 509, in svd lwork = calc_lwork.gesdd(gesdd.prefix,m,n,compute_uv)[1] RuntimeError: more argument specifiers than keyword list entries (remaining format:'|:calc_lwork.gesdd') On the other hand, calculating la, v = eig( m ) works just fine. If I see this correctly, my SciPy version is 0.6.0; running on 64bit Suse 11. Any thoughts? Best, Ph. From bolme1234 at comcast.net Mon May 4 18:58:52 2009 From: bolme1234 at comcast.net (David Bolme) Date: Mon, 4 May 2009 16:58:52 -0600 Subject: [Numpy-discussion] FaceL - Facile Face Labeling Message-ID: I have been working on a open source face recognition demo tool called FaceL for the past few months. FaceL is a simple and fun face processing and labeling tool that labels faces in a live video from an iSight camera or webcam. It uses OpenCV for face detection, ASEF correlation filters for eye localization, and a Support Vector Machine for face classification. FaceL is implemented in python (using PyVision, wxPython, SciPy, OpenCV, libsvm, and PIL libraries) with a binary executable available on Mac OS 10.5 Intel. (Windows and Linux versions expected soon.) FaceL has been fun to work on and I thought that the NumPy community might like to see their software in action. FaceL source code, videos, and executable can be found at: http://pyvision.sourceforge.net/facel -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon May 4 19:21:29 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 5 May 2009 01:21:29 +0200 Subject: [Numpy-discussion] SVD failure In-Reply-To: <200905041519.52056.python@beyondcode.org> References: <200905041519.52056.python@beyondcode.org> Message-ID: <9457e7c80905041621n64404f0apeb47260adbd8dc21@mail.gmail.com> Hi Philipp 2009/5/5 Philipp K. Janert : > If I see this correctly, my SciPy version > is 0.6.0; running on 64bit Suse 11. SciPy 0.6 is quite old, and it is likely that the problem was fixed in the mean time. On SciPy 0.7 I see: In [31]: u,s,v = linalg.svd(m) In [32]: u Out[32]: array([[-0.70710678, 0. , -0.70710678, 0. ], [-0.70710678, 0. , 0.70710678, 0. ], [ 0. , -0.70710678, 0. , -0.70710678], [ 0. , -0.70710678, 0. , 0.70710678]]) In [33]: s Out[33]: array([ 2., 2., 0., 0.]) In [34]: v Out[34]: array([[-0.70710678, -0.70710678, -0. , -0. ], [-0. , -0. , -0.70710678, -0.70710678], [-0.70710678, 0.70710678, 0. , 0. ], [ 0. , 0. , -0.70710678, 0.70710678]]) Regards St?fan From stefan at sun.ac.za Mon May 4 19:32:28 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 5 May 2009 01:32:28 +0200 Subject: [Numpy-discussion] FaceL - Facile Face Labeling In-Reply-To: References: Message-ID: <9457e7c80905041632x6cd497e8t22d14d1f7b489d9@mail.gmail.com> 2009/5/5 David Bolme : > I have been working on a?open source face recognition demo tool called > FaceL?for the past few months. ? FaceL is a simple and fun face processing > and labeling tool that labels faces in a live video from an iSight camera or > webcam. That's really neat -- I've always wanted to rig our office with biometric access control, and now you've provided a key ingredient! Thanks for sharing, David. Cheers St?fan From python at beyondcode.org Mon May 4 19:39:40 2009 From: python at beyondcode.org (Philipp K. Janert) Date: Mon, 4 May 2009 16:39:40 -0700 Subject: [Numpy-discussion] SVD failure In-Reply-To: <9457e7c80905041621n64404f0apeb47260adbd8dc21@mail.gmail.com> References: <200905041519.52056.python@beyondcode.org> <9457e7c80905041621n64404f0apeb47260adbd8dc21@mail.gmail.com> Message-ID: <200905041639.41083.python@beyondcode.org> Thanks for the quick reply. I'll try upgrading. Best, Ph. On Monday 04 May 2009 04:21:29 pm St?fan van der Walt wrote: > Hi Philipp > > 2009/5/5 Philipp K. Janert : > > If I see this correctly, my SciPy version > > is 0.6.0; running on 64bit Suse 11. > > SciPy 0.6 is quite old, and it is likely that the problem was fixed in > the mean time. > > On SciPy 0.7 I see: > > In [31]: u,s,v = linalg.svd(m) > > In [32]: u > Out[32]: > array([[-0.70710678, 0. , -0.70710678, 0. ], > [-0.70710678, 0. , 0.70710678, 0. ], > [ 0. , -0.70710678, 0. , -0.70710678], > [ 0. , -0.70710678, 0. , 0.70710678]]) > > In [33]: s > Out[33]: array([ 2., 2., 0., 0.]) > > In [34]: v > Out[34]: > array([[-0.70710678, -0.70710678, -0. , -0. ], > [-0. , -0. , -0.70710678, -0.70710678], > [-0.70710678, 0.70710678, 0. , 0. ], > [ 0. , 0. , -0.70710678, 0.70710678]]) > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Tue May 5 00:56:22 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 4 May 2009 21:56:22 -0700 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <49FD380A.9080600@ar.media.kyoto-u.ac.jp> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> <49FD380A.9080600@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> Hi, >> I'm trying to fix a bug in the scipy matlab loading routines, and this >> requires me to somehow represent an empty structured array. >> > > Do you need the struct to be empty (size is 0) or to have no fields ? > What would you expect np.zeros((), dtype=np.dtype([])) to return, for > example ? Yes, I've got nothing - I have no idea what that might return. I'm afraid what I need is some way of representing the fact that I have read, from matlab, a structure with no fields (and therefore no data) that can be - say - shape (10,2) - or any other. Some time ago we thought of switching to structured arrays to represent matlab structs, but this begins to make me think again. Thanks for the replies, Matthew From sccolbert at gmail.com Tue May 5 01:38:37 2009 From: sccolbert at gmail.com (S. Chris Colbert) Date: Tue, 5 May 2009 01:38:37 -0400 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com><49FD380A.9080600@ar.media.kyoto-u.ac.jp> <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> Message-ID: you can create an array without changing the values of the allocated memory by using numpy.empty() or numpy.ndarray() this will allow you to create an array of any size without specifying the contents beforehand. I'm not sure what you mean by "empty", because any memory address will have a value, whether its assigned to or not. Chris -------------------------------------------------- From: "Matthew Brett" Sent: Tuesday, May 05, 2009 12:56 AM To: "Discussion of Numerical Python" Subject: Re: [Numpy-discussion] Structured array with no fields - possible? > Hi, > >>> I'm trying to fix a bug in the scipy matlab loading routines, and this >>> requires me to somehow represent an empty structured array. >>> >> >> Do you need the struct to be empty (size is 0) or to have no fields ? >> What would you expect np.zeros((), dtype=np.dtype([])) to return, for >> example ? > > Yes, I've got nothing - I have no idea what that might return. > > I'm afraid what I need is some way of representing the fact that I > have read, from matlab, a structure with no fields (and therefore no > data) that can be - say - shape (10,2) - or any other. > > Some time ago we thought of switching to structured arrays to > represent matlab structs, but this begins to make me think again. > > Thanks for the replies, > > Matthew > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Tue May 5 03:10:25 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 5 May 2009 09:10:25 +0200 Subject: [Numpy-discussion] How to convert a list into a structured array? In-Reply-To: <41459.25081.qm@web43503.mail.sp1.yahoo.com> References: <41459.25081.qm@web43503.mail.sp1.yahoo.com> Message-ID: <200905050910.25603.faltet@pytables.org> Welcome Wei! A Monday 04 May 2009, Wei Su escrigu?: > Hi,All: > ? > My first post! I am very excited to find out structured array (record > array) in Python. Since I do data manipulation every day, this is > truly great. However, I typically download data using pyodbc, the > default output is a big list. So I am wondering how to convert that > big list into a structured array? using array() will turn it into a > text array, afaik. it is even better if anybody can show me some > tricks to download the data directly as a structured array. > Thanks a lot for the help. Please, could you provide an example of the list that you are getting from your database? With that we can probably figure out your needs much better. > BTW: I am also interested in Python's ability to handle large data. > Any hints or suggestion is welcome. This is also a bit generic question. What kind of data you have to deal with? What sort of operations do you want to perform over it? Do you need a lot of speed or flexibility is more important? Some example? Cheers, -- Francesc Alted "One would expect people to feel threatened by the 'giant brains or machines that think'. In fact, the frightening computer becomes less frightening if it is used only to simulate a familiar noncomputer." -- Edsger W. Dykstra "On the cruelty of really teaching computer science" From cournape at gmail.com Tue May 5 05:33:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 5 May 2009 18:33:09 +0900 Subject: [Numpy-discussion] [review] py3k_bootstrap branch Message-ID: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> Hi, I spent some more time on making numpy.distutils runnable under python 3. I finally made up to the point where it breaks at C code compilation, so we can start working on the hard part. The branch is there for review http://github.com/cournape/numpy/commits/py3k_bootstrap The code is quite ugly to be honest, but I have not found a better way; suggestions are welcomed. The biggest pain is by far exception catching (you can't do except IOError, e in python 3), and then print. Most other things can be handled by careful application of 2to3 with the fixers which keep python2 compatibility (print is unfortunately not one of them). There are also a few python 3.* bugs in distutils (I guess few C-based extensions made it for python 3 already). The rationale for making numpy.distutils runnable under both python2 and python3 (instead of just applying 2to3 on it): - it enables us to bootstrap our build process through the distutils 2to3 command (which is supposed to convert code to python 3 from python 2 sources on the fly). - The few informations I found on non trivial port all made sure their setup.py was python 2 and 3 compatible - which means numpy.distutils for us. - 2to3 is very slow (takes 5 minutes for me on numpy), so having to apply it every time from pristine source for python 3 support would be very painful IMHO. cheers, David From timmichelsen at gmx-topmail.de Tue May 5 05:56:55 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Tue, 5 May 2009 09:56:55 +0000 (UTC) Subject: [Numpy-discussion] numpy docstrings Message-ID: Hello, is it possible to add sections to the allowed sections in the numpy docstring standard? What to think of a section like: Todo ---- Regards, Timmie From robert.kern at gmail.com Tue May 5 07:34:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 5 May 2009 07:34:19 -0400 Subject: [Numpy-discussion] numpy docstrings In-Reply-To: References: Message-ID: <3d375d730905050434l32cec624wc3fb562c3a4e1880@mail.gmail.com> On Tue, May 5, 2009 at 05:56, Timmie wrote: > Hello, > is it possible to add sections to the allowed sections in the numpy docstring > standard? > > What to think of a section like: > > Todo > ---- I prefer to keep such things in comments rather than docstrings, myself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david.huard at gmail.com Tue May 5 09:46:38 2009 From: david.huard at gmail.com (David Huard) Date: Tue, 5 May 2009 09:46:38 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> Message-ID: <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> On Mon, May 4, 2009 at 4:18 PM, wrote: > On Mon, May 4, 2009 at 4:00 PM, Chris Colbert wrote: > > i'll take a look at them over the next few days and see what i can hack > out. > > > > Chris > > > > On Mon, May 4, 2009 at 3:18 PM, David Huard > wrote: > >> > >> > >> On Mon, May 4, 2009 at 7:00 AM, wrote: > >>> > >>> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert > >>> wrote: > >>> > this actually sort of worked. Thanks for putting me on the right > track. > >>> > > >>> > Here is what I ended up with. > >>> > > >>> > this is what I ended up with: > >>> > > >>> > def hist3d(imgarray): > >>> > histarray = N.zeros((16, 16, 16)) > >>> > temp = imgarray.copy() > >>> > bins = N.arange(0, 257, 16) > >>> > histarray = N.histogramdd((temp[:,:,0].ravel(), > >>> > temp[:,:,1].ravel(), > >>> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] > >>> > return histarray > >>> > > >>> > this creates a 3d histogram of rgb image values in the range 0,255 > >>> > using 16 > >>> > bins per component color. > >>> > > >>> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a > for > >>> > loop. > >>> > > >>> > not quite framerate, but good enough for prototyping. > >>> > > >>> > >>> I don't think your copy to temp is necessary, and use reshape(-1,3) as > >>> in the example of Stefan, which will avoid copying the array 3 times. > >>> > >>> If you need to gain some more speed, then rewriting histogramdd and > >>> removing some of the unnecessary checks and calculations looks > >>> possible. > >> > >> Indeed, the strategy used in the histogram function is faster than the > one > >> used in the histogramdd case, so porting one to the other should speed > >> things up. > >> > >> David > > is searchsorted faster than digitize and bincount ? > That depends on the number of bins and whether or not the bin width is uniform. A 1D benchmark I did a while ago showed that if the bin width is uniform, then the best strategy is to create a counter initialized to 0, loop through the data, compute i = (x-bin0) /binwidth and increment counter i by 1 (or by the weight of the data). If the bins are non uniform, then for nbin > 30 you'd better use searchsort, and digitize otherwise. For those interested in speeding up histogram code, I recommend reading a thread started by Cameron Walsh on the 12/12/06 named "Histograms of extremely large data sets" Code and benchmarks were posted. Chris, if your bins all have the same width, then you can certainly write an histogramdd routine that is way faster by using the indexing trick instead of digitize or searchsort. Cheers, David > > Using the idea of histogramdd, I get a bit below a tenth of a second, > my best for this problem is below. > I was trying for a while what the fastest way is to convert a two > dimensional array into a one dimensional index for bincount. I found > that using the return index of unique1d is very slow compared to > numeric index calculation. > > Josef > > example timed for: > nobs = 307200 > nbins = 16 > factors = np.random.randint(256,size=(nobs,3)).copy() > factors2 = factors.reshape(-1,480,3).copy() > > def hist3(factorsin, nbins): > if factorsin.ndim != 2: > factors = factorsin.reshape(-1,factorsin.shape[-1]) > else: > factors = factorsin > N, D = factors.shape > darr = np.empty(factors.T.shape, dtype=int) > nele = np.max(factors)+1 > bins = np.arange(0, nele, nele/nbins) > bins[-1] += 1 > for i in range(D): > darr[i] = np.digitize(factors[:,i],bins) - 1 > > #add weighted rows > darrind = darr[D-1] > for i in range(D-1): > darrind += darr[i]*nbins**(D-i-1) > return np.bincount(darrind) # return flat not reshaped > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From malkarouri at yahoo.co.uk Tue May 5 10:36:22 2009 From: malkarouri at yahoo.co.uk (Muhammad Alkarouri) Date: Tue, 5 May 2009 14:36:22 +0000 (GMT) Subject: [Numpy-discussion] linalg.svd not working? Message-ID: <164180.3450.qm@web24203.mail.ird.yahoo.com> Hi everyone, I have installed numpy 1.3.0 on Python 2.5.1 in an x86_64 machine, and it hangs when I do a numpy.test(verbose=10) on test_pinv (test_defmatrix.TestProperties) ... which I believe hangs on a call to numpy.linalg.svd. Can you please help me with this problem? The installation and configuration is probably a bit non-standard, so I am including the output of python -m numpy.distutils.system_info below. In particular, all the installation was done using CC='gcc -m32' to enforce 32 bit executables, as Python is a 32 bit executable here. Regards, Muhammad Alkarouri lapack_info: libraries lapack not found in /GWD/appbase/common/lib libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /GWD/appbase/common/lib libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries lapack_atlas not found in /users/d88/ma856388/lib __main__.atlas_threads_info Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = f77 customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config compiling '_configtest.c': /* This file is generated from numpy/distutils/system_info.py */ void ATL_buildinfo(void); int main(void) { ATL_buildinfo(); return 0; } C compiler: gcc -m32 -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-c' gcc: _configtest.c gcc -m32 _configtest.o -m32 -fPIC -L/users/d88/ma856388/lib -llapack -lptf77blas -lptcblas -latlas -o _configtest ATLAS version 3.8.3 built by ma856388 on Fri May 1 14:56:25 BST 2009: UNAME : Linux stvwolx028 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_P4E -DATL_CPUMHZ=3200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 F2CDEFS : -DAdd__ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 8388608 F77 : g77, version GNU Fortran (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) F77FLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 -fPIC -m32 SMC : gcc, version gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) SMCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 SKC : gcc, version gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) SKCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 success! removing: _configtest.c _configtest.o _configtest FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = f77 define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] wx_info: Could not locate executable wx-config File not found: None. Cannot determine wx info. NOT AVAILABLE lapack_atlas_info: libraries lapack_atlas,f77blas,cblas,atlas not found in /users/d88/ma856388/lib libraries lapack_atlas not found in /users/d88/ma856388/lib libraries lapack_atlas,f77blas,cblas,atlas not found in /GWD/appbase/common/lib libraries lapack_atlas not found in /GWD/appbase/common/lib libraries lapack_atlas,f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries lapack_atlas,f77blas,cblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries lapack_atlas,f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib __main__.lapack_atlas_info NOT AVAILABLE umfpack_info: libraries umfpack not found in /GWD/appbase/common/lib libraries umfpack not found in /usr/local/lib libraries umfpack not found in /usr/lib NOT AVAILABLE _pkg_config_info: Found executable /usr/bin/pkg-config NOT AVAILABLE lapack_atlas_threads_info: Setting PTATLAS=ATLAS libraries lapack_atlas,ptf77blas,ptcblas,atlas not found in /users/d88/ma856388/lib libraries lapack_atlas not found in /users/d88/ma856388/lib libraries lapack_atlas,ptf77blas,ptcblas,atlas not found in /GWD/appbase/common/lib libraries lapack_atlas not found in /GWD/appbase/common/lib libraries lapack_atlas,ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries lapack_atlas,ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries lapack_atlas,ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib __main__.lapack_atlas_threads_info NOT AVAILABLE x11_info: FOUND: libraries = ['X11'] library_dirs = ['/usr/X11R6/lib'] blas_info: libraries blas not found in /GWD/appbase/common/lib libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 fftw_info: libraries fftw3 not found in /GWD/appbase/common/lib libraries fftw3 not found in /usr/local/lib libraries fftw3 not found in /usr/lib fftw3 not found libraries rfftw,fftw not found in /GWD/appbase/common/lib libraries rfftw,fftw not found in /usr/local/lib libraries rfftw,fftw not found in /usr/lib fftw2 not found NOT AVAILABLE f2py_info: FOUND: sources = ['/users/d88/ma856388/lib/python/numpy/f2py/src/fortranobject.c'] include_dirs = ['/users/d88/ma856388/lib/python/numpy/f2py/src'] gdk_pixbuf_xlib_2_info: FOUND: libraries = ['gdk_pixbuf_xlib-2.0', 'gdk_pixbuf-2.0', 'm', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GDK_PIXBUF_XLIB_2_INFO', '"\\"2.4.13\\""'), ('GDK_PIXBUF_XLIB_VERSION_2_4_13', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] dfftw_threads_info: libraries drfftw_threads,dfftw_threads not found in /GWD/appbase/common/lib libraries drfftw_threads,dfftw_threads not found in /usr/local/lib libraries drfftw_threads,dfftw_threads not found in /usr/lib dfftw threads not found NOT AVAILABLE atlas_blas_info: FOUND: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = c fftw3_info: libraries fftw3 not found in /GWD/appbase/common/lib libraries fftw3 not found in /usr/local/lib libraries fftw3 not found in /usr/lib fftw3 not found NOT AVAILABLE blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /GWD/appbase/common/lib libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = c customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config compiling '_configtest.c': /* This file is generated from numpy/distutils/system_info.py */ void ATL_buildinfo(void); int main(void) { ATL_buildinfo(); return 0; } C compiler: gcc -m32 -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-c' gcc: _configtest.c gcc -m32 _configtest.o -m32 -fPIC -L/users/d88/ma856388/lib -lptf77blas -lptcblas -latlas -o _configtest ATLAS version 3.8.3 built by ma856388 on Fri May 1 14:56:25 BST 2009: UNAME : Linux stvwolx028 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_P4E -DATL_CPUMHZ=3200 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_GAS_x8632 F2CDEFS : -DAdd__ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 8388608 F77 : g77, version GNU Fortran (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) F77FLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 -fPIC -m32 SMC : gcc, version gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) SMCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 SKC : gcc, version gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) SKCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m32 success! removing: _configtest.c _configtest.o _configtest FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = c define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] sfftw_info: libraries srfftw,sfftw not found in /GWD/appbase/common/lib libraries srfftw,sfftw not found in /usr/local/lib libraries srfftw,sfftw not found in /usr/lib sfftw not found NOT AVAILABLE xft_info: FOUND: libraries = ['Xft', 'X11', 'freetype', 'Xrender', 'fontconfig'] library_dirs = ['/usr/X11R6/lib64'] define_macros = [('XFT_INFO', '"\\"2.1.2.2\\""'), ('XFT_VERSION_2_1_2_2', None)] include_dirs = ['/usr/X11R6/include', '/usr/include/freetype2', '/usr/include/freetype2/config'] fft_opt_info: fftw2_info: libraries rfftw,fftw not found in /GWD/appbase/common/lib libraries rfftw,fftw not found in /usr/local/lib libraries rfftw,fftw not found in /usr/lib fftw2 not found NOT AVAILABLE dfftw_info: libraries drfftw,dfftw not found in /GWD/appbase/common/lib libraries drfftw,dfftw not found in /usr/local/lib libraries drfftw,dfftw not found in /usr/lib dfftw not found NOT AVAILABLE djbfft_info: NOT AVAILABLE NOT AVAILABLE gdk_x11_2_info: FOUND: libraries = ['gdk-x11-2.0', 'gdk_pixbuf-2.0', 'm', 'pangoxft-1.0', 'pangox-1.0', 'pango-1.0', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GDK_X11_2_INFO', '"\\"2.4.13\\""'), ('GDK_X11_VERSION_2_4_13', None), ('XTHREADS', None), ('_REENTRANT', None), ('XUSE_MTSAFE_API', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/lib64/gtk-2.0/include', '/usr/X11R6/include', '/usr/include/pango-1.0', '/usr/include/freetype2', '/usr/include/freetype2/config', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] agg2_info: NOT AVAILABLE numarray_info: NOT AVAILABLE blas_src_info: NOT AVAILABLE fftw_threads_info: libraries rfftw_threads,fftw_threads not found in /GWD/appbase/common/lib libraries rfftw_threads,fftw_threads not found in /usr/local/lib libraries rfftw_threads,fftw_threads not found in /usr/lib fftw threads not found NOT AVAILABLE _numpy_info: FOUND: define_macros = [('NUMERIC_VERSION', '"\\"23.8\\""'), ('NUMERIC', None)] gdk_info: FOUND: libraries = ['gdk', 'Xi', 'Xext', 'X11', 'm', 'glib'] library_dirs = ['/usr/X11R6/lib64'] define_macros = [('GDK_INFO', '"\\"1.2.10\\""'), ('GDK_VERSION_1_2_10', None)] include_dirs = ['/usr/include/gtk-1.2', '/usr/X11R6/include', '/usr/include/glib-1.2', '/usr/lib64/glib/include'] gtkp_x11_2_info: FOUND: libraries = ['gtk-x11-2.0', 'gdk-x11-2.0', 'atk-1.0', 'gdk_pixbuf-2.0', 'm', 'pangoxft-1.0', 'pangox-1.0', 'pango-1.0', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GTKP_X11_2_INFO', '"\\"2.4.13\\""'), ('GTK_X11_VERSION_2_4_13', None), ('XTHREADS', None), ('_REENTRANT', None), ('XUSE_MTSAFE_API', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/lib64/gtk-2.0/include', '/usr/X11R6/include', '/usr/include/atk-1.0', '/usr/include/pango-1.0', '/usr/include/freetype2', '/usr/include/freetype2/config', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] sfftw_threads_info: libraries srfftw_threads,sfftw_threads not found in /GWD/appbase/common/lib libraries srfftw_threads,sfftw_threads not found in /usr/local/lib libraries srfftw_threads,sfftw_threads not found in /usr/lib sfftw threads not found NOT AVAILABLE boost_python_info: NOT AVAILABLE freetype2_info: FOUND: libraries = ['freetype', 'z'] define_macros = [('FREETYPE2_INFO', '"\\"9.7.3\\""'), ('FREETYPE2_VERSION_9_7_3', None)] include_dirs = ['/usr/include/freetype2'] gdk_2_info: FOUND: libraries = ['gdk-x11-2.0', 'gdk_pixbuf-2.0', 'm', 'pangoxft-1.0', 'pangox-1.0', 'pango-1.0', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GDK_2_INFO', '"\\"2.4.13\\""'), ('GDK_VERSION_2_4_13', None), ('XTHREADS', None), ('_REENTRANT', None), ('XUSE_MTSAFE_API', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/lib64/gtk-2.0/include', '/usr/X11R6/include', '/usr/include/pango-1.0', '/usr/include/freetype2', '/usr/include/freetype2/config', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] lapack_src_info: NOT AVAILABLE gtkp_2_info: FOUND: libraries = ['gtk-x11-2.0', 'gdk-x11-2.0', 'atk-1.0', 'gdk_pixbuf-2.0', 'm', 'pangoxft-1.0', 'pangox-1.0', 'pango-1.0', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GTKP_2_INFO', '"\\"2.4.13\\""'), ('GTK_VERSION_2_4_13', None), ('XTHREADS', None), ('_REENTRANT', None), ('XUSE_MTSAFE_API', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/lib64/gtk-2.0/include', '/usr/X11R6/include', '/usr/include/atk-1.0', '/usr/include/pango-1.0', '/usr/include/freetype2', '/usr/include/freetype2/config', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] gdk_pixbuf_2_info: FOUND: libraries = ['gdk_pixbuf-2.0', 'm', 'gobject-2.0', 'gmodule-2.0', 'dl', 'glib-2.0'] extra_link_args = ['-Wl,--export-dynamic'] define_macros = [('GDK_PIXBUF_2_INFO', '"\\"2.4.13\\""'), ('GDK_PIXBUF_VERSION_2_4_13', None)] include_dirs = ['/usr/include/gtk-2.0', '/usr/include/glib-2.0', '/usr/lib64/glib-2.0/include'] amd_info: libraries amd not found in /GWD/appbase/common/lib libraries amd not found in /usr/local/lib libraries amd not found in /usr/lib NOT AVAILABLE atlas_info: libraries lapack_atlas not found in /users/d88/ma856388/lib __main__.atlas_info FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/users/d88/ma856388/lib'] language = f77 Numeric_info: FOUND: define_macros = [('NUMERIC_VERSION', '"\\"23.8\\""'), ('NUMERIC', None)] numerix_info: numpy_info: FOUND: define_macros = [('NUMPY_VERSION', '"\\"1.3.0\\""'), ('NUMPY', None)] FOUND: define_macros = [('NUMPY_VERSION', '"\\"1.3.0\\""'), ('NUMPY', None)] From Chris.Barker at noaa.gov Tue May 5 11:20:59 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 May 2009 08:20:59 -0700 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> <49FD380A.9080600@ar.media.kyoto-u.ac.jp> <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> Message-ID: <4A00595B.3040109@noaa.gov> Matthew Brett wrote: > I'm afraid what I need is some way of representing the fact that I > have read, from matlab, a structure with no fields (and therefore no > data) that can be - say - shape (10,2) - or any other. how about: >>> a = np.empty(size, dtype=np.object) >>> >>> a array([[None, None, None, None], [None, None, None, None], [None, None, None, None]], dtype=object) I also thinking of putting an empty as the items, but I couldn't figure out how to do that: >>> a[:] = () Traceback (most recent call last): File "", line 1, in ValueError: shape mismatch: objects cannot be broadcast to a single shape >>> a[0] = () Traceback (most recent call last): File "", line 1, in ValueError: shape mismatch: objects cannot be broadcast to a single shape Some folks think the way to spell a struct in python is a clas with only attributes, so: >>> class empty: ... def __repr__(self): ... return "empty class" ... >>> a[:] = empty() >>> a array([[empty class, empty class, empty class, empty class], [empty class, empty class, empty class, empty class], [empty class, empty class, empty class, empty class]], dtype=object) or you may be able to some trick with strides that would give you zero-size elements, though I suppose you'd need at least one byte allocated for the data pointer. Can you have an empty struct in C? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Tue May 5 11:24:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 May 2009 09:24:53 -0600 Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: <164180.3450.qm@web24203.mail.ird.yahoo.com> References: <164180.3450.qm@web24203.mail.ird.yahoo.com> Message-ID: On Tue, May 5, 2009 at 8:36 AM, Muhammad Alkarouri wrote: > > Hi everyone, > > I have installed numpy 1.3.0 on Python 2.5.1 in an x86_64 machine, and it > hangs when I do a numpy.test(verbose=10) on > test_pinv (test_defmatrix.TestProperties) ... > which I believe hangs on a call to numpy.linalg.svd. Can you please help me > with this problem? > > The installation and configuration is probably a bit non-standard, so I am > including the output of python -m numpy.distutils.system_info below. In > particular, all the installation was done using CC='gcc -m32' to enforce 32 > bit executables, as Python is a 32 bit executable here. > This is almost always an ATLAS problem. Where did your ATLAS come from and what distro are you running? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue May 5 11:26:04 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 05 May 2009 10:26:04 -0500 Subject: [Numpy-discussion] [review] py3k_bootstrap branch In-Reply-To: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> References: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> Message-ID: <4A005A8C.4060709@gmail.com> David Cournapeau wrote: > Hi, > > I spent some more time on making numpy.distutils runnable under python > 3. I finally made up to the point where it breaks at C code > compilation, so we can start working on the hard part. The branch is > there for review > > http://github.com/cournape/numpy/commits/py3k_bootstrap > > The code is quite ugly to be honest, but I have not found a better > way; suggestions are welcomed. The biggest pain is by far exception > catching (you can't do except IOError, e in python 3), and then print. > Most other things can be handled by careful application of 2to3 with > the fixers which keep python2 compatibility (print is unfortunately > not one of them). There are also a few python 3.* bugs in distutils (I > guess few C-based extensions made it for python 3 already). > > The rationale for making numpy.distutils runnable under both python2 > and python3 (instead of just applying 2to3 on it): > - it enables us to bootstrap our build process through the distutils > 2to3 command (which is supposed to convert code to python 3 from > python 2 sources on the fly). > - The few informations I found on non trivial port all made sure > their setup.py was python 2 and 3 compatible - which means > numpy.distutils for us. > - 2to3 is very slow (takes 5 minutes for me on numpy), so having to > apply it every time from pristine source for python 3 support would be > very painful IMHO. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, This is really impressive! I agree that there should only be one source for Python 2 and Python 3. Although it does mean that any new code must be compatible with both Python 2.4+ and Python 3.+. I have only been browsing some of the code and was wondering about the usage of print. In many cases it seems that the print statements are perhaps warnings. If so, should the print statements be changed to warnings? For example, I think, in setup.py 663d9e7, this clearly should be a warning. http://github.com/cournape/numpy/commit/663d9e7e29bfea0f7adc8de5ff0e9d83264c3962 print(" --- Could not run svn info --- ") Bruce From nwagner at iam.uni-stuttgart.de Tue May 5 11:50:02 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 05 May 2009 17:50:02 +0200 Subject: [Numpy-discussion] cannot build numpy from trunk Message-ID: ... In file included from numpy/core/src/multiarray/ctors.c:16, from numpy/core/src/multiarray/multiarraymodule_onefile.c:13: numpy/core/src/multiarray/ctors.h: At top level: numpy/core/src/multiarray/ctors.h:68: warning: conflicting types for ?byte_swap_vector? numpy/core/src/multiarray/ctors.h:68: error: static declaration of ?byte_swap_vector? follows non-static declaration numpy/core/src/multiarray/scalarapi.c:640: error: previous implicit declaration of ?byte_swap_vector? was here error: Command "/usr/bin/gcc -fno-strict-aliasing -DNDEBUG -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -fwrapv -fPIC -Inumpy/core/include -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.6 -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c numpy/core/src/multiarray/multiarraymodule_onefile.c -o build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" failed with exit status 1 Nils From cournape at gmail.com Tue May 5 11:50:43 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 00:50:43 +0900 Subject: [Numpy-discussion] [review] py3k_bootstrap branch In-Reply-To: <4A005A8C.4060709@gmail.com> References: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> <4A005A8C.4060709@gmail.com> Message-ID: <5b8d13220905050850i125ebe82x2ec026af4035b11c@mail.gmail.com> On Wed, May 6, 2009 at 12:26 AM, Bruce Southey wrote: > David Cournapeau wrote: >> Hi, >> >> I spent some more time on making numpy.distutils runnable under python >> 3. I finally made up to the point where it breaks at C code >> compilation, so we can start working on the hard part. The branch is >> there for review >> >> http://github.com/cournape/numpy/commits/py3k_bootstrap >> >> The code is quite ugly to be honest, but I have not found a better >> way; suggestions are welcomed. The biggest pain is by far exception >> catching (you can't do except IOError, e in python 3), and then print. >> Most other things can be handled by careful application of 2to3 with >> the fixers which keep python2 compatibility (print is unfortunately >> not one of them). There are also a few python 3.* bugs in distutils (I >> guess few C-based extensions made it for python 3 already). >> >> The rationale for making numpy.distutils runnable under both python2 >> and python3 (instead of just applying 2to3 on it): >> ?- it enables us to bootstrap our build process through the distutils >> 2to3 command (which is supposed to convert code to python 3 from >> python 2 sources on the fly). >> ?- The few informations I found on non trivial port all made sure >> their setup.py was python 2 and 3 compatible - which means >> numpy.distutils for us. >> ?- 2to3 is very slow (takes 5 minutes for me on numpy), so having to >> apply it every time from pristine source for python 3 support would be >> very painful IMHO. >> >> cheers, >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > This is really impressive! > > I agree that there should only be one source for Python 2 and Python 3. > Although it does mean that any new code must be compatible with both > Python 2.4+ and Python 3.+. That's almost impossible. It would be extremely painful to be source compatible. But we should aim at being able to produce most python 3 code from 2to3. > > I have only been browsing some of the code and was wondering about the > usage of print. In many cases it seems that the print statements are > perhaps warnings. If so, should the print statements be changed to warnings? yes, there are many things which could be done better. Ideally, we should first clean up numpy.distutils code, but that's not a very exciting task :) The goal is more reach something which works as quickly as possible, so that we can focus on the real issues (C code and design decision for strings vs bytes, etc...). David From charlesr.harris at gmail.com Tue May 5 12:04:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 May 2009 10:04:11 -0600 Subject: [Numpy-discussion] cannot build numpy from trunk In-Reply-To: References: Message-ID: On Tue, May 5, 2009 at 9:50 AM, Nils Wagner wrote: > ... > In file included from > numpy/core/src/multiarray/ctors.c:16, > from > numpy/core/src/multiarray/multiarraymodule_onefile.c:13: > numpy/core/src/multiarray/ctors.h: At top level: > numpy/core/src/multiarray/ctors.h:68: warning: conflicting > types for ?byte_swap_vector? > numpy/core/src/multiarray/ctors.h:68: error: static > declaration of ?byte_swap_vector? follows non-static > declaration > numpy/core/src/multiarray/scalarapi.c:640: error: previous > implicit declaration of ?byte_swap_vector? was here > error: Command "/usr/bin/gcc -fno-strict-aliasing -DNDEBUG > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > -fstack-protector -funwind-tables > -fasynchronous-unwind-tables -g -fwrapv -fPIC > -Inumpy/core/include > -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy > -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include -I/usr/include/python2.6 > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c > numpy/core/src/multiarray/multiarraymodule_onefile.c -o > > build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" > failed with exit status 1 > What happens if you delete the build directory first? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue May 5 12:12:36 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 05 May 2009 18:12:36 +0200 Subject: [Numpy-discussion] cannot build numpy from trunk In-Reply-To: References: Message-ID: On Tue, 5 May 2009 10:04:11 -0600 Charles R Harris wrote: > On Tue, May 5, 2009 at 9:50 AM, Nils Wagner >wrote: > >> ... >> In file included from >> numpy/core/src/multiarray/ctors.c:16, >> from >> numpy/core/src/multiarray/multiarraymodule_onefile.c:13: >> numpy/core/src/multiarray/ctors.h: At top level: >> numpy/core/src/multiarray/ctors.h:68: warning: >>conflicting >> types for ?byte_swap_vector? >> numpy/core/src/multiarray/ctors.h:68: error: static >> declaration of ?byte_swap_vector? follows non-static >> declaration >> numpy/core/src/multiarray/scalarapi.c:640: error: >>previous >> implicit declaration of ?byte_swap_vector? was here >> error: Command "/usr/bin/gcc -fno-strict-aliasing >>-DNDEBUG >> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 >> -fstack-protector -funwind-tables >> -fasynchronous-unwind-tables -g -fwrapv -fPIC >> -Inumpy/core/include >> -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy >> -Inumpy/core/src/multiarray -Inumpy/core/src/umath >> -Inumpy/core/include -I/usr/include/python2.6 >> -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray >> -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c >> numpy/core/src/multiarray/multiarraymodule_onefile.c -o >> >> build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" >> failed with exit status 1 >> > > What happens if you delete the build directory first? > > Chuck I have done that before ;-) Nils From matthew.brett at gmail.com Tue May 5 12:33:53 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 5 May 2009 09:33:53 -0700 Subject: [Numpy-discussion] Structured array with no fields - possible? In-Reply-To: <4A00595B.3040109@noaa.gov> References: <1e2af89e0905022154o375b48b7u1f27da260f7286eb@mail.gmail.com> <49FD380A.9080600@ar.media.kyoto-u.ac.jp> <1e2af89e0905042156x6aad087bj19109de1916b299e@mail.gmail.com> <4A00595B.3040109@noaa.gov> Message-ID: <1e2af89e0905050933p40f8ea0fi2726379b28e31ff5@mail.gmail.com> Hi, >> I'm afraid what I need is some way of representing the fact that I >> have read, from matlab, a structure with no fields (and therefore no >> data) that can be - say - shape (10,2) - or any other. > > how about: > ?>>> a = np.empty(size, dtype=np.object) > ?>>> > ?>>> a > array([[None, None, None, None], > ? ? ? ?[None, None, None, None], > ? ? ? ?[None, None, None, None]], dtype=object) Yes, that's the solution I came to in the end, the problem being that it is hard for the roundtrip (matlab->python->matlab) to tell that this Python thing should be converted to an empty struct; normally structs are numpy structured arrays, and I used object arrays for matlab cell arrays. Your empty class idea is good, and more obviously identifiable, at least to the code. But thanks for the thoughts, it's helpful to try and think it through, Matthew From dmitrey.kroshko at scipy.org Tue May 5 12:48:11 2009 From: dmitrey.kroshko at scipy.org (dmitrey) Date: Tue, 5 May 2009 09:48:11 -0700 (PDT) Subject: [Numpy-discussion] error building numpy: no file refecount.c Message-ID: Hi all, I've got the error during building numpy from latest svn snapshot - any ideas? D. ... executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.6/numpy/core/include/numpy/ __multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.6/ numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.6/numpy/ core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.6/numpy/ core/include/numpy/__multiarray_api.h'] building extension "numpy.core.multiarray" sources error: src/multiarray/refecount.c: No such file or directory From charlesr.harris at gmail.com Tue May 5 13:08:54 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 May 2009 11:08:54 -0600 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: References: Message-ID: On Tue, May 5, 2009 at 10:48 AM, dmitrey wrote: > Hi all, > I've got the error during building numpy from latest svn snapshot - > any ideas? > D. > I would guess it is a consequence of David's ongoing breakup of the src files. Did you try the usual delete of the build directory? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 5 13:12:21 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 May 2009 11:12:21 -0600 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: References: Message-ID: On Tue, May 5, 2009 at 11:08 AM, Charles R Harris wrote: > > > On Tue, May 5, 2009 at 10:48 AM, dmitrey wrote: > >> Hi all, >> I've got the error during building numpy from latest svn snapshot - >> any ideas? >> D. >> > > I would guess it is a consequence of David's ongoing breakup of the src > files. Did you try the usual delete of the build directory? > And David, it is probably time to slow down the grinding of src into little bits. I don't think it needs to be rushed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmb at wartburg.edu Tue May 5 13:32:09 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 05 May 2009 12:32:09 -0500 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: References: Message-ID: <4A007819.8050808@wartburg.edu> On 2009-05-05 12:12 , Charles R Harris wrote: > > > On Tue, May 5, 2009 at 11:08 AM, Charles R Harris > > wrote: > > > > On Tue, May 5, 2009 at 10:48 AM, dmitrey > wrote: > > Hi all, > I've got the error during building numpy from latest svn snapshot - > any ideas? > D. > > > I would guess it is a consequence of David's ongoing breakup of the > src files. Did you try the usual delete of the build directory? > > > And David, it is probably time to slow down the grinding of src into > little bits. I don't think it needs to be rushed. Some bisection shows that the problem is not present in r6944, so for now, one can "svn up -r 6944" until David gets the problem resolved. While understanding that making sure the trunk builds on many platforms is a problem, I think that numpy could do better at keeping the trunk buildable and doing disruptive things on long-lived feature branches that could then be merged. -Neil From pav at iki.fi Tue May 5 13:38:37 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 5 May 2009 17:38:37 +0000 (UTC) Subject: [Numpy-discussion] error building numpy: no file refecount.c References: <4A007819.8050808@wartburg.edu> Message-ID: Tue, 05 May 2009 12:32:09 -0500, Neil Martinsen-Burrell wrote: [clip] > While understanding that making sure the trunk builds on many platforms > is a problem, I think that numpy could do better at keeping the trunk > buildable and doing disruptive things on long-lived feature branches > that could then be merged. I don't think broken trunk has often been a significant problem in Numpy in the past. Anyway, feature branches are good, and we have the buildbot.scipy.org, so there's no reason not to check it after committing. -- Pauli Virtanen From taste_of_r at yahoo.com Tue May 5 14:42:04 2009 From: taste_of_r at yahoo.com (Wei Su) Date: Tue, 5 May 2009 11:42:04 -0700 (PDT) Subject: [Numpy-discussion] How to download data directly from SQL into NumPy as a record array or structured array. Message-ID: <910467.47742.qm@web43516.mail.sp1.yahoo.com> ? Hi, Everyone: ? This is what I need to do everyday. Now I have to first save data as .csv file and the use csv2rec() to read the data as a record array. Anybody can give me some advice on how to directly get the data as record arrays? It will save me tons of time. ? Thanks in advance. ? Wei Su -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 5 14:44:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 May 2009 12:44:31 -0600 Subject: [Numpy-discussion] cannot build numpy from trunk In-Reply-To: References:

Message-ID: On Tue, May 5, 2009 at 10:12 AM, Nils Wagner wrote: > On Tue, 5 May 2009 10:04:11 -0600 > Charles R Harris wrote: > > On Tue, May 5, 2009 at 9:50 AM, Nils Wagner > >wrote: > > > >> ... > >> In file included from > >> numpy/core/src/multiarray/ctors.c:16, > >> from > >> numpy/core/src/multiarray/multiarraymodule_onefile.c:13: > >> numpy/core/src/multiarray/ctors.h: At top level: > >> numpy/core/src/multiarray/ctors.h:68: warning: > >>conflicting > >> types for ?byte_swap_vector? > >> numpy/core/src/multiarray/ctors.h:68: error: static > >> declaration of ?byte_swap_vector? follows non-static > >> declaration > >> numpy/core/src/multiarray/scalarapi.c:640: error: > >>previous > >> implicit declaration of ?byte_swap_vector? was here > >> error: Command "/usr/bin/gcc -fno-strict-aliasing > >>-DNDEBUG > >> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > >> -fstack-protector -funwind-tables > >> -fasynchronous-unwind-tables -g -fwrapv -fPIC > >> -Inumpy/core/include > >> -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy > >> -Inumpy/core/src/multiarray -Inumpy/core/src/umath > >> -Inumpy/core/include -I/usr/include/python2.6 > >> -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray > >> -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c > >> numpy/core/src/multiarray/multiarraymodule_onefile.c -o > >> > >> > build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" > >> failed with exit status 1 > >> > > > > What happens if you delete the build directory first? > > > > Chuck > > I have done that before ;-) > Is this from the latest svn? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue May 5 14:46:20 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 05 May 2009 20:46:20 +0200 Subject: [Numpy-discussion] cannot build numpy from trunk In-Reply-To: References:

Message-ID: On Tue, 5 May 2009 12:44:31 -0600 Charles R Harris wrote: > On Tue, May 5, 2009 at 10:12 AM, Nils Wagner > wrote: > >> On Tue, 5 May 2009 10:04:11 -0600 >> Charles R Harris wrote: >> > On Tue, May 5, 2009 at 9:50 AM, Nils Wagner >> >wrote: >> > >> >> ... >> >> In file included from >> >> numpy/core/src/multiarray/ctors.c:16, >> >> from >> >> >>numpy/core/src/multiarray/multiarraymodule_onefile.c:13: >> >> numpy/core/src/multiarray/ctors.h: At top level: >> >> numpy/core/src/multiarray/ctors.h:68: warning: >> >>conflicting >> >> types for ?byte_swap_vector? >> >> numpy/core/src/multiarray/ctors.h:68: error: static >> >> declaration of ?byte_swap_vector? follows non-static >> >> declaration >> >> numpy/core/src/multiarray/scalarapi.c:640: error: >> >>previous >> >> implicit declaration of ?byte_swap_vector? was here >> >> error: Command "/usr/bin/gcc -fno-strict-aliasing >> >>-DNDEBUG >> >> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 >> >> -fstack-protector -funwind-tables >> >> -fasynchronous-unwind-tables -g -fwrapv -fPIC >> >> -Inumpy/core/include >> >> -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy >> >> -Inumpy/core/src/multiarray -Inumpy/core/src/umath >> >> -Inumpy/core/include -I/usr/include/python2.6 >> >> >>-Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray >> >> -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c >> >> numpy/core/src/multiarray/multiarraymodule_onefile.c >>-o >> >> >> >> >> build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" >> >> failed with exit status 1 >> >> >> > >> > What happens if you delete the build directory first? >> > >> > Chuck >> >> I have done that before ;-) >> > > Is this from the latest svn? > > Chuck ------------------------------------------------------------------------ r6955 | cdavid | 2009-05-05 13:10:29 +0200 (Di, 05. Mai 2009) | 1 line Put buffer protocol in separate file. Nils From dsdale24 at gmail.com Tue May 5 14:57:45 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 5 May 2009 14:57:45 -0400 Subject: [Numpy-discussion] [review] py3k_bootstrap branch In-Reply-To: <5b8d13220905050850i125ebe82x2ec026af4035b11c@mail.gmail.com> References: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> <4A005A8C.4060709@gmail.com> <5b8d13220905050850i125ebe82x2ec026af4035b11c@mail.gmail.com> Message-ID: On Tue, May 5, 2009 at 11:50 AM, David Cournapeau wrote: > On Wed, May 6, 2009 at 12:26 AM, Bruce Southey wrote: > > David Cournapeau wrote: > >> Hi, > >> > >> I spent some more time on making numpy.distutils runnable under python > >> 3. I finally made up to the point where it breaks at C code > >> compilation, so we can start working on the hard part. The branch is > >> there for review > >> > >> http://github.com/cournape/numpy/commits/py3k_bootstrap > >> > >> The code is quite ugly to be honest, but I have not found a better > >> way; suggestions are welcomed. The biggest pain is by far exception > >> catching (you can't do except IOError, e in python 3), and then print. > >> Most other things can be handled by careful application of 2to3 with > >> the fixers which keep python2 compatibility (print is unfortunately > >> not one of them). There are also a few python 3.* bugs in distutils (I > >> guess few C-based extensions made it for python 3 already). > >> > >> The rationale for making numpy.distutils runnable under both python2 > >> and python3 (instead of just applying 2to3 on it): > >> - it enables us to bootstrap our build process through the distutils > >> 2to3 command (which is supposed to convert code to python 3 from > >> python 2 sources on the fly). > >> - The few informations I found on non trivial port all made sure > >> their setup.py was python 2 and 3 compatible - which means > >> numpy.distutils for us. > >> - 2to3 is very slow (takes 5 minutes for me on numpy), so having to > >> apply it every time from pristine source for python 3 support would be > >> very painful IMHO. > >> > >> cheers, > >> > >> David > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > Hi, > > This is really impressive! > > > > I agree that there should only be one source for Python 2 and Python 3. > > Although it does mean that any new code must be compatible with both > > Python 2.4+ and Python 3.+. > > That's almost impossible. It would be extremely painful to be source > compatible. But we should aim at being able to produce most python 3 > code from 2to3. > There is a lot of interest in a 3to2 tool, and I have read speculation ( http://sayspy.blogspot.com/2009/04/pycon-2009-recap-best-pycon-ever.html) that going from 3 to 2 should be easier than the other way around. Maybe it will be worth keeping an eye on. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue May 5 15:15:23 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 May 2009 15:15:23 -0400 Subject: [Numpy-discussion] How to download data directly from SQL into NumPy as a record array or structured array. In-Reply-To: <910467.47742.qm@web43516.mail.sp1.yahoo.com> References: <910467.47742.qm@web43516.mail.sp1.yahoo.com> Message-ID: <50254212-E8F8-4C79-B30F-8308687EA685@gmail.com> On May 5, 2009, at 2:42 PM, Wei Su wrote: > > Hi, Everyone: > > This is what I need to do everyday. Now I have to first save data > as .csv file and the use csv2rec() to read the data as a record > array. Anybody can give me some advice on how to directly get the > data as record arrays? It will save me tons of time. Wei, Have a look to numpi.lib.io.genfromtxt, that should give you some ideas. From stefan at sun.ac.za Tue May 5 15:35:38 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 5 May 2009 21:35:38 +0200 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: <4A007819.8050808@wartburg.edu> References: <4A007819.8050808@wartburg.edu> Message-ID: <9457e7c80905051235s1afde19ak61179ddfcf66cf84@mail.gmail.com> 2009/5/5 Neil Martinsen-Burrell : > While understanding that making sure the trunk builds on many platforms > is a problem, I think that numpy could do better at keeping the trunk > buildable and doing disruptive things on long-lived feature branches > that could then be merged. David frequently submits his branches for review (although few people take the time to comment). If he breaks the build once in a hundred commits, there really is no reason to complain. If you want to live on the bleeding edge, you must be prepared to bleed a little! Regards St?fan From taste_of_r at yahoo.com Tue May 5 19:39:13 2009 From: taste_of_r at yahoo.com (Wei Su) Date: Tue, 5 May 2009 16:39:13 -0700 (PDT) Subject: [Numpy-discussion] How to convert a list into a structured array? Message-ID: <202381.57946.qm@web43504.mail.sp1.yahoo.com> ? Hi, Francesc: ? Thanks a lot for offering me help. My code is really simple as of now. ? ********************************************************************************** from pyodbc import * from rpy import * cnxn = connect('DRIVER={SQL Server};SERVER=srdata01\\sql2k5;DATABASE=Qai;UID=;PWD=') cursor = cnxn.cursor() cursor.execute("select IsrCode, MstrName from qai..qaiLinkBase") data = cursor.fetchall() cursor.close() *************************************************** The result, data, I got from the above code tends to be a giant list, which is very hard to handle. My goal is to to turn it into a record array so that i can access the field directly by name or by index. My data is typically numerical, character and datetime variables. no other complications. ? >From the above code, you can also see that I used R for some time. But I have to switch to something else because I sometimes cannot even download all my data via R due to its memory limit under windows. I thought NumPy might be the solution. But I am not sure. Anybody can let me know whether Python has a memory limit? or can I use virtual memory by calling some Python module? ? Thanks in advance. ? Wei? Su ? ? --- On Tue, 5/5/09, Francesc Alted wrote: From: Francesc Alted Subject: Re: [Numpy-discussion] How to convert a list into a structured array? To: "Discussion of Numerical Python" Date: Tuesday, May 5, 2009, 7:10 AM Welcome Wei! A Monday 04 May 2009, Wei Su escrigu?: > Hi,All: > ? > My first post! I am very excited to find out structured array (record > array) in Python. Since I do data manipulation every day, this is > truly great. However, I typically download data using pyodbc, the > default output is a big list. So I am wondering how to convert that > big list into a structured array? using array() will turn it into a > text array, afaik. it is even better if anybody can show me some > tricks to download the data directly as a structured array. > Thanks a lot for the help. Please, could you provide an example of the list that you are getting from your database?? With that we can probably figure out your needs much better. > BTW: I am also interested in Python's ability to handle large data. > Any hints or suggestion is welcome. This is also a bit generic question.? What kind of data you have to deal with?? What sort of operations do you want to perform over it?? Do you need a lot of speed or flexibility is more important?? Some example? Cheers, -- Francesc Alted "One would expect people to feel threatened by the 'giant brains or machines that think'.? In fact, the frightening computer becomes less frightening if it is used only to simulate a familiar noncomputer." -- Edsger W. Dykstra ???"On the cruelty of really teaching computer science" _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue May 5 20:40:03 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 09:40:03 +0900 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: <4A007819.8050808@wartburg.edu> References: <4A007819.8050808@wartburg.edu> Message-ID: <5b8d13220905051740p202fae1bxa0c023d4f4aaa9a8@mail.gmail.com> On Wed, May 6, 2009 at 2:32 AM, Neil Martinsen-Burrell wrote: > While understanding that making sure the trunk builds on many platforms > is a problem, I think that numpy could do better at keeping the trunk > buildable and doing disruptive things on long-lived feature branches > that could then be merged. The trunk is rarely broken more than a few hours. In this case, it is just a file which was not added to the trunk, hence I did not detect the problem. Using feature branches would not have prevented the problem, cheers, David From cournape at gmail.com Tue May 5 21:01:07 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 10:01:07 +0900 Subject: [Numpy-discussion] error building numpy: no file refecount.c In-Reply-To: References: Message-ID: <5b8d13220905051801wee8fdax3cca34cb407fcd67@mail.gmail.com> On Wed, May 6, 2009 at 1:48 AM, dmitrey wrote: > Hi all, > I've got the error during building numpy from latest svn snapshot - > any ideas? The problem should be fixed now, David From myeates at jpl.nasa.gov Tue May 5 21:01:49 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 05 May 2009 18:01:49 -0700 Subject: [Numpy-discussion] difficult optimization problem Message-ID: <4A00E17D.6040203@jpl.nasa.gov> Hi I'm trying to solve an optimization problem where the search domain is limited. Suppose I want to minimize the function f(x,y) but f(x,y) is only valid over a subset (unknown without calling f) of (x,y)? I tried looking at OpenOpt but ... kind of unusable without some documentation. Thanks Mathew From cournape at gmail.com Tue May 5 21:02:04 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 10:02:04 +0900 Subject: [Numpy-discussion] cannot build numpy from trunk In-Reply-To: References: Message-ID: <5b8d13220905051802i40d2a8d4n8b611cbc24e0f027@mail.gmail.com> On Wed, May 6, 2009 at 12:50 AM, Nils Wagner wrote: > ... > In file included from > numpy/core/src/multiarray/ctors.c:16, > ? ? ? ? ? ? ? ? ?from > numpy/core/src/multiarray/multiarraymodule_onefile.c:13: > numpy/core/src/multiarray/ctors.h: At top level: > numpy/core/src/multiarray/ctors.h:68: warning: conflicting > types for ?byte_swap_vector? > numpy/core/src/multiarray/ctors.h:68: error: static > declaration of ?byte_swap_vector? follows non-static > declaration > numpy/core/src/multiarray/scalarapi.c:640: error: previous > implicit declaration of ?byte_swap_vector? was here > error: Command "/usr/bin/gcc -fno-strict-aliasing -DNDEBUG > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > -fstack-protector -funwind-tables > -fasynchronous-unwind-tables -g -fwrapv -fPIC > -Inumpy/core/include > -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy > -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include -I/usr/include/python2.6 > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c > numpy/core/src/multiarray/multiarraymodule_onefile.c -o > build/temp.linux-x86_64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" > failed with exit status 1 Should be fixed now, David From cournape at gmail.com Tue May 5 21:12:20 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 10:12:20 +0900 Subject: [Numpy-discussion] [review] py3k_bootstrap branch In-Reply-To: References: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> <4A005A8C.4060709@gmail.com> <5b8d13220905050850i125ebe82x2ec026af4035b11c@mail.gmail.com> Message-ID: <5b8d13220905051812o39874428k6077cd5d9f3dc225@mail.gmail.com> On Wed, May 6, 2009 at 3:57 AM, Darren Dale wrote: > > There is a lot of interest in a 3to2 tool, and I have read speculation > (http://sayspy.blogspot.com/2009/04/pycon-2009-recap-best-pycon-ever.html) > that going from 3 to 2 should be easier than the other way around. Maybe it > will be worth keeping an eye on. I can see how this could help people who have a working python 3 implementation, but in numpy's case, I am not so sure. Do you know which version of python is targeted by 3to2 ? 2.6, 2.5 or even below ? cheers, David From mark.wendell at gmail.com Tue May 5 22:37:00 2009 From: mark.wendell at gmail.com (Mark Wendell) Date: Tue, 5 May 2009 20:37:00 -0600 Subject: [Numpy-discussion] array membership test? Message-ID: Is there a numpy equivalent of python's membership test (eg, "5 in [1,3,4,5]" returns True)? I'd like a quick way to test if a given number is in an array, without stepping through the elements individually. I realize this can be tricky with floats, but if there is such a thing for ints, that would be great. thanks Mark -- -- Mark Wendell From josef.pktd at gmail.com Tue May 5 23:42:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 5 May 2009 23:42:26 -0400 Subject: [Numpy-discussion] array membership test? In-Reply-To: References: Message-ID: <1cd32cbb0905052042v7725cd97i279936d435aa4ae6@mail.gmail.com> On Tue, May 5, 2009 at 10:37 PM, Mark Wendell wrote: > Is there a numpy equivalent of python's membership test (eg, ?"5 in > [1,3,4,5]" returns True)? I'd like a quick way to test if a given > number is in an array, without stepping through the elements > individually. I realize this can be tricky with floats, but if there > is such a thing for ints, that would be great. > >>> import numpy as np >>> (5==np.array([1,3,4,5])).any() True >>> np.setmember1d([5],[1,3,4,5]) array([ True], dtype=bool) >>> np.setmember1d([3,5,6],[1,3,4,5]) array([ True, True, False], dtype=bool) >>> np.setmember1d([3.1,5.3,6.],[1.,3.1,4.,5.3]) array([ True, True, False], dtype=bool) setmember1d requires unique elements in both arrays, but there is a version for non-unique arrays in a trac ticket Josef From mail at stevesimmons.com Wed May 6 01:46:54 2009 From: mail at stevesimmons.com (Stephen Simmons) Date: Wed, 06 May 2009 07:46:54 +0200 Subject: [Numpy-discussion] How to convert a list into a structured array? In-Reply-To: <202381.57946.qm@web43504.mail.sp1.yahoo.com> References: <202381.57946.qm@web43504.mail.sp1.yahoo.com> Message-ID: <4A01244E.8030508@stevesimmons.com> Wei Su wrote: > > Hi, Francesc: > > Thanks a lot for offering me help. My code is really simple as of now. > > ********************************************************************************** > > from pyodbc import * > > from rpy import * > > cnxn = connect(/'DRIVER={SQL > Server};SERVER=srdata01\\sql2k5;DATABASE=_Qai_;UID=;PWD='/) > > cursor = cnxn.cursor() > > cursor.execute(/"select IsrCode, MstrName from _qai_..qaiLinkBase"/) > > data = cursor.fetchall() > > cursor.close() > > *************************************************** > The result, data, I got from the above code tends to be a giant list, > which is very hard to handle. My goal is to to turn it into a record > array so that i can access the field directly by name or by index. My > data is typically numerical, character and datetime variables. no > other complications. > > From the above code, you can also see that I used R for some time. But > I have to switch to something else because I sometimes cannot even > download all my data via R due to its memory limit under windows. I > thought NumPy might be the solution. But I am not sure. Anybody can > let me know whether Python has a memory limit? or can I use virtual > memory by calling some Python module? > > Thanks in advance. > > Wei Su > > > Hi Wei Su, Below is an example from the code I use to read text files into recarrays. The same approach can be used for your SQL data by redefining the inner iterator(path) function to execute your SQL query. If your data is really big, you could also use the PyTables package (written by Francesc actually) to store SQL extracts as numpy-compatible HDF tables. The HDF format can compress the data transparently, so the resulting data files are 1/10 the size of an equivalent text dump. You can then read any or all rows into memory for subsequent process using table.read[row_from, row_to], thereby avoiding running out of memory if your dataset is really big. PyTables/HDF is also really fast for reading. As an example, my three year old laptop with slow hard drive achieves up to 250,000 row per second speeds on GROUP BY-style subtotals. This uses PyTables for storing the data and numpy's bincount() function for doing the aggregation. Stephen def text_file_to_sorted_numpy_array(path, dtype, row_fn, max_rows=None, header=None, order_by=None, min_row_length=None): """ Read a database extract into a numpy recarray, which is possibly sorted then returned. path Path to the text file. dtype String giving column names and numpy data types e.g. 'COL1,S8 COL2,i4' row_fn Optional function splitting a row into a list that is compatible with the numpy array's dtype. The function can indicate the row should be skipped by returning None. If not given, the row has leading and trailing whitespace removed and then is split on '|'. order_by Optional list of column names used to sort the array. header Optional prefix for a header line. If given, there must be a line with this prefix within the first 20 lines. Any leading whitespace is removed before checking. max_rows Optional maximum number of rows that a file will contain. min_row_length Optional length of row in text file, used to estimate upper bound on size of final array. One or both of max_rows and min_row_length must be given. """ # Create a numpy array large enough to hold the entire file in memory if min_row_length: file_size = os.stat(path).st_size num_rows_upper_bound = file_size/min_row_length else: num_rows_upper_bound = max_rows if num_rows_upper_bound is None: raise ValueError('No information given about size of the final array') if max_rows and num_rows_upper_bound>max_rows: raise ValueError("'%s' is %d bytes long, too large to fit in memory" % (os.path.basename(path), file_size)) # Define an iterator that reads the data file def iterator(path): # Read the file with file(path, 'rU') as fh: ftype, prefix = os.path.splitext(os.path.basename(path))[0].split('-', 2) pb = ProgressBar(prefix=prefix) # Read the data lines ctr = idx = 0 for s in fh: s = s.strip() if s in ('\x1A', '-', '') or s.startswith('-------'): # Empty lines after end of real data continue res = row_fn(s) if res: yield res ctr+=1 if ctr%1000==0: total_rows = float(file_size*ctr)/float(fh.tell()) pb(ctr, total=total_rows) pb(ctr, last=True) # Create an empty array to hold all data, then fill in blocks of 5000 rows # Doing this by blocks is 4x faster than adding one row at a time. dtype = list( tuple(x.split(',')) for x in dtype.split() ) arr = numpy.zeros(num_rows_upper_bound, dtype) def block_iterator(iterator, blk_size): "Group iterator into lists with blk_size elements" res = [] for i in iterator: res.append(i) if len(res)==blk_size: yield res res = [] if res: yield res # Now fill the array i = 0 try: for blk in block_iterator(iterator(path), 5000): b = len(blk) tmp = numpy.rec.fromrecords(blk, dtype=dtype, shape=b) arr[i:i+b] = tmp i+=b except KeyboardInterrupt: pass arr = arr[:i] # Remove unused rows at the end of the array # Sort array if required if order_by: print " Sorting %d-row array on %r" % (len(arr), order_by) arr.sort(order=order_by) # Return the final array return arr -------------- next part -------------- An HTML attachment was scrubbed... URL: From taste_of_r at yahoo.com Wed May 6 01:57:55 2009 From: taste_of_r at yahoo.com (Wei Su) Date: Tue, 5 May 2009 22:57:55 -0700 (PDT) Subject: [Numpy-discussion] How to convert a list into a structured array? Message-ID: <940469.68210.qm@web43513.mail.sp1.yahoo.com> ? Hi, Stephen: ? This is fantastic. I shall read your codes carefully next week. (I am taking the rest of the week off for vacation.) Hopefully I am not?so dumb that I need to ask again. ? Regards, ? Wei Su --- On Wed, 5/6/09, Stephen Simmons wrote: From: Stephen Simmons Subject: Re: [Numpy-discussion] How to convert a list into a structured array? To: "Discussion of Numerical Python" Date: Wednesday, May 6, 2009, 5:46 AM Wei Su wrote: ? Hi, Francesc: ? Thanks a lot for offering me help. My code is really simple as of now. ? ********************************************************************************** from pyodbc import * from rpy import * cnxn = connect('DRIVER={SQL Server};SERVER=srdata01\\sql2k5;DATABASE=Qai;UID=;PWD=') cursor = cnxn.cursor() cursor.execute("select IsrCode, MstrName from qai..qaiLinkBase") data = cursor.fetchall() cursor.close() *************************************************** The result, data, I got from the above code tends to be a giant list, which is very hard to handle. My goal is to to turn it into a record array so that i can access the field directly by name or by index. My data is typically numerical, character and datetime variables. no other complications. ? >From the above code, you can also see that I used R for some time. But I have to switch to something else because I sometimes cannot even download all my data via R due to its memory limit under windows. I thought NumPy might be the solution. But I am not sure. Anybody can let me know whether Python has a memory limit? or can I use virtual memory by calling some Python module? ? Thanks in advance. ? Wei? Su ? ? Hi Wei Su, Below is an example from the code I use to read text files into recarrays. The same approach can be used for your SQL data by redefining the inner iterator(path) function to execute your SQL query. If your data is really big, you could also use the PyTables package (written by Francesc actually) to store SQL extracts as numpy-compatible HDF tables. The HDF format can compress the data transparently, so the resulting data files are 1/10 the size of an equivalent text dump. You can then read any or all rows into memory for subsequent process using table.read[row_from, row_to], thereby avoiding running out of memory if your dataset is really big. PyTables/HDF is also really fast for reading. As an example, my three year old laptop with slow hard drive achieves up to 250,000 row per second speeds on GROUP BY-style subtotals. This uses PyTables for storing the data and numpy's bincount() function for doing the aggregation. Stephen def text_file_to_sorted_numpy_array(path, dtype, row_fn, max_rows=None, header=None, ?????????????????????????????????????????? order_by=None, min_row_length=None): ??? """ ??? Read a database extract into a numpy recarray, which is possibly sorted then returned. ??????? path??????????? Path to the text file. ??????? dtype?????????? String giving column names and numpy data types ??????????????????????? e.g. 'COL1,S8 COL2,i4' ??????? row_fn????????? Optional function splitting a row into a list that is ??????????????????????? compatible with the numpy array's dtype. The function ??????????????????????? can indicate the row should be skipped by returning ??????????????????????? None. If not given, the row has leading and trailing ??????????????????????? whitespace removed and then is split on '|'. ??????? order_by??????? Optional list of column names used to sort the array. ??????? header????????? Optional prefix for a header line. If given, there ??????????????????????? must be a line with this prefix within the first 20 lines. ??????????????????????? Any leading whitespace is removed before checking. ??????? max_rows??????? Optional maximum number of rows that a file will contain. ??????? min_row_length? Optional length of row in text file, used to estimate ??????????????????????? upper bound on size of final array. One or both of ??????????????????????? max_rows and min_row_length must be given. ??? """ ??? # Create a numpy array large enough to hold the entire file in memory ??? if min_row_length: ??????? file_size = os.stat(path).st_size ??????? num_rows_upper_bound = file_size/min_row_length ??? else: ??????? num_rows_upper_bound = max_rows ??? if num_rows_upper_bound is None: ??????? raise ValueError('No information given about size of the final array') ??? if max_rows and num_rows_upper_bound>max_rows: ??????? raise ValueError("'%s' is %d bytes long, too large to fit in memory" % (os.path.basename(path), file_size)) ??? # Define an iterator that reads the data file??? ??? def iterator(path): ??????? # Read the file ??????? with file(path, 'rU') as fh: ??????????? ftype, prefix = os.path.splitext(os.path.basename(path))[0].split('-', 2) ??????????? pb = ProgressBar(prefix=prefix) ??????????? # Read the data lines ??????????? ctr = idx = 0??????????? ??????????? for s in fh: ??????????????? s = s.strip() ??????????????? if s in ('\x1A', '-', '') or s.startswith('-------'): ??????????????????? # Empty lines after end of real data ??????????????????? continue ??????????????? res = row_fn(s) ??????????????? if res: ??????????????????? yield res ??????????????? ctr+=1 ??????????????? if ctr%1000==0: ??????????????????? total_rows = float(file_size*ctr)/float(fh.tell()) ??????????????????? pb(ctr, total=total_rows) ??????????? pb(ctr, last=True) ??? # Create an empty array to hold all data, then fill in blocks of 5000 rows ??? # Doing this by blocks is 4x faster than adding one row at a time.. ??? dtype = list( tuple(x.split(',')) for x in dtype.split() ) ??? arr = numpy.zeros(num_rows_upper_bound, dtype) ??? def block_iterator(iterator, blk_size): ??????? "Group iterator into lists with blk_size elements" ??????? res = [] ??????? for i in iterator: ??????????? res.append(i) ??????????? if len(res)==blk_size: ??????????????? yield res ??????????????? res = [] ??????? if res: ??????????? yield res ??? # Now fill the array??????????? ??? i = 0 ??? try: ??????? for blk in block_iterator(iterator(path), 5000): ??????????? b = len(blk) ??????????? tmp = numpy.rec.fromrecords(blk, dtype=dtype, shape=b) ??????????? arr[i:i+b] = tmp ??????????? i+=b ??? except KeyboardInterrupt: ??????? pass ??? arr = arr[:i]?????? # Remove unused rows at the end of the array ??? # Sort array if required ??? if order_by: ??????? print "? Sorting %d-row array on %r" % (len(arr), order_by) ??????? arr.sort(order=order_by) ??? # Return the final array ??? return arr ??? -----Inline Attachment Follows----- _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed May 6 02:03:58 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 May 2009 23:03:58 -0700 Subject: [Numpy-discussion] OS-X binary name... Message-ID: <4A01284E.6070104@noaa.gov> Hi all, The binary for OS-X on sourceforge is called: numpy-1.3.0-py2.5-macosx10.5.dmg However, as far as I can tell, it works just fine on OS-X 10.4, and maybe even 10.3.9. Perhaps a re-naming is in order? But to what? I'd say: numpy-1.3.0-py2.5-macosx10.4.dmg but would folks think that it's only for 10.4? maybe: numpy-1.3.0-py2.5-macosx-python.org.dmg to indicate that it's for the python.org build of python2.5, though I'v never seen anyone use that convention. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From malkarouri at yahoo.co.uk Wed May 6 03:56:48 2009 From: malkarouri at yahoo.co.uk (Muhammad Alkarouri) Date: Wed, 6 May 2009 07:56:48 +0000 (GMT) Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: Message-ID: <798430.55699.qm@web24204.mail.ird.yahoo.com> > Date: Tue, 5 May 2009 09:24:53 -0600 > From: Charles R Harris ... > This is almost always an ATLAS problem. Where did your > ATLAS come from and > what distro are you running? You are probably right. I compiled and installed ATLAS from source. The distro is Redhat Enterprise Linux 4. I had to because the ones from the distro are compiled targetting 64-bit architecture. I largely followed the instructions at http://www.scipy.org/Installing_SciPy/Linux#head-eecf834fad12bf7a625752528547588a93f8263c . Built lapack 3.1.1, then copied the library to ATLAS and compiled per instructions. The compilers are CC='gcc -m32' version 3.4.4 (enforcing 32 bit compilation), g77 3.4.4. At various points I needed to define flags to ensure 32 bits, though not in numpy compilation. I have gfortran on the system but I didn't use it. liblapack.so and other .so files are linked to libg2c.so not libgfortran. ATLAS (3.8.3) was configured with the additional -32 flag as well. make check works nicely. What should I check in order to find the error with ATLAS configuration and/or installation? Or is there a 32 bit version binary I can download/ use (even if only for testing)? Regards, Muhammad Alkarouri From david at ar.media.kyoto-u.ac.jp Wed May 6 03:45:59 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 06 May 2009 16:45:59 +0900 Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: <798430.55699.qm@web24204.mail.ird.yahoo.com> References: <798430.55699.qm@web24204.mail.ird.yahoo.com> Message-ID: <4A014037.5080205@ar.media.kyoto-u.ac.jp> Muhammad Alkarouri wrote: >> Date: Tue, 5 May 2009 09:24:53 -0600 >> From: Charles R Harris >> > ... > >> This is almost always an ATLAS problem. Where did your >> ATLAS come from and >> what distro are you running? >> > > You are probably right. I compiled and installed ATLAS from source. The distro is Redhat Enterprise Linux 4. I had to because the ones from the distro are compiled targetting 64-bit architecture. > > I largely followed the instructions at http://www.scipy.org/Installing_SciPy/Linux#head-eecf834fad12bf7a625752528547588a93f8263c . Built lapack 3.1.1, then copied the library to ATLAS and compiled per instructions. The compilers are CC='gcc -m32' version 3.4.4 (enforcing 32 bit compilation), g77 3.4.4. At various points I needed to define flags to ensure 32 bits, though not in numpy compilation. I have gfortran on the system but I didn't use it. > What does ldd lapack_lite.so returns (lapack_lite.so is in numpy/linalg, in your installed directory) ? It may be that numpy uses gfortran, whereas ATLAS is built with g77. gfortran and g77 should not be mixed, cheers, David From sebastian.walter at gmail.com Wed May 6 04:07:12 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 6 May 2009 10:07:12 +0200 Subject: [Numpy-discussion] difficult optimization problem In-Reply-To: <4A00E17D.6040203@jpl.nasa.gov> References: <4A00E17D.6040203@jpl.nasa.gov> Message-ID: I tried looking at your question but ... kind of unusable without some documentation. You need to give at least the following information: what kind of optimization problem? LP,NLP, Mixed Integer LP, Stochastic, semiinfinite, semidefinite? Most solvers require the problem in the following form min_x f(x) subject to g(x)<=0 h(x) = 0 In your case that would mean: g(x) = g(x,f(x)). On Wed, May 6, 2009 at 3:01 AM, Mathew Yeates wrote: > Hi > I'm trying to solve an optimization problem where the search domain is > limited. Suppose I want to minimize the function f(x,y) but f(x,y) is > only valid over a subset (unknown without calling f) of (x,y)? > > I tried looking at OpenOpt but ... kind of unusable without some > documentation. > > Thanks > Mathew > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From malkarouri at yahoo.co.uk Wed May 6 05:15:39 2009 From: malkarouri at yahoo.co.uk (Muhammad Alkarouri) Date: Wed, 6 May 2009 09:15:39 +0000 (GMT) Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: <4A014037.5080205@ar.media.kyoto-u.ac.jp> Message-ID: <208618.12427.qm@web24203.mail.ird.yahoo.com> --- On Wed, 6/5/09, David Cournapeau wrote: ... > What does ldd lapack_lite.so returns (lapack_lite.so is in > numpy/linalg, > in your installed directory) ? It may be that numpy uses > gfortran, > whereas ATLAS is built with g77. gfortran and g77 should > not be mixed, Thanks David. I went there and found that lapack_lite.so didn't link to ATLAS in the first place. So I rebuilt numpy to ensure that. Now I have: ma856388 at H:linalg>ldd lapack_lite.so linux-gate.so.1 => (0xffffe000) liblapack.so => /users/d88/ma856388/lib/liblapack.so (0xf790f000) libptf77blas.so => /users/d88/ma856388/lib/libptf77blas.so (0xf78f0000) libptcblas.so => /users/d88/ma856388/lib/libptcblas.so (0xf78cf000) libatlas.so => /users/d88/ma856388/lib/libatlas.so (0xf755d000) libf77blas.so => /users/d88/ma856388/lib/libf77blas.so (0xf753e000) libcblas.so => /users/d88/ma856388/lib/libcblas.so (0xf751c000) libg2c.so.0 => /usr/lib/libg2c.so.0 (0xf74d2000) libm.so.6 => /lib/tls/libm.so.6 (0xf74af000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf74a7000) libc.so.6 => /lib/tls/libc.so.6 (0xf737d000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0xf736b000) /lib/ld-linux.so.2 (0x56555000) but still, test_pinv hangs using almost 100% of CPU time. Any suggestions? Regards, Muhammad Alkarouri From david at ar.media.kyoto-u.ac.jp Wed May 6 05:10:20 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 06 May 2009 18:10:20 +0900 Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: <208618.12427.qm@web24203.mail.ird.yahoo.com> References: <208618.12427.qm@web24203.mail.ird.yahoo.com> Message-ID: <4A0153FC.8000106@ar.media.kyoto-u.ac.jp> Muhammad Alkarouri wrote: > --- On Wed, 6/5/09, David Cournapeau wrote: > ... > >> What does ldd lapack_lite.so returns (lapack_lite.so is in >> numpy/linalg, >> in your installed directory) ? It may be that numpy uses >> gfortran, >> whereas ATLAS is built with g77. gfortran and g77 should >> not be mixed, >> > > Thanks David. I went there and found that lapack_lite.so didn't link to ATLAS in the first place. So I rebuilt numpy to ensure that. Now I have: > Ok, so that's not a gfortran problem. As Chuck, I think that's an atlas problem (you could check by compiling without ATLAS: ATLAS=None python setup.py build after removing the build directory). Your gcc compiler is quite old, so I would not be surprised it that were related, cheers, David From dsdale24 at gmail.com Wed May 6 07:17:52 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 6 May 2009 07:17:52 -0400 Subject: [Numpy-discussion] [review] py3k_bootstrap branch In-Reply-To: <5b8d13220905051812o39874428k6077cd5d9f3dc225@mail.gmail.com> References: <5b8d13220905050233m14c4a1e1l80dd6f231e100d5a@mail.gmail.com> <4A005A8C.4060709@gmail.com> <5b8d13220905050850i125ebe82x2ec026af4035b11c@mail.gmail.com> <5b8d13220905051812o39874428k6077cd5d9f3dc225@mail.gmail.com> Message-ID: On Tue, May 5, 2009 at 9:12 PM, David Cournapeau wrote: > On Wed, May 6, 2009 at 3:57 AM, Darren Dale wrote: > > > > > There is a lot of interest in a 3to2 tool, and I have read speculation > > ( > http://sayspy.blogspot.com/2009/04/pycon-2009-recap-best-pycon-ever.html) > > that going from 3 to 2 should be easier than the other way around. Maybe > it > > will be worth keeping an eye on. > > I can see how this could help people who have a working python 3 > implementation, but in numpy's case, I am not so sure. Do you know > which version of python is targeted by 3to2 ? 2.6, 2.5 or even below ? I was thinking further down the road, once numpy has a python-3 implementation. Based on http://wiki.python.org/moin/3to2 , it looks like people are thinking about the possibility of supporting 2.5 and earlier. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From schut at sarvision.nl Wed May 6 07:28:11 2009 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 06 May 2009 13:28:11 +0200 Subject: [Numpy-discussion] bitwise view on numpy array Message-ID: Hi, I'm gonna have large (e.g. 2400x2400) arrays of 16 and 32 bit bitfields. I've been searching in vain for an efficient and convenient way to represent these array's individual bit's (or, even better, configureable bitfields of 1-4 bits each). Of course I know I can 'split' the array in its separate bitfields using bitwise operators and shifts, but this will greatly increase the memory usage because it'll create one byte array for each bitfield. So I was looking for a way to create a bitwise view on the original array's data. I've been looking at recarray's, but the smallest element these can use are bytes, correct?. I've been looking at ctypes arrays of Structure subclasses, which can define bitfields. However, these will give me an object array of elements with the Structure class subclass, and only allow me to access the bits per array element instead of for the entire array (or a subset), e.g. data[:].bit17-19 or someting like that. After searching the net in vain for some hours, the list is my last resort :-) Anyone having ideas of how to get both memory-efficient and convenient access to single bits of a numpy array? On a slightly related note, during my search I found some comments saying that numpy.bool arrays use an entire byte for each element. Could someone confirm (or, better, negate) that? Thanks, Vincent. From stefan at sun.ac.za Wed May 6 07:45:09 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 6 May 2009 13:45:09 +0200 Subject: [Numpy-discussion] bitwise view on numpy array In-Reply-To: References: Message-ID: <9457e7c80905060445o54657ba9pc66cfefc99c87c4c@mail.gmail.com> Hi Vincent Take a look at http://pypi.python.org/pypi/bitarray/ I'm not sure if you can initialise bitarrays from NumPy arrays. If not, you'll have to implement a conversion scheme, but that can be done without making a copy. Regards St?fan 2009/5/6 Vincent Schut : > Hi, > > I'm gonna have large (e.g. 2400x2400) arrays of 16 and 32 bit bitfields. > ?I've been searching in vain for an efficient and convenient way to > represent these array's individual bit's (or, even better, configureable > bitfields of 1-4 bits each). From malkarouri at yahoo.co.uk Wed May 6 09:31:39 2009 From: malkarouri at yahoo.co.uk (Muhammad Alkarouri) Date: Wed, 6 May 2009 13:31:39 +0000 (GMT) Subject: [Numpy-discussion] linalg.svd not working? In-Reply-To: <4A0153FC.8000106@ar.media.kyoto-u.ac.jp> Message-ID: <881804.58063.qm@web24202.mail.ird.yahoo.com> --- On Wed, 6/5/09, David Cournapeau wrote: ... > Ok, so that's not a gfortran problem. As Chuck, I think > that's an atlas > problem (you could check by compiling without ATLAS: It is an atlas problem. Not that I knew how to correct it, but I was able to build numpy with a standard package blas and lapack, and the tests passed without incident. Many thanks. I guess I will leave the atlas benefits for another day. Cheers, Muhammad Alkarouri From Gerry.Talbot at amd.com Wed May 6 09:44:36 2009 From: Gerry.Talbot at amd.com (Talbot, Gerry) Date: Wed, 6 May 2009 08:44:36 -0500 Subject: [Numpy-discussion] Recurrence relationships Message-ID: Does anyone know how to efficiently implement a recurrence relationship in numpy such as: y[n] = A*x[n] + B*y[n-1] Thanks, Gerry -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed May 6 09:53:25 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 6 May 2009 06:53:25 -0700 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: References: Message-ID: On Wed, May 6, 2009 at 6:44 AM, Talbot, Gerry wrote: > Does anyone know how to efficiently implement a recurrence relationship in > numpy such as: > > > > ???????????? y[n] = A*x[n] + B*y[n-1] On an intel chip I'd use a Monte Carlo simulation. On an amd chip I'd use: >> x = np.array([1,2,3]) >> y = np.array([4,5,6]) >> y = x[1:] + y[:-1] >> y array([6, 8]) From Gerry.Talbot at amd.com Wed May 6 10:00:08 2009 From: Gerry.Talbot at amd.com (Talbot, Gerry) Date: Wed, 6 May 2009 09:00:08 -0500 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: References: Message-ID: Sorry, I guess I wasn't clear, I meant: for n in xrange(1,N): y[n] = A*x[n] + B*y[n-1] So y[n-1] is the result from the previous loop iteration. Gerry -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Keith Goodman Sent: Wednesday, May 06, 2009 9:53 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Recurrence relationships On Wed, May 6, 2009 at 6:44 AM, Talbot, Gerry wrote: > Does anyone know how to efficiently implement a recurrence relationship in > numpy such as: > > > > ???????????? y[n] = A*x[n] + B*y[n-1] On an intel chip I'd use a Monte Carlo simulation. On an amd chip I'd use: >> x = np.array([1,2,3]) >> y = np.array([4,5,6]) >> y = x[1:] + y[:-1] >> y array([6, 8]) _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Wed May 6 10:21:13 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 6 May 2009 10:21:13 -0400 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: References: Message-ID: <1cd32cbb0905060721y4282cb77w4ca460529774c901@mail.gmail.com> On Wed, May 6, 2009 at 10:00 AM, Talbot, Gerry wrote: > Sorry, I guess I wasn't clear, I meant: > > ? ? ? ?for n in xrange(1,N): > ? ? ? ? ?y[n] = A*x[n] + B*y[n-1] > > So y[n-1] is the result from the previous loop iteration. > I was using scipy.signal for this but I have to look up what I did exactly. I think either signal.correlate or using signal.lti. Josef From natachai_w at hotmail.com Wed May 6 10:18:35 2009 From: natachai_w at hotmail.com (natachai wongchavalidkul) Date: Wed, 6 May 2009 07:18:35 -0700 Subject: [Numpy-discussion] ValueError: dimensions too large. Message-ID: Hello alls, I currently have a problem with creating a multi-dimensional array in numpy. The following is what I am trying to do and the error message. >>> test = zeros((3,3,3,3,3,3,10,4,6,2,18,10,11,4,2,2), dtype=float); Traceback (most recent call last): File "", line 1, in test = zeros((3,3,3,3,3,3,10,4,6,2,18,10,11,4,2,2), dtype=float); ValueError: dimensions too large. I haven't sure if they should be something to do with the memory or any other suggestions for the way to solve this problem. Anyway, comments or suggestions will be really appreciate though. Thank you _________________________________________________________________ Hotmail? has a new way to see what's up with your friends. http://windowslive.com/Tutorial/Hotmail/WhatsNew?ocid=TXT_TAGLM_WL_HM_Tutorial_WhatsNew1_052009 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Wed May 6 10:25:17 2009 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 06 May 2009 10:25:17 -0400 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: References: Message-ID: <4A019DCD.8030705@american.edu> On 5/6/2009 10:00 AM Talbot, Gerry apparently wrote: > for n in xrange(1,N): > y[n] = A*x[n] + B*y[n-1] So, x is known before you start? How big is N? Also, is y.shape (N,)? Do you need all of y or only y[N]? Alan Isaac From silva at lma.cnrs-mrs.fr Wed May 6 10:29:21 2009 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Wed, 06 May 2009 16:29:21 +0200 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: <1cd32cbb0905060721y4282cb77w4ca460529774c901@mail.gmail.com> References: <1cd32cbb0905060721y4282cb77w4ca460529774c901@mail.gmail.com> Message-ID: <1241620161.2950.30.camel@localhost.localdomain> Le mercredi 06 mai 2009 ? 10:21 -0400, josef.pktd at gmail.com a ?crit : > On Wed, May 6, 2009 at 10:00 AM, Talbot, Gerry wrote: > > Sorry, I guess I wasn't clear, I meant: > > > > for n in xrange(1,N): > > y[n] = A*x[n] + B*y[n-1] > > > > So y[n-1] is the result from the previous loop iteration. > > > > I was using scipy.signal for this but I have to look up what I did > exactly. I think either signal.correlate or using signal.lti. > > Josef Isn't it what scipy.signal.lfilter does ? y=scipy.signal.lfilter([A],[1,-B],x) You may be careful with initial conditions... -- Fabrice Silva LMA UPR CNRS 7051 From josef.pktd at gmail.com Wed May 6 10:28:46 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 6 May 2009 10:28:46 -0400 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: <1cd32cbb0905060721y4282cb77w4ca460529774c901@mail.gmail.com> References: <1cd32cbb0905060721y4282cb77w4ca460529774c901@mail.gmail.com> Message-ID: <1cd32cbb0905060728y6df45c02n83b978e09904a5f6@mail.gmail.com> On Wed, May 6, 2009 at 10:21 AM, wrote: > On Wed, May 6, 2009 at 10:00 AM, Talbot, Gerry wrote: >> Sorry, I guess I wasn't clear, I meant: >> >> ? ? ? ?for n in xrange(1,N): >> ? ? ? ? ?y[n] = A*x[n] + B*y[n-1] >> >> So y[n-1] is the result from the previous loop iteration. >> > > I was using scipy.signal for this but I have to look up what I did > exactly. I think either signal.correlate or using signal.lti. > No, its signal.lfilter, below is a part of a script I used to simulate and estimate an AR(1) process, which is similar to your example. I haven't looked at it in a while but it might give you the general idea. Josef # Simulate AR(1) #-------------- # ar * y = ma * eta ar = [1, -0.8] ma = [1.0] # generate AR data eta = 0.1 * np.random.randn(1000) yar1 = signal.lfilter(ar, ma, eta) etahat = signal.lfilter(ma, ar, y) np.all(etahat == eta) # find error for given filter on data print 'AR(2)' for rho in [0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.79, 0.8, 0.81, 0.9]: etahatr = signal.lfilter(ma, [1, --rho], yar1) print rho,np.sum(etahatr*etahatr) print 'AR(2)' for rho2 in np.linspace(-0.4,0.4,9): etahatr = signal.lfilter(ma, [1, -0.8, -rho2], yar1) print rho2,np.sum(etahatr*etahatr) def errfn(rho): etahatr = signal.lfilter(ma, [1, -rho], yar1) #print rho,np.sum(etahatr*etahatr) return etahatr def errssfn(rho): etahatr = signal.lfilter(ma, [1, -rho], yar1) return np.sum(etahatr*etahatr) resultls = optimize.leastsq(errfn,[0.5]) print 'LS ARMA(1,0)', resultls resultfmin = optimize.fmin(errssfn, 0.5) print 'fminLS ARMA(1,0)', resultfmin From charlesr.harris at gmail.com Wed May 6 10:32:35 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 6 May 2009 08:32:35 -0600 Subject: [Numpy-discussion] ValueError: dimensions too large. In-Reply-To: References: Message-ID: On Wed, May 6, 2009 at 8:18 AM, natachai wongchavalidkul < natachai_w at hotmail.com> wrote: > > Hello alls, > > I currently have a problem with creating a multi-dimensional array in > numpy. The following is what I am trying to do and the error message. > > >>> test = zeros((3,3,3,3,3,3,10,4,6,2,18,10,11,4,2,2), dtype=float); > > Traceback (most recent call last): > File "", line 1, in > test = zeros((3,3,3,3,3,3,10,4,6,2,18,10,11,4,2,2), dtype=float); > ValueError: dimensions too large. > > I haven't sure if they should be something to do with the memory or any > other suggestions for the way to solve this problem. Anyway, comments or > suggestions will be really appreciate though. > There is not enough memory to hold the array. In [3]: prod = 1 In [4]: for i in (3,3,3,3,3,3,10,4,6,2,18,10,11,4,2,2) : ...: prod *= i ...: In [5]: prod Out[5]: 11085465600L That is 11 gigs of floats, each of which is 8 bytes. So you need about 88 gigs for the array. I expect that that is not what you are trying to do. Do you just want an array with the listed values? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Gerry.Talbot at amd.com Wed May 6 10:37:09 2009 From: Gerry.Talbot at amd.com (Talbot, Gerry) Date: Wed, 6 May 2009 09:37:09 -0500 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: <4A019DCD.8030705@american.edu> References: <4A019DCD.8030705@american.edu> Message-ID: The application is essentially filtering 1D arrays, typically N is >20e6, the required result is y[1:N]. Gerry -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Alan G Isaac Sent: Wednesday, May 06, 2009 10:25 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Recurrence relationships On 5/6/2009 10:00 AM Talbot, Gerry apparently wrote: > for n in xrange(1,N): > y[n] = A*x[n] + B*y[n-1] So, x is known before you start? How big is N? Also, is y.shape (N,)? Do you need all of y or only y[N]? Alan Isaac _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Wed May 6 10:55:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 May 2009 23:55:18 +0900 Subject: [Numpy-discussion] Recurrence relationships In-Reply-To: References: Message-ID: <5b8d13220905060755y60c05847u1ca8556341e56a34@mail.gmail.com> On Wed, May 6, 2009 at 10:44 PM, Talbot, Gerry wrote: > Does anyone know how to efficiently implement a recurrence relationship in > numpy such as: > > > > ???????????? y[n] = A*x[n] + B*y[n-1] That's the direct implement of a linear filter with an infinite impulse response. That's exactly what scipy.signal.lfilter is for, cheers, David From myeates at jpl.nasa.gov Wed May 6 15:16:12 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 06 May 2009 12:16:12 -0700 Subject: [Numpy-discussion] hairy optimization problem Message-ID: <4A01E1FC.8090701@jpl.nasa.gov> I have a function f(x,y) which produces N values [v1,v2,v3 .... vN] where some of the values are None (only found after evaluation) each evaluation of "f" is expensive and N is large. I want N x,y pairs which produce the optimal value in each column. A brute force approach would be to generate [v11,v12,v13,v14 ....] [v21,v22,v23 ....... ] etc then locate the maximum of each column. This is far too slow ......Any other ideas? From dwf at cs.toronto.edu Wed May 6 18:03:24 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 6 May 2009 18:03:24 -0400 Subject: [Numpy-discussion] OS-X binary name... In-Reply-To: <4A01284E.6070104@noaa.gov> References: <4A01284E.6070104@noaa.gov> Message-ID: On 6-May-09, at 2:03 AM, Christopher Barker wrote: > maybe: > > numpy-1.3.0-py2.5-macosx-python.org.dmg +1 on having python.org in the name. It clarifies and reinforces the case that this isn't for the "Apple-shipped" Python (which I heard comes with NumPy now?). David From sccolbert at gmail.com Wed May 6 18:06:09 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 6 May 2009 18:06:09 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> Message-ID: <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> I decided to hold myself over until being able to take a hard look at the numpy histogramdd code: Here is a quick thing a put together in cython. It's a 40x speedup over histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) array, this executed in 0.005 seconds on my machine. This only works for arrays with uint8 data types having dimensions (x, y, 3) (common image format). The return array is a (16, 16, 16) equal width bin histogram of the input. If anyone wants the cython C-output, let me know and I will email it to you. If there is interest, I will extend this for different size bins and aliases for different data types. Chris import numpy as np cimport numpy as np DTYPE = np.uint8 DTYPE32 = np.int ctypedef np.uint8_t DTYPE_t ctypedef np.int_t DTYPE_t32 def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2 for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1 return out On Tue, May 5, 2009 at 9:46 AM, David Huard wrote: > > > On Mon, May 4, 2009 at 4:18 PM, wrote: > >> On Mon, May 4, 2009 at 4:00 PM, Chris Colbert >> wrote: >> > i'll take a look at them over the next few days and see what i can hack >> out. >> > >> > Chris >> > >> > On Mon, May 4, 2009 at 3:18 PM, David Huard >> wrote: >> >> >> >> >> >> On Mon, May 4, 2009 at 7:00 AM, wrote: >> >>> >> >>> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert >> >>> wrote: >> >>> > this actually sort of worked. Thanks for putting me on the right >> track. >> >>> > >> >>> > Here is what I ended up with. >> >>> > >> >>> > this is what I ended up with: >> >>> > >> >>> > def hist3d(imgarray): >> >>> > histarray = N.zeros((16, 16, 16)) >> >>> > temp = imgarray.copy() >> >>> > bins = N.arange(0, 257, 16) >> >>> > histarray = N.histogramdd((temp[:,:,0].ravel(), >> >>> > temp[:,:,1].ravel(), >> >>> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0] >> >>> > return histarray >> >>> > >> >>> > this creates a 3d histogram of rgb image values in the range 0,255 >> >>> > using 16 >> >>> > bins per component color. >> >>> > >> >>> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a >> for >> >>> > loop. >> >>> > >> >>> > not quite framerate, but good enough for prototyping. >> >>> > >> >>> >> >>> I don't think your copy to temp is necessary, and use reshape(-1,3) as >> >>> in the example of Stefan, which will avoid copying the array 3 times. >> >>> >> >>> If you need to gain some more speed, then rewriting histogramdd and >> >>> removing some of the unnecessary checks and calculations looks >> >>> possible. >> >> >> >> Indeed, the strategy used in the histogram function is faster than the >> one >> >> used in the histogramdd case, so porting one to the other should speed >> >> things up. >> >> >> >> David >> >> is searchsorted faster than digitize and bincount ? >> > > That depends on the number of bins and whether or not the bin width is > uniform. A 1D benchmark I did a while ago showed that if the bin width is > uniform, then the best strategy is to create a counter initialized to 0, > loop through the data, compute i = (x-bin0) /binwidth and increment counter > i by 1 (or by the weight of the data). If the bins are non uniform, then for > nbin > 30 you'd better use searchsort, and digitize otherwise. > > For those interested in speeding up histogram code, I recommend reading a > thread started by Cameron Walsh on the 12/12/06 named "Histograms of > extremely large data sets" Code and benchmarks were posted. > > Chris, if your bins all have the same width, then you can certainly write > an histogramdd routine that is way faster by using the indexing trick > instead of digitize or searchsort. > > Cheers, > > David > > > > >> >> Using the idea of histogramdd, I get a bit below a tenth of a second, >> my best for this problem is below. >> I was trying for a while what the fastest way is to convert a two >> dimensional array into a one dimensional index for bincount. I found >> that using the return index of unique1d is very slow compared to >> numeric index calculation. >> >> Josef >> >> example timed for: >> nobs = 307200 >> nbins = 16 >> factors = np.random.randint(256,size=(nobs,3)).copy() >> factors2 = factors.reshape(-1,480,3).copy() >> >> def hist3(factorsin, nbins): >> if factorsin.ndim != 2: >> factors = factorsin.reshape(-1,factorsin.shape[-1]) >> else: >> factors = factorsin >> N, D = factors.shape >> darr = np.empty(factors.T.shape, dtype=int) >> nele = np.max(factors)+1 >> bins = np.arange(0, nele, nele/nbins) >> bins[-1] += 1 >> for i in range(D): >> darr[i] = np.digitize(factors[:,i],bins) - 1 >> >> #add weighted rows >> darrind = darr[D-1] >> for i in range(D-1): >> darrind += darr[i]*nbins**(D-i-1) >> return np.bincount(darrind) # return flat not reshaped >> > >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed May 6 18:10:43 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 06 May 2009 15:10:43 -0700 Subject: [Numpy-discussion] OS-X binary name... In-Reply-To: References: <4A01284E.6070104@noaa.gov> Message-ID: <4A020AE3.7090005@noaa.gov> David Warde-Farley wrote: > On 6-May-09, at 2:03 AM, Christopher Barker wrote: >> maybe: >> >> numpy-1.3.0-py2.5-macosx-python.org.dmg > > +1 on having python.org in the name. It clarifies and reinforces the > case that this isn't for the "Apple-shipped" Python exactly. > (which I heard comes with NumPy now?). yup, but an old and crusty version... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From thomas.robitaille at gmail.com Wed May 6 19:19:07 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Wed, 6 May 2009 16:19:07 -0700 (PDT) Subject: [Numpy-discussion] Numpy Trac site redirecting in a loop? In-Reply-To: <49E6658A.1090004@jhu.edu> References: <49E64FF7.3050804@jhu.edu> <49E6658A.1090004@jhu.edu> Message-ID: <23417366.post@talk.nabble.com> Hi, I'm having the exact same problem, trying to log in to the trac website for numpy, and getting stuck in a redirect loop. I tried different browsers, and no luck. The browser gets stuck on http://projects.scipy.org/numpy/prefs/account and stops loading after a while because of too many redirects... Is there any way around this? Thanks, Thomas -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23417366.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Wed May 6 19:24:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 6 May 2009 19:24:48 -0400 Subject: [Numpy-discussion] Numpy Trac site redirecting in a loop? In-Reply-To: <23417366.post@talk.nabble.com> References: <49E64FF7.3050804@jhu.edu> <49E6658A.1090004@jhu.edu> <23417366.post@talk.nabble.com> Message-ID: <3d375d730905061624k6376e57brfb19d72344b99ec2@mail.gmail.com> On Wed, May 6, 2009 at 19:19, Thomas Robitaille wrote: > > Hi, > > I'm having the exact same problem, trying to log in to the trac website for > numpy, and getting stuck in a redirect loop. I tried different browsers, and > no luck. The browser gets stuck on > > http://projects.scipy.org/numpy/prefs/account > > and stops loading after a while because of too many redirects... > > Is there any way around this? I don't see this on my Mac with Firefox 3. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed May 6 19:30:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 6 May 2009 19:30:00 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> Message-ID: <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> On Wed, May 6, 2009 at 6:06 PM, Chris Colbert wrote: > I decided to hold myself over until being able to take a hard look at the > numpy histogramdd code: > > Here is a quick thing a put together in cython. It's a 40x speedup over > histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) > array, this executed in 0.005 seconds on my machine. > > This only works for arrays with uint8 data types having dimensions (x, y, 3) > (common image format). The return array is a (16, 16, 16) equal width bin > histogram of the input. > > If anyone wants the cython C-output, let me know and I will email it to you. > > If there is interest, I will extend this for different size bins and aliases > for different data types. > > Chris > > import numpy as np > > cimport numpy as np > > DTYPE = np.uint8 > DTYPE32 = np.int > > ctypedef np.uint8_t DTYPE_t > ctypedef np.int_t DTYPE_t32 > > def hist3d(np.ndarray[DTYPE_t, ndim=3] img): > ??? cdef int x = img.shape[0] > ??? cdef int y = img.shape[1] > ??? cdef int z = img.shape[2] > ??? cdef int addx > ??? cdef int addy > ??? cdef int addz > ??? cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], > dtype=DTYPE32) > ??? cdef int i, j, v0, v1, v2 > > > ??? for i in range(x): > ??????? for j in range(y): > ??????????? v0 = img[i, j, 0] > ??????????? v1 = img[i, j, 1] > ??????????? v2 = img[i, j, 2] > ??????????? addx = (v0 - (v0 % 16)) / 16 > ??????????? addy = (v1 - (v1 % 16)) / 16 > ??????????? addz = (v2 - (v2 % 16)) / 16 > ??????????? out[addx, addy, addz] += 1 > > ??? return out > Thanks for the example for using cython. Once I figure out what the types are, cython will look very convenient for loops, and pyximport takes care of the compiler. Josef import pyximport; pyximport.install() import hist_rgb #name of .pyx files import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = hist_rgb.hist3d(factors.astype(np.uint8)) print h[:,:,0] From thomas.robitaille at gmail.com Wed May 6 19:35:23 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Wed, 6 May 2009 16:35:23 -0700 (PDT) Subject: [Numpy-discussion] Numpy Trac site redirecting in a loop? In-Reply-To: <3d375d730905061624k6376e57brfb19d72344b99ec2@mail.gmail.com> References: <49E64FF7.3050804@jhu.edu> <49E6658A.1090004@jhu.edu> <23417366.post@talk.nabble.com> <3d375d730905061624k6376e57brfb19d72344b99ec2@mail.gmail.com> Message-ID: <23417595.post@talk.nabble.com> Could it be linked to specific users, since the problem occurs when loading the account page? I had the same problem on two different computers with two different browsers. Thomas -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23417595.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From sccolbert at gmail.com Wed May 6 19:39:18 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 6 May 2009 19:39:18 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905031736n99b1907v13de267be7639f39@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> Message-ID: <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> i just realized I don't need the line: cdef int z = img.shape(2) it's left over from tinkering. sorry. And i should probably convert the out array to type float to handle large data sets. Chris On Wed, May 6, 2009 at 7:30 PM, wrote: > On Wed, May 6, 2009 at 6:06 PM, Chris Colbert wrote: > > I decided to hold myself over until being able to take a hard look at the > > numpy histogramdd code: > > > > Here is a quick thing a put together in cython. It's a 40x speedup over > > histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) > > array, this executed in 0.005 seconds on my machine. > > > > This only works for arrays with uint8 data types having dimensions (x, y, > 3) > > (common image format). The return array is a (16, 16, 16) equal width bin > > histogram of the input. > > > > If anyone wants the cython C-output, let me know and I will email it to > you. > > > > If there is interest, I will extend this for different size bins and > aliases > > for different data types. > > > > Chris > > > > import numpy as np > > > > cimport numpy as np > > > > DTYPE = np.uint8 > > DTYPE32 = np.int > > > > ctypedef np.uint8_t DTYPE_t > > ctypedef np.int_t DTYPE_t32 > > > > def hist3d(np.ndarray[DTYPE_t, ndim=3] img): > > cdef int x = img.shape[0] > > cdef int y = img.shape[1] > > cdef int z = img.shape[2] > > cdef int addx > > cdef int addy > > cdef int addz > > cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], > > dtype=DTYPE32) > > cdef int i, j, v0, v1, v2 > > > > > > for i in range(x): > > for j in range(y): > > v0 = img[i, j, 0] > > v1 = img[i, j, 1] > > v2 = img[i, j, 2] > > addx = (v0 - (v0 % 16)) / 16 > > addy = (v1 - (v1 % 16)) / 16 > > addz = (v2 - (v2 % 16)) / 16 > > out[addx, addy, addz] += 1 > > > > return out > > > > Thanks for the example for using cython. Once I figure out what the > types are, cython will look very convenient for loops, and pyximport > takes care of the compiler. > > Josef > > import pyximport; pyximport.install() > import hist_rgb #name of .pyx files > > import numpy as np > factors = np.random.randint(256,size=(480, 630, 3)) > h = hist_rgb.hist3d(factors.astype(np.uint8)) > print h[:,:,0] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 6 20:21:45 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 6 May 2009 20:21:45 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <7f014ea60905032131i114b8fdbyb9bfd04ad7d1200a@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> Message-ID: <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> On Wed, May 6, 2009 at 7:39 PM, Chris Colbert wrote: > i just realized I don't need the line: > > cdef int z = img.shape(2) > > it's left over from tinkering. sorry. And i should probably convert the out > array to type float to handle large data sets. > > Chris > > On Wed, May 6, 2009 at 7:30 PM, wrote: >> >> On Wed, May 6, 2009 at 6:06 PM, Chris Colbert wrote: >> > I decided to hold myself over until being able to take a hard look at >> > the >> > numpy histogramdd code: >> > >> > Here is a quick thing a put together in cython. It's a 40x speedup over >> > histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, 3) >> > array, this executed in 0.005 seconds on my machine. >> > >> > This only works for arrays with uint8 data types having dimensions (x, >> > y, 3) >> > (common image format). The return array is a (16, 16, 16) equal width >> > bin >> > histogram of the input. >> > >> > If anyone wants the cython C-output, let me know and I will email it to >> > you. >> > >> > If there is interest, I will extend this for different size bins and >> > aliases >> > for different data types. >> > >> > Chris >> > >> > import numpy as np >> > >> > cimport numpy as np >> > >> > DTYPE = np.uint8 >> > DTYPE32 = np.int >> > >> > ctypedef np.uint8_t DTYPE_t >> > ctypedef np.int_t DTYPE_t32 >> > >> > def hist3d(np.ndarray[DTYPE_t, ndim=3] img): >> > ??? cdef int x = img.shape[0] >> > ??? cdef int y = img.shape[1] >> > ??? cdef int z = img.shape[2] >> > ??? cdef int addx >> > ??? cdef int addy >> > ??? cdef int addz >> > ??? cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], >> > dtype=DTYPE32) >> > ??? cdef int i, j, v0, v1, v2 >> > >> > >> > ??? for i in range(x): >> > ??????? for j in range(y): >> > ??????????? v0 = img[i, j, 0] >> > ??????????? v1 = img[i, j, 1] >> > ??????????? v2 = img[i, j, 2] >> > ??????????? addx = (v0 - (v0 % 16)) / 16 >> > ??????????? addy = (v1 - (v1 % 16)) / 16 >> > ??????????? addz = (v2 - (v2 % 16)) / 16 >> > ??????????? out[addx, addy, addz] += 1 >> > >> > ??? return out >> > >> >> Thanks for the example for using cython. Once I figure out what the >> types are, cython will look very convenient for loops, and pyximport >> takes care of the compiler. >> >> Josef >> >> import pyximport; pyximport.install() >> import hist_rgb ? ?#name of .pyx files >> >> import numpy as np >> factors = np.random.randint(256,size=(480, 630, 3)) >> h = hist_rgb.hist3d(factors.astype(np.uint8)) >> print h[:,:,0] playing some more with cython: here is a baby on the fly code generator input type int, output type float64 a dispatch function by type is missing no segfaults, even though most of the time a call the function with the wrong type. Josef code = ''' import numpy as np cimport numpy as np __all__ = ["hist3d"] DTYPE = ${imgtype} DTYPE32 = ${outtype} ctypedef ${imgtype}_t DTYPE_t ctypedef ${outtype}_t DTYPE_t32 def hist3d(np.ndarray[DTYPE_t, ndim=3] img): cdef int x = img.shape[0] cdef int y = img.shape[1] #cdef int z = img.shape[2] cdef int addx cdef int addy cdef int addz cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], dtype=DTYPE32) cdef int i, j, v0, v1, v2 for i in range(x): for j in range(y): v0 = img[i, j, 0] v1 = img[i, j, 1] v2 = img[i, j, 2] addx = (v0 - (v0 % 16)) / 16 addy = (v1 - (v1 % 16)) / 16 addz = (v2 - (v2 % 16)) / 16 out[addx, addy, addz] += 1 return out ''' from string import Template s = Template(code) src = s.substitute({'imgtype': 'np.int', 'outtype': 'np.float64'}) open('histrgbintfl2.pyx','w').write(src) import pyximport; pyximport.install() import histrgbintfl2 import numpy as np factors = np.random.randint(256,size=(480, 630, 3)) h = histrgbintfl2.hist3d(factors)#.astype(np.uint8)) print h[:,:,0] From kbasye1 at jhu.edu Wed May 6 20:32:05 2009 From: kbasye1 at jhu.edu (Ken Basye) Date: Wed, 6 May 2009 17:32:05 -0700 (PDT) Subject: [Numpy-discussion] Numpy Trac site redirecting in a loop? In-Reply-To: <23417366.post@talk.nabble.com> References: <49E64FF7.3050804@jhu.edu> <49E6658A.1090004@jhu.edu> <23417366.post@talk.nabble.com> Message-ID: <23418151.post@talk.nabble.com> I ran into something like this a couple weeks ago. I use Firefox 3 on MacOS. My work-around was to clear all the cookies from scipy.org, clear all authenticated sessions, then register a completely new account name. I never could get my existing account to stop looping. HTH, Ken Thomas Robitaille wrote: > > Hi, > > I'm having the exact same problem, trying to log in to the trac website > for numpy, and getting stuck in a redirect loop. I tried different > browsers, and no luck. The browser gets stuck on > > http://projects.scipy.org/numpy/prefs/account > > and stops loading after a while because of too many redirects... > > Is there any way around this? > > Thanks, > > Thomas > -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23418151.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From sccolbert at gmail.com Wed May 6 20:34:18 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 6 May 2009 20:34:18 -0400 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <1cd32cbb0905040400p4dcd3de7he45b3be942dc2c02@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> Message-ID: <7f014ea60905061734y3dfcb299w14681ff0020a4de1@mail.gmail.com> nice! This was really my first attempt at doing anything constructive with Cython. It was actually unbelievably easy to work with. I think i spent less time working on this, than I did trying to find an optimized solution using pure numpy and python. Chris On Wed, May 6, 2009 at 8:21 PM, wrote: > On Wed, May 6, 2009 at 7:39 PM, Chris Colbert wrote: > > i just realized I don't need the line: > > > > cdef int z = img.shape(2) > > > > it's left over from tinkering. sorry. And i should probably convert the > out > > array to type float to handle large data sets. > > > > Chris > > > > On Wed, May 6, 2009 at 7:30 PM, wrote: > >> > >> On Wed, May 6, 2009 at 6:06 PM, Chris Colbert > wrote: > >> > I decided to hold myself over until being able to take a hard look at > >> > the > >> > numpy histogramdd code: > >> > > >> > Here is a quick thing a put together in cython. It's a 40x speedup > over > >> > histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630, > 3) > >> > array, this executed in 0.005 seconds on my machine. > >> > > >> > This only works for arrays with uint8 data types having dimensions (x, > >> > y, 3) > >> > (common image format). The return array is a (16, 16, 16) equal width > >> > bin > >> > histogram of the input. > >> > > >> > If anyone wants the cython C-output, let me know and I will email it > to > >> > you. > >> > > >> > If there is interest, I will extend this for different size bins and > >> > aliases > >> > for different data types. > >> > > >> > Chris > >> > > >> > import numpy as np > >> > > >> > cimport numpy as np > >> > > >> > DTYPE = np.uint8 > >> > DTYPE32 = np.int > >> > > >> > ctypedef np.uint8_t DTYPE_t > >> > ctypedef np.int_t DTYPE_t32 > >> > > >> > def hist3d(np.ndarray[DTYPE_t, ndim=3] img): > >> > cdef int x = img.shape[0] > >> > cdef int y = img.shape[1] > >> > cdef int z = img.shape[2] > >> > cdef int addx > >> > cdef int addy > >> > cdef int addz > >> > cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], > >> > dtype=DTYPE32) > >> > cdef int i, j, v0, v1, v2 > >> > > >> > > >> > for i in range(x): > >> > for j in range(y): > >> > v0 = img[i, j, 0] > >> > v1 = img[i, j, 1] > >> > v2 = img[i, j, 2] > >> > addx = (v0 - (v0 % 16)) / 16 > >> > addy = (v1 - (v1 % 16)) / 16 > >> > addz = (v2 - (v2 % 16)) / 16 > >> > out[addx, addy, addz] += 1 > >> > > >> > return out > >> > > >> > >> Thanks for the example for using cython. Once I figure out what the > >> types are, cython will look very convenient for loops, and pyximport > >> takes care of the compiler. > >> > >> Josef > >> > >> import pyximport; pyximport.install() > >> import hist_rgb #name of .pyx files > >> > >> import numpy as np > >> factors = np.random.randint(256,size=(480, 630, 3)) > >> h = hist_rgb.hist3d(factors.astype(np.uint8)) > >> print h[:,:,0] > > > playing some more with cython: here is a baby on the fly code generator > input type int, output type float64 > > a dispatch function by type is missing > > no segfaults, even though most of the time a call the function with > the wrong type. > > Josef > > code = ''' > import numpy as np > cimport numpy as np > > __all__ = ["hist3d"] > DTYPE = ${imgtype} > DTYPE32 = ${outtype} > > ctypedef ${imgtype}_t DTYPE_t > ctypedef ${outtype}_t DTYPE_t32 > > def hist3d(np.ndarray[DTYPE_t, ndim=3] img): > cdef int x = img.shape[0] > cdef int y = img.shape[1] > #cdef int z = img.shape[2] > cdef int addx > cdef int addy > cdef int addz > cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16], > dtype=DTYPE32) > cdef int i, j, v0, v1, v2 > > > for i in range(x): > for j in range(y): > v0 = img[i, j, 0] > v1 = img[i, j, 1] > v2 = img[i, j, 2] > addx = (v0 - (v0 % 16)) / 16 > addy = (v1 - (v1 % 16)) / 16 > addz = (v2 - (v2 % 16)) / 16 > out[addx, addy, addz] += 1 > > return out > ''' > > from string import Template > s = Template(code) > src = s.substitute({'imgtype': 'np.int', 'outtype': 'np.float64'}) > open('histrgbintfl2.pyx','w').write(src) > > import pyximport; pyximport.install() > import histrgbintfl2 > > import numpy as np > factors = np.random.randint(256,size=(480, 630, 3)) > h = histrgbintfl2.hist3d(factors)#.astype(np.uint8)) > print h[:,:,0] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Wed May 6 23:59:54 2009 From: david.huard at gmail.com (David Huard) Date: Wed, 6 May 2009 23:59:54 -0400 Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: <4A01E1FC.8090701@jpl.nasa.gov> References: <4A01E1FC.8090701@jpl.nasa.gov> Message-ID: <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> Hi Mathew, You could use Newton's method to optimize for each vi sequentially. If you have an expression for the jacobian, it's even better. What I'd do is write a class with a method f(self, x, y) that records the result of f(x,y) each time it is called. I would then sample very coarsely the x,y space where I guess my solutions are. You can then select the x,y where v1 is maximum as your initial point for Newton's method and iterate until you converge to the solution for v1. Since during the search for the optimum your class stores the computed points, your initial guess for v2 should be a bit better than it was for v1, which should speed up the convergence to the solution for v2, etc. If you have multiple processors available, you can scatter function evaluation among them using ipython. It's easier than it looks. Hope someone comes up with a nicer solution, David On Wed, May 6, 2009 at 3:16 PM, Mathew Yeates wrote: > I have a function f(x,y) which produces N values [v1,v2,v3 .... vN] > where some of the values are None (only found after evaluation) > > each evaluation of "f" is expensive and N is large. > I want N x,y pairs which produce the optimal value in each column. > > A brute force approach would be to generate > [v11,v12,v13,v14 ....] > [v21,v22,v23 ....... ] > etc > > then locate the maximum of each column. > This is far too slow ......Any other ideas? > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Thu May 7 04:06:41 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Thu, 7 May 2009 10:06:41 +0200 Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> References: <4A01E1FC.8090701@jpl.nasa.gov> <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> Message-ID: hi mathew, 1) what does it mean if a value is None? I.e., what is larger: None or 3? Then first thing I would do is convert the None to a number. 2) Are your arrays integer arrays or double arrays? It's much easier if they are doubles because then you could use standard methods for NLP problems, as for example Newton's method as suggested above. But of the sound of it you could possibly enumerate over all possible solutions. This may possibly be formulated as a linear mixed integer program. This is also a hard problem, but can be usually solved quite fast nowadays. In the worst case, you might have to use algorithms as genetic algorithms or a stochastic search as e.g. simulated annealing which often do not give good results. 3) It is not clear to me what exactly you are trying to maximize. As far as I understand you actually have N optimization problems. This is very unusual! Typically the problem at hand can be formulated as *one* optimization problem. Could you tell us, what exactly your problem is and why you want to solve it? I am pretty sure that there is a much better approach than solving N optimization problems. It is good practice to first find the category of the optimization problem. There are quite a lot of them: linear programs, nonlinear programs, mixed integer linear programs, .... and they can further be distinguished by the number of constraints, type of constraints, if the objective function is convex, etc... Once you have identified all that for your given problem, you can start looking for a standard solver that can solve your problem. On Thu, May 7, 2009 at 5:59 AM, David Huard wrote: > Hi Mathew, > > You could use Newton's method to optimize for each vi sequentially. If you > have an expression for the jacobian, it's even better. > > What I'd do is write a class with a method f(self, x, y) that records the > result of f(x,y) each time it is called. I would then sample very coarsely > the x,y space where I guess my solutions are. You can then select the x,y > where v1 is maximum as your initial point for Newton's method and iterate > until you converge to the solution for v1. Since during the search for the > optimum your class stores the computed points, your initial guess for v2 > should be a bit better than it was for v1, which should speed up the > convergence to the solution for v2, etc. > > If you have multiple processors available, you can scatter function > evaluation among them using ipython. It's easier than it looks. > > Hope someone comes up with a nicer solution, > > David > > On Wed, May 6, 2009 at 3:16 PM, Mathew Yeates wrote: >> >> I have a function f(x,y) which produces N values [v1,v2,v3 .... vN] >> where some of the values are None (only found after evaluation) >> >> each evaluation of "f" is expensive and N is large. >> I want N x,y pairs which produce the optimal value in each column. >> >> A brute force approach would be to generate >> [v11,v12,v13,v14 ....] >> [v21,v22,v23 ....... ] >> etc >> >> then locate the maximum of each column. >> This is far too slow ......Any other ideas? >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From stefan at sun.ac.za Thu May 7 07:11:28 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 7 May 2009 13:11:28 +0200 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <7f014ea60905061734y3dfcb299w14681ff0020a4de1@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> <7f014ea60905061734y3dfcb299w14681ff0020a4de1@mail.gmail.com> Message-ID: <9457e7c80905070411x6d51c768g20f4dd0990b843a8@mail.gmail.com> 2009/5/7 Chris Colbert : > This was really my first attempt at doing anything constructive with Cython. > It was actually unbelievably easy to work with. I think i spent less time > working on this, than I did trying to find an optimized solution using pure > numpy and python. One aspect we often overlook is how easy it is to write a for-loop in comparison to vectorisation. Besides, for-loops are sometimes easier to read as well! I think the Cython guys are planning some sort of templating, but I'll CC Dag so that he can tell us more. Regards St?fan From dagss at student.matnat.uio.no Thu May 7 07:32:03 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 07 May 2009 13:32:03 +0200 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <9457e7c80905070411x6d51c768g20f4dd0990b843a8@mail.gmail.com> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> <7f014ea60905061734y3dfcb299w14681ff0020a4de1@mail.gmail.com> <9457e7c80905070411x6d51c768g20f4dd0990b843a8@mail.gmail.com> Message-ID: <4A02C6B3.3030404@student.matnat.uio.no> St?fan van der Walt wrote: > 2009/5/7 Chris Colbert : >> This was really my first attempt at doing anything constructive with Cython. >> It was actually unbelievably easy to work with. I think i spent less time >> working on this, than I did trying to find an optimized solution using pure >> numpy and python. > > One aspect we often overlook is how easy it is to write a for-loop in > comparison to vectorisation. Besides, for-loops are sometimes easier > to read as well! > > I think the Cython guys are planning some sort of templating, but I'll > CC Dag so that he can tell us more. We were discussing how it would/should look like, but noone's committed to implementing it so it's pretty much up in the blue I think -- someone might jump in and do it next week, or it might go another year, I can't tell. While I'm here, also note in that code Chris wrote that you want to pay attention to the change of default division semantics on Cython 0.12 (especially for speed). http://wiki.cython.org/enhancements/division -- Dag Sverre From dagss at student.matnat.uio.no Thu May 7 07:35:13 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 07 May 2009 13:35:13 +0200 Subject: [Numpy-discussion] efficient 3d histogram creation In-Reply-To: <4A02C6B3.3030404@student.matnat.uio.no> References: <7f014ea60905031715h635a69faof6a06e10c3621ba7@mail.gmail.com> <91cf711d0905041218p70bb44ct35a601844c8c262b@mail.gmail.com> <7f014ea60905041300y49b48055i6df8d5e598d0fe80@mail.gmail.com> <1cd32cbb0905041318g11dea0b9oa61f08bcc380144d@mail.gmail.com> <91cf711d0905050646q7652eepc55d1aed17b1d1ed@mail.gmail.com> <7f014ea60905061506o7253a315p940d6cbd2cc2b420@mail.gmail.com> <1cd32cbb0905061630t1e73e8a5i4bd454f789e22714@mail.gmail.com> <7f014ea60905061639k21dd12b5j4ee6c2738758284@mail.gmail.com> <1cd32cbb0905061721y233aa4c7uf580ad1d99539b1c@mail.gmail.com> <7f014ea60905061734y3dfcb299w14681ff0020a4de1@mail.gmail.com> <9457e7c80905070411x6d51c768g20f4dd0990b843a8@mail.gmail.com> <4A02C6B3.3030404@student.matnat.uio.no> Message-ID: <4A02C771.3040105@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > St?fan van der Walt wrote: >> 2009/5/7 Chris Colbert : >>> This was really my first attempt at doing anything constructive with Cython. >>> It was actually unbelievably easy to work with. I think i spent less time >>> working on this, than I did trying to find an optimized solution using pure >>> numpy and python. >> One aspect we often overlook is how easy it is to write a for-loop in >> comparison to vectorisation. Besides, for-loops are sometimes easier >> to read as well! >> >> I think the Cython guys are planning some sort of templating, but I'll >> CC Dag so that he can tell us more. > > We were discussing how it would/should look like, but noone's committed > to implementing it so it's pretty much up in the blue I think -- someone > might jump in and do it next week, or it might go another year, I can't > tell. BTW the consensus pretty much ended on: cdef class MyClass[T](Ancestor): cdef T evaluate(T x): ... And then instantiate with cdef MyClass[int] obj = MyClass[int]() ... Only class templates would be targeted at first. -- Dag Sverre From myeates at jpl.nasa.gov Thu May 7 09:57:00 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Thu, 07 May 2009 06:57:00 -0700 Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> References: <4A01E1FC.8090701@jpl.nasa.gov> <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> Message-ID: <4A02E8AC.7040603@jpl.nasa.gov> David Huard wrote: > Hi Mathew, > > You could use Newton's method to optimize for each vi sequentially. If > you have an expression for the jacobian, it's even better. Here's the problem. Every time f is evaluated, it returns a set of values. (a row in the matrix) But if we are trying to find the minimum of the first column, we only care about the first value in the set. This is really N optimization. problems I want to perform simultaneously. Find N (x,y) values where x1,y1 minimizes f in the first column, x2,y2 minimizes f in the second column, etc. And ... doing this a column at a time is too slow (I just did a quick calculation and my brute force method is going to take 30 days!) > > What I'd do is write a class with a method f(self, x, y) that records > the result of f(x,y) each time it is called. I would then sample very > coarsely the x,y space where I guess my solutions are. You can then > select the x,y where v1 is maximum as your initial point for Newton's > method and iterate until you converge to the solution for v1. Since > during the search for the optimum your class stores the computed > points, your initial guess for v2 should be a bit better than it was > for v1, which should speed up the convergence to the solution for v2, > etc. > > If you have multiple processors available, you can scatter function > evaluation among them using ipython. It's easier than it looks. > > Hope someone comes up with a nicer solution, > > David > > On Wed, May 6, 2009 at 3:16 PM, Mathew Yeates > wrote: > > I have a function f(x,y) which produces N values [v1,v2,v3 .... vN] > where some of the values are None (only found after evaluation) > > each evaluation of "f" is expensive and N is large. > I want N x,y pairs which produce the optimal value in each column. > > A brute force approach would be to generate > [v11,v12,v13,v14 ....] > [v21,v22,v23 ....... ] > etc > > then locate the maximum of each column. > This is far too slow ......Any other ideas? > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From myeates at jpl.nasa.gov Thu May 7 10:05:04 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Thu, 07 May 2009 07:05:04 -0700 Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: References: <4A01E1FC.8090701@jpl.nasa.gov> <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> Message-ID: <4A02EA90.30009@jpl.nasa.gov> Sebastian Walter wrote: > N optimization problems. This is very unusual! Typically the problem > at hand can be formulated as *one* optimization problem. > > yes, this is really not so much an optimization problem as it is a vectorization problem. I am trying to avoid 1) Evaluate f over and over and find the maximum in the first column. Store solution 1. 2) Evaluate f over and over and find the max in the second column. Store solution 2. Rinse, Repeat From kbasye1 at jhu.edu Thu May 7 11:35:48 2009 From: kbasye1 at jhu.edu (Ken Basye) Date: Thu, 7 May 2009 08:35:48 -0700 (PDT) Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: <4A02EA90.30009@jpl.nasa.gov> References: <4A01E1FC.8090701@jpl.nasa.gov> <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> <4A02EA90.30009@jpl.nasa.gov> Message-ID: <23428578.post@talk.nabble.com> Hi Mathew, Here are some things to think about: First, is there a way to decompose 'f' so that it computes only one or a subset of K values, but in 1/N ( K/N) time? If so, you can decompose your problem into N single optimizations. Presumably not, but I think it's worth asking. Second, what method would you use if you were only trying to solve the problem for one column? I'm thinking about a heuristic solution involving caching, which is close to what an earlier poster suggested. The idea is to cache complete (length N) results for each call you make. Whenever you need to compute f(x,y), consult the cache to see if there's a result for any point within D of x,y (look up "nearest neighbor search"). Here D is a configurable parameter which will trade off the accuracy of your optimization against time. If there is, use the cached value instead of calling f. Now you just do the "rinse-repeat" algorithm, but it should get progressively faster (per column) as you get more and more cache hits. Possible augmentations: 1) Within the run for a given column, adjust D downward as the optimization progresses so you don't reach a "fixed-point" to early. Trades time for optimization accuracy. 2) When finished, the cache should have "good" values for each column which were found on the pass for that column, but there's no reason not to scan the entire cache one last time to see if a later pass stumbled on a better value for an earlier column. 3) Iterate the entire procedure, using each iteration to seed the starting locations for the next - might be useful if your function has many local minima in some of the N output dimensions. Mathew Yeates wrote: > > Sebastian Walter wrote: >> N optimization problems. This is very unusual! Typically the problem >> at hand can be formulated as *one* optimization problem. >> >> > yes, this is really not so much an optimization problem as it is a > vectorization problem. > I am trying to avoid > 1) Evaluate f over and over and find the maximum in the first column. > Store solution 1. > 2) Evaluate f over and over and find the max in the second column. Store > solution 2. > Rinse, Repeat > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- View this message in context: http://www.nabble.com/hairy-optimization-problem-tp23413559p23428578.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From myeates at jpl.nasa.gov Thu May 7 12:01:35 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Thu, 07 May 2009 09:01:35 -0700 Subject: [Numpy-discussion] hairy optimization problem In-Reply-To: <23428578.post@talk.nabble.com> References: <4A01E1FC.8090701@jpl.nasa.gov> <91cf711d0905062059i57bf84ecxcacbca3f4b1e30f0@mail.gmail.com> <4A02EA90.30009@jpl.nasa.gov> <23428578.post@talk.nabble.com> Message-ID: <4A0305DF.3010201@jpl.nasa.gov> Thanks Ken, I was actually thinking about using caching while on my way into work. Might work. Beats the heck out of using brute force. One other question (maybe I should ask in another thread) what is the canonical method for dealing with missing values? Suppose f(x,y) returns None for some (x,y) pairs (unknown until evaluation). I don't like the idea of setting the return to some small value as this may create local maxima in the solution space. Mathew Ken Basye wrote: > Hi Mathew, > Here are some things to think about: First, is there a way to decompose > 'f' so that it computes only one or a subset of K values, but in 1/N ( K/N) > time? If so, you can decompose your problem into N single optimizations. > Presumably not, but I think it's worth asking. Second, what method would > you use > if you were only trying to solve the problem for one column? > I'm thinking about a heuristic solution involving caching, which is close > to what an earlier poster suggested. The idea is to cache complete (length > N) results for each call you make. Whenever you need to compute f(x,y), > consult the cache to see if there's a result for any point within D of x,y > (look up "nearest neighbor search"). Here D is a configurable parameter > which will trade off the accuracy of your optimization against time. If > there is, use the cached value instead of calling f. Now you just do the > "rinse-repeat" algorithm, but it should get progressively faster (per > column) as you get more and more cache hits. > Possible augmentations: 1) Within the run for a given column, adjust D > downward as the optimization progresses so you don't reach a "fixed-point" > to early. Trades time for optimization accuracy. 2) When finished, the > cache should have "good" values for each column which were found on the pass > for that column, but there's no reason not to scan the entire cache one last > time to see if a later pass stumbled on a better value for an earlier > column. 3) Iterate the entire procedure, using each iteration to seed the > starting locations for the next - might be useful if your function has many > local minima in some of the N output dimensions. > > > > Mathew Yeates wrote: > >> Sebastian Walter wrote: >> >>> N optimization problems. This is very unusual! Typically the problem >>> at hand can be formulated as *one* optimization problem. >>> >>> >>> >> yes, this is really not so much an optimization problem as it is a >> vectorization problem. >> I am trying to avoid >> 1) Evaluate f over and over and find the maximum in the first column. >> Store solution 1. >> 2) Evaluate f over and over and find the max in the second column. Store >> solution 2. >> Rinse, Repeat >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > > From myeates at jpl.nasa.gov Thu May 7 12:16:02 2009 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Thu, 07 May 2009 09:16:02 -0700 Subject: [Numpy-discussion] optimization when there are mssing values Message-ID: <4A030942.1090405@jpl.nasa.gov> What is the canonical method for dealing with missing values? Suppose f(x,y) returns None for some (x,y) pairs (unknown until evaluation). I don't like the idea of setting the return to some small value as this may create local maxima in the solution space. So any of the scipy packages deal with this? Mathew From sccolbert at gmail.com Thu May 7 12:39:38 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 7 May 2009 12:39:38 -0400 Subject: [Numpy-discussion] element wise help Message-ID: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> suppose i have two arrays: n and t, both are 1-D arrays. for each value in t, I need to use it to perform an element wise scalar operation on every value in n and then sum the results into a single scalar to be stored in the output array. Is there any way to do this without the for loop like below: for val in t_array: out = (n / val).sum() # not the actual function being done, but you get the idea Thanks, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 7 12:56:04 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 May 2009 12:56:04 -0400 Subject: [Numpy-discussion] element wise help In-Reply-To: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> References: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> Message-ID: <1cd32cbb0905070956r268dc891jc762e119d69858a@mail.gmail.com> On Thu, May 7, 2009 at 12:39 PM, Chris Colbert wrote: > suppose i have two arrays:? n and t, both are 1-D arrays. > > for each value in t, I need to use it to perform an element wise scalar > operation on every value in n and then sum the results into a single scalar > to be stored in the output array. > > Is there any way to do this without the for loop like below: > > for val in t_array: > > ????????? out = (n / val).sum()? # not the actual function being done, but > you get the idea > broad casting should work, e.g. (n[:,np.newaxis] / val[np.newaxis,:]).sum() but it constructs the full product array, which is memory intensive for a reduce operation, if the 1d arrays are large. another candidate for a cython loop if the arrays are large? Josef From sccolbert at gmail.com Thu May 7 13:04:46 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 7 May 2009 13:04:46 -0400 Subject: [Numpy-discussion] element wise help In-Reply-To: <1cd32cbb0905070956r268dc891jc762e119d69858a@mail.gmail.com> References: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> <1cd32cbb0905070956r268dc891jc762e119d69858a@mail.gmail.com> Message-ID: <7f014ea60905071004n41aef796s8667c0a9745370f1@mail.gmail.com> unfortunately, the actual function being processes is not so simple, and involves evaluating user functions input from the prompt as strings. So i have no idea how to do it in Cython. Let me look into this broadcasting. Thanks Josef! On Thu, May 7, 2009 at 12:56 PM, wrote: > On Thu, May 7, 2009 at 12:39 PM, Chris Colbert > wrote: > > suppose i have two arrays: n and t, both are 1-D arrays. > > > > for each value in t, I need to use it to perform an element wise scalar > > operation on every value in n and then sum the results into a single > scalar > > to be stored in the output array. > > > > Is there any way to do this without the for loop like below: > > > > for val in t_array: > > > > out = (n / val).sum() # not the actual function being done, > but > > you get the idea > > > > > broad casting should work, e.g. > > (n[:,np.newaxis] / val[np.newaxis,:]).sum() > > but it constructs the full product array, which is memory intensive > for a reduce operation, if the 1d arrays are large. > > another candidate for a cython loop if the arrays are large? > > Josef > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Thu May 7 13:08:43 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 7 May 2009 13:08:43 -0400 Subject: [Numpy-discussion] element wise help In-Reply-To: <7f014ea60905071004n41aef796s8667c0a9745370f1@mail.gmail.com> References: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> <1cd32cbb0905070956r268dc891jc762e119d69858a@mail.gmail.com> <7f014ea60905071004n41aef796s8667c0a9745370f1@mail.gmail.com> Message-ID: <7f014ea60905071008r39a6a73eie040dce046fe7d31@mail.gmail.com> let me just post my code: t is the time array and n is also an array. For every value of time t, these operations are performed on the entire array n. Then, n is summed to a scalar which represents the system response at time t. I would like to eliminate this for loop if possible. Chris #### code #### b = 4.7 f = [] n = arange(1, N+1, 1) for t in timearray: arg1 = {'S': ((b/t) + (1J*n*pi/t))} exec('from numpy import *', arg1) tempval = eval(transform, arg1)*((-1)**n) rsum = tempval.real.sum() arg2 = {'S': b/t} exec('from numpy import *', arg2) tempval2 = eval(transform, arg2)*0.5 fval = (exp(b) / t) * (tempval2 + rsum) f.append(fval) #### /code ##### On Thu, May 7, 2009 at 1:04 PM, Chris Colbert wrote: > unfortunately, the actual function being processes is not so simple, and > involves evaluating user functions input from the prompt as strings. So i > have no idea how to do it in Cython. > > Let me look into this broadcasting. > > Thanks Josef! > > > On Thu, May 7, 2009 at 12:56 PM, wrote: > >> On Thu, May 7, 2009 at 12:39 PM, Chris Colbert >> wrote: >> > suppose i have two arrays: n and t, both are 1-D arrays. >> > >> > for each value in t, I need to use it to perform an element wise scalar >> > operation on every value in n and then sum the results into a single >> scalar >> > to be stored in the output array. >> > >> > Is there any way to do this without the for loop like below: >> > >> > for val in t_array: >> > >> > out = (n / val).sum() # not the actual function being done, >> but >> > you get the idea >> > >> >> >> broad casting should work, e.g. >> >> (n[:,np.newaxis] / val[np.newaxis,:]).sum() >> >> but it constructs the full product array, which is memory intensive >> for a reduce operation, if the 1d arrays are large. >> >> another candidate for a cython loop if the arrays are large? >> >> Josef >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 7 13:37:40 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 May 2009 13:37:40 -0400 Subject: [Numpy-discussion] element wise help In-Reply-To: <7f014ea60905071008r39a6a73eie040dce046fe7d31@mail.gmail.com> References: <7f014ea60905070939j2c66ecbbu7f7cbb7f0a27e2d6@mail.gmail.com> <1cd32cbb0905070956r268dc891jc762e119d69858a@mail.gmail.com> <7f014ea60905071004n41aef796s8667c0a9745370f1@mail.gmail.com> <7f014ea60905071008r39a6a73eie040dce046fe7d31@mail.gmail.com> Message-ID: <1cd32cbb0905071037h5c9e9754tb4b1346eb53a88cc@mail.gmail.com> On Thu, May 7, 2009 at 1:08 PM, Chris Colbert