From ddrake at brontes3d.com Tue Sep 4 14:26:32 2007 From: ddrake at brontes3d.com (Daniel Drake) Date: Tue, 04 Sep 2007 14:26:32 -0400 Subject: [Numpy-discussion] Numeric 64 bit issues with python 2.5 -- fixed Message-ID: <1188930392.10589.3.camel@localhost> I realise Numeric is a dead project, but in case it is useful for anyone else: I wrote a patch to solve the issue reported here: http://projects.scipy.org/pipermail/numpy-discussion/2007-May/027616.html The patch can be found here: http://bugs.gentoo.org/attachment.cgi?id=129716&action=view S?bastien Fabbro performed a more comprehensive review and fixed up some other potential issues alongside my fix. His patch is here: http://sources.gentoo.org/viewcvs.py/*checkout*/gentoo-x86/dev-python/numeric/files/numeric-24.2-python25.patch This patch has been included in Gentoo's package tree. -- Daniel Drake Brontes Technologies, A 3M Company http://www.brontes3d.com/opensource From David.L.Goldsmith at noaa.gov Tue Sep 4 20:24:20 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Tue, 04 Sep 2007 17:24:20 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? Message-ID: <46DDF734.7070708@noaa.gov> Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? Thanks! DG -- ERD/ORR/NOS/NOAA From broman at spawar.navy.mil Tue Sep 4 20:52:07 2007 From: broman at spawar.navy.mil (Vincent Broman) Date: Tue, 4 Sep 2007 17:52:07 -0700 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: References: Message-ID: <200709041752.08069@b00d61a8cecf8b2266f81358fd170621.navy.mil> Oliphant: > Would you be willing to help get the config.h file set up correctly? Yes. I thought it was automatic, tho. What to do? me: > Its kernel is 2.4.19-Asmp tailored by the vendor. Harris: > Which vendor? Curtiss-Wright Controls, back when they were called Synergy. Harris: > Ancient.... Curtiss-Wright now supports Linux and kernel 2.6.16 on > some of their newer hardware The military ends up supporting ancient systems for a long time. I don't think I have the option of upgrading the OS on these beasts, especially when it involves a new binary format. That would be nice, tho, especially if it made it possible to run subversion. Harris: > Any more detail on these? What causes the conflict. I've got to wonder about > the the libc/libm versions also. Does the include file math.h say anything > about the prototypes for these functions? I expect cosl et.al. to be > potential problems on the PPC anyway due to the way long doubles were > implemented. "grep -r sinl /usr/include" finds nothing. math.h defines sin and sinf with various underscores attached, using token pasting, but no mention of sinl. Similarly for fabs and cos which I checked. Harris: > What is the PPC model number? The CWC VSS4 contains four powerpc G4's, the 7400. Vincent Broman From charlesr.harris at gmail.com Tue Sep 4 23:41:37 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Sep 2007 21:41:37 -0600 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: <200709041752.08069@b00d61a8cecf8b2266f81358fd170621.navy.mil> References: <200709041752.08069@b00d61a8cecf8b2266f81358fd170621.navy.mil> Message-ID: Hi Vincent, On 9/4/07, Vincent Broman wrote: > > Oliphant: > > Would you be willing to help get the config.h file set up correctly? > > Yes. I thought it was automatic, tho. > What to do? > > me: > > Its kernel is 2.4.19-Asmp tailored by the vendor. > > Harris: > > Which vendor? > > Curtiss-Wright Controls, back when they were called Synergy. > > Harris: > > Ancient.... Curtiss-Wright now supports Linux and kernel 2.6.16 on > > some of their newer hardware > > The military ends up supporting ancient systems for a long time. > I don't think I have the option of upgrading the OS on these beasts, > especially when it involves a new binary format. > That would be nice, tho, especially if it made it possible to run > subversion. Too bad. It looks to me like gcc2.95.3 should work just fine. Could you run $ cat ./build/src.linux-i686-2.5/numpy/core/config.h in your numpy source directory, with the appropriate changes of course, and send the result? I'm curious about the value of SIZEOF_LONG_DOUBLE. Then could you also compile and run #include int main(int argc, char** argv) { printf("size of long double: %d\n", sizeof(long double)); return 1; } And see what it prints? The two should be the same. Thanks, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dannoritzer at web.de Wed Sep 5 03:48:45 2007 From: dannoritzer at web.de (=?ISO-8859-1?Q?G=FCnter_Dannoritzer?=) Date: Wed, 05 Sep 2007 09:48:45 +0200 Subject: [Numpy-discussion] Use my own data type with NumPy Message-ID: <46DE5F5D.8060204@web.de> Hi, I am trying to use my own data type with NumPy, but get some funny result when creating a NumPy array with it. My data type is indexable and sliceable and what happens now is when I create an array, NumPy is adding the instance as a list of the indexed values. How can I force NumPy to handle my data type as an 'Object' and use the value of __repr__ to display in the array? Maybe I am handling that too simple, as I had another data type before, -- not indexable though --, that just works fine with NumPy. Do I have to worry about a dtype object with my new data type or how can I use my new data type with NumPy? Thanks for your help. Cheers, Guenter From Marc.Poinot at onera.fr Wed Sep 5 05:05:00 2007 From: Marc.Poinot at onera.fr (Marc POINOT) Date: Wed, 05 Sep 2007 11:05:00 +0200 Subject: [Numpy-discussion] Change array memory ownership status Message-ID: <46DE713C.1080805@onera.fr> I want to change the "status" of a numpy array. I mean this array was created by a server application using PyArray_FromDimsAndData that sets the NPY_OWNDATA flag to False. The server application believes the client would free the memory. But there are more than one client application and none knows who is in charge of freeing this memory. Then I want to set the flag NPY_OWNDATA to True to tell the server to do the job when it finishes the script. How can I do that, I mean at the Python interface level, not the C API. >>> print a.flags.owndata True >>> >>> a.flags.owndata=False Traceback (most recent call last): File "", line 1, in ? TypeError: attribute 'owndata' of 'numpy.flagsobj' objects is not writable >>> Or should I have to write my own function set/get for OWNDATA ? The get is there but I can't get the set... -MP- ----------------------------------------------------------------------- Marc POINOT [ONERA/DSNA] Tel:+33.1.46.73.42.84 Fax:+33.1.46.73.41.66 Avertissement/disclaimer http://www.onera.fr/onera-en/emails-terms From david at ar.media.kyoto-u.ac.jp Wed Sep 5 07:01:11 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 05 Sep 2007 20:01:11 +0900 Subject: [Numpy-discussion] Improving bug triage in trac ? Message-ID: <46DE8C77.10808@ar.media.kyoto-u.ac.jp> Hi there, I am personnally a bit annoyed by the way trac handle bug reports, and would like to know if there is space for improvement. I do not know much about bug tracking systems, so maybe I just don't know how to use it, though. The main thing I dislike is the status of tickets and reports. In particular: - I don't know how other feel, but I am rather confused by the meta data of a ticket, and do not find them really useful from a developer point of view. It would be quite helpful to have information whether the bug is confirmed or not, another one which says wether a patch is available or not. This would make bug triage much easier, IMHO. This should be possible, since I have seen some trac installation with such features (wordpad trac, for example). - This one maybe a bit more difficult to implement I guess (I don't know anything about trac internals): I find the general view of bugs for a given repository really helpful in launchpad, in perticular, you can easily view the percentage of bugs wrt their status (eg 30 % bugs unconfirmed, etc...); see for example https://bugs.launchpad.net/+about. This gives a pretty good idea of what needs to be done for a particular release. How do other people feel about those suggestions ? cheers, David From Joris.DeRidder at ster.kuleuven.be Wed Sep 5 07:36:13 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 5 Sep 2007 13:36:13 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DDF734.7070708@noaa.gov> References: <46DDF734.7070708@noaa.gov> Message-ID: <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> A related question, just out of curiosity: is there a technical reason why Numpy has been coded in C rather than C++? Joris On 05 Sep 2007, at 02:24, David Goldsmith wrote: > Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array > typemap to share? Thanks! > > DG > -- > ERD/ORR/NOS/NOAA emergencyresponse/> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From gnata at obs.univ-lyon1.fr Wed Sep 5 08:08:16 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Wed, 05 Sep 2007 14:08:16 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> Message-ID: <46DE9C30.8020300@obs.univ-lyon1.fr> I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the conversion but my code is written by hands. I would like to simplify it using SWIG but I also would like to see a good typemap valarray <=> numpy.array :) Joris : Historical ones? Maybe also the fact that distutils has some small pb with C++ module. To sum up : You have to compile you module with the same compiler options you used to compile Python. Python is coded in C so some options does not make sense in C++. As a result, you get warnings at compile time (see pylab compiled with gcc for instance). Xavier > A related question, just out of curiosity: is there a technical > reason why Numpy has been coded in C rather than C++? > > Joris > > > > On 05 Sep 2007, at 02:24, David Goldsmith wrote: > > >> Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array >> typemap to share? Thanks! >> >> DG >> -- >> ERD/ORR/NOS/NOAA > emergencyresponse/> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From travis at enthought.com Wed Sep 5 10:07:14 2007 From: travis at enthought.com (Travis Vaught) Date: Wed, 5 Sep 2007 09:07:14 -0500 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DE9C30.8020300@obs.univ-lyon1.fr> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> Message-ID: Have you seen this? http://www.scipy.org/Cookbook/SWIG_and_NumPy Also, the numpy/doc/swig directory has the simple typemaps. Travis On Sep 5, 2007, at 7:08 AM, Xavier Gnata wrote: > I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the > conversion but my code is written by hands. I would like to > simplify it > using SWIG but I also would like to see a good typemap valarray <=> > numpy.array :) > > Joris : Historical ones? Maybe also the fact that distutils has > some small pb with C++ module. To sum up : You have to compile you > module with the same compiler options you used to compile Python. > Python is coded in C so some options does not make sense in C++. > As a result, you get warnings at compile time (see pylab compiled > with gcc for instance). > > > Xavier > > > >> A related question, just out of curiosity: is there a technical >> reason why Numpy has been coded in C rather than C++? >> >> Joris >> >> >> >> On 05 Sep 2007, at 02:24, David Goldsmith wrote: >> >> >>> Anyone have a well-tested SWIG-based C++ STL valarray <=> >>> numpy.array >>> typemap to share? Thanks! >>> >>> DG >>> -- >>> ERD/ORR/NOS/NOAA >> emergencyresponse/> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > ############################################ > Xavier Gnata > CRAL - Observatoire de Lyon > 9, avenue Charles Andr? > 69561 Saint Genis Laval cedex > Phone: +33 4 78 86 85 28 > Fax: +33 4 78 86 83 86 > E-mail: gnata at obs.univ-lyon1.fr > ############################################ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From george.sakkis at gmail.com Wed Sep 5 10:22:46 2007 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 5 Sep 2007 10:22:46 -0400 Subject: [Numpy-discussion] 2-d in-place operation performance vs 1-d non in-place Message-ID: <91ad5bf80709050722n67ef53b3l6902aa03eff95a66@mail.gmail.com> I was surprised to see that an in-place modification of a 2-d array turns out to be slower from the respective non-mutating operation on 1- d arrays, although the latter creates new array objects. Here is the benchmarking code: import timeit for n in 10,100,1000,10000: setup = 'from numpy.random import random;' \ 'm=random((%d,2));' \ 'u1=random(%d);' \ 'u2=u1.reshape((u1.size,1))' % (n,n) timers = [timeit.Timer(stmt,setup) for stmt in # 1-d operations; create new arrays 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', # 2-d in place operation 'm -= u2' ] print n, [min(timer.repeat(3,1000)) for timer in timers] And some results (Python 2.5, WinXP): 10 [0.010832382327921563, 0.0045706926438974782] 100 [0.010882668048592767, 0.021704993232380093] 1000 [0.018272154701226007, 0.19477587235249172] 10000 [0.073787590322233698, 1.9234369172618306] So the 2-d in-place modification time grows linearly with the array size but the 1-d operations are much more efficient, despite allocating new arrays while doing so. What gives ? George From wfspotz at sandia.gov Wed Sep 5 10:45:40 2007 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 5 Sep 2007 08:45:40 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DDF734.7070708@noaa.gov> References: <46DDF734.7070708@noaa.gov> Message-ID: <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> I have been considering adding some C++ STL support to numpy/doc/swig/ numpy.i. Probably std::vector <=> PyArrayObject (and some std::complex support as well). Is this what you had in mind? On Sep 4, 2007, at 6:24 PM, David Goldsmith wrote: > Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array > typemap to share? Thanks! > > DG > -- > ERD/ORR/NOS/NOAA emergencyresponse/> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-5451 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From David.L.Goldsmith at noaa.gov Wed Sep 5 12:01:50 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Wed, 05 Sep 2007 09:01:50 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DDF734.7070708@noaa.gov> References: <46DDF734.7070708@noaa.gov> Message-ID: <46DED2EE.6070400@noaa.gov> Point of clarification: below "well-tested" = "well-use-tested," not (necessarily) "well-unit-tested". DG David Goldsmith wrote: > Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array > typemap to share? Thanks! > > DG > -- ERD/ORR/NOS/NOAA From David.L.Goldsmith at noaa.gov Wed Sep 5 12:16:24 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Wed, 05 Sep 2007 09:16:24 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> Message-ID: <46DED658.9070206@noaa.gov> No I hadn't - thanks! (Probably should have Google-d first, huh. :-[ ) DG Travis Vaught wrote: > Have you seen this? > > http://www.scipy.org/Cookbook/SWIG_and_NumPy > > Also, the numpy/doc/swig directory has the simple typemaps. > > Travis > > On Sep 5, 2007, at 7:08 AM, Xavier Gnata wrote: > > >> I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the >> conversion but my code is written by hands. I would like to >> simplify it >> using SWIG but I also would like to see a good typemap valarray <=> >> numpy.array :) >> >> Joris : Historical ones? Maybe also the fact that distutils has >> some small pb with C++ module. To sum up : You have to compile you >> module with the same compiler options you used to compile Python. >> Python is coded in C so some options does not make sense in C++. >> As a result, you get warnings at compile time (see pylab compiled >> with gcc for instance). >> >> >> Xavier >> >> >> >> >>> A related question, just out of curiosity: is there a technical >>> reason why Numpy has been coded in C rather than C++? >>> >>> Joris >>> >>> >>> >>> On 05 Sep 2007, at 02:24, David Goldsmith wrote: >>> >>> >>> >>>> Anyone have a well-tested SWIG-based C++ STL valarray <=> >>>> numpy.array >>>> typemap to share? Thanks! >>>> >>>> DG >>>> -- >>>> ERD/ORR/NOS/NOAA >>> emergencyresponse/> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> -- >> ############################################ >> Xavier Gnata >> CRAL - Observatoire de Lyon >> 9, avenue Charles Andr? >> 69561 Saint Genis Laval cedex >> Phone: +33 4 78 86 85 28 >> Fax: +33 4 78 86 83 86 >> E-mail: gnata at obs.univ-lyon1.fr >> ############################################ >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- ERD/ORR/NOS/NOAA From David.L.Goldsmith at noaa.gov Wed Sep 5 12:20:30 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Wed, 05 Sep 2007 09:20:30 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> Message-ID: <46DED74E.4090506@noaa.gov> Not presently, as the C++ code I need to wrap now is using the valarray class template (largely at my behest), though I (and I imagine others) might find this useful in the future. DG Bill Spotz wrote: > I have been considering adding some C++ STL support to > numpy/doc/swig/numpy.i. Probably std::vector <=> PyArrayObject > (and some std::complex support as well). Is this what you had > in mind? > > On Sep 4, 2007, at 6:24 PM, David Goldsmith wrote: > >> Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array >> typemap to share? Thanks! >> >> DG >> -- >> ERD/ORR/NOS/NOAA >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > ** Bill Spotz ** > ** Sandia National Laboratories Voice: (505)845-0170 ** > ** P.O. Box 5800 Fax: (505)284-5451 ** > ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** > > > -- ERD/ORR/NOS/NOAA From faltet at carabos.com Wed Sep 5 12:29:24 2007 From: faltet at carabos.com (Francesc Altet) Date: Wed, 5 Sep 2007 18:29:24 +0200 Subject: [Numpy-discussion] 2-d in-place operation performance vs 1-d non in-place In-Reply-To: <91ad5bf80709050722n67ef53b3l6902aa03eff95a66@mail.gmail.com> References: <91ad5bf80709050722n67ef53b3l6902aa03eff95a66@mail.gmail.com> Message-ID: <200709051829.24448.faltet@carabos.com> A Wednesday 05 September 2007, George Sakkis escrigu?: > I was surprised to see that an in-place modification of a 2-d array > turns out to be slower from the respective non-mutating operation on > 1- d arrays, although the latter creates new array objects. Here is > the benchmarking code: > > import timeit > > for n in 10,100,1000,10000: > ? ?setup = 'from numpy.random import random;' \ > ? ? ? ? ? ?'m=random((%d,2));' \ > ? ? ? ? ? ?'u1=random(%d);' \ > ? ? ? ? ? ?'u2=u1.reshape((u1.size,1))' % (n,n) > ? ?timers = [timeit.Timer(stmt,setup) for stmt in > ? ? ? ?# 1-d operations; create new arrays > ? ? ? ?'a0 = m[:,0]-u1; a1 = m[:,1]-u1', > ? ? ? ?# 2-d in place operation > ? ? ? ?'m -= u2' > ? ?] > ? ?print n, [min(timer.repeat(3,1000)) for timer in timers] > > > And some results (Python 2.5, WinXP): > > 10 [0.010832382327921563, 0.0045706926438974782] > 100 [0.010882668048592767, 0.021704993232380093] > 1000 [0.018272154701226007, 0.19477587235249172] > 10000 [0.073787590322233698, 1.9234369172618306] > > So the 2-d in-place modification time grows linearly with the array > size but the 1-d operations are much more efficient, despite > allocating new arrays while doing so. What gives ? This seems the effect of broadcasting u2. If you were to use a pre-computed broadcasted, you would get rid of such bottleneck: for n in 10,100,1000,10000: setup = 'import numpy;' \ 'm=numpy.random.random((%d,2));' \ 'u1=numpy.random.random(%d);' \ 'u2=u1[:, numpy.newaxis];' \ 'u3=numpy.array([u1,u1]).transpose()' % (n,n) timers = [timeit.Timer(stmt,setup) for stmt in # 1-d operations; create new arrays 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', # 2-d in place operation (using broadcasting) 'm -= u2', # 2-d in-place operation (not forcing broadcasting) 'm -= u3' ] print n, [min(timer.repeat(3,1000)) for timer in timers] gives in my machine: 10 [0.03213191032409668, 0.012019872665405273, 0.0068600177764892578] 100 [0.033048152923583984, 0.06542205810546875, 0.0076580047607421875] 1000 [0.040294170379638672, 0.59892702102661133, 0.014600992202758789] 10000 [0.32667303085327148, 5.9721651077270508, 0.10261106491088867] HTH, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From rcdailey at gmail.com Wed Sep 5 12:55:36 2007 From: rcdailey at gmail.com (Robert Dailey) Date: Wed, 5 Sep 2007 11:55:36 -0500 Subject: [Numpy-discussion] Vector magnitude? Message-ID: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> Hi, I have two questions: 1) Is there any way in numpy to represent vectors? Currently I'm using 'array' for vectors. 2) Is there a way to calculate the magnitude (length) of a vector in numpy? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From broman at spawar.navy.mil Wed Sep 5 13:13:50 2007 From: broman at spawar.navy.mil (Vincent Broman) Date: Wed, 5 Sep 2007 10:13:50 -0700 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: References: Message-ID: <200709051013.50786@b00d61a8cecf8b2266f81358fd170621.navy.mil> Harris asked about long doubles. On my YDL, SIZEOF_LONG_DOUBLE and sizeof( long double) were both 8. Vincent Broman From Chris.Barker at noaa.gov Wed Sep 5 13:19:43 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 10:19:43 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> Message-ID: <46DEE52F.8050000@noaa.gov> Bill Spotz wrote: > I have been considering adding some C++ STL support to numpy/doc/swig/ > numpy.i. Probably std::vector <=> PyArrayObject (and some > std::complex support as well). Is this what you had in mind? well, std::valarray is not the same as std::vector, though there are similarities, so the example would be helpful. Of greatest concern to me is the memory management issue -- it would be nice to be able to share the data black between the valarray and the numpy array (rather than copying back and forth), but as the both valarrays and vectors are re-sizeable, that might get tricky. I'm assuming that you can get the pointer to the data block from both, but it might change on you. If you solve that for std::vector, the solution is likely to be similar for std:valarray. (I hope). David Goldsmith wrote: > Point of clarification: below "well-tested" = "well-use-tested," not > (necessarily) "well-unit-tested". Of course, the better tested the better, but anything is probably better than starting from scratch! Travis Vaught wrote: > Have you seen this? > > http://www.scipy.org/Cookbook/SWIG_and_NumPy > > Also, the numpy/doc/swig directory has the simple typemaps. Looked at both, and they are a great starting point, but only deal with plain old C arrays, as far as I've seen. Unless someone speaks up, it sounds like it's not been done, but there are at least a few of us that are interested, so maybe we can start a collaboration. -Chris By the way, David G. and I are working on the same project, so we're kind of like one person.... -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Sep 5 13:18:57 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 05 Sep 2007 12:18:57 -0500 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DE5F5D.8060204@web.de> References: <46DE5F5D.8060204@web.de> Message-ID: <46DEE501.7080609@gmail.com> G?nter Dannoritzer wrote: > Hi, > > I am trying to use my own data type with NumPy, but get some funny > result when creating a NumPy array with it. > > My data type is indexable and sliceable and what happens now is when I > create an array, NumPy is adding the instance as a list of the indexed > values. How can I force NumPy to handle my data type as an 'Object' and > use the value of __repr__ to display in the array? > > Maybe I am handling that too simple, as I had another data type before, > -- not indexable though --, that just works fine with NumPy. Do I have > to worry about a dtype object with my new data type or how can I use my > new data type with NumPy? I presume that by "new data type" you mean some Python class that you've made, not some C data type. Creating an object array can be a bit tricky. array() has to make some guesses as to which objects are containers and which are elements. Sometimes, it guesses wrong (at least measured against what the user actually desired). The most robust way to construct an object array is to create an empty() object array of the shape you want, and then assign the contents. In [1]: from numpy import * In [2]: a = empty((3,), dtype=object) In [3]: a Out[3]: array([None, None, None], dtype=object) In [4]: a[:] = [[1], [2], [3]] In [5]: a Out[5]: array([[1], [2], [3]], dtype=object) In [6]: a.shape Out[6]: (3,) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Wed Sep 5 13:20:45 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 5 Sep 2007 19:20:45 +0200 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> Message-ID: <20070905172044.GI15889@clipper.ens.fr> On Wed, Sep 05, 2007 at 11:55:36AM -0500, Robert Dailey wrote: > 1) Is there any way in numpy to represent vectors? Currently I'm using > 'array' for vectors. What do you call a vector ? For me a vector is an element of an linear space. In numerical methods what is comonly called a vector is a 1D array of arbitrary length. I suspect you mean something different, given your question > 2) Is there a way to calculate the magnitude (length) of a vector in > numpy? I am being dumb. What do you mean by magnitude (or length) ? Maybe it is just because I am not a native English speaker. If you are talking about the euclidien norm, I don't know a built in way of doing it, but it is very easy to define a norm function: import numpy as N a = N.arange(3) norm = lambda x: N.sqrt(N.square(x).sum()) norm(a) -> 2.2360679775 From matthieu.brucher at gmail.com Wed Sep 5 13:21:05 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 5 Sep 2007 19:21:05 +0200 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> Message-ID: 2007/9/5, Robert Dailey : > > Hi, > > I have two questions: > > 1) Is there any way in numpy to represent vectors? Currently I'm using > 'array' for vectors. A vector is an array with one dimension, it's OK. You could use a matrix of dimension 1xn or nx1 as well. 2) Is there a way to calculate the magnitude (length) of a vector in numpy? Yes, len(a) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Sep 5 13:24:27 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 10:24:27 -0700 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DE5F5D.8060204@web.de> References: <46DE5F5D.8060204@web.de> Message-ID: <46DEE64B.4010100@noaa.gov> G?nter Dannoritzer wrote: > My data type is indexable and sliceable and what happens now is when I > create an array, NumPy is adding the instance as a list of the indexed > values. How can I force NumPy to handle my data type as an 'Object' Object arrays are tricky, 'cause it's hard for numpy to know how you want to unpack arbitrary objects. The solution is to make an empty object array first, then populate it. For example: >>> import numpy as N >>> MyData = [[1,2,3], ... [4,5,6], ... [7,8,9]] This is a list or lists, so numpy.array would unpack it into a 2-d array: >>> N.array(MyData) array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) However, let's say what I want is a 1-d object array. I create the object array empty: >>> OA = N.empty((3,), dtype=N.object) >>> OA array([None, None, None], dtype=object) Then populate it: >>> OA[:] = MyData >>> OA array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=object) Does that help? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rowen at cesmail.net Wed Sep 5 13:31:46 2007 From: rowen at cesmail.net (Russell E. Owen) Date: Wed, 05 Sep 2007 10:31:46 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> Message-ID: In article <4CED078D-B8C7-4917-B0DE-4ED60BDFE015 at sandia.gov>, "Bill Spotz" wrote: > I have been considering adding some C++ STL support to numpy/doc/swig/ > numpy.i. Probably std::vector <=> PyArrayObject (and some > std::complex support as well). Is this what you had in mind? That sounds very useful, but how did you get it to work? std::vectors are resizable and numpy arrays are not. However, much of the time I want std::vectors of a particular size -- in which case numpy would be a great match. (Speaking of which, do you happen to know of any good std::vector variant that has fixed length?) -- Russell From Chris.Barker at noaa.gov Wed Sep 5 13:38:04 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 10:38:04 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> Message-ID: <46DEE97C.5070008@noaa.gov> Joris De Ridder wrote: > A related question, just out of curiosity: is there a technical > reason why Numpy has been coded in C rather than C++? There was a fair bit of discussion about this back when the numarray project started, which was a re-implementation of the original Numeric. IIRC, one of the drivers was that C++ support was still pretty inconsistent across compilers and OSs, particularly if you wanted to really get the advantages of C++, by using templates and the like. It was considered very important that the numpy code base be very portable. C++ compilers have gotten better an more standards compliant, but as a recent thread shows, folks still want to build numpy with older compilers and libs, so the reasoning still holds. Too bad, in a way, I suspect a well-designed C++ numpy could make it easier to write compiled extensions, which would be pretty nice. Of course, it should be possible to write C++ wrappers around the core ND-array object, if anyone wants to take that on! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed Sep 5 13:50:14 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Sep 2007 11:50:14 -0600 Subject: [Numpy-discussion] numpy build fails on powerpc ydl In-Reply-To: <200709051013.50786@b00d61a8cecf8b2266f81358fd170621.navy.mil> References: <200709051013.50786@b00d61a8cecf8b2266f81358fd170621.navy.mil> Message-ID: On 9/5/07, Vincent Broman wrote: > > Harris asked about long doubles. > On my YDL, SIZEOF_LONG_DOUBLE and sizeof( long double) were both 8. Hmm, so long doubles are just doubles, I kinda suspected that. I'm not really familiar with this code, but what happens if you go to numpy/numpy/core/include/numpy/ndarrayobject.h and change the double in line 84 to long double? Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 5 13:53:26 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 05 Sep 2007 12:53:26 -0500 Subject: [Numpy-discussion] Improving bug triage in trac ? In-Reply-To: <46DE8C77.10808@ar.media.kyoto-u.ac.jp> References: <46DE8C77.10808@ar.media.kyoto-u.ac.jp> Message-ID: <46DEED16.9010802@gmail.com> David Cournapeau wrote: > Hi there, > > I am personnally a bit annoyed by the way trac handle bug reports, > and would like to know if there is space for improvement. I do not know > much about bug tracking systems, so maybe I just don't know how to use > it, though. The main thing I dislike is the status of tickets and > reports. In particular: > - I don't know how other feel, but I am rather confused by the meta > data of a ticket, and do not find them really useful from a developer > point of view. It would be quite helpful to have information whether the > bug is confirmed or not, another one which says wether a patch is > available or not. This would make bug triage much easier, IMHO. This > should be possible, since I have seen some trac installation with such > features (wordpad trac, for example). Did you mean "WordPress Trac"? (luckily, the first Google hit for "wordpad trac" happens to be the WordPress Trac) They seem to do this with a standard lexicon of keywords and custom queries. http://codex.wordpress.org/Reporting_Bugs#Trac_Keywords > - This one maybe a bit more difficult to implement I guess (I don't > know anything about trac internals): I find the general view of bugs for > a given repository really helpful in launchpad, in perticular, you can > easily view the percentage of bugs wrt their status (eg 30 % bugs > unconfirmed, etc...); see for example https://bugs.launchpad.net/+about. > This gives a pretty good idea of what needs to be done for a particular > release. > How do other people feel about those suggestions ? We can certainly add custom fields or start using keywords like WordPress. We can't change the status field (new, assigned, closed, reopened), though. That workflow is hardcoded in Trac 0.10, which we are currently using. Getting the summary (30% unconfirmed) may be a bit more difficult. For more reading: http://scipy.org/scipy/scipy/wiki/TracReports http://scipy.org/scipy/scipy/wiki/TracQuery -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wfspotz at sandia.gov Wed Sep 5 14:04:37 2007 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 5 Sep 2007 12:04:37 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEE52F.8050000@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> Message-ID: <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> On Sep 5, 2007, at 11:19 AM, Christopher Barker wrote: > Bill Spotz wrote: >> I have been considering adding some C++ STL support to numpy/doc/ >> swig/ >> numpy.i. Probably std::vector <=> PyArrayObject (and some >> std::complex support as well). Is this what you had in mind? > > well, std::valarray is not the same as std::vector, though there are > similarities, so the example would be helpful. Ah, silly me. Back when I first learned C++, valarray wasn't around yet (or at least it wasn't taught to me), and it is not in the (clearly outdated) references I use. But it is a more logical choice than vector. > Of greatest concern to me is the memory management issue -- it > would be > nice to be able to share the data black between the valarray and the > numpy array (rather than copying back and forth), but as the both > valarrays and vectors are re-sizeable, that might get tricky. I'm > assuming that you can get the pointer to the data block from both, but > it might change on you. If you solve that for std::vector, the > solution > is likely to be similar for std:valarray. (I hope). Yes, this resizing memory management issue is the main reason I haven't tried to implement it in numpy.i yet. A possibly better solution would be to develop a class that inherits from std::valarray but also implements the array interface attributes (these attributes would have to be dynamic, in that they check the std::valarray attributes when accessed rather than storing copies). We could then write typemaps that utilize the array interface. So pure input arguments could be numpy.ndarrays (or any reasonable sequence, really), but output arrays would be a wrapped version of this new class. (Which would behave both like std::valarrays and like numpy.ndarrays. I think...) ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-5451 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From wfspotz at sandia.gov Wed Sep 5 14:07:31 2007 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 5 Sep 2007 12:07:31 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEE97C.5070008@noaa.gov> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DEE97C.5070008@noaa.gov> Message-ID: On Sep 5, 2007, at 11:38 AM, Christopher Barker wrote: > Of course, it should be possible to write C++ wrappers around the core > ND-array object, if anyone wants to take that on! boost::python has done this for Numeric, but last I checked, they have not upgraded to numpy. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-5451 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From rcdailey at gmail.com Wed Sep 5 14:40:19 2007 From: rcdailey at gmail.com (Robert Dailey) Date: Wed, 5 Sep 2007 13:40:19 -0500 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> Message-ID: <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> Thanks for your response. I was not able to find len() in the numpy documentation at the following link: http://www.scipy.org/doc/numpy_api_docs/namespace_index.html Perhaps I'm looking in the wrong location? On 9/5/07, Matthieu Brucher wrote: > > > > 2007/9/5, Robert Dailey : > > > > Hi, > > > > I have two questions: > > > > 1) Is there any way in numpy to represent vectors? Currently I'm using > > 'array' for vectors. > > > > A vector is an array with one dimension, it's OK. You could use a matrix > of dimension 1xn or nx1 as well. > > > 2) Is there a way to calculate the magnitude (length) of a vector in > > numpy? > > > Yes, len(a) > > Matthieu > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Sep 5 14:45:57 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 11:45:57 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> Message-ID: <46DEF965.1000308@noaa.gov> Bill Spotz wrote: > Yes, this resizing memory management issue is the main reason I haven't > tried to implement it in numpy.i yet. > > A possibly better solution would be to develop a class that inherits > from std::valarray but also implements the array interface > attributes (these attributes would have to be dynamic, in that they > check the std::valarray attributes when accessed rather than storing > copies). I like that, though it's way over my head to implement. However, I'm beginning to have my doubts about valarrays. I'm reading: Josuttis, Nicolai M. 1999. "The C+= Standard Library: A Tutorial and Reference" It's 8 years old now, but he writes: "The valarray classes were not designed very well. In fact, nobody tried to determine if the specification worked" He goes on to suggest that Blitz++ might have more of a future. (though it looks like he's involved with the Boost project now) He also points out some major flaws in the text. In reading, I also see that while valarrays can be used for multidimensional arrays, the semantics are pretty ugly. However, he also says: "In principle...you can change their size. However, changing the size of a valarray is provided only to make a two-step initialization (creating and setting the size)" So maybe the memory re-allocation is such an issue. Does anyone know the status of support for valarrays now? Is there another alternative? At the moment, all we need is a one-d fixed size array. There is Boost::array (and MultiArray), but Boost has always seemed like a big, ugly, hard to build dependency. Can you just grab the code for some of these small pieces by themselves? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From zpincus at stanford.edu Wed Sep 5 14:49:45 2007 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed, 5 Sep 2007 14:49:45 -0400 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> Message-ID: <445A7F45-C4A7-41C4-8FDC-A7FBD8AA2D53@stanford.edu> Hello, 'len' is a (pretty basic) python builtin function for getting the length of anything with a list-like interface. (Or more generally, getting the size of anything that is sized, e.g. a set or dictionary.) Numpy arrays offer a list-like interface allowing you to iterate along their first dimension, etc. (*) Thus, len(numpy_array) is equivalent to numpy_array.shape[0], which is the number of elements along the first dimension of the array. Zach (*) For example, this is useful if you've got numerous data vectors packed into an array along the first dimension, and want to iterate across the different vectors. a = numpy.ones((number_of_data_elements, size_of_data_element)) for element in a: # element is a 1-D array with a length of 'size_of_data_element' Note further that this works even if your data elements are multi- dimensional; i.e. the above works the same if: element_shape = (x,y,z) a = numpy.ones((number_of_data_elements,)+element_shape) for element in a: # element is a 3-D array with a shape of (x,y,z) On Sep 5, 2007, at 2:40 PM, Robert Dailey wrote: > Thanks for your response. > > I was not able to find len() in the numpy documentation at the > following link: > http://www.scipy.org/doc/numpy_api_docs/namespace_index.html > > Perhaps I'm looking in the wrong location? > > On 9/5/07, Matthieu Brucher wrote: > > 2007/9/5, Robert Dailey < rcdailey at gmail.com>: Hi, > > I have two questions: > > 1) Is there any way in numpy to represent vectors? Currently I'm > using 'array' for vectors. > > > A vector is an array with one dimension, it's OK. You could use a > matrix of dimension 1xn or nx1 as well. > > > 2) Is there a way to calculate the magnitude (length) of a vector > in numpy? > > Yes, len(a) > > Matthieu > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From matthieu.brucher at gmail.com Wed Sep 5 14:50:13 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 5 Sep 2007 20:50:13 +0200 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> Message-ID: 2007/9/5, Robert Dailey : > > Thanks for your response. > > I was not able to find len() in the numpy documentation at the following > link: > http://www.scipy.org/doc/numpy_api_docs/namespace_index.html > > Perhaps I'm looking in the wrong location? Yes, it's a Python function ;) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcdailey at gmail.com Wed Sep 5 14:52:53 2007 From: rcdailey at gmail.com (Robert Dailey) Date: Wed, 5 Sep 2007 13:52:53 -0500 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> Message-ID: <496954360709051152v30ed8ff6td831ba5922ed469e@mail.gmail.com> Oh I think I get it. You mean the built-in len() function? This isn't what I am looking for. len() returns the number of components in the vector (e.g. whether it is a 2D, 3D, etc vector). I found that magnitude can be calculated using hypot() in the math module that comes with python. However, this method only appears to work with 2D vectors. And yes, by magnitude I mean euclidean norm: sqrt( x*x + y*y ) = magnitude (length) of a vector On 9/5/07, Robert Dailey wrote: > > Thanks for your response. > > I was not able to find len() in the numpy documentation at the following > link: > http://www.scipy.org/doc/numpy_api_docs/namespace_index.html > > Perhaps I'm looking in the wrong location? > > On 9/5/07, Matthieu Brucher wrote: > > > > > > > 2007/9/5, Robert Dailey < rcdailey at gmail.com>: > > > > > > Hi, > > > > > > I have two questions: > > > > > > 1) Is there any way in numpy to represent vectors? Currently I'm using > > > 'array' for vectors. > > > > > > > > A vector is an array with one dimension, it's OK. You could use a matrix > > of dimension 1xn or nx1 as well. > > > > > > 2) Is there a way to calculate the magnitude (length) of a vector in > > > numpy? > > > > > > Yes, len(a) > > > > Matthieu > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Wed Sep 5 14:53:38 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 5 Sep 2007 20:53:38 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEF965.1000308@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> Message-ID: > > He goes on to suggest that Blitz++ might have more of a future. (though > it looks like he's involved with the Boost project now) Blitz++ is more or less avandoned. It uses indexes than can be not-portable between 32bits platforms and 64bits ones. Is there another alternative? At the moment, all we need is a one-d > fixed size array. There is Boost::array (and MultiArray), but Boost has > always seemed like a big, ugly, hard to build dependency. Can you just > grab the code for some of these small pieces by themselves? > The Boost.Array is a fixed-size array, determined at compile-time, not interesting there, I suppose. Multiarrays are what you're looking for. Besides, it is not needed to build Boost to use them (Boost needs compilation for some libraries like Regex, program_options or Python). Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Wed Sep 5 14:55:12 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Thu, 6 Sep 2007 03:55:12 +0900 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEF965.1000308@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> Message-ID: On 9/6/07, Christopher Barker wrote: > Bill Spotz wrote: > However, I'm beginning to have my doubts about valarrays. I'm reading: > > Josuttis, Nicolai M. 1999. "The C+= Standard Library: A Tutorial and > Reference" > > It's 8 years old now, but he writes: > > "The valarray classes were not designed very well. In fact, nobody tried > to determine if the specification worked" I've never read that particular book, but I've also read somewhere that valarray is pretty useless for serious work. The timeframe is probably about the same, though -- I think the last time I used valarray was about 7-8 years ago. --bb From robert.kern at gmail.com Wed Sep 5 14:57:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 05 Sep 2007 13:57:17 -0500 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> Message-ID: <46DEFC0D.8040207@gmail.com> Robert Dailey wrote: > Thanks for your response. > > I was not able to find len() in the numpy documentation at the following > link: > http://www.scipy.org/doc/numpy_api_docs/namespace_index.html > > > Perhaps I'm looking in the wrong location? It's a Python builtin function, but it doesn't do what you want. It returns the number of elements in a sequence (any sequence, not just arrays) not the magnitude of the vector. Besides constructing the Euclidean norm itself (as shown by others here), you can also use numpy.linalg.norm() to calculate any of several different norms of a vector or a matrix: In [7]: numpy.linalg.norm? Type: function Base Class: Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.4.dev4025-py2.5-macosx-10.3-fat.egg/numpy/linalg/linalg.py Definition: numpy.linalg.norm(x, ord=None) Docstring: norm(x, ord=None) -> n Matrix or vector norm. Inputs: x -- a rank-1 (vector) or rank-2 (matrix) array ord -- the order of the norm. Comments: For arrays of any rank, if ord is None: calculate the square norm (Euclidean norm for vectors, Frobenius norm for matrices) For vectors ord can be any real number including Inf or -Inf. ord = Inf, computes the maximum of the magnitudes ord = -Inf, computes minimum of the magnitudes ord is finite, computes sum(abs(x)**ord,axis=0)**(1.0/ord) For matrices ord can only be one of the following values: ord = 2 computes the largest singular value ord = -2 computes the smallest singular value ord = 1 computes the largest column sum of absolute values ord = -1 computes the smallest column sum of absolute values ord = Inf computes the largest row sum of absolute values ord = -Inf computes the smallest row sum of absolute values ord = 'fro' computes the frobenius norm sqrt(sum(diag(X.H * X),axis=0)) For values ord < 0, the result is, strictly speaking, not a mathematical 'norm', but it may still be useful for numerical purposes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lbolla at gmail.com Wed Sep 5 14:59:17 2007 From: lbolla at gmail.com (lorenzo bolla) Date: Wed, 5 Sep 2007 20:59:17 +0200 Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <496954360709051152v30ed8ff6td831ba5922ed469e@mail.gmail.com> References: <496954360709050955t64ad9edclc18b0fa5d71acb9b@mail.gmail.com> <496954360709051140k40f1b5cen7cdfafd9a9913873@mail.gmail.com> <496954360709051152v30ed8ff6td831ba5922ed469e@mail.gmail.com> Message-ID: <80c99e790709051159ie84cc7wdf79205119e933e6@mail.gmail.com> maybe numpy.vdot is good for you. In [3]: x = numpy.random.rand(4) In [4]: x Out[4]: array([ 0.45426898, 0.22369238, 0.98731244, 0.7758774 ]) In [5]: numpy.sqrt(numpy.vdot(x,x)) Out[5]: 1.35394615117 hth, lorenzo On 9/5/07, Robert Dailey wrote: > > Oh I think I get it. > > You mean the built-in len() function? This isn't what I am looking for. > len() returns the number of components in the vector (e.g. whether it is a > 2D, 3D, etc vector). I found that magnitude can be calculated using hypot() > in the math module that comes with python. However, this method only appears > to work with 2D vectors. And yes, by magnitude I mean euclidean norm: > > sqrt( x*x + y*y ) = magnitude (length) of a vector > > On 9/5/07, Robert Dailey wrote: > > > > Thanks for your response. > > > > I was not able to find len() in the numpy documentation at the following > > link: > > http://www.scipy.org/doc/numpy_api_docs/namespace_index.html > > > > Perhaps I'm looking in the wrong location? > > > > On 9/5/07, Matthieu Brucher < matthieu.brucher at gmail.com > wrote: > > > > > > > > > > > 2007/9/5, Robert Dailey < rcdailey at gmail.com>: > > > > > > > > Hi, > > > > > > > > I have two questions: > > > > > > > > 1) Is there any way in numpy to represent vectors? Currently I'm > > > > using 'array' for vectors. > > > > > > > > > > > > A vector is an array with one dimension, it's OK. You could use a > > > matrix of dimension 1xn or nx1 as well. > > > > > > > > > 2) Is there a way to calculate the magnitude (length) of a vector in > > > > numpy? > > > > > > > > > Yes, len(a) > > > > > > Matthieu > > > > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lou_boog2000 at yahoo.com Wed Sep 5 15:02:58 2007 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 5 Sep 2007 12:02:58 -0700 (PDT) Subject: [Numpy-discussion] Vector magnitude? In-Reply-To: <46DEFC0D.8040207@gmail.com> Message-ID: <870179.54459.qm@web34409.mail.mud.yahoo.com> --- Robert Kern wrote: > > Besides constructing the Euclidean norm itself (as > shown by others here), you > can also use numpy.linalg.norm() to calculate any of > several different norms of > a vector or a matrix: Right. linalg.norm also gives the proper magnitude of complex vectors -- Lou Pecora, my views are my own. --------------- Great spirits have always encountered violent opposition from mediocre minds. -Albert Einstein ____________________________________________________________________________________ Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games. http://sims.yahoo.com/ From rcdailey at gmail.com Wed Sep 5 15:05:15 2007 From: rcdailey at gmail.com (Robert Dailey) Date: Wed, 5 Sep 2007 14:05:15 -0500 Subject: [Numpy-discussion] How-to: Uniform vector scale Message-ID: <496954360709051205g60433bfcndb5144c7bb1a3867@mail.gmail.com> Hi, I have a scalar value S. I want to perform the following math on vectors A and B (both of type array): A + B * S By order of operations, B * S should be done first. This is a vector multiplied by a real number and should be valid. However, the interpreter outputs: "ValueError: shape mismatch: objects cannot be broadcast to a single shape" I am not sure how I am supposed to multiply a vector with a scalar value. For example: array([5,2]) * 2 = [10,4] The above should happen. However, I get the error message instead. Any ideas? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 5 15:18:24 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 05 Sep 2007 14:18:24 -0500 Subject: [Numpy-discussion] How-to: Uniform vector scale In-Reply-To: <496954360709051205g60433bfcndb5144c7bb1a3867@mail.gmail.com> References: <496954360709051205g60433bfcndb5144c7bb1a3867@mail.gmail.com> Message-ID: <46DF0100.7080905@gmail.com> Robert Dailey wrote: > Hi, > > I have a scalar value S. I want to perform the following math on vectors > A and B (both of type array): > > A + B * S > > By order of operations, B * S should be done first. This is a vector > multiplied by a real number and should be valid. However, the > interpreter outputs: > "ValueError: shape mismatch: objects cannot be broadcast to a single > shape" Most likely, the + is failing, not the multiplication. However, I can't tell from your description. It is much more helpful for us to help you if you give us a self-contained, small example that demonstrates the problem along with the output (copy-and-paste, no summaries) and the output that you expected. > I am not sure how I am supposed to multiply a vector with a scalar > value. For example: > > array([5,2]) * 2 = [10,4] > > The above should happen. However, I get the error message instead. Any > ideas? Thanks. In [3]: from numpy import * In [4]: array([5,2]) * 2 Out[4]: array([10, 4]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dannoritzer at web.de Wed Sep 5 15:19:58 2007 From: dannoritzer at web.de (=?ISO-8859-1?Q?G=FCnter_Dannoritzer?=) Date: Wed, 05 Sep 2007 21:19:58 +0200 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DEE64B.4010100@noaa.gov> References: <46DE5F5D.8060204@web.de> <46DEE64B.4010100@noaa.gov> Message-ID: <46DF015E.3020502@web.de> Christopher Barker wrote: > [...] > The solution is to make an empty object array first, then populate it. [...] > > Does that help? Robert, Chris, thanks for that explanation. I understand that now. The purpose of my (Python) class is to model a fixed point data type. So I can specify how many bits are used for integer and how many bits are used for fractional representation. Then it should be possible to assign a value and do basic arithmetic with an instance of that class. The idea is that based on fixed point arithmetic rules, each operation tracks changes of bit width. I would now like to use that class in connection with numpy and my question is, whether there is a way to make its use as intuitive as possible for the user. Meaning that it would be possible to create a list of my FixedPoint instances and then assign that list to a numpy array. I created some minimal code that shows the behavior: import numpy class FixPoint(object): def __repr__(self): return "Hello" def __len__(self): return 3 def __getitem__(self, key): return 7 if __name__ == '__main__': a = numpy.array([FixPoint(), FixPoint()]) print "a: ", a b = [FixPoint(), FixPoint()] print "b: ", b When running that code, the output is: a: [[7 7 7] [7 7 7]] b: [Hello, Hello] What is interesting, the list uses the representation of the class, whereas array changes it to a list of the indexed values. Note that when changing the __len__ function to something else, the array also uses the __repr__ output. Would creating my own dType solve that problem? Cheers, Guenter From robert.kern at gmail.com Wed Sep 5 15:28:49 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 05 Sep 2007 14:28:49 -0500 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DF015E.3020502@web.de> References: <46DE5F5D.8060204@web.de> <46DEE64B.4010100@noaa.gov> <46DF015E.3020502@web.de> Message-ID: <46DF0371.1050108@gmail.com> G?nter Dannoritzer wrote: > Christopher Barker wrote: > [...] >> The solution is to make an empty object array first, then populate it. > [...] >> Does that help? > > Robert, Chris, thanks for that explanation. I understand that now. > > The purpose of my (Python) class is to model a fixed point data type. So > I can specify how many bits are used for integer and how many bits are > used for fractional representation. Then it should be possible to assign > a value and do basic arithmetic with an instance of that class. The idea > is that based on fixed point arithmetic rules, each operation tracks > changes of bit width. > > I would now like to use that class in connection with numpy and my > question is, whether there is a way to make its use as intuitive as > possible for the user. Meaning that it would be possible to create a > list of my FixedPoint instances and then assign that list to a numpy array. > > I created some minimal code that shows the behavior: > > import numpy > > class FixPoint(object): > def __repr__(self): > return "Hello" > > def __len__(self): > return 3 > > def __getitem__(self, key): > return 7 > > > > if __name__ == '__main__': > a = numpy.array([FixPoint(), FixPoint()]) > print "a: ", a > > b = [FixPoint(), FixPoint()] > print "b: ", b > > > When running that code, the output is: > > a: [[7 7 7] > [7 7 7]] > b: [Hello, Hello] > > What is interesting, the list uses the representation of the class, > whereas array changes it to a list of the indexed values. > > Note that when changing the __len__ function to something else, the > array also uses the __repr__ output. Yes, I believe we've explained why this is the case and how to work around it. You can encapsulate that workaround into a function specifically for making arrays of FixedPoint objects, if you like. > Would creating my own dType solve that problem? No. That's only useful for C data types, not Python instances. You're pretty much stuck with object arrays. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bryanv at enthought.com Wed Sep 5 15:35:05 2007 From: bryanv at enthought.com (Bryan Van de Ven) Date: Wed, 05 Sep 2007 14:35:05 -0500 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEF965.1000308@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> Message-ID: <46DF04E9.2020501@enthought.com> Christopher Barker wrote: > Does anyone know the status of support for valarrays now? I used std::valarray to implement a variant of the example Matrix class in Stroustrup's book (2D only) about two years ago. I was aware that is in disuse, by and large, but it worked well enough for my purposes and I was happy with it. I'm sure it could have been done differently/better. Bryan From zpincus at stanford.edu Wed Sep 5 16:10:38 2007 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed, 5 Sep 2007 16:10:38 -0400 Subject: [Numpy-discussion] numpy.rot90 bug with >2D arrays Message-ID: Hello, numpy.rot90 (from twodim_base.by) says it works only on the first two axes of an array, but due to its use of the transpose method (which reverses the shape tuple), can affect other axes. For example: a = numpy.ones((50,40,3)) b = numpy.rot90(a) assert(b.shape == (3,40,50)) # desired result is b.shape == (40,50,3) I believe that the fix is to replace the two calls to 'transpose()' in the function with 'swapaxes(0,1)'. This would allow rotating an array along just the first two axes. Does this fix look right? Should I file a bug or can someone just check that in? Zach From Chris.Barker at noaa.gov Wed Sep 5 16:22:29 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 13:22:29 -0700 Subject: [Numpy-discussion] How-to: Uniform vector scale In-Reply-To: <496954360709051205g60433bfcndb5144c7bb1a3867@mail.gmail.com> References: <496954360709051205g60433bfcndb5144c7bb1a3867@mail.gmail.com> Message-ID: <46DF1005.6090808@noaa.gov> Robert Dailey wrote: > The > interpreter outputs: > "ValueError: shape mismatch: objects cannot be broadcast to a single > shape" You need to post your actually input and output. The above works fine for me, just as you'd expect: >>> A = N.array([2,3,4]) >>> B = N.array([5,6,7]) >>> S = 5 >>> A+B*S array([27, 33, 39]) # I like to be explicit about order of operations, though: >>> A+(B*S) array([27, 33, 39]) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Sep 5 16:28:28 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 13:28:28 -0700 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DF015E.3020502@web.de> References: <46DE5F5D.8060204@web.de> <46DEE64B.4010100@noaa.gov> <46DF015E.3020502@web.de> Message-ID: <46DF116C.5040604@noaa.gov> G?nter Dannoritzer wrote: > The purpose of my (Python) class is to model a fixed point data type. So > I can specify how many bits are used for integer and how many bits are > used for fractional representation. > it would be possible to create a > list of my FixedPoint instances and then assign that list to a numpy array. If your class looks like a sequence, it's going to confuse numpy.array() > I created some minimal code that shows the behavior: > > import numpy > > class FixPoint(object): > def __repr__(self): > return "Hello" > > def __len__(self): > return 3 > > def __getitem__(self, key): > return 7 Why would a FixPoint object have to look like a sequence, with a length and a _getitem_? That's where the confusion is coming from. If I understand your needs, a FixPoint object is a number -- you'll want to override __add__ __mult__, and all those sorts of things, but there should be no need to make it look like a sequence. What does the "length" of a fixed point number mean? What does it meant to get the third element of a fixed point object? I'm guessing that maybe you're using __len__ to mean bit-width, but I'd just have a property for that, and call it BitWidth or something. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Sep 5 17:08:07 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 14:08:07 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> Message-ID: <46DF1AB7.5030900@noaa.gov> Matthieu Brucher wrote: > Blitz++ is more or less avandoned. It uses indexes than can be > not-portable between 32bits platforms and 64bits ones. Oh well -- that seems remarkably short sited, but would I have done better? > The Boost.Array is a fixed-size array, determined at compile-time, Ah, I had gotten the wrong impression -- I thought it was fixed at construction time, not compile time. > not interesting there, I suppose. I agree, I kind of wonder what the point is. > Multiarrays are what you're looking for. Even if I just want 1-d? though I guess a 1-d multiarray is pretty simple. > Besides, it is not needed to build Boost to use them I've seen that -- it does look like all we'd need is the header. So, can one: - create a Multiarray from an existing data pointer? - get the data pointer for an existing Multiarray? I think that's what I'd need to make the numpy array <-> Multiarray transition without any copying. ( I know those are really questions that are best asked on the boost list, and should be in the docs, but you folks are so helpful...) Maybe this is the class to wrap with the array interface, though maybe that's exactly what Boost::python::array does (though, AFAICT, still not for numpy). -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paustin at eos.ubc.ca Wed Sep 5 17:32:13 2007 From: paustin at eos.ubc.ca (Philip Austin) Date: Wed, 5 Sep 2007 14:32:13 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DF1AB7.5030900@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF1AB7.5030900@noaa.gov> Message-ID: <18143.8285.670787.980433@owl.eos.ubc.ca> Christopher Barker writes: > > I've seen that -- it does look like all we'd need is the header. > > So, can one: > > - create a Multiarray from an existing data pointer? > > - get the data pointer for an existing Multiarray? > > I think that's what I'd need to make the numpy array <-> Multiarray > transition without any copying. Albert Strasheim has done some work on this: http://thread.gmane.org/gmane.comp.python.c++/11559/focus=11560 > > ( I know those are really questions that are best asked on the boost > list, and should be in the docs, but you folks are so helpful...) > > Maybe this is the class to wrap with the array interface, though maybe > that's exactly what Boost::python::array does (though, AFAICT, still not > for numpy). > From Chris.Barker at noaa.gov Wed Sep 5 17:57:01 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 05 Sep 2007 14:57:01 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <18143.8285.670787.980433@owl.eos.ubc.ca> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF1AB7.5030900@noaa.gov> <18143.8285.670787.980433@owl.eos.ubc.ca> Message-ID: <46DF262D.5080001@noaa.gov> Philip Austin wrote: > Albert Strasheim has done some work on this: > http://thread.gmane.org/gmane.comp.python.c++/11559/focus=11560 Thanks for the pointer. Not a lot of docs, and it looks like he's using boost::python, and I want to use SWIG, but I'm sure there's something useful in there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From David.L.Goldsmith at noaa.gov Wed Sep 5 17:57:42 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Wed, 05 Sep 2007 14:57:42 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> Message-ID: <46DF2656.9010605@noaa.gov> Travis Vaught wrote: > Have you seen this? > > http://www.scipy.org/Cookbook/SWIG_and_NumPy > Unclear (to me): precisely what does one get from running python numpy/docs/swig/setup.py install, and is the product necessary, and if so, which other components rely on the product? I ask 'cause I'm getting the following error trying to do so: [Parallels emulating] Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Python25\Lib\site-packages\numpy\doc\swig>python setup.py install running install running build running build_py file Vector.py (for module Vector) not found file Matrix.py (for module Matrix) not found file Tensor.py (for module Tensor) not found file Vector.py (for module Vector) not found file Matrix.py (for module Matrix) not found file Tensor.py (for module Tensor) not found running build_ext building '_Vector' extension creating build creating build\temp.win32-2.5 creating build\temp.win32-2.5\Release C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBUG -IC:\Python25\lib\site-packages\numpy\core\include -IC:\Python25\include -IC:\Python25\PC /TpVector_wrap.cxx /Fobuild\temp.win32-2.5\Release\Vector_wrap.obj cl : Command line warning D4029 : optimization is not available in the standard edition compiler Vector_wrap.cxx c1xx : fatal error C1083: Cannot open source file: 'Vector_wrap.cxx': No such file or directory error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe"' failed with exit status 2 I have Python 2.5.1 installed: C:\Python25\Lib\site-packages\numpy\doc\swig>python Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32 and numpy 1.0.3: >>> import numpy >>> numpy.__version__ '1.0.3' I have the swig-1.3.31 exe: Directory of C:\SWIG\swigwin-1.3.31 11/21/2006 12:07 AM 1,190,652 swig.exe and it runs (take my word for it) and the VC++ compiler via Visual Studio .NET 2003 (this I know 'cause I use it frequently). So, if I don't need the product of python numpy/doc/swig/setup.py install, please explain why I don't, but if I do need it, please help me figure out why I can't build it. Thanks! DG > Also, the numpy/doc/swig directory has the simple typemaps. > > Travis > > On Sep 5, 2007, at 7:08 AM, Xavier Gnata wrote: > > >> I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the >> conversion but my code is written by hands. I would like to >> simplify it >> using SWIG but I also would like to see a good typemap valarray <=> >> numpy.array :) >> >> Joris : Historical ones? Maybe also the fact that distutils has >> some small pb with C++ module. To sum up : You have to compile you >> module with the same compiler options you used to compile Python. >> Python is coded in C so some options does not make sense in C++. >> As a result, you get warnings at compile time (see pylab compiled >> with gcc for instance). >> >> >> Xavier >> >> >> >> >>> A related question, just out of curiosity: is there a technical >>> reason why Numpy has been coded in C rather than C++? >>> >>> Joris >>> >>> >>> >>> On 05 Sep 2007, at 02:24, David Goldsmith wrote: >>> >>> >>> >>>> Anyone have a well-tested SWIG-based C++ STL valarray <=> >>>> numpy.array >>>> typemap to share? Thanks! >>>> >>>> DG >>>> -- >>>> ERD/ORR/NOS/NOAA >>> emergencyresponse/> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> -- >> ############################################ >> Xavier Gnata >> CRAL - Observatoire de Lyon >> 9, avenue Charles Andr? >> 69561 Saint Genis Laval cedex >> Phone: +33 4 78 86 85 28 >> Fax: +33 4 78 86 83 86 >> E-mail: gnata at obs.univ-lyon1.fr >> ############################################ >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From dannoritzer at web.de Wed Sep 5 17:59:23 2007 From: dannoritzer at web.de (=?ISO-8859-1?Q?G=FCnter_Dannoritzer?=) Date: Wed, 05 Sep 2007 23:59:23 +0200 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: <46DF116C.5040604@noaa.gov> References: <46DE5F5D.8060204@web.de> <46DEE64B.4010100@noaa.gov> <46DF015E.3020502@web.de> <46DF116C.5040604@noaa.gov> Message-ID: <46DF26BB.8090107@web.de> Christopher Barker wrote: [...] > > Why would a FixPoint object have to look like a sequence, with a length > and a _getitem_? That's where the confusion is coming from. > That allows me to slice bits. > If I understand your needs, a FixPoint object is a number -- you'll want > to override __add__ __mult__, and all those sorts of things, but there > should be no need to make it look like a sequence. > > What does the "length" of a fixed point number mean? What does it meant > to get the third element of a fixed point object? > Yes it returns the bit width. When indexing you can get individual bits or a bit range. That is good for modeling hardware behavior. > I'm guessing that maybe you're using __len__ to mean bit-width, but I'd > just have a property for that, and call it BitWidth or something. That solved it. I took out __len__ and added a width property. Now array is happy. Guenter From wfspotz at sandia.gov Wed Sep 5 18:13:18 2007 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 5 Sep 2007 16:13:18 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DF2656.9010605@noaa.gov> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> <46DF2656.9010605@noaa.gov> Message-ID: The setup.py script in numpy/doc/swig is for compiling test code for numpy.i. It is properly invoked by the Makefile, which will first run swig to generate the wrapper code for the test classes. All a developer, who is using swig to interface some code with numpy in python, needs is numpy.i. The setup.py script could possibly work as a template for whatever they are wrapping, I guess. On Sep 5, 2007, at 3:57 PM, David Goldsmith wrote: > Travis Vaught wrote: >> Have you seen this? >> >> http://www.scipy.org/Cookbook/SWIG_and_NumPy >> > Unclear (to me): precisely what does one get from running python > numpy/docs/swig/setup.py install, and is the product necessary, and if > so, which other components rely on the product? I ask 'cause I'm > getting the following error trying to do so: > > [Parallels emulating] Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\Python25\Lib\site-packages\numpy\doc\swig>python setup.py install > running install > running build > running build_py > file Vector.py (for module Vector) not found > file Matrix.py (for module Matrix) not found > file Tensor.py (for module Tensor) not found > file Vector.py (for module Vector) not found > file Matrix.py (for module Matrix) not found > file Tensor.py (for module Tensor) not found > running build_ext > building '_Vector' extension > creating build > creating build\temp.win32-2.5 > creating build\temp.win32-2.5\Release > C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c > /nologo /Ox > /MD /W3 /GX /DNDEBUG -IC:\Python25\lib\site-packages\numpy\core > \include > -IC:\Python25\include -IC:\Python25\PC /TpVector_wrap.cxx > /Fobuild\temp.win32-2.5\Release\Vector_wrap.obj > cl : Command line warning D4029 : optimization is not available in the > standard edition compiler > Vector_wrap.cxx > c1xx : fatal error C1083: Cannot open source file: > 'Vector_wrap.cxx': No > such file or directory > error: command '"C:\Program Files\Microsoft Visual Studio .NET > 2003\Vc7\bin\cl.exe"' failed with exit status 2 > > I have Python 2.5.1 installed: > > C:\Python25\Lib\site-packages\numpy\doc\swig>python > Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit > (Intel)] on win32 > > and numpy 1.0.3: >>>> import numpy >>>> numpy.__version__ > '1.0.3' > > I have the swig-1.3.31 exe: > > Directory of C:\SWIG\swigwin-1.3.31 > > 11/21/2006 12:07 AM 1,190,652 swig.exe > > and it runs (take my word for it) > > and the VC++ compiler via Visual Studio .NET 2003 (this I know > 'cause I > use it frequently). > > So, if I don't need the product of python numpy/doc/swig/setup.py > install, please explain why I don't, but if I do need it, please > help me > figure out why I can't build it. Thanks! > > DG > > >> Also, the numpy/doc/swig directory has the simple typemaps. >> >> Travis >> >> On Sep 5, 2007, at 7:08 AM, Xavier Gnata wrote: >> >> >>> I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the >>> conversion but my code is written by hands. I would like to >>> simplify it >>> using SWIG but I also would like to see a good typemap valarray <=> >>> numpy.array :) >>> >>> Joris : Historical ones? Maybe also the fact that distutils has >>> some small pb with C++ module. To sum up : You have to compile you >>> module with the same compiler options you used to compile Python. >>> Python is coded in C so some options does not make sense in C++. >>> As a result, you get warnings at compile time (see pylab compiled >>> with gcc for instance). >>> >>> >>> Xavier >>> >>> >>> >>> >>>> A related question, just out of curiosity: is there a technical >>>> reason why Numpy has been coded in C rather than C++? >>>> >>>> Joris >>>> >>>> >>>> >>>> On 05 Sep 2007, at 02:24, David Goldsmith wrote: >>>> >>>> >>>> >>>>> Anyone have a well-tested SWIG-based C++ STL valarray <=> >>>>> numpy.array >>>>> typemap to share? Thanks! >>>>> >>>>> DG >>>>> -- >>>>> ERD/ORR/NOS/NOAA >>>> emergencyresponse/> >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Numpy-discussion at scipy.org >>>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm >>>> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>> -- >>> ############################################ >>> Xavier Gnata >>> CRAL - Observatoire de Lyon >>> 9, avenue Charles Andr? >>> 69561 Saint Genis Laval cedex >>> Phone: +33 4 78 86 85 28 >>> Fax: +33 4 78 86 83 86 >>> E-mail: gnata at obs.univ-lyon1.fr >>> ############################################ >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-5451 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From jbednar at inf.ed.ac.uk Wed Sep 5 18:37:06 2007 From: jbednar at inf.ed.ac.uk (James A. Bednar) Date: Wed, 5 Sep 2007 23:37:06 +0100 Subject: [Numpy-discussion] Use my own data type with NumPy In-Reply-To: References: Message-ID: <18143.12178.421607.971471@lodestar.inf.ed.ac.uk> | Date: Wed, 05 Sep 2007 21:19:58 +0200 | From: G?nter Dannoritzer | Subject: Re: [Numpy-discussion] Use my own data type with NumPy | | The purpose of my (Python) class is to model a fixed point data | type. So I can specify how many bits are used for integer and how | many bits are used for fractional representation. Then it should be | possible to assign a value and do basic arithmetic with an instance | of that class. The idea is that based on fixed point arithmetic | rules, each operation tracks changes of bit width. You may already be aware of this, but there is a package available that does exactly what you describe: http://fixedpoint.sourceforge.net It is not currently actively aintained, and it's quite slow, but it does work reliably, and we use it daily at my site. It might be best for all concerned if you simply took over that project, making it faster and well supported, rather than creating a competing one... Jim From matthieu.brucher at gmail.com Thu Sep 6 04:06:35 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 6 Sep 2007 10:06:35 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DF1AB7.5030900@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF1AB7.5030900@noaa.gov> Message-ID: 2007/9/5, Christopher Barker : > > Matthieu Brucher wrote: > > Blitz++ is more or less avandoned. It uses indexes than can be > > not-portable between 32bits platforms and 64bits ones. > > Oh well -- that seems remarkably short sited, but would I have done > better? Well, it's too bad the mainteners used int instead of long or somthing like that, but at the time, 64bits platforms did not exist. > The Boost.Array is a fixed-size array, determined at compile-time, > > Ah, I had gotten the wrong impression -- I thought it was fixed at > construction time, not compile time. According to the doc, it's fixed at compile-time. > not interesting there, I suppose. > > I agree, I kind of wonder what the point is. In some case you might want them, but not very often, only to speed up computation. > Multiarrays are what you're looking for. > > Even if I just want 1-d? though I guess a 1-d multiarray is pretty simple. > > > Besides, it is not needed to build Boost to use them > > I've seen that -- it does look like all we'd need is the header. > > So, can one: > > - create a Multiarray from an existing data pointer? > > - get the data pointer for an existing Multiarray? > > I think that's what I'd need to make the numpy array <-> Multiarray > transition without any copying. > I have the same problem at my job, but I don't think SWIG will suit me, although I use it for simpler wrappers. Like Philip said, there are some trials, I hope someone (or I) will come up with a Python array <-> C++ array wrapper without copy. What you really need is that the multiarray must be able to use the data pointer, it's a special policy, and then you must be able to register a shared pointer to Python, that is if the original container use shared pointer. If the last part is not possible, you will need to create first a Python array and then make a view of it in C++, even for a result array. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Thu Sep 6 07:30:43 2007 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 06 Sep 2007 11:30:43 -0000 Subject: [Numpy-discussion] 2-d in-place operation performance vs 1-d non in-place In-Reply-To: <200709051829.24448.faltet@carabos.com> References: <91ad5bf80709050722n67ef53b3l6902aa03eff95a66@mail.gmail.com> <200709051829.24448.faltet@carabos.com> Message-ID: <1189078243.699888.8790@k79g2000hse.googlegroups.com> On Sep 5, 12:29 pm, Francesc Altet wrote: > A Wednesday 05 September 2007, George Sakkis escrigu?: > > > > > I was surprised to see that an in-place modification of a 2-d array > > turns out to be slower from the respective non-mutating operation on > > 1- d arrays, although the latter creates new array objects. Here is > > the benchmarking code: > > > import timeit > > > for n in 10,100,1000,10000: > > setup = 'from numpy.random import random;' \ > > 'm=random((%d,2));' \ > > 'u1=random(%d);' \ > > 'u2=u1.reshape((u1.size,1))' % (n,n) > > timers = [timeit.Timer(stmt,setup) for stmt in > > # 1-d operations; create new arrays > > 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', > > # 2-d in place operation > > 'm -= u2' > > ] > > print n, [min(timer.repeat(3,1000)) for timer in timers] > > > And some results (Python 2.5, WinXP): > > > 10 [0.010832382327921563, 0.0045706926438974782] > > 100 [0.010882668048592767, 0.021704993232380093] > > 1000 [0.018272154701226007, 0.19477587235249172] > > 10000 [0.073787590322233698, 1.9234369172618306] > > > So the 2-d in-place modification time grows linearly with the array > > size but the 1-d operations are much more efficient, despite > > allocating new arrays while doing so. What gives ? > > This seems the effect of broadcasting u2. If you were to use a > pre-computed broadcasted, you would get rid of such bottleneck: > > for n in 10,100,1000,10000: > setup = 'import numpy;' \ > 'm=numpy.random.random((%d,2));' \ > 'u1=numpy.random.random(%d);' \ > 'u2=u1[:, numpy.newaxis];' \ > 'u3=numpy.array([u1,u1]).transpose()' % (n,n) > timers = [timeit.Timer(stmt,setup) for stmt in > # 1-d operations; create new arrays > 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', > # 2-d in place operation (using broadcasting) > 'm -= u2', > # 2-d in-place operation (not forcing broadcasting) > 'm -= u3' > ] > print n, [min(timer.repeat(3,1000)) for timer in timers] > > gives in my machine: > > 10 [0.03213191032409668, 0.012019872665405273, 0.0068600177764892578] > 100 [0.033048152923583984, 0.06542205810546875, 0.0076580047607421875] > 1000 [0.040294170379638672, 0.59892702102661133, 0.014600992202758789] > 10000 [0.32667303085327148, 5.9721651077270508, 0.10261106491088867] Thank you, indeed broadcasting is the bottleneck here. I had the impression that broadcasting was a fast operation, i.e. it doesn't require allocating physically the target array of the broadcast but it seems this is not the case. George From tim.hochberg at ieee.org Thu Sep 6 11:07:19 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Thu, 6 Sep 2007 08:07:19 -0700 Subject: [Numpy-discussion] 2-d in-place operation performance vs 1-d non in-place In-Reply-To: <1189078243.699888.8790@k79g2000hse.googlegroups.com> References: <91ad5bf80709050722n67ef53b3l6902aa03eff95a66@mail.gmail.com> <200709051829.24448.faltet@carabos.com> <1189078243.699888.8790@k79g2000hse.googlegroups.com> Message-ID: On 9/6/07, George Sakkis wrote: > > On Sep 5, 12:29 pm, Francesc Altet wrote: > > A Wednesday 05 September 2007, George Sakkis escrigu?: > > > > > > > > > I was surprised to see that an in-place modification of a 2-d array > > > turns out to be slower from the respective non-mutating operation on > > > 1- d arrays, although the latter creates new array objects. Here is > > > the benchmarking code: > > > > > import timeit > > > > > for n in 10,100,1000,10000: > > > setup = 'from numpy.random import random;' \ > > > 'm=random((%d,2));' \ > > > 'u1=random(%d);' \ > > > 'u2=u1.reshape((u1.size,1))' % (n,n) > > > timers = [timeit.Timer(stmt,setup) for stmt in > > > # 1-d operations; create new arrays > > > 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', > > > # 2-d in place operation > > > 'm -= u2' > > > ] > > > print n, [min(timer.repeat(3,1000)) for timer in timers] > > > > > And some results (Python 2.5, WinXP): > > > > > 10 [0.010832382327921563, 0.0045706926438974782] > > > 100 [0.010882668048592767, 0.021704993232380093] > > > 1000 [0.018272154701226007, 0.19477587235249172] > > > 10000 [0.073787590322233698, 1.9234369172618306] > > > > > So the 2-d in-place modification time grows linearly with the array > > > size but the 1-d operations are much more efficient, despite > > > allocating new arrays while doing so. What gives ? > > > > This seems the effect of broadcasting u2. If you were to use a > > pre-computed broadcasted, you would get rid of such bottleneck: > > > > for n in 10,100,1000,10000: > > setup = 'import numpy;' \ > > 'm=numpy.random.random((%d,2));' \ > > 'u1=numpy.random.random(%d);' \ > > 'u2=u1[:, numpy.newaxis];' \ > > 'u3=numpy.array([u1,u1]).transpose()' % (n,n) > > timers = [timeit.Timer(stmt,setup) for stmt in > > # 1-d operations; create new arrays > > 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', > > # 2-d in place operation (using broadcasting) > > 'm -= u2', > > # 2-d in-place operation (not forcing broadcasting) > > 'm -= u3' > > ] > > print n, [min(timer.repeat(3,1000)) for timer in timers] > > > > gives in my machine: > > > > 10 [0.03213191032409668, 0.012019872665405273, 0.0068600177764892578] > > 100 [0.033048152923583984, 0.06542205810546875, 0.0076580047607421875] > > 1000 [0.040294170379638672, 0.59892702102661133, 0.014600992202758789] > > 10000 [0.32667303085327148, 5.9721651077270508, 0.10261106491088867] > > Thank you, indeed broadcasting is the bottleneck here. I had the > impression that broadcasting was a fast operation, I'm fairly certain that this is correct. > i.e. it doesn't > require allocating physically the target array of the broadcast but it > seems this is not the case. I don't think it does. However it does change the way the ufuncs stride over the array when they do the operation, which I suspect is the root of the problem. First note that all of these operations are (and should be) linear in n. I played around this a little bit and found a couple of things; first the axis along which you broadcast matters a lot. That's not surprising; when you broadcast along the short axis I believe that the inner loop ends up very short and thus grows a lot of overhead. The other is that in-place versus not in place makes a big difference. I'm not sure why that is. Here's the amended benchmark; I use larger values of N so that the linear behavior of all the examples is clearer. import timeit for n in 10000,20000,30000,40000,50000: setup = """ import numpy m = numpy.random.random((%d,2)) u1 = numpy.random.random(%d) u2 = u1[:, numpy.newaxis] u3 = numpy.array([u1,u1]).transpose() m2 = numpy.array(m.transpose()) u4 = u1[numpy.newaxis, :] """ % (n,n) timers = [timeit.Timer(stmt,setup) for stmt in # 1-d operations; create new arrays 'a0 = m[:,0]-u1; a1 = m[:,1]-u1', # 2-d in place operation (using broadcasting) 'x = m - u2', # 2-d in-place operation (not forcing broadcasting) 'x = m - u3', # transposed using broadcasting 'x = m2 - u4' ] print n, [min(timer.repeat(3,100)) for timer in timers] and the results: 10000 [0.0061748071333088406, 0.091492354475219639, 0.0063329277883082957, 0.0055317086389471415] 20000 [0.01121259824921883, 0.18455949010279238, 0.013805665245163912, 0.010877918841640355] 30000 [0.018362668998434195, 0.27472122853602898, 0.026029285844988426, 0.01910075163184155] 40000 [0.029564092643059592, 0.4027084447883742, 0.12735166061608361, 0.10206883835794844] 50000 [0.059520972637583824, 0.50685380404492797, 0.17739796538386798, 0.13967277964098823] -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Thu Sep 6 14:06:50 2007 From: jdh2358 at gmail.com (John Hunter) Date: Thu, 6 Sep 2007 13:06:50 -0500 Subject: [Numpy-discussion] corrcoef Message-ID: <88e473830709061106w634e997s7dc027da040ba260@mail.gmail.com> Is it desirable that numpy.corrcoef for two arrays returns a 2x2 array rather than a scalar In [10]: npy.corrcoef(npy.random.rand(10), npy.random.rand(10)) Out[10]: array([[ 1. , -0.16088728], [-0.16088728, 1. ]]) I always end up extracting the 0,1 element anyway. What is the advantage, aside from backwards compatibility, for returning a 2x2? JDH From svetosch at gmx.net Thu Sep 6 13:13:24 2007 From: svetosch at gmx.net (Sven Schreiber) Date: Thu, 06 Sep 2007 18:13:24 +0100 Subject: [Numpy-discussion] corrcoef In-Reply-To: <88e473830709061106w634e997s7dc027da040ba260@mail.gmail.com> References: <88e473830709061106w634e997s7dc027da040ba260@mail.gmail.com> Message-ID: <46E03534.9000309@gmx.net> John Hunter schrieb: > Is it desirable that numpy.corrcoef for two arrays returns a 2x2 array > rather than a scalar > > In [10]: npy.corrcoef(npy.random.rand(10), npy.random.rand(10)) > Out[10]: > array([[ 1. , -0.16088728], > [-0.16088728, 1. ]]) > > > I always end up extracting the 0,1 element anyway. What is the > advantage, aside from backwards compatibility, for returning a 2x2? > JDH Forgive me if my answer is trivial and misses your point, but I guess because for more than two data series one also gets the entire correlation matrix and that is desirable. However, I would agree that the name "corrcoef" is unfortunate, suggesting a scalar value (at least to me). -sven From Chris.Barker at noaa.gov Thu Sep 6 18:10:15 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Sep 2007 15:10:15 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DE9C30.8020300@obs.univ-lyon1.fr> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> Message-ID: <46E07AC7.2010001@noaa.gov> Xavier Gnata wrote: > I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the > conversion but my code is written by hands. I'd like to see that. How are you getting the pointer to pass in to PyArray_SimpleNewFromData? It looks like you can do something like: (VA is a valarray) npy_intp *dims dims[0] = VA.size() NPA = PyArray_SimpleNewFromData(1, dims, typenum, &VA[0]); Is that what you're doing? Is there any guarantee that &VA[0] won't change? In any case, I assume that you have to make sure that VA doesn't get deleted while the array is still around. > I would like to simplify it using SWIG but I also would like to see a good typemap valarray <=> > numpy.array :) In principle, if you know how to write the code by hand, you know how to write the typemap. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Fri Sep 7 05:36:50 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 07 Sep 2007 18:36:50 +0900 Subject: [Numpy-discussion] why system_info.check_libs does not look for dll on windows ? Message-ID: <46E11BB2.4060203@ar.media.kyoto-u.ac.jp> Hi, I would like to know if there is a reason why system_info does not look for dll on windows ? I think it would make sense to look for dll when you want to use an external lib through ctypes, for example. cheers, David From nardei at infinito.it Fri Sep 7 06:11:10 2007 From: nardei at infinito.it (ale) Date: Fri, 07 Sep 2007 12:11:10 +0200 Subject: [Numpy-discussion] Importing data from html tables Message-ID: Hi, I'm trying to import into array the data contained in a html table. I use BeautifulSoup as html parser html = open('T0015.html','r') bs = BeautifulSoup(html) for tr in bs.findAll('tr')[1:]: table.append([td.p.string for td in tr.findAll('td')]) and I get this: print table [[u'1925', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'105.0'] [u'1926', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'136.0'] [u'1927', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'51.0'] [u'1928', u'--', u'--', u'--', u'nn', u'--', u'--', u'--', u'--', u'104.0'] ,.......and so on] How to put this list of list of strings in a numpy array, and set '--' and 'nn' as NaN? Thank you Alessio From cookedm at physics.mcmaster.ca Fri Sep 7 09:59:50 2007 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri, 07 Sep 2007 09:59:50 -0400 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: (Bill Spotz's message of "Wed\, 5 Sep 2007 12\:07\:31 -0600") References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DEE97C.5070008@noaa.gov> Message-ID: "Bill Spotz" writes: > On Sep 5, 2007, at 11:38 AM, Christopher Barker wrote: > >> Of course, it should be possible to write C++ wrappers around the core >> ND-array object, if anyone wants to take that on! > > boost::python has done this for Numeric, but last I checked, they > have not upgraded to numpy. Even then, their wrappers went through the Python interface, not the C API. So, it's no faster than using Python straight. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Fri Sep 7 10:03:47 2007 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri, 07 Sep 2007 10:03:47 -0400 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DEE97C.5070008@noaa.gov> (Christopher Barker's message of "Wed\, 05 Sep 2007 10\:38\:04 -0700") References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DEE97C.5070008@noaa.gov> Message-ID: Christopher Barker writes: > Joris De Ridder wrote: >> A related question, just out of curiosity: is there a technical >> reason why Numpy has been coded in C rather than C++? > > There was a fair bit of discussion about this back when the numarray > project started, which was a re-implementation of the original Numeric. > > IIRC, one of the drivers was that C++ support was still pretty > inconsistent across compilers and OSs, particularly if you wanted to > really get the advantages of C++, by using templates and the like. > > It was considered very important that the numpy code base be very portable. One of the big problems has always been that the C++ application binary interface (ABI) has historically not been all that stable: all the C++ libraries your program used would have to be compiled by the same version of the compiler. That includes Python. You couldn't import an extension module written in C++ compiled with g++ 3.3, say, at the same time as one compiled with g++ 4.0, and your Python would have to been linked with the same version. While the ABI issues (at least on Linux with GCC) are better now, it's still something of a quagmire. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From robert.kern at gmail.com Fri Sep 7 14:16:05 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 07 Sep 2007 13:16:05 -0500 Subject: [Numpy-discussion] why system_info.check_libs does not look for dll on windows ? In-Reply-To: <46E11BB2.4060203@ar.media.kyoto-u.ac.jp> References: <46E11BB2.4060203@ar.media.kyoto-u.ac.jp> Message-ID: <46E19565.1060407@gmail.com> David Cournapeau wrote: > Hi, > > I would like to know if there is a reason why system_info does not > look for dll on windows ? I think it would make sense to look for dll > when you want to use an external lib through ctypes, for example. Because it was designed to find libraries that the compiler can link against. Most Windows compilers require a .lib or a .a "import library" in order to link with the DLL. Making system_info find .dlls would give false positives for its intended use. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Sep 7 14:17:44 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 07 Sep 2007 13:17:44 -0500 Subject: [Numpy-discussion] Importing data from html tables In-Reply-To: References: Message-ID: <46E195C8.3030507@gmail.com> ale wrote: > Hi, > I'm trying to import into array the data contained in a html table. > I use BeautifulSoup as html parser > > html = open('T0015.html','r') > bs = BeautifulSoup(html) > for tr in bs.findAll('tr')[1:]: > table.append([td.p.string for td in tr.findAll('td')]) > > and I get this: > > print table > > [[u'1925', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'105.0'] > [u'1926', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'136.0'] > [u'1927', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'--', u'51.0'] > [u'1928', u'--', u'--', u'--', u'nn', u'--', u'--', u'--', u'--', u'104.0'] > ,.......and so on] > > How to put this list of list of strings in a numpy array, and set '--' > and 'nn' as NaN? from numpy import array, nan def myfloat(x): if x == '--': return nan else: return float(x) arr = array([map(myfloat, row) for row in table]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dalcinl at gmail.com Fri Sep 7 20:39:16 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 7 Sep 2007 21:39:16 -0300 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DDF734.7070708@noaa.gov> References: <46DDF734.7070708@noaa.gov> Message-ID: David, I'll try to show you what I do for a custom C++ class, of course this does not solve the issue resizing (my class does not actually support resizing, so this is fine for me): My custom class is a templatized one called DTable (is like a 2d contiguous array), but currently I only instantiate it for 'int' and 'double'. This class have two relevant methods: getShape(), returning a std::pair, and getArray(), returning a reference to an underliing std::vector. So I am able to automatically support array interface by using this: First, I define a templatized utility function 'array_interface' %header %{ namespace numpy { template static char typechar() { return '\0'; } template<> static char typechar() { return 'i'; } template<> static char typechar() { return 'f'; } template static PyObject* array_interface(const DTable* self) { const std::pair& shape = self->getShape(); const std::vector& data = self->getArray(); void* array = const_cast(&data[0]); char endian = PyArray_NATIVE; char kind = typechar(); int elsize = sizeof(T); return Py_BuildValue("{sNsNsNsN}", "shape", Py_BuildValue("ii", shape.first, shape.second), "typestr", PyString_FromFormat("%c%c%d", endian, kind, elsize), "data", Py_BuildValue("NO", PyLong_FromVoidPtr(array), Py_False), "version", PyInt_FromLong(3)); } } %} Now define a SWIG macro to apply it to instantiations of my class %define %array_interface(Class) %extend Class { PyObject* __array_interface__; } %{ #define %mangle(Class) ##_## __array_interface__ ## _get(_t) \ numpy::array_interface(_t) #define %mangle(Class) ##_## __array_interface__ ## _set(_t, _val) \ SWIG_exception_fail(SWIG_AttributeError, "read-only attribute") %} %enddef and finally instantiate my class with different names for 'int' and 'double' in SWIG and finally apply previous macro %template(DTableI) DTable; %template(DTableS) DTable; %array_interface( DTable ); %array_interface( DTable ); I think you can implement someting similar for std::vector, std::valarray, or whatever... For use other data types, the only you need is to specialize the 'typecode' function. Hope this help you. On 9/4/07, David Goldsmith wrote: > Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array > typemap to share? Thanks! > > DG > -- > ERD/ORR/NOS/NOAA > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From David.L.Goldsmith at noaa.gov Fri Sep 7 21:23:26 2007 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Fri, 07 Sep 2007 18:23:26 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: References: <46DDF734.7070708@noaa.gov> Message-ID: <46E1F98E.6070408@noaa.gov> Thanks! DG Lisandro Dalcin wrote: > David, I'll try to show you what I do for a custom C++ class, of > course this does not solve the issue resizing (my class does not > actually support resizing, so this is fine for me): > > My custom class is a templatized one called DTable (is like a 2d > contiguous array), but currently I only instantiate it for 'int' and > 'double'. This class have two relevant methods: getShape(), returning > a std::pair, and getArray(), returning a reference to an underliing > std::vector. So I am able to automatically support array interface by > using this: > > First, I define a templatized utility function 'array_interface' > > %header %{ > namespace numpy { > > template static char typechar() { return '\0'; } > template<> static char typechar() { return 'i'; } > template<> static char typechar() { return 'f'; } > > template > static PyObject* > array_interface(const DTable* self) > { > const std::pair& shape = self->getShape(); > const std::vector& data = self->getArray(); > void* array = const_cast(&data[0]); > char endian = PyArray_NATIVE; > char kind = typechar(); > int elsize = sizeof(T); > return Py_BuildValue("{sNsNsNsN}", > "shape", Py_BuildValue("ii", shape.first, shape.second), > "typestr", PyString_FromFormat("%c%c%d", endian, kind, elsize), > "data", Py_BuildValue("NO", PyLong_FromVoidPtr(array), Py_False), > "version", PyInt_FromLong(3)); > } > } > %} > > Now define a SWIG macro to apply it to instantiations of my class > > %define %array_interface(Class) > %extend Class { PyObject* __array_interface__; } > %{ > #define %mangle(Class) ##_## __array_interface__ ## _get(_t) \ > numpy::array_interface(_t) > #define %mangle(Class) ##_## __array_interface__ ## _set(_t, _val) \ > SWIG_exception_fail(SWIG_AttributeError, "read-only attribute") > %} > %enddef > > and finally instantiate my class with different names for 'int' and > 'double' in SWIG and finally apply previous macro > > > %template(DTableI) DTable; > %template(DTableS) DTable; > > %array_interface( DTable ); > %array_interface( DTable ); > > I think you can implement someting similar for std::vector, > std::valarray, or whatever... For use other data types, the only you > need is to specialize the 'typecode' function. > > Hope this help you. > > > On 9/4/07, David Goldsmith wrote: > >> Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array >> typemap to share? Thanks! >> >> DG >> -- >> ERD/ORR/NOS/NOAA >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > From david at ar.media.kyoto-u.ac.jp Sat Sep 8 07:59:02 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 08 Sep 2007 20:59:02 +0900 Subject: [Numpy-discussion] why system_info.check_libs does not look for dll on windows ? In-Reply-To: <46E19565.1060407@gmail.com> References: <46E11BB2.4060203@ar.media.kyoto-u.ac.jp> <46E19565.1060407@gmail.com> Message-ID: <46E28E86.1010904@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > David Cournapeau wrote: > >> Hi, >> >> I would like to know if there is a reason why system_info does not >> look for dll on windows ? I think it would make sense to look for dll >> when you want to use an external lib through ctypes, for example. >> > > Because it was designed to find libraries that the compiler can link against. > Most Windows compilers require a .lib or a .a "import library" in order to link > with the DLL. Making system_info find .dlls would give false positives for its > intended use. > I see, thanks for the explanation. cheers, David From gnata at obs.univ-lyon1.fr Sat Sep 8 15:20:37 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Sat, 08 Sep 2007 21:20:37 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E07AC7.2010001@noaa.gov> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> <46E07AC7.2010001@noaa.gov> Message-ID: <46E2F605.1020103@obs.univ-lyon1.fr> Christopher Barker wrote: > Xavier Gnata wrote: > >> I'm using the numpy C API (PyArray_SimpleNewFromData) to perform the >> conversion but my code is written by hands. >> > > I'd like to see that. How are you getting the pointer to pass in to > PyArray_SimpleNewFromData? It looks like you can do something like: > > (VA is a valarray) > npy_intp *dims > dims[0] = VA.size() > > NPA = PyArray_SimpleNewFromData(1, dims, typenum, &VA[0]); > > Is that what you're doing? Is there any guarantee that &VA[0] won't > change? In any case, I assume that you have to make sure that VA doesn't > get deleted while the array is still around. > > >> I would like to simplify it using SWIG but I also would like to see a good typemap valarray <=> >> numpy.array :) >> > > In principle, if you know how to write the code by hand, you know how to > write the typemap. > > > Here it is :) OK it is only a smal code but I often use something like that to debug my C++ code (the goal is just to have a quick look to larrge array to unnderstand what is going wrong). #include #include #include using namespace std; int main (int argc, char *argv[]) { PyObject *pylab_module, *pylab_module_dict, *func_imshow, *func_show, *args, *result_imshow, *result_show, *array; int NbDims = 2; int *Dims; long NbData; Dims = new int[NbDims]; Dims[0] = 10; Dims[1] = 10; NbData = Dims[0] * Dims[1]; valarray < double >Data (NbData); for (long i = 0; i < NbData; i++) { Data[i] = (double) i / (NbData - 1); } Py_Initialize (); // Needed before any call to PyArray_foo function. import_array1 (-1); // New reference to a numpy array array = PyArray_SimpleNewFromData (NbDims, Dims, PyArray_DOUBLE, &Data[0]); pylab_module = PyImport_Import (PyString_FromString ("pylab")); if (pylab_module) { pylab_module_dict = PyModule_GetDict (pylab_module); if (pylab_module_dict) { func_imshow = PyDict_GetItemString (pylab_module_dict, "imshow"); if (func_imshow) { func_show = PyDict_GetItemString (pylab_module_dict, "show"); if (func_show) { args = PyTuple_New (1); PyTuple_SetItem (args, 0, array); result_imshow = PyObject_CallObject (func_imshow, args); Py_XDECREF (result_imshow); // We dont use the result... Py_XDECREF (args); result_show = PyObject_CallObject (func_show, NULL); Py_XDECREF (result_show); // We dont use the result... Py_XDECREF (array); } } } Py_XDECREF (pylab_module); } Py_Finalize (); return 0; } "In principle, if you know how to write the code by hand, you know how to write the typemap." Yes but I had to spend a bit more time on that. Hand written code I have posted fit my (debug) needs so I decided not to use SWIG. "Is that what you're doing? Is there any guarantee that &VA[0] won't change? In any case, I assume that you have to make sure that VA doesn't get deleted while the array is still around. " Yes I have but it is not a problem in my simple use cases. Xavier -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From gnata at obs.univ-lyon1.fr Sat Sep 8 15:33:58 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Sat, 08 Sep 2007 21:33:58 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46DF04E9.2020501@enthought.com> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> Message-ID: <46E2F926.1000400@obs.univ-lyon1.fr> Bryan Van de Ven wrote: > Christopher Barker wrote: > > >> Does anyone know the status of support for valarrays now? >> > > I used std::valarray to implement a variant of the example Matrix class in > Stroustrup's book (2D only) about two years ago. I was aware that is in disuse, > by and large, but it worked well enough for my purposes and I was happy with it. > I'm sure it could have been done differently/better. > > Bryan > Hi, std:valarray are quite strange containers because they are not well integrated in the STL. For instance, you cannot play with maps of valarrays : http://gcc.gnu.org/ml/gcc-bugs/2006-05/msg02397.html Note that it is *not* a gcc bug but a design choice. As a result and beacuse I'm an STL happy user, I always use vector when I have to deal with arrays (of course matrix lib are great as log as you have to do some more complicated maths or as long as you have matrix and not only arrays). Hum...we are on a python list ;) so I would be very happy to see some support of std:vector <--> numpy. Xavier ps : There are no real performance differences between vector and valarray (in my use cases...) -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From koepsell at gmail.com Sat Sep 8 18:39:23 2007 From: koepsell at gmail.com (killian koepsell) Date: Sat, 8 Sep 2007 15:39:23 -0700 Subject: [Numpy-discussion] von mises distribution in numpy.random biased Message-ID: hi, the von mises distribution in numpy.random seems to be biased towards a higher concentration (kappa). given a concentration of 2, it produces data that has a concentration of 2.36. i compared the distribution to the one produced by the CircStats[1] package of R[2] using RPy [3] and created a figure here: http://redwood.berkeley.edu/kilian/vonmises.png the script i used is attached to this email. i don't know what algorithm NumPy uses, so i can't tell if it is a real bug or some sort of rounding error. the CircStats package uses the algorithm by Best and Fisher [4]. kilian [1] http://cran.r-project.org/src/contrib/Descriptions/CircStats.html [2] http://www.r-project.org/ [3] http://rpy.sf.net/ [4] Best, D. and Fisher, N. (1979). Efficient simulation of the von Mises distribution. Applied Statistics, 24, 152-157. ---------------------- vonmises.py ---------------------- from numpy import array,pi,exp,mod,random from pylab import hist,linspace,figure,clf,plot,subplot,xlim,title,legend from rpy import r as R R.library("CircStats") # compute von Mises distribuion fo two different values of kappa size = 1e5 kappa1 = 1 x1 = random.vonmises(0,kappa1,size=size) y1 = mod(R.rvm(size,0,kappa1),2*pi) kappa2 = 2 x2 = random.vonmises(0,kappa2,size=size) y2 = mod(R.rvm(size,0,kappa2),2*pi) def phasedist(x,kappa): ph = linspace(-pi,pi,50) vM = R.dvm(ph,0,kappa) kest = R.A1inv(abs(exp(1j*array(x)).mean())) vMest = R.dvm(ph,0,kest) plot(ph,vM,'r',linewidth=2) plot(ph,vMest,'k:',linewidth=2) legend(['$\\kappa=%d$'%kappa,'$\\hat{\\kappa}=%1.2f$'%kest]) hist(x,ph,normed=1) xlim((-pi,pi)) # plot figure fig=figure(1) clf() ax1=subplot(2,2,1) title('NumPy.random') phasedist(x1,kappa1) ax2=subplot(2,2,3) phasedist(x2,kappa2) subplot(2,2,2,sharey=ax1) title('RPy.CircStats') phasedist(y1,kappa1) subplot(2,2,4,sharey=ax2) phasedist(y2,kappa2) From oliphant at enthought.com Sun Sep 9 00:44:36 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 08 Sep 2007 23:44:36 -0500 Subject: [Numpy-discussion] von mises distribution in numpy.random biased In-Reply-To: References: Message-ID: <46E37A34.9080103@enthought.com> killian koepsell wrote: > hi, > > the von mises distribution in numpy.random seems to be biased towards > a higher concentration (kappa). given a concentration of 2, it > produces data that has a concentration of 2.36. i compared the > distribution to the one produced by the CircStats[1] package of R[2] > using RPy [3] and created a figure here: > > http://redwood.berkeley.edu/kilian/vonmises.png > > the script i used is attached to this email. i don't know what > algorithm NumPy uses, so i can't tell if it is a real bug or some sort > of rounding error. the CircStats package uses the algorithm by Best > and Fisher [4]. > Interesting. The algorithm in NumPy is basically the same algorithm. However, two things are different about it and the one used in R (and in Python itself by the way). 1) The two random variates used in the rejection algorithm are drawn from a uniform on [-1,1] 2) The testing for the sign is done with the same random variable instead of a new one. I've updated the source to change both of these behaviors. I suspect the bias is coming from 1) Perhaps you could re-run your tests. -Travis From orest.kozyar at gmail.com Sun Sep 9 22:36:11 2007 From: orest.kozyar at gmail.com (Orest Kozyar) Date: Sun, 9 Sep 2007 22:36:11 -0400 Subject: [Numpy-discussion] odd behavior for numpy.ndarray index? Message-ID: <000401c7f353$5a2a3860$4e07fa12@issphoenix> In the following output (see below), why would x[1,None] work, but x[1,None,2] or even x[1,2,None] not work? Incidentally, I would be very interested in a solution that allows me to index numpy arrays using a list/iterable that might contain None values. Is there a straightforward way to do so, or should I just use list comprehensions (i.e. [x[a] for a in indices if a])? Thanks! Orest In [122]: x = arange(5) In [123]: x[1] Out[123]: 1 In [124]: x[None] Out[124]: array([[0, 1, 2, 3, 4]]) In [125]: x[1,None] Out[125]: array([1]) In [126]: x[1,None,2] --------------------------------------------------------------------------- Traceback (most recent call last) c:\documents\research\programs\python\epl\src\ in () : invalid inde From koepsell at gmail.com Sun Sep 9 22:47:03 2007 From: koepsell at gmail.com (killian koepsell) Date: Sun, 9 Sep 2007 19:47:03 -0700 Subject: [Numpy-discussion] von mises distribution in numpy.random biased In-Reply-To: <46E37A34.9080103@enthought.com> References: <46E37A34.9080103@enthought.com> Message-ID: On 9/8/07, Travis E. Oliphant wrote: > killian koepsell wrote: > > hi, > > > > the von mises distribution in numpy.random seems to be biased towards > > a higher concentration (kappa). given a concentration of 2, it > > produces data that has a concentration of 2.36. i compared the > > distribution to the one produced by the CircStats[1] package of R[2] > > using RPy [3] and created a figure here: > > > > http://redwood.berkeley.edu/kilian/vonmises.png > > > > Interesting. > > The algorithm in NumPy is basically the same algorithm. However, two > things are different about it and the one used in R (and in Python > itself by the way). > > 1) The two random variates used in the rejection algorithm are drawn > from a uniform on [-1,1] > 2) The testing for the sign is done with the same random variable > instead of a new one. > > I've updated the source to change both of these behaviors. I suspect > the bias is coming from 1) > > Perhaps you could re-run your tests. > > -Travis hi travis, thanks for fixing this so fast. it looks very good now: http://redwood.berkeley.edu/kilian/vonmises_new.png kilian From tim.hochberg at ieee.org Sun Sep 9 22:47:49 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 9 Sep 2007 19:47:49 -0700 Subject: [Numpy-discussion] odd behavior for numpy.ndarray index? In-Reply-To: <000401c7f353$5a2a3860$4e07fa12@issphoenix> References: <000401c7f353$5a2a3860$4e07fa12@issphoenix> Message-ID: On 9/9/07, Orest Kozyar wrote: > > In the following output (see below), why would x[1,None] work, but > x[1,None,2] or even x[1,2,None] not work? None is the same thing as newaxis (newaxis is just an alias for None). Armed with that tidbit, a little perusing of the docs should quickly explain the behaviour you are seeing. Incidentally, I would be very interested in a solution that allows me to > index numpy arrays using a list/iterable that might contain None > values. Is > there a straightforward way to do so, or should I just use list > comprehensions (i.e. [x[a] for a in indices if a])? You could try something using compress x[compress(indices, indices)] for example. Whether that's superior to a iterator solution would probably depend on the problem. I'd probably use fromiter instead of a list comprehension if I went that route though. There are probably other ways too. It very much depends on the details and performance requirements of what you are trying to do. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Sep 10 12:30:05 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 10 Sep 2007 09:30:05 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E2F926.1000400@obs.univ-lyon1.fr> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> Message-ID: <46E5710D.4010208@noaa.gov> Thanks for you input Xavier. Xavier Gnata wrote: > std:valarray are quite strange containers because they are not well > integrated in the STL. > I always use vector when I have to deal with arrays. > ps : There are no real performance differences between vector and > valarray (in my use cases...) Probably not for our use cases either. However, we're not doing a lot of STL either, so I'm not sure there is any downside to valarray. It looks like neither one supports any kind of "view" semantics, so for the purposes of numpy array wrapping, they really aren't any different. On the other hand, I don't know if any of the other common array implementations do either -- boost::multiarray, blitz++, etc. So I guess we just need to copy data s we move back and forth (which may not even be a problem -- we haven't gotten it working yet, so have no idea if there are any performance issues for us) A nice numpy-array compatible array for C++ would be nice -- but I know I'm not going to write it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Sep 10 12:33:44 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 10 Sep 2007 09:33:44 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E2F605.1020103@obs.univ-lyon1.fr> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> <46E07AC7.2010001@noaa.gov> <46E2F605.1020103@obs.univ-lyon1.fr> Message-ID: <46E571E8.1070908@noaa.gov> Xavier Gnata wrote: > Here it is :) Thanks, that's helpful. Am I reading it right? Are you running the python process embedded in your C++ app? (rather than extending?) > valarray < double >Data (NbData); > array = PyArray_SimpleNewFromData (NbDims, Dims, PyArray_DOUBLE, > &Data[0]); OK, so you've now got a view of the data from the valarray. Nice to know this works, but, of course, fragile if the valarray is re-sized or anything, so it probably won't work for us. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthieu.brucher at gmail.com Mon Sep 10 14:59:30 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 10 Sep 2007 20:59:30 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E571E8.1070908@noaa.gov> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> <46E07AC7.2010001@noaa.gov> <46E2F605.1020103@obs.univ-lyon1.fr> <46E571E8.1070908@noaa.gov> Message-ID: > > OK, so you've now got a view of the data from the valarray. Nice to know > this works, but, of course, fragile if the valarray is re-sized or > anything, so it probably won't work for us. > Unless you use a special allocator/desallocator (I don't know if the latter is possible), I don't know how you could first correctly share the pointer (as you said, if the valarray is resized or deleted, you lose your memory). If you want to have a correct implementation (robust to array deletion, resizing in Python and C++), you should use shared pointers, and I don't know if you can build a vector with a shared pointer (perhaps with the special allocator/desallocator, but then it does not work with std::set, std::list or std::map). Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From gnata at obs.univ-lyon1.fr Mon Sep 10 16:11:28 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Mon, 10 Sep 2007 22:11:28 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E571E8.1070908@noaa.gov> References: <46DDF734.7070708@noaa.gov> <32632B72-2B1A-4AC2-AA04-288696E37A46@ster.kuleuven.be> <46DE9C30.8020300@obs.univ-lyon1.fr> <46E07AC7.2010001@noaa.gov> <46E2F605.1020103@obs.univ-lyon1.fr> <46E571E8.1070908@noaa.gov> Message-ID: <46E5A4F0.1040605@obs.univ-lyon1.fr> Christopher Barker wrote: > Xavier Gnata wrote: > >> Here it is :) >> > > Thanks, that's helpful. Am I reading it right? Are you running the > python process embedded in your C++ app? (rather than extending?) > > Yes! The point is this way I'm able to debug my C++ code plotting the array using matplotlib :) That is cool ;). >> valarray < double >Data (NbData); >> > > >> array = PyArray_SimpleNewFromData (NbDims, Dims, PyArray_DOUBLE, >> &Data[0]); >> > > OK, so you've now got a view of the data from the valarray. Nice to know > this works, but, of course, fragile if the valarray is re-sized or > anything, so it probably won't work for us. > > Yep it is not robust at all because the valarray can be modify. However, it is a a quite great way to plot an array in a C++ code. Nothing more. If you want to try with shared pointers, maybe you should have a look at the boost lib. Xavier -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From gnata at obs.univ-lyon1.fr Mon Sep 10 16:20:06 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Mon, 10 Sep 2007 22:20:06 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E1F98E.6070408@noaa.gov> References: <46DDF734.7070708@noaa.gov> <46E1F98E.6070408@noaa.gov> Message-ID: <46E5A6F6.9090706@obs.univ-lyon1.fr> yeah! Looks good! Thanks a lot. Xavier > Thanks! > > DG > > Lisandro Dalcin wrote: > >> David, I'll try to show you what I do for a custom C++ class, of >> course this does not solve the issue resizing (my class does not >> actually support resizing, so this is fine for me): >> >> My custom class is a templatized one called DTable (is like a 2d >> contiguous array), but currently I only instantiate it for 'int' and >> 'double'. This class have two relevant methods: getShape(), returning >> a std::pair, and getArray(), returning a reference to an underliing >> std::vector. So I am able to automatically support array interface by >> using this: >> >> First, I define a templatized utility function 'array_interface' >> >> %header %{ >> namespace numpy { >> >> template static char typechar() { return '\0'; } >> template<> static char typechar() { return 'i'; } >> template<> static char typechar() { return 'f'; } >> >> template >> static PyObject* >> array_interface(const DTable* self) >> { >> const std::pair& shape = self->getShape(); >> const std::vector& data = self->getArray(); >> void* array = const_cast(&data[0]); >> char endian = PyArray_NATIVE; >> char kind = typechar(); >> int elsize = sizeof(T); >> return Py_BuildValue("{sNsNsNsN}", >> "shape", Py_BuildValue("ii", shape.first, shape.second), >> "typestr", PyString_FromFormat("%c%c%d", endian, kind, elsize), >> "data", Py_BuildValue("NO", PyLong_FromVoidPtr(array), Py_False), >> "version", PyInt_FromLong(3)); >> } >> } >> %} >> >> Now define a SWIG macro to apply it to instantiations of my class >> >> %define %array_interface(Class) >> %extend Class { PyObject* __array_interface__; } >> %{ >> #define %mangle(Class) ##_## __array_interface__ ## _get(_t) \ >> numpy::array_interface(_t) >> #define %mangle(Class) ##_## __array_interface__ ## _set(_t, _val) \ >> SWIG_exception_fail(SWIG_AttributeError, "read-only attribute") >> %} >> %enddef >> >> and finally instantiate my class with different names for 'int' and >> 'double' in SWIG and finally apply previous macro >> >> >> %template(DTableI) DTable; >> %template(DTableS) DTable; >> >> %array_interface( DTable ); >> %array_interface( DTable ); >> >> I think you can implement someting similar for std::vector, >> std::valarray, or whatever... For use other data types, the only you >> need is to specialize the 'typecode' function. >> >> Hope this help you. >> >> >> On 9/4/07, David Goldsmith wrote: >> >> >>> Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array >>> typemap to share? Thanks! >>> >>> DG >>> -- >>> ERD/ORR/NOS/NOAA >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From Chris.Barker at noaa.gov Mon Sep 10 19:56:28 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 10 Sep 2007 16:56:28 -0700 Subject: [Numpy-discussion] how to include numpy headers when building an extension? Message-ID: <46E5D9AC.8040206@noaa.gov> Hi all, I'm porting an extension from Numeric. At the top, I've changed: #include to #include But distutils can't find numpy/arrayobject.h -- how do I tell distutils where to look for it? I've gotten it to work by hard-coding the entire path, but that's not very portable. It looks like Numeric used to put its headers in a standard location, but that numpy buries them deeper. thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Mon Sep 10 20:03:26 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 10 Sep 2007 19:03:26 -0500 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E5D9AC.8040206@noaa.gov> References: <46E5D9AC.8040206@noaa.gov> Message-ID: <46E5DB4E.3090008@gmail.com> Christopher Barker wrote: > Hi all, > > I'm porting an extension from Numeric. At the top, I've changed: > > #include > > to > > #include > > But distutils can't find numpy/arrayobject.h -- how do I tell distutils > where to look for it? I've gotten it to work by hard-coding the entire > path, but that's not very portable. numpy.get_include() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Tue Sep 11 01:57:22 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 10 Sep 2007 22:57:22 -0700 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E5DB4E.3090008@gmail.com> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> Message-ID: <46E62E42.7030308@noaa.gov> Robert Kern wrote: > Christopher Barker wrote: >> how do I tell distutils >> where to look for it? > numpy.get_include() Ah, got it. thanks. I know this has been discussed, but why doesn't numpy put its includes somewhere that distutils would know where to find it? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From oliphant at enthought.com Tue Sep 11 02:35:49 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 11 Sep 2007 01:35:49 -0500 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E62E42.7030308@noaa.gov> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> Message-ID: <46E63745.1060306@enthought.com> Christopher Barker wrote: > > I know this has been discussed, but why doesn't numpy put its includes > somewhere that distutils would know where to find it? > I think one answer is because distutils doesn't have defaults that play well with eggs. NumPy provides very nice extensions to distutils which will correctly add the include directories you need. Look at any of the setup.py files for scipy for examples of how to use numpy.distutils. Once you convert your setup.py files to use numpy.distutils there really isn't any problem anymore about where numpy puts its include files and the benefit gained by being able to use versioned eggs is worth it. Yes, the transition is a little messy (especially if the build is not using distutils at all). But, numpy.get_include() really helps mitigate that pain. Thanks for the comments, -Travis > > From david at ar.media.kyoto-u.ac.jp Tue Sep 11 03:13:00 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 11 Sep 2007 16:13:00 +0900 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E63745.1060306@enthought.com> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> Message-ID: <46E63FFC.2020900@ar.media.kyoto-u.ac.jp> Travis E. Oliphant wrote: > Christopher Barker wrote: > >> I know this has been discussed, but why doesn't numpy put its includes >> somewhere that distutils would know where to find it? >> >> > I think one answer is because distutils doesn't have defaults that play > well with eggs. NumPy provides very nice extensions to distutils which > will correctly add the include directories you need. > Concerning numpy.distutils, is anyone working on improving it ? There was some work started, but I have not seen any news on this front: am I missing something ? I would really like to have the possibility to compile custom extensions to be used through ctypes, and didn't go very far just by myself, unfortunately (understanding distutils is not trivial, to say the least). David From Chris.Barker at noaa.gov Tue Sep 11 11:53:36 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 11 Sep 2007 08:53:36 -0700 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E63745.1060306@enthought.com> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> Message-ID: <46E6BA00.4000603@noaa.gov> Travis E. Oliphant wrote: > I think one answer is because distutils doesn't have defaults that play > well with eggs. Fair enough -- anyone know about the progress of integrating setuptool into the standard library? > Look at any of the setup.py files for scipy for examples of how to use > numpy.distutils. I'll do that. I guess I was under the impression that numpy.distutils was fro building numpy itself. I don't know why it didn't dawn on my to give it a try. > Yes, the transition is a little messy (especially if the build is not > using distutils at all). Well, yes, but I learned that lesson a long time ago. [OT: why the heck do none of the SWIG docs give examples (or even suggest) using distutils for building SWIG extensions???] -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From a.schmolck at gmx.net Tue Sep 11 12:41:17 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Tue, 11 Sep 2007 17:41:17 +0100 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) Message-ID: Hi, I've sent pretty much the same email to c++sig, but I thought I'd also try my luck here, especially since I just saw a closely related question posted one week ago here (albeit mostly from a swig context). I'm working working on an existing scientific code base that's mostly C++ and I'm currently interfacing it to python with boost.python with a view to doing non-performance critical things in python. The code currently mostly just uses plain C double arrays passed around by pointers and I'd like to encapsulate this at least with something like stl::vector (or maybe valarray), but I've been wondering whether it might not make sense to use (slightly wrapped) numpy ndarrays -- since I eventually plan to make fairly heavy use of existing python infrastructure like matplotlib and scipy where possible. Also, ndarrays provide fairly rich functionality even at the C-API-level and I'm pretty familiar with numpy. Furthermore I can't find something equivalent to numpy for C++ -- there's ublas as well as several other matrix libs and a couple of array ones (like blitz++), but there doesn't seem to be one obvious choice, as there is for python. I think I will mostly use double arrays of fairly large size, so having really low overhead operations on small arrays with more or less exotic types is not important to me. Things that would eventually come in handy, although they're not needed yet, are basic linear algebra and maybe two or three LAPACK-level functions (I can think of cholesky decomposition and SVD right now) as well as possibly wavelets (DWT). I think I could get all these things (and more) from scipy (and kin) with too much fuzz (although I haven't tried wavelet support yet) and it seems like picking together the same functionality from different C++ libs would require considerably more work. So my question is: might it make sense to use (a slightly wrapped) numpy.ndarray, and if so is some code already floating around for that (on first glance it seems like there's a bit of support for the obsolete Numeric package in boost, but none for the newer numpy that supercedes it); if not is my impression correct that making the existing code numpy compatible shouldn't be too much work. Provided this route doesn't make much sense, I'd also be curious what people would recommend doing instead. In last week's thread mentioned above I found the following link which looks pretty relevant, albeit essentially undocumented and possibly pre-alpha -- has anyone here tried it out? many thanks, 'as From paustin at eos.ubc.ca Tue Sep 11 13:46:31 2007 From: paustin at eos.ubc.ca (Philip Austin) Date: Tue, 11 Sep 2007 10:46:31 -0700 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: References: Message-ID: <18150.54391.274768.497392@owl.eos.ubc.ca> Alexander Schmolck writes: > So my question is: might it make sense to use (a slightly wrapped) > numpy.ndarray, and if so is some code already floating around for that (on > first glance it seems like there's a bit of support for the obsolete Numeric > package in boost, but none for the newer numpy that supercedes it); if not is > my impression correct that making the existing code numpy compatible shouldn't > be too much work. Right, it should work more or less as is if you just do: set_module_and_type("numpy", "ArrayType"); in the examples. Some tests will fail because of numpy changes to function signatures, etc. The current library doesn't wrap numpy.zeros, numpy.ones or numpy.empty constructors, so the only way to construct an empty is to pass the constructor a tuple and then resize. Because it (by design) doesn't include arrayobject.h, there's also no clean way to get at the underlying data pointer to share the memory. You can use helper functions like this, though: //Create a one-dimensional numpy array of length n and type t boost::python::numeric::array makeNum(intp n, PyArray_TYPES t=PyArray_DOUBLE){ object obj(handle<>(PyArray_FromDims(1, &n, t))); return extract(obj); } http://www.eos.ubc.ca/research/clouds/software/pythonlibs/num_util/num_util_release2/Readme.html has more examples -- Phil From oliphant at enthought.com Tue Sep 11 14:43:36 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 11 Sep 2007 13:43:36 -0500 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E6BA00.4000603@noaa.gov> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> <46E6BA00.4000603@noaa.gov> Message-ID: <46E6E1D8.3040707@enthought.com> Christopher Barker wrote: > [OT: why the heck do none of the SWIG docs give examples (or even > suggest) using distutils for building SWIG extensions???] > I'm pretty sure this is because the people who contributed to the SWIG docs the most are not currently using numpy.distutils (but probably should be) --- I suspect they were in your same boat of not realizing that numpy.distutils makes it easier to build packages that depend on numpy in general. -Travis From Chris.Barker at noaa.gov Tue Sep 11 15:02:14 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 11 Sep 2007 12:02:14 -0700 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E6E1D8.3040707@enthought.com> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> <46E6BA00.4000603@noaa.gov> <46E6E1D8.3040707@enthought.com> Message-ID: <46E6E636.8060300@noaa.gov> Travis E. Oliphant wrote: > Christopher Barker wrote: >> [OT: why the heck do none of the SWIG docs give examples (or even >> suggest) using distutils for building SWIG extensions???] > > I'm pretty sure this is because the people who contributed to the SWIG > docs the most are not currently using numpy.distutils Actually, I'm talking about the general SWIG docs -- that have nothing to do with numpy at all. They've got you calling gcc on the command line, which I haven't done for years! But while we're on the topic, I started this be looking at the swig examples that come with numpy. It would be really great if there was a complete, very simple, sample or two complete with c code, *.i files, and setup.py. There are the tests, which built and ran fine for me, but that's a bit of a complicated mess. Anyway, I'm working on some simple samples -- I'll contribute them when I get them working! OT: The struggle at the moment is building a C++ extension -- if I use straight C, a trivial sample works, if I call the file *.cpp, swig it with -c++, I get a *_wrap.cxx file that won't build. Do I have to do something different in the setup.py? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From wfspotz at sandia.gov Tue Sep 11 15:53:15 2007 From: wfspotz at sandia.gov (Bill Spotz) Date: Tue, 11 Sep 2007 13:53:15 -0600 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E6E636.8060300@noaa.gov> References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> <46E6BA00.4000603@noaa.gov> <46E6E1D8.3040707@enthought.com> <46E6E636.8060300@noaa.gov> Message-ID: On Sep 11, 2007, at 1:02 PM, Christopher Barker wrote: > There are the tests, which built and ran fine for me, but > that's a bit of a complicated mess. There is a combinatorics problem there that make the tests difficult to follow. Lots of nested macros and such. > Anyway, I'm working on some simple samples -- I'll contribute them > when > I get them working! Great. I have been considering changing the directory structure of numpy/doc/swig to include subdirectories: doc/, test/, and now example/, if there are some. Some people were confused by its current structure, which just lumps everything together. Let me know when you have a working example. > OT: > The struggle at the moment is building a C++ extension -- if I use > straight C, a trivial sample works, if I call the file *.cpp, swig it > with -c++, I get a *_wrap.cxx file that won't build. Do I have to do > something different in the setup.py? I use numpy.i with C++ all the time. Let me know the specific nature of your problem and I'll look into it. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-5451 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From cookedm at physics.mcmaster.ca Tue Sep 11 16:49:59 2007 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue, 11 Sep 2007 16:49:59 -0400 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: <46E63FFC.2020900@ar.media.kyoto-u.ac.jp> (David Cournapeau's message of "Tue\, 11 Sep 2007 16\:13\:00 +0900") References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> <46E63FFC.2020900@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau writes: > Travis E. Oliphant wrote: >> Christopher Barker wrote: >> >>> I know this has been discussed, but why doesn't numpy put its includes >>> somewhere that distutils would know where to find it? >>> >>> >> I think one answer is because distutils doesn't have defaults that play >> well with eggs. NumPy provides very nice extensions to distutils which >> will correctly add the include directories you need. >> > Concerning numpy.distutils, is anyone working on improving it ? There > was some work started, but I have not seen any news on this front: am I > missing something ? I would really like to have the possibility to > compile custom extensions to be used through ctypes, and didn't go very > far just by myself, unfortunately (understanding distutils is not > trivial, to say the least). I work on it off and on. As you say, it's not trivial :-) It also has a tendency to be fragile, so large changes are harder. Something will work for me, then I merge it into the trunk, and it breaks on half-a-dozen platforms that I can't test on :-) So, it's slow going. I've got a list of my current goals at http://scipy.org/scipy/numpy/wiki/DistutilsRevamp. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From mike.ressler at alum.mit.edu Tue Sep 11 18:11:53 2007 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Tue, 11 Sep 2007 15:11:53 -0700 Subject: [Numpy-discussion] Slicing/selection in multiple dimensions simultaneously Message-ID: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> The following seems to be a wart: is it expected? Set up a 10x10 array and some indexing arrays: a=arange(100) a.shape=(10,10) q=array([0,2,4,6,8]) r=array([0,5]) Suppose I want to extract only the "even" numbered rows from a - then print a[q,:] Every fifth column: print a[:,r] Only the even rows of every fifth column: print a[q,r] --------------------------------------------------------------------------- Traceback (most recent call last) /.../.../.../ in () : shape mismatch: objects cannot be broadcast to a single shape But, this works: print a[q,:][:,r] [[ 0 5] [20 25] [40 45] [60 65] [80 85]] So why does the a[q,r] form have problems? Thanks for your insights. Mike -- mike.ressler at alum.mit.edu From robert.kern at gmail.com Tue Sep 11 18:24:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 11 Sep 2007 17:24:17 -0500 Subject: [Numpy-discussion] Slicing/selection in multiple dimensions simultaneously In-Reply-To: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> References: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> Message-ID: <46E71591.20802@gmail.com> Mike Ressler wrote: > The following seems to be a wart: is it expected? > > Set up a 10x10 array and some indexing arrays: > > a=arange(100) > a.shape=(10,10) > q=array([0,2,4,6,8]) > r=array([0,5]) > > Suppose I want to extract only the "even" numbered rows from a - then > > print a[q,:] > > > > Every fifth column: > > print a[:,r] > > > > Only the even rows of every fifth column: > > print a[q,r] > > --------------------------------------------------------------------------- > Traceback (most recent call last) > > /.../.../.../ in () > > : shape mismatch: objects cannot be > broadcast to a single shape > > But, this works: > > print a[q,:][:,r] > > [[ 0 5] > [20 25] > [40 45] > [60 65] > [80 85]] > > So why does the a[q,r] form have problems? Thanks for your insights. It is intended that the form a[q,r] be the general case: q and r are broadcasted against each other to a single shape. The result of the indexing is an array of that broadcasted shape with elements found by using each pair of elements in the broadcasted q and r arrays as indices. There are operations you can express with this form that you couldn't if the behavior that you expected were the case whereas you can get the result you want relatively straightforwardly. In [6]: a[q[:,newaxis], r] Out[6]: array([[ 0, 5], [20, 25], [40, 45], [60, 65], [80, 85]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From a.schmolck at gmx.net Tue Sep 11 18:25:45 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Tue, 11 Sep 2007 23:25:45 +0100 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: <18150.54391.274768.497392@owl.eos.ubc.ca> (Philip Austin's message of "Tue\, 11 Sep 2007 10\:46\:31 -0700") References: <18150.54391.274768.497392@owl.eos.ubc.ca> Message-ID: Philip Austin writes: > Alexander Schmolck writes: > > > So my question is: might it make sense to use (a slightly wrapped) > > numpy.ndarray, and if so is some code already floating around for that (on > > first glance it seems like there's a bit of support for the obsolete Numeric > > package in boost, but none for the newer numpy that supercedes it); if not is > > my impression correct that making the existing code numpy compatible shouldn't > > be too much work. > > Right, it should work more or less as is if you just do: > > set_module_and_type("numpy", "ArrayType"); Ah, I guess that's the advantage of going via python, rather than calling the C-api directly (although I assume it must be rather costly). > > in the examples. Some tests will fail because of numpy changes to > function signatures, etc. > > The current library doesn't wrap numpy.zeros, numpy.ones or > numpy.empty constructors, so the only way to construct an empty > is to pass the constructor a tuple and then resize. Because > it (by design) doesn't include arrayobject.h, there's also no clean > way to get at the underlying data pointer to share the memory. Not being able to get at the data-pointer sounds like a show-stopper for this purpose -- I will almost certainly need to interface to exisitng C and C++ code, and I do not intend to copy hundres of MB around unnecessarily. I think it is a real shame that boost currently doesn't properly support numpy out of the box, although numpy has long obsoleted both numarray and Numeric (which is both buggy and completely unsupported). All the more so since writing multimedial or scientific extensions (in which numpy's array interface is very natural to figure prominently) would seem such an ideal use for boost.python, as soon as complex classes or compound structures that need to efficiently support several (primitive) datatypes are involved, boost.python could really play its strenghts compared to Fortran/C based extensions. I've since stumbled (in the numpy-list) upon which seems like it could offer a great fit for my needs, but unfortunately it's not really documented and there's also no indication how ready for use it is -- I'd be interested to hear if anyone has an experience-report to offer; if not I guess I might just end up settling for std::vector for the time being, I need something workable soon, and it doesn't look like it'd be able to figure it out and verify that it works for me without a substantial time investment. > You can use helper functions like this, though: > > //Create a one-dimensional numpy array of length n and type t > boost::python::numeric::array makeNum(intp n, PyArray_TYPES t=PyArray_DOUBLE){ > object obj(handle<>(PyArray_FromDims(1, &n, t))); > return extract(obj); > } > > http://www.eos.ubc.ca/research/clouds/software/pythonlibs/num_util/num_util_release2/Readme.html Thanks! This looks rather useful, I will try it on some data I need to convert to pass to python tomorrow. alex From tim.hochberg at ieee.org Tue Sep 11 18:42:07 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 11 Sep 2007 15:42:07 -0700 Subject: [Numpy-discussion] Slicing/selection in multiple dimensions simultaneously In-Reply-To: <46E71591.20802@gmail.com> References: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> <46E71591.20802@gmail.com> Message-ID: On 9/11/07, Robert Kern wrote: > > Mike Ressler wrote: > > The following seems to be a wart: is it expected? > > > > Set up a 10x10 array and some indexing arrays: > > > > a=arange(100) > > a.shape=(10,10) > > q=array([0,2,4,6,8]) > > r=array([0,5]) > > > > Suppose I want to extract only the "even" numbered rows from a - then > > > > print a[q,:] > > > > > > > > Every fifth column: > > > > print a[:,r] > > > > > > > > Only the even rows of every fifth column: > > > > print a[q,r] > > > > > --------------------------------------------------------------------------- > > Traceback (most recent call > last) > > > > /.../.../.../ in () > > > > : shape mismatch: objects cannot be > > broadcast to a single shape > > > > But, this works: > > > > print a[q,:][:,r] > > > > [[ 0 5] > > [20 25] > > [40 45] > > [60 65] > > [80 85]] > > > > So why does the a[q,r] form have problems? Thanks for your insights. > > It is intended that the form a[q,r] be the general case: q and r are > broadcasted > against each other to a single shape. The result of the indexing is an > array of > that broadcasted shape with elements found by using each pair of elements > in the > broadcasted q and r arrays as indices. > > There are operations you can express with this form that you couldn't if > the > behavior that you expected were the case whereas you can get the result > you want > relatively straightforwardly. > > In [6]: a[q[:,newaxis], r] > Out[6]: > array([[ 0, 5], > [20, 25], > [40, 45], > [60, 65], > [80, 85]]) At the risk of making Robert grumpy: while it is true the form we ended up with is more general I've come to the conclusion that it was a bit of a mistake. In the spirit of making simple things simple and complex things possible, I suspect that having fancy-indexing do the obvious thing here[1] and delegating the more powerful but also more difficult to understand case to a function or method would have been overall more useful. Cases where the multidimensional features of fancy-indexing get used are messy enough that they don't benefit much from the conciseness of the indexing notation, at least in my experience. -- . __ . |-\ . . tim.hochberg at ieee.org [1] Just in case the 'obvious' thing isn't all that obvious: I mean restrict index-arrays to one dimension and have them simply select the given values along the axis. Hmmm. Without giving examples, which I have not time for right now, that's probably not any clearer than saying nothing. Ah well. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.ressler at alum.mit.edu Tue Sep 11 18:49:41 2007 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Tue, 11 Sep 2007 15:49:41 -0700 Subject: [Numpy-discussion] Slicing/selection in multiple dimensions simultaneously In-Reply-To: <46E71591.20802@gmail.com> References: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> <46E71591.20802@gmail.com> Message-ID: <268febdf0709111549gf3996b4re10109c6d0a233d1@mail.gmail.com> Thanks, Robert, for the quick response. On 9/11/07, Robert Kern wrote: > There are operations you can express with this form that you couldn't if the > behavior that you expected were the case whereas you can get the result you want > relatively straightforwardly. > > In [6]: a[q[:,newaxis], r] Ah, yes, of course. I forgot my broadcasting rules. However, I also thank Timothy Hochberg for the moral support of agreeing that the obvious way (a[q,r]) should be reasonably expected to work. Thanks again for your help - the routine is now working nicely. Mike -- mike.ressler at alum.mit.edu From oliphant at enthought.com Tue Sep 11 19:10:27 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 11 Sep 2007 18:10:27 -0500 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: References: <18150.54391.274768.497392@owl.eos.ubc.ca> Message-ID: <46E72063.1060408@enthought.com> > nd to copy hundres of MB around unnecessarily. > > I think it is a real shame that boost currently doesn't properly support numpy > out of the box, although numpy has long obsoleted both numarray and Numeric > (which is both buggy and completely unsupported). All the more so since > writing multimedial or scientific extensions (in which numpy's array interface > is very natural to figure prominently) would seem such an ideal use for > boost.python, as soon as complex classes or compound structures that need to > efficiently support several (primitive) datatypes are involved, boost.python > could really play its strenghts compared to Fortran/C based extensions. > > I think it could be that boost.python is waiting for the extended buffer interface which is coming in Python 3.0 and Python 2.6. This would really be ideal for wrapping external code in a form that plays well with other libraries. -Travis O. From oliphant at enthought.com Tue Sep 11 19:13:26 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 11 Sep 2007 18:13:26 -0500 Subject: [Numpy-discussion] Slicing/selection in multiple dimensions simultaneously In-Reply-To: References: <268febdf0709111511n3ca15d42o85d31831178d96a@mail.gmail.com> <46E71591.20802@gmail.com> Message-ID: <46E72116.8040408@enthought.com> Timothy Hochberg wrote: > > > On 9/11/07, *Robert Kern* > wrote: > > Mike Ressler wrote: > > The following seems to be a wart: is it expected? > > > > Set up a 10x10 array and some indexing arrays: > > > > a=arange(100) > > a.shape=(10,10) > > q=array([0,2,4,6,8]) > > r=array([0,5]) > > > > Suppose I want to extract only the "even" numbered rows from a - > then > > > > print a[q,:] > > > > > > > > Every fifth column: > > > > print a[:,r] > > > > > > > > Only the even rows of every fifth column: > > > > print a[q,r] > > > > > --------------------------------------------------------------------------- > > > Traceback (most recent > call last) > > > > /.../.../.../ in () > > > > : shape mismatch: objects cannot be > > broadcast to a single shape > > > > But, this works: > > > > print a[q,:][:,r] > > > > [[ 0 5] > > [20 25] > > [40 45] > > [60 65] > > [80 85]] > > > > So why does the a[q,r] form have problems? Thanks for your insights. > > It is intended that the form a[q,r] be the general case: q and r > are broadcasted > against each other to a single shape. The result of the indexing > is an array of > that broadcasted shape with elements found by using each pair of > elements in the > broadcasted q and r arrays as indices. > > There are operations you can express with this form that you > couldn't if the > behavior that you expected were the case whereas you can get the > result you want > relatively straightforwardly. > > In [6]: a[q[:,newaxis], r] > Out[6]: > array([[ 0, 5], > [20, 25], > [40, 45], > [60, 65], > [80, 85]]) > > > > At the risk of making Robert grumpy: while it is true the form we > ended up with is more general I've come to the conclusion that it was > a bit of a mistake. In the spirit of making simple things simple and > complex things possible, I suspect that having fancy-indexing do the > obvious thing here[1] and delegating the more powerful but also more > difficult to understand case to a function or method would have been > overall more useful. Cases where the multidimensional features of > fancy-indexing get used are messy enough that they don't benefit much > from the conciseness of the indexing notation, at least in my experience. This is a reasonable argument. It is reasonable enough that I intentionally made an ix_ function to do what you want. a[ix_(q,r)] does as originally expected if a bit more line-noise. -Travis From david at ar.media.kyoto-u.ac.jp Wed Sep 12 01:55:09 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 12 Sep 2007 14:55:09 +0900 Subject: [Numpy-discussion] how to include numpy headers when building an extension? In-Reply-To: References: <46E5D9AC.8040206@noaa.gov> <46E5DB4E.3090008@gmail.com> <46E62E42.7030308@noaa.gov> <46E63745.1060306@enthought.com> <46E63FFC.2020900@ar.media.kyoto-u.ac.jp> Message-ID: <46E77F3D.1010409@ar.media.kyoto-u.ac.jp> David M. Cooke wrote: > > I work on it off and on. As you say, it's not trivial :-) It also has > a tendency to be fragile, so large changes are harder. Something will work > for me, then I merge it into the trunk, and it breaks on half-a-dozen > platforms that I can't test on :-) So, it's slow going. > > I've got a list of my current goals at > http://scipy.org/scipy/numpy/wiki/DistutilsRevamp. > Would some contribution help ? Or is distutils such a beast that working together on it would be counter productive ? The things I had in mind were: - add the possibility to build shared libraries usable by ctypes (I just have a problem to start: I do not know how to add a new command to numpy.distutils) - provides an interface to be able to know how numpy/scipy was configured (for example: is numpy compiled with ATLAS, Apple perf libraries, GOTO, fftw, etc...) - add the possibility to build several object files differently (one of your goal if I remember correctly) cheers, David From Chris.Barker at noaa.gov Wed Sep 12 02:03:46 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 11 Sep 2007 23:03:46 -0700 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: References: Message-ID: <46E78142.7080500@noaa.gov> Alexander Schmolck wrote: > I just saw a closely related question posted one > week ago here (albeit mostly from a swig context). SWIG, Boost, whatever, the issues are similar. I guess what I'd love to find is an array implementation that plays well with modern C++, and also numpy. > The code currently mostly just uses > plain C double arrays passed around by pointers and I'd like to encapsulate > this at least with something like stl::vector (or maybe valarray), but I've > been wondering whether it might not make sense to use (slightly wrapped) numpy > ndarrays -- Well, you can go back and forth between pointers to data blacks and numpy arrays pretty easily. Where you thinking of doing this at the python-C++ interface, or where you looking for something you could use throughout your code. If the later, then I expect you don't want to use a Python Object (unless you're using your code only from Python). Our case is such: We want to have a nice array-like container that we can use in C++ code that makes sense both for pure C++, and interacts well with numpy arrays, as the code may be used in pure C++ app, but also want to test it, script it, etc from Python. > Also, ndarrays > provide fairly rich functionality even at the C-API-level Yes, the more I look into this, the more I'm impressed with numpy's design. > but there doesn't seem to be one obvious choice, as > there is for python. Though there may be more than one good choice -- did you check out boost::multiarray ? I didn't see that on your list. > Things that would eventually come in handy, although they're not needed yet, > are basic linear algebra and maybe two or three LAPACK-level functions (I can > think of cholesky decomposition and SVD right now) It would be nice to just have that (is MTL viable?), but writing connection code to LAPACK for a few functions is not too bad. > I think I could get all these things (and more) from scipy > (and kin) with too much fuzz (although I haven't tried wavelet support yet) > and it seems like picking together the same functionality from different C++ > libs would require considerably more work. True -- do-able, but you'd have to do it! > So my question is: might it make sense to use (a slightly wrapped) > numpy.ndarray, I guess what I'd like is a C++ array that was essentially an ndarray without the pyobject stuff -- it could then be useful for C++, but also easy to go back and forth between numpy and C++. Ideally, there'd be something that already fits that bill. I see a couple design issues that are key: "View" semantics: numpy arrays have the idea of "views" of data built in to them -- a given array can have it's own data block, or a be a view onto another. This is quite powerful and flexible, and can save a lot a data copying. The STL containers don't seem to have that concept at all. std::valarray has utility classes that are views of a valarray, but they really only useful as temporaries - they are not full-blown valarrays. It looks like boost::multiarrays have a similar concept though """ The MultiArray concept defines an interface to hierarchically nested containers. It specifies operations for accessing elements, traversing containers, and creating views of array data. """ Another issue is dynamic typing. Templates provide a way to do generic programming, but it's only generic at the code level. At compile time, types are fixed, so you have a valarray, for instance. numpy arrays, on the other hand are of only one type - with the data type specified as meta-data essentially. I don't know what mismatch this may cause, but it's a pretty different way to structure things. (Side note: I used this feature once to re-type an array in place, using the same data block -- it was a nifty hack used to unpack an odd binary format). Would it make sense to use this approach in C++? I suspect not -- all your computational code would have to deal with it. There is also the re-sizing issue. It's pretty handy to be able to re-size arrays -- but then the data pointer can change, making it pretty impossible to share the data. Maybe it would be helpful to have a pointer-to-a-pointer instead, so that the shared pointer wouldn't change. However, there could be uglyness with the pointer changing while some other view is working with it. > That does look promising -- and it used boost::multiarrays The more I look at boost::multiarray, the better I like it (and the more it looks like numpy) -- does anyone here have experience (good or bad) with it? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From david at ar.media.kyoto-u.ac.jp Wed Sep 12 06:19:59 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 12 Sep 2007 19:19:59 +0900 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: <46E78142.7080500@noaa.gov> References: <46E78142.7080500@noaa.gov> Message-ID: <46E7BD4F.90001@ar.media.kyoto-u.ac.jp> Christopher Barker wrote: > Alexander Schmolck wrote: >> I just saw a closely related question posted one >> week ago here (albeit mostly from a swig context). > > SWIG, Boost, whatever, the issues are similar. I guess what I'd love to > find is an array implementation that plays well with modern C++, and > also numpy. > Maybe I am naive, but I think a worthy goal would be a minimal C++ library which wraps ndarray, without thinking about SWIG, boost and co first. I don't know what other people are looking for, but for me, the interesting things with using C++ for ndarrays would be (in this order of importance): 1 much less error prone memory management 2 a bit more high level than plain C ndarrays (syntactic sugar mostly: keyword args, overloaded methods and so on) 3 more high level things for views 4 actual computation (linear algebra, SVD, etc...) 1 and 2 can "easily" be implemented without anything but plain C++. 3 would be a pain to do right without e.g boost::multiarray; 4 obviously needs external libraries anyway. My guess is that most people need mostly those 4 points, but not in the same order. Am I right ? One huge advantage of being independant of external libraries would be that the wrapper could then be included in numpy, and you could expect it everywhere. > > I guess what I'd like is a C++ array that was essentially an ndarray > without the pyobject stuff -- it could then be useful for C++, but also > easy to go back and forth between numpy and C++. > > Ideally, there'd be something that already fits that bill. I see a > couple design issues that are key: > > "View" semantics: numpy arrays have the idea of "views" of data built in > to them -- a given array can have it's own data block, or a be a view > onto another. This is quite powerful and flexible, and can save a lot a > data copying. The STL containers don't seem to have that concept at all. > std::valarray has utility classes that are views of a valarray, but they > really only useful as temporaries - they are not full-blown valarrays. What do you mean by STL does not have the concept of view ? Do you mean vector ? > > It looks like boost::multiarrays have a similar concept though > """ > The MultiArray concept defines an interface to hierarchically nested > containers. It specifies operations for accessing elements, traversing > containers, and creating views of array data. > """ > > Another issue is dynamic typing. Templates provide a way to do generic > programming, but it's only generic at the code level. At compile time, > types are fixed, so you have a valarray, for instance. numpy > arrays, on the other hand are of only one type - with the data type > specified as meta-data essentially. I don't know what mismatch this may > cause, but it's a pretty different way to structure things. (Side note: > I used this feature once to re-type an array in place, using the same > data block -- it was a nifty hack used to unpack an odd binary format). > Would it make sense to use this approach in C++? I suspect not -- all > your computational code would have to deal with it. Why not making one non template class, and having all the work done inside the class instead ? class ndarray { private: ndarray_imp a; }; > > There is also the re-sizing issue. It's pretty handy to be able to > re-size arrays -- but then the data pointer can change, making it pretty > impossible to share the data. Maybe it would be helpful to have a > pointer-to-a-pointer instead, so that the shared pointer wouldn't > change. However, there could be uglyness with the pointer changing while > some other view is working with it. If you have an array with several views on it, why not just enforcing that the block data address cannot change as long as you have a view ? This should not be too complicated, right ? I don't use views that much myself in numpy (other than implicetely, of course), so I may missing something important here cheers, David From a.schmolck at gmx.net Wed Sep 12 06:31:39 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Wed, 12 Sep 2007 11:31:39 +0100 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: <46E78142.7080500@noaa.gov> (Christopher Barker's message of "Tue\, 11 Sep 2007 23\:03\:46 -0700") References: <46E78142.7080500@noaa.gov> Message-ID: Christopher Barker writes: > Alexander Schmolck wrote: >> I just saw a closely related question posted one >> week ago here (albeit mostly from a swig context). > > SWIG, Boost, whatever, the issues are similar. I guess what I'd love to > find is an array implementation that plays well with modern C++, and > also numpy. > > >> The code currently mostly just uses plain C double arrays passed around by >> pointers and I'd like to encapsulate this at least with something like >> stl::vector (or maybe valarray), but I've been wondering whether it might >> not make sense to use (slightly wrapped) numpy ndarrays -- > > Well, you can go back and forth between pointers to data blacks and > numpy arrays pretty easily. Where you thinking of doing this at the > python-C++ interface, or where you looking for something you could use > throughout your code. The latter -- I'd ideally like something that I can more or less transparently pass and return data between python and C++ and I want to use numpy arrays on the python side. It'd also be nice to have reference semantics and reference counting working fairly painlessly between both sides. > If the later, then I expect you don't want to use a Python Object (unless > you're using your code only from Python). Yup; that would be somewhat perverse -- although as I said I expect that most data I deal with will be pretty large, so overheads from creating python objects aren't likely to matter that much. > Our case is such: We want to have a nice array-like container that we > can use in C++ code that makes sense both for pure C++, and interacts > well with numpy arrays, as the code may be used in pure C++ app, but > also want to test it, script it, etc from Python. Yes, that's exactly what I'm after. What's your current solution for this? >> Also, ndarrays >> provide fairly rich functionality even at the C-API-level > > Yes, the more I look into this, the more I'm impressed with numpy's design. > > >> but there doesn't seem to be one obvious choice, as >> there is for python. > > Though there may be more than one good choice -- did you check out > boost::multiarray ? I didn't see that on your list. No, I hadn't looked at that -- thanks. It looks like a raw, stripped down version of a multidimensional array -- no . Since I'm mostly going to use matrices (and vectors, here and there), maybe ublas, which does provide useful numeric functionality is a better choice. I must say I find it fairly painful to figure out how to do things I consider quite basic with the matrix/array classes I come accross in C++ (I'm not exactly a C++ expert, but still); I also can't seem to find a way to construct an ublas matrix or vector from existing C-array data. >> Things that would eventually come in handy, although they're not needed yet, >> are basic linear algebra and maybe two or three LAPACK-level functions (I can >> think of cholesky decomposition and SVD right now) > > It would be nice to just have that (is MTL viable?) No idea -- as far as I can tell the webpage is broken, so I can't look at the examples (http://osl.iu.edu/research/mtl/examples.php3). It doesn't seem to provide SVD out of th box either though -- and since I've already got a boost dependency my first instinct would be to use something from there. What's the advantage of MTL over ublas? > but writing connection code to LAPACK for a few functions is not too bad. > >> I think I could get all these things (and more) from scipy >> (and kin) with too much fuzz (although I haven't tried wavelet support yet) >> and it seems like picking together the same functionality from different C++ >> libs would require considerably more work. > > True -- do-able, but you'd have to do it! > >> So my question is: might it make sense to use (a slightly wrapped) >> numpy.ndarray, > > I guess what I'd like is a C++ array that was essentially an ndarray > without the pyobject stuff -- it could then be useful for C++, but also > easy to go back and forth between numpy and C++. Indeed. > Ideally, there'd be something that already fits that bill. I see a > couple design issues that are key: > > "View" semantics: numpy arrays have the idea of "views" of data built in > to them -- a given array can have it's own data block, or a be a view > onto another. This is quite powerful and flexible, and can save a lot a > data copying. The STL containers don't seem to have that concept at all. Yes. C++ copying semantics seem completely braindamaged to me. > std::valarray has utility classes that are views of a valarray, but they > really only useful as temporaries - they are not full-blown valarrays. > > It looks like boost::multiarrays have a similar concept though > """ > The MultiArray concept defines an interface to hierarchically nested > containers. It specifies operations for accessing elements, traversing > containers, and creating views of array data. > """ > > Another issue is dynamic typing. Templates provide a way to do generic > programming, but it's only generic at the code level. At compile time, > types are fixed, so you have a valarray, for instance. > numpy arrays, on the other hand are of only one type - with the data type > specified as meta-data essentially. I don't know what mismatch this may > cause, but it's a pretty different way to structure things. (Side note: I > used this feature once to re-type an array in place, using the same data > block -- it was a nifty hack used to unpack an odd binary format). Would it > make sense to use this approach in C++? I suspect not -- all your > computational code would have to deal with it. Any solution that just works fine for doubles as element type would perfectly suffice for me, but yes, I'm sure compile time vs. run-time element-type-specification causes impedance mismatch. > > There is also the re-sizing issue. It's pretty handy to be able to > re-size arrays -- but then the data pointer can change, making it pretty > impossible to share the data. Maybe it would be helpful to have a > pointer-to-a-pointer instead, so that the shared pointer wouldn't > change. However, there could be uglyness with the pointer changing while > some other view is working with it. > >> > > That does look promising -- and it used boost::multiarrays Yes (and also ublas vectors and matrices). Unfortunately, the author just wrote in the c++-sig noted that he's unlikely to work on the code again -- but it might still make a good starting point for someone looking into creating nice-seamless integration between numpy and a decent C++ matrix/array type. Unfortunately I haven't time for this; I might start out just using multiarray or ublas matrices/vectors and use some primitive explicit hack to convert. > The more I look at boost::multiarray, the better I like it (and the more > it looks like numpy) -- does anyone here have experience (good or bad) > with it I'd be interested to hear about that too. cheers, 'as From kurdt.bane at gmail.com Wed Sep 12 06:43:56 2007 From: kurdt.bane at gmail.com (Kurdt Bane) Date: Wed, 12 Sep 2007 12:43:56 +0200 Subject: [Numpy-discussion] Optimizing similarity matrix algorithm Message-ID: Hi to all! For reverse engineering purposes, I need to find where every possible chunk of bytes in file A is contained in file B. Obviously, if a chunk of length n is contained in B, I dont' want my script to recognize also all the subchunks of size < n contained in the chunk. I coded a naive implementation of a similarity algorithm: scan the similarity algorithm and find all the diagonals. Here's the code: file_a = open(argv[1]) file_b = open(argv[2]) a = numpy.fromstring(file_a.read(),'c') b = numpy.fromstring(file_b.read(),'c') tolerance = int(argv[3]) chunks = [] valid = True count_xcent = 0 for x in xrange(len(a)): for y in xrange(len(b)): count = 0 if (a[x] == b[y]): x_cnt, y_cnt = x,y if (a[x_cnt] == b[y_cnt]): try: while (a[x_cnt+1] == b[y_cnt+1]): count += 1 x_cnt += 1 y_cnt += 1 except IndexError: pass if ((count > tolerance) or (count == tolerance)): for tuple in chunks: if (((x >= tuple[0]) and (x_cnt <= tuple[1])) and ((y >= tuple[2]) and (y_cnt <= tuple[3]))): valid = False if __debug__: print "Found an already processed subchunk" break if (valid): chunks.append ((x, x_cnt, y, y_cnt)) print "Corresponding chunk found. List 1: from " + str(x) + " to " + str(x_cnt) +". List 2: from " + str(y) + " to " + str(y_cnt) print "with length of " + str (x_cnt + 1 - x) else: valid = True It simply scans the similarity matrix, finds the diagonals in wich a[x] == a[y] and interprets the diagonal as a chunk. Then, it stores the chunk in a list and determines if it's a subchunk of another greater chunk already found. The problem is: this implementation is very slow, imho because of three factors: 1. I use a nested for loop to scan the matrix 2. When the start of a diagonal is found, the program scans the diagonal with another additional loop. Maybe it would be faster to use a function such as diagonal() (but I can't actually _create_ the boolean similarity matrix, as it gets way too big for files big enough (we're talking about ~ 100 Kb files) - I am forced to compute it "on the way". 3. When I find a chunk, I compute all its subchunks with this approach, and compare them with the "big" chunk I stored in the list. Maybe there's a better algorithmical way. What do you think about these issues? Is there a way to optimize them? And are there other issues I didn't take in account? Thanks in advance, regards, Chris. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kurdt.bane at gmail.com Wed Sep 12 06:48:24 2007 From: kurdt.bane at gmail.com (Kurdt Bane) Date: Wed, 12 Sep 2007 12:48:24 +0200 Subject: [Numpy-discussion] Optimizing similarity matrix algorithm In-Reply-To: References: Message-ID: Er.. obviousl, when i wrote : "scan the similarity algorithm and find all the diagonals", I meant scan the "similarity matrix and find all the diagonals". "Similarity matrix" should really be called "Equality matrix", as I imagine it as a matrix with dimensions len(a) x len(b) where M[x][y] = (a[x] == b[y]) On 9/12/07, Kurdt Bane wrote: > > Hi to all! > For reverse engineering purposes, I need to find where every possible > chunk of bytes in file A is contained in file B. Obviously, if a chunk of > length n is contained in B, I dont' want my script to recognize also all the > subchunks of size < n contained in the chunk. > > I coded a naive implementation of a similarity algorithm: scan the > similarity algorithm and find all the diagonals. Here's the code: > > file_a = open(argv[1]) > file_b = open(argv[2]) > > a = numpy.fromstring (file_a.read(),'c') > b = numpy.fromstring(file_b.read(),'c') > > tolerance = int(argv[3]) > > chunks = [] > valid = True > > count_xcent = 0 > for x in xrange(len(a)): > for y in xrange(len(b)): > count = 0 > if (a[x] == b[y]): > x_cnt, y_cnt = x,y > if (a[x_cnt] == b[y_cnt]): > try: > while (a[x_cnt+1] == b[y_cnt+1]): > count += 1 > x_cnt += 1 > y_cnt += 1 > except IndexError: > pass > if ((count > tolerance) or (count == tolerance)): > for tuple in chunks: > if (((x >= tuple[0]) and (x_cnt <= tuple[1])) and > ((y >= tuple[2]) and (y_cnt <= tuple[3]))): > valid = False > if __debug__: > print "Found an already processed > subchunk" > break > if (valid): > chunks.append ((x, x_cnt, y, y_cnt)) > print "Corresponding chunk found. List 1: from " + > str(x) + " to " + str(x_cnt) +". List 2: from " + str(y) + " to " + > str(y_cnt) > print "with length of " + str (x_cnt + 1 - x) > else: > valid = True > > > It simply scans the similarity matrix, finds the diagonals in wich a[x] == > a[y] and interprets the diagonal as a chunk. Then, it stores the chunk in a > list and determines if it's a subchunk of another greater chunk already > found. > > The problem is: this implementation is very slow, imho because of three > factors: > > 1. I use a nested for loop to scan the matrix > > 2. When the start of a diagonal is found, the program scans the diagonal > with another additional loop. Maybe it would be faster to use a function > such as diagonal() (but I can't actually _create_ the boolean similarity > matrix, as it gets way too big for files big enough (we're talking about ~ > 100 Kb files) - I am forced to compute it "on the way". > > 3. When I find a chunk, I compute all its subchunks with this approach, > and compare them with the "big" chunk I stored in the list. Maybe there's a > better algorithmical way. > > What do you think about these issues? Is there a way to optimize them? And > are there other issues I didn't take in account? > > Thanks in advance, > > regards, > > Chris. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Sep 12 12:01:38 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 12 Sep 2007 09:01:38 -0700 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: <46E7BD4F.90001@ar.media.kyoto-u.ac.jp> References: <46E78142.7080500@noaa.gov> <46E7BD4F.90001@ar.media.kyoto-u.ac.jp> Message-ID: <46E80D62.6000003@noaa.gov> David Cournapeau wrote: > Maybe I am naive, but I think a worthy goal would be a minimal C++ > library which wraps ndarray, without thinking about SWIG, boost and co > first. That's exactly what I had in mind. If you have something that works well with ndarray -- then SWIG et al. can work with it. In principle, if you can do the transition nicely with hand-written wrappers, then you can do it with the automated tools too. > I don't know what other people are looking for, but for me, the > interesting things with using C++ for ndarrays would be (in this order > of importance): > 1 much less error prone memory management less than what? std:valarray, etc. all help with this. > 2 a bit more high level than plain C ndarrays (syntactic sugar > mostly: keyword args, overloaded methods and so on) Yes. > 3 more high level things for views I think views are key. > 4 actual computation (linear algebra, SVD, etc...) This is last on my list -- key is the core data type. I may be an unusual user, but what I expect is that in a given pile of code, we need one or two linear algebra routines, so I don't mind hand-wrapping LAPACK. Not that it wouldn't be nice to have it built in, but it's not a deal breaker. In any case, it should be separate: a core set of array objects, an a linear algebra (or whatever else) package built on top of it. > 3 > would be a pain to do right without e.g boost::multiarray; Yes, it sure would be nice to build it on an existing code base, and boost::multiarray seems to fit. > One huge advantage of being independant of external libraries would be > that the wrapper could then be included in numpy, and you could expect > it everywhere. That would be nice, but may be too much work. I"m really a C++ newbie, but it seems like the key here is the view semantics -- and perhaps the core solution is to have a "data block" class -- all it would have is a pointer to a block of data, and a reference counter. Then each array object would have a view of one of those -- each new array object that used a given instance would increase the ref count, and decrease it on deletion. The view would destroy itself when its refcount went to zero. (is this how numpy works now?) Even if this makes sense, I have no idea how compatible it would be with numpy and/or python. boost:multiarray does not seem to take this approach. Rather it has two classes: a multi_array: responsible for its own data block, and a multi_array_ref: which uses a view on another multiarray's data block. This is getting close, but it means that when you create a multi_array_ref, the original multi_array needs to stay around. I'd rather have much more flexible system,where you could create an array, create a view of that array, then destroy the original, then have the data block go away when you destroy the view. This could cause little complications if you started with a huge array, made a view into a tiny piece of it, then the whole data block would stick around -- but that would be up to the user to think about. >> Would it make sense to use this approach in C++? I suspect not -- all >> your computational code would have to deal with it. > Why not making one non template class, and having all the work done > inside the class instead ? > > class ndarray { > private: > ndarray_imp a; > }; hm. that could work (as far as my limited C++ knowledge tells me),b ut it's still static at run time -- which may be OK -- and is C++-is anyway. > If you have an array with several views on it, why not just enforcing > that the block data address cannot change as long as you have a view ? Maybe I"m missing what you're suggesting but this would lock in the original array once any views were on it -- that would greatly restrict flexibility. My suggestion above may help, but I think maybe I could just live without re-sizing. > This should not be too complicated, right ? I don't use views that much > myself in numpy (other than implicitly, of course), so I may missing > something important here Implicitly, we're all using them all the time -- which is why I think views are key. Alexander Schmolck wrote: > I'd ideally like something that I can more or less transparently > pass and return data between python and C++ and I want to use numpy arrays on > the python side. It'd also be nice to have reference semantics and reference > counting working fairly painlessly between both sides. Can the python gurus here comment on how possible that is? > as I said I expect that most > data I deal with will be pretty large, so overheads from creating python > objects aren't likely to matter that much. I"m not so much worried about the overhead as the dependency -- to use your words, it would feel perverse to by including python.h for a program that wasn't using python at all. >> Our case is such: We want to have a nice array-like container that we >> can use in C++ code that makes sense both for pure C++, and interacts >> well with numpy arrays, as the code may be used in pure C++ app, but >> also want to test it, script it, etc from Python. > > Yes, that's exactly what I'm after. What's your current solution for this? We're trying to build it now. The old code used Mac-OS Handles, those have been converted to std::valarrays, and we're working on wrapping those for with numpy arrays -- which, at the moment looks like copying the data back and forth -- fine for testing code, but maybe not OK for production work. >> did you check out >> boost::multiarray ? I didn't see that on your list. > Since I'm mostly going to use > matrices (and vectors, here and there), maybe ublas, which does provide useful > numeric functionality is a better choice. Well, one of the lesson's I learned from numpy is that I'm much happier with a general purpose n-d array than with a "matrix" and "vector". the latter can be built on top of the former if you want (like it is in numpy). How compatible are multiarray and ublas matrices? It kind of looks like boost isn't really a single project, so things that could be related may not be. Hmmm -- if my concept above works, then all you need is for your n-d arrays and your matrices and vectors to all share the data "data block" class. > I must say I find it fairly painful > to figure out how to do things I consider quite basic with the matrix/array > classes I come accross in C++ (I'm not exactly a C++ expert, but still); neither am I -- but I think it's the nature of C++! > I > also can't seem to find a way to construct an ublas matrix or vector from > existing C-array data. This functionality seems to be missing from many (moat) of these C++ containers. I suspect that it's the memory management issue. One of the points of these containers it to take care of memory management for you -- if you pass in a pointer to an existing data block -- it's not managing your memory any more. >> It would be nice to just have that (is MTL viable?) > > No idea -- as far as I can tell the webpage is broken, so I can't look at the > examples (http://osl.iu.edu/research/mtl/examples.php3). Too many dead or sleeping projects.... > Yes. C++ copying semantics seem completely braindamaged to me. It's the memory management issue again -- C++ doesn't have it built in -- so it's built in to each class instead. >>> >> That does look promising -- and it used boost::multiarrays > > Yes (and also ublas vectors and matrices). Unfortunately, the author just > wrote in the c++-sig noted that he's unlikely to work on the code again -- Darn but > it might still make a good starting point for someone The advantage of open source! Full Disclosure: I have neither the skills nor the time to actually implement any of these ideas. If no one else does, then I guess we're just blabbing -- not that there is anything wrong with blabbing! -Chris From travis at enthought.com Wed Sep 12 12:05:15 2007 From: travis at enthought.com (Travis Vaught) Date: Wed, 12 Sep 2007 11:05:15 -0500 Subject: [Numpy-discussion] ANN: Reminder - Texas Python Regional Unconference Message-ID: Greetings, Just a reminder for those in the area... http://pycamp.python.org/Texas/HomePage The Unconference is to be held this weekend (Saturday and Sunday, September 15, 16) at the Texas Learning & Computing Center at the University of Houston main campus. It's free. Sign up by adding your name to the wiki page. Travis From myeates at jpl.nasa.gov Wed Sep 12 14:03:17 2007 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Wed, 12 Sep 2007 11:03:17 -0700 Subject: [Numpy-discussion] pycdf probs Message-ID: <46E829E5.7080500@jpl.nasa.gov> Anybody know how to contact the pycdf author? His name is Gosselin I think. There are hardcoded values that cause pycdf to segfault when using large strings. Mathew From matthieu.brucher at gmail.com Wed Sep 12 14:10:39 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 12 Sep 2007 20:10:39 +0200 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) In-Reply-To: <46E80D62.6000003@noaa.gov> References: <46E78142.7080500@noaa.gov> <46E7BD4F.90001@ar.media.kyoto-u.ac.jp> <46E80D62.6000003@noaa.gov> Message-ID: > > less than what? std:valarray, etc. all help with this. I do not agree with this statement. A correct memory managed array would increment and decrement a reference counter somewhere. Yes, it sure would be nice to build it on an existing code base, and > boost::multiarray seems to fit. The problem with multiarray is that the dimension of the array are fixed at the compilation. Although one could use 1 for the size in the remaning dimension, I don't think it's the best choice, given that a real dynamic-dimension array is not much complicated than a static-dimension one, perhaps a little more slower. boost:multiarray does not seem to take this approach. Rather it has two > classes: a multi_array: responsible for its own data block, and a > multi_array_ref: which uses a view on another multiarray's data block. > This is getting close, but it means that when you create a > multi_array_ref, the original multi_array needs to stay around. I'd > rather have much more flexible system,where you could create an array, > create a view of that array, then destroy the original, then have the > data block go away when you destroy the view. This could cause little > complications if you started with a huge array, made a view into a tiny > piece of it, then the whole data block would stick around -- but that > would be up to the user to think about. I don't know ho numpy does it either, but a view on an view of an array may be a view on an array, so in C++, an view should only reference the data, not the real view, so when the array is destroyed, the view is still correct, as it has a reference on the data and not the original array. hm. that could work (as far as my limited C++ knowledge tells me),b ut > it's still static at run time -- which may be OK -- and is C++-is anyway. I've done this before, with type traits and a multi-dispatch method, you can instantiate several functions with the correct type. It's a classic approach that is used in plugins, and it does not use RTTI, and it is compatible across C++ compilers. Can the python gurus here comment on how possible that is? Once you have the Python object, increment the reference counter when you wrap the data in C++ for a real array or for a view, and decrement it in the destructor of your C++ object, is that what you mean ? If the C++ object can directly use a PyObject, it's very simple to use. It perhaps could be done by a policy class, so that temporary C++ object would use a default policy that does not rely on a Python object. I"m not so much worried about the overhead as the dependency -- to use > your words, it would feel perverse to by including python.h for a > program that wasn't using python at all. This is solved if one can use policy classes. This functionality seems to be missing from many (moat) of these C++ > containers. I suspect that it's the memory management issue. One of the > points of these containers it to take care of memory management for you > -- if you pass in a pointer to an existing data block -- it's not > managing your memory any more. What Albert did for hs wrapper is this : provide an adaptator that can use the data pointer. It's only a policy (but not the default one). Full Disclosure: I have neither the skills nor the time to actually > implement any of these ideas. If no one else does, then I guess we're > just blabbing -- not that there is anything wrong with blabbing! > I know that in my lab, we intend to wrap numpy arrays in a C++ multi array, but not the boost one. It will be for array that have more than 3 dimensions, and for less than 2 dimensions, we will use our own matrix library, as it is "simple" to wrap array with it. The most complicated thing will be the automatic conversion. It will most likely be Open Source (GPL), but I don't know when we will be able to have time to do it and then when we will make it available... Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawtrevor at gmail.com Wed Sep 12 20:24:59 2007 From: lawtrevor at gmail.com (Trevor Law) Date: Wed, 12 Sep 2007 17:24:59 -0700 Subject: [Numpy-discussion] pycdf probs In-Reply-To: <46E829E5.7080500@jpl.nasa.gov> References: <46E829E5.7080500@jpl.nasa.gov> Message-ID: <61e67ee50709121724o14693eb3je4d505e1696f0a04@mail.gmail.com> I believe I have contacted them before at this address: gosselina at dfo-mpo.gc.ca Trevor Law UC Irvine Undergraduate Student On 9/12/07, Mathew Yeates wrote: > Anybody know how to contact the pycdf author? His name is Gosselin I > think. There are hardcoded values that cause pycdf to segfault when > using large strings. > > Mathew > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From fperez.net at gmail.com Wed Sep 12 21:14:07 2007 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 12 Sep 2007 19:14:07 -0600 Subject: [Numpy-discussion] Docstring improvements for numpy.where? Message-ID: Hi all, A couple of times I've been confused by numpy.where(), and I think part of it comes from the docstring. Searching my gmail archive seems to indicate I'm not the only one bitten by this. Compare: In [14]: pdoc numpy.where Class Docstring: where(condition, | x, y) The result is shaped like condition and has elements of x and y where condition is respectively true or false. If x or y are not given, then it is equivalent to condition.nonzero(). To group the indices by element, rather than dimension, use transpose(where(condition, | x, y)) instead. This always results in a 2d array, with a row of indices for each element that satisfies the condition. with (b is just any array): In [17]: pdoc b.nonzero Class Docstring: a.nonzero() returns a tuple of arrays Returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The corresponding non-zero values can be obtained with a[a.nonzero()]. To group the indices by element, rather than dimension, use transpose(a.nonzero()) instead. The result of this is always a 2d array, with a row for each non-zero element.; The sentence "The result is shaped like condition" in the where() docstring is misleading, since the behavior is really that of nonzero(). Where() *always* returns a tuple, not an array shaped like condition. If this were more clearly explained, along with a simple example for the usual case that seems to trip everyone: In [21]: a=arange(10) In [22]: N.where(a>5) Out[22]: (array([6, 7, 8, 9]),) In [23]: N.where(a>5)[0] Out[23]: array([6, 7, 8, 9]) I think we'd get a lot less confusion. Or am I missing something, or just being dense (quite likely)? Cheers, f From robert.kern at gmail.com Wed Sep 12 22:16:09 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Sep 2007 21:16:09 -0500 Subject: [Numpy-discussion] Docstring improvements for numpy.where? In-Reply-To: References: Message-ID: <46E89D69.5080804@gmail.com> Fernando Perez wrote: > Hi all, > > A couple of times I've been confused by numpy.where(), and I think > part of it comes from the docstring. Searching my gmail archive seems > to indicate I'm not the only one bitten by this. > > Compare: > > In [14]: pdoc numpy.where > Class Docstring: > where(condition, | x, y) > > The result is shaped like condition and has elements of x and y where > condition is respectively true or false. If x or y are not given, > then it is equivalent to condition.nonzero(). > > To group the indices by element, rather than dimension, use > > transpose(where(condition, | x, y)) > > instead. This always results in a 2d array, with a row of indices for > each element that satisfies the condition. > > with (b is just any array): > > In [17]: pdoc b.nonzero > Class Docstring: > a.nonzero() returns a tuple of arrays > > Returns a tuple of arrays, one for each dimension of a, > containing the indices of the non-zero elements in that > dimension. The corresponding non-zero values can be obtained > with > a[a.nonzero()]. > > To group the indices by element, rather than dimension, use > transpose(a.nonzero()) > instead. The result of this is always a 2d array, with a row for > each non-zero element.; > > > The sentence "The result is shaped like condition" in the where() > docstring is misleading, since the behavior is really that of > nonzero(). Where() *always* returns a tuple, not an array shaped like > condition. If this were more clearly explained, along with a simple > example for the usual case that seems to trip everyone: > > In [21]: a=arange(10) > > In [22]: N.where(a>5) > Out[22]: (array([6, 7, 8, 9]),) > > In [23]: N.where(a>5)[0] > Out[23]: array([6, 7, 8, 9]) > > I think we'd get a lot less confusion. > > Or am I missing something, or just being dense (quite likely)? That sentence applies to the 3-argument form, which has nothing to do with nonzero() and does not yield a tuple. But in general, yes, the docstring leaves much to be desired. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Wed Sep 12 22:30:11 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 12 Sep 2007 22:30:11 -0400 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) References: <18150.54391.274768.497392@owl.eos.ubc.ca> <46E72063.1060408@enthought.com> Message-ID: Travis E. Oliphant wrote: > >> nd to copy hundres of MB around unnecessarily. >> >> I think it is a real shame that boost currently doesn't properly support >> numpy out of the box, although numpy has long obsoleted both numarray and >> Numeric (which is both buggy and completely unsupported). All the more so >> since writing multimedial or scientific extensions (in which numpy's >> array interface is very natural to figure prominently) would seem such an >> ideal use for boost.python, as soon as complex classes or compound >> structures that need to efficiently support several (primitive) datatypes >> are involved, boost.python could really play its strenghts compared to >> Fortran/C based extensions. >> >> > I think it could be that boost.python is waiting for the extended buffer > interface which is coming in Python 3.0 and Python 2.6. This would > really be ideal for wrapping external code in a form that plays well > with other libraries. > > > -Travis O. I've spent a lot of time on this issue as well. There have been a few efforts, and at present I've followed my own path. My interest is in exposing algorithms that are written in generic c++ style to python. The requirement here is to find containers (vectors, n-dim arrays) that are friendly to the generic c++ side and can be used from python. My opinion is that Numeric and all it's descendants aren't what I want on the c++ interface side. Also, I've stumbled over trying to grok ndarrayobject.h, and I haven't had much success finding docs. What I've done instead is to basically write all that I need from numpy myself. I've usually used ublas vector and matrix to do this. I've also used boost::multi_array at times (and found it quite good), and fixed_array from stlsoft. I implemented all the arithmetic I need and many functions that operate on vectors (mostly I'm interested in vectors to represent signals - not so much higher dimen arrays). As far as views of arrays, ref counting, etc. I have not worried much about it. I thought it would be a very elegant idea, but in practice I don't really need it. The most common thing I'd do with a view is to operate on a slice. Python supports this via __setitem__. For example: u[4:10:3] += 2 works. There is no need for python to hold a reference to a vector slice to do this. Probably the biggest problem I've encountered is that there is not any perfect c++ array container. For 1-dimen, std::vector is pretty good - and the interface is reasonable. For more dimen, there doesn't seem to be any perfect solution or general agreement on interface (or semantics). One of my favorite ideas related to this. I've gotten a lot of mileage out of moving from the pair-of-iterator interface featured by stl to the boost::range. I believe it would be useful to consider a multi-dimen extension of this idea. Perhaps this could present some unifying interface to different underlying array libraries. For example, maybe something like: template void F (in_t const& in) { typename row_iterator::type r = row_begin (in); for (; r != row_end (in); ++r) { typename col_iterator::type c = col_begin (r); ... The idea is that even though multi_array and ublas::matrix present very different interfaces, they can be adapted to a 2-dimen range abstraction. Anyway, that's a different issue. From peridot.faceted at gmail.com Wed Sep 12 22:53:24 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 12 Sep 2007 22:53:24 -0400 Subject: [Numpy-discussion] Docstring improvements for numpy.where? In-Reply-To: <46E89D69.5080804@gmail.com> References: <46E89D69.5080804@gmail.com> Message-ID: On 12/09/2007, Robert Kern wrote: > That sentence applies to the 3-argument form, which has nothing to do with > nonzero() and does not yield a tuple. But in general, yes, the docstring leaves > much to be desired. Well, here's what I hope is a step in the right direction. Anne -------------- next part -------------- A non-text attachment was scrubbed... Name: where-docstring.patch Type: text/x-patch Size: 1321 bytes Desc: not available URL: From Chris.Barker at noaa.gov Thu Sep 13 01:23:50 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 12 Sep 2007 22:23:50 -0700 Subject: [Numpy-discussion] Docstring improvements for numpy.where? In-Reply-To: References: <46E89D69.5080804@gmail.com> Message-ID: <46E8C966.2050408@noaa.gov> Yes, the docs could be clearer (and thanks Ann, that's better), but I'm not sure that's the core problem.... > + If x and y are not given, condition.nonzero() is returned. This has > + the effect of returning a tuple suitable for fancy indexing. Why is this a special case of where? This just seems weird to me. Shouldn't that live somewhere else? -Chris From charlesr.harris at gmail.com Thu Sep 13 01:44:32 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 12 Sep 2007 23:44:32 -0600 Subject: [Numpy-discussion] Documentation Message-ID: Hi All, While doing documentation, I've run into the need for a simple word to express the fact that a variable can be an array, or anything that can be converted to an array using asarray(). I have been using array_like for this kind of variable, but I am open to suggestions. Also, the current version of epydoc doesn't like None as a default keyword when doing module introspection. On my machine this yields [{'bogomips': '5066.01', 'cache size': '4096 KB', 'clflush siz... as the value. I suspect this can be fixed in epydoc. There are a few other minor glitches as well. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Sep 13 01:44:35 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Sep 2007 00:44:35 -0500 Subject: [Numpy-discussion] Docstring improvements for numpy.where? In-Reply-To: <46E8C966.2050408@noaa.gov> References: <46E89D69.5080804@gmail.com> <46E8C966.2050408@noaa.gov> Message-ID: <46E8CE43.6050900@gmail.com> Christopher Barker wrote: > Yes, the docs could be clearer (and thanks Ann, that's better), but I'm > not sure that's the core problem.... > >> + If x and y are not given, condition.nonzero() is returned. This has >> + the effect of returning a tuple suitable for fancy indexing. > > Why is this a special case of where? This just seems weird to me. It was introduced in numarray. I don't know why. > Shouldn't that live somewhere else? And it does: nonzero(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Thu Sep 13 05:26:33 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 13 Sep 2007 05:26:33 -0400 Subject: [Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost) References: <18150.54391.274768.497392@owl.eos.ubc.ca> <46E72063.1060408@enthought.com> Message-ID: Travis E. Oliphant wrote: > >> nd to copy hundres of MB around unnecessarily. >> >> I think it is a real shame that boost currently doesn't properly support >> numpy out of the box, although numpy has long obsoleted both numarray and >> Numeric (which is both buggy and completely unsupported). All the more so >> since writing multimedial or scientific extensions (in which numpy's >> array interface is very natural to figure prominently) would seem such an >> ideal use for boost.python, as soon as complex classes or compound >> structures that need to efficiently support several (primitive) datatypes >> are involved, boost.python could really play its strenghts compared to >> Fortran/C based extensions. >> >> > I think it could be that boost.python is waiting for the extended buffer > interface which is coming in Python 3.0 and Python 2.6. This would > really be ideal for wrapping external code in a form that plays well > with other libraries. > > > -Travis O. I should also mention a couple of other efforts. IIRC, blitz++ is very close to what I wanted. It has views of arrays with ref counting. I did make some simple demo interface of blitz++ to python. I don't recall why I abandoned this approach, but I think it's because blitz++ has reached it's end-of-life. There are several promising efforts on the horizon. There is mtl4, which despite much promise has so far not delivered. There is glas. Also, there is D, which has strong native array support and PyD. This latter approach seems very interesting, but is also very immature. From steve at shrogers.com Thu Sep 13 21:13:13 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Thu, 13 Sep 2007 19:13:13 -0600 Subject: [Numpy-discussion] APL2007 - Arrays and Objects - Early Bird Registration and Preliminary Program Message-ID: <46E9E029.5020004@shrogers.com> This is the last day for early bird registration for APL2007, 21-23 Oct in Montreal. It's co-located with OOPSLA2007 and sharing registration services at: http://www.regmaster.com/conf/oopsla2007.html =============== Preliminary Program =============== Tutorials and workshops ================= Introduction to APL (Ray Polivka) Object Oriented for APLers, APL for OOers (Dan Baronet) ... others in the works Presentations ========= No Experience Necessary: Hire for Aptitude - Train for Skills (Brooke Allen) Compiling APL with APEX (Robert Bernecky) APL, Bioinformatics, Cancer Research (Ken Fordyce) Generic Programming on Nesting Structure (Stephan Herhut, Sven-Bodo Scholz, Clemens Grelck) Interactive Array-Based Languages and Financial Research (Devon McCormick) Array vs Non-Array Approaches to Programming Problems (Devon McCormick) Design Issues in APL/OO Interfacing (Richard Nabavi) Arrays of Objects, or Arrays within Objects (Richard Nabavi) Competing, with J (John Randall) ... others in the works There is still room for oral or poster presentations that will not be contributed papers (published in a special issue of APL Quote Quad). If you would like to make an oral presentation or a poster, contact Lynne Shaw (Shaw at acm.org). ACM SIGAPL has broadened it's scope to all Array Programming Languages and NumPy/SciPy representation would be welcome. From raphael.langella at steria.cnes.fr Fri Sep 14 03:27:38 2007 From: raphael.langella at steria.cnes.fr (Langella Raphael) Date: Fri, 14 Sep 2007 09:27:38 +0200 Subject: [Numpy-discussion] Compiling numpy with 64 bits support under Solaris Message-ID: <092785B790DCD043BA45401EDA43D9B50121E05B@cst-xch-003.cnesnet.ad.cnes.fr> Hi, I'm trying to compile numpy with 64 bits support under Sparc/Solaris 8. I've already compiled Python 2.5.1 with 64 bits. I've set up my environnement with : export CC="gcc -mcpu=v9 -m64 -D_LARGEFILE64_SOURCE=1" export CXX="g++ -mcpu=v9 -m64 -D_LARGEFILE64_SOURCE=1" export LDFLAGS='-mcpu=v9 -m64' export LDDFLAGS='-mcpu=v9 -m64 -G' I also compiled blas and lapack in 64 bits. I know I don't need them for numpy, but I will soon when I'll compile scipy. I've tried to set up my site.cfg, tu use libfblas and libflapack and it didn't work. I tried libsunperf and got the same result : /outils_std/csw/gcc3/bin/g77 -mcpu=v9 -m64 build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o -L/outils_std/SUNS11/SUNWspro/lib/v9 -L/outils_std/csw/gcc3/bin/../lib/gcc/sparc-sun-solaris2.8/3.4 .4 -lsunperf -lg2c -o build/lib.solaris-2.8-sun4u-2.5/numpy/core/_dotblas.so Undefined first referenced symbol in file PyExc_ImportError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyCObject_AsVoidPtr build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyArg_ParseTuple build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyExc_RuntimeError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyEval_SaveThread build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyObject_GetAttrString build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyExc_ValueError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o MAIN__ /outils_std/csw/gcc3/bin/../lib/gcc/sparc-sun-solaris2.8/3.4.4 /../../../sparcv9/libfrtbegin.a(frtbegin.o) PyErr_SetString build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyErr_Format build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyCObject_Type build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyTuple_New build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyErr_Print build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyImport_ImportModule build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o _Py_NoneStruct build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o Py_InitModule4_64 build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyEval_RestoreThread build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o ld: fatal: Symbol referencing errors. No output written to build/lib.solaris-2.8-sun4u-2.5/numpy/core/_dotblas.so collect2: ld returned 1 exit status Undefined first referenced symbol in file PyExc_ImportError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyCObject_AsVoidPtr build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyArg_ParseTuple build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyExc_RuntimeError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyEval_SaveThread build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyObject_GetAttrString build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyExc_ValueError build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o MAIN__ /outils_std/csw/gcc3/bin/../lib/gcc/sparc-sun-solaris2.8/3.4.4 /../../../sparcv9/libfrtbegin.a(frtbegin.o) PyErr_SetString build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyErr_Format build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyCObject_Type build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyTuple_New build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyErr_Print build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyImport_ImportModule build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o _Py_NoneStruct build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o Py_InitModule4_64 build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o PyEval_RestoreThread build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o ld: fatal: Symbol referencing errors. No output written to build/lib.solaris-2.8-sun4u-2.5/numpy/core/_dotblas.so collect2: ld returned 1 exit status error: Command "/outils_std/csw/gcc3/bin/g77 -mcpu=v9 -m64 build/temp.solaris-2.8-sun4u-2.5/numpy/core/blasdot/_dotblas.o -L/outils_std/SUNS11/SUNWspro/lib/v9 -L/outils_std/csw/gcc3/bin/../lib/gcc/sparc-sun-solaris2.8/3.4 .4 -lsunperf -lg2c -o build/lib.solaris-2.8-sun4u-2.5/numpy/core/_dotblas.so" failed with exit status 1 Does numpy and scipy support 64 bits under Sparc/Solaris? Thanks. Regards, Rapha?l Langella From david at ar.media.kyoto-u.ac.jp Fri Sep 14 03:26:56 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 14 Sep 2007 16:26:56 +0900 Subject: [Numpy-discussion] Compiling numpy with 64 bits support under Solaris In-Reply-To: <092785B790DCD043BA45401EDA43D9B50121E05B@cst-xch-003.cnesnet.ad.cnes.fr> References: <092785B790DCD043BA45401EDA43D9B50121E05B@cst-xch-003.cnesnet.ad.cnes.fr> Message-ID: <46EA37C0.7030109@ar.media.kyoto-u.ac.jp> Langella Raphael wrote: > Hi, > I'm trying to compile numpy with 64 bits support under > Sparc/Solaris 8. I've already compiled Python 2.5.1 with 64 > bits. I've set up my environnement with : > > export CC="gcc -mcpu=v9 -m64 -D_LARGEFILE64_SOURCE=1" > export CXX="g++ -mcpu=v9 -m64 -D_LARGEFILE64_SOURCE=1" > export LDFLAGS='-mcpu=v9 -m64' > export LDDFLAGS='-mcpu=v9 -m64 -G' > > I am afraid this won't work really well, because it overwrites LDFLAGS. Unfortunately, AFAIK, there is no easy way to change flags used for compilation and linking. I don't think this is linked to 32 vs 64 bits problem (though I may be wrong; I don't know much about solaris). > I also compiled blas and lapack in 64 bits. I know I don't > need them for numpy, but I will soon when I'll compile scipy. > I've tried to set up my site.cfg, tu use libfblas and > libflapack and it didn't work. I tried libsunperf and got the > same result : > See http://projects.scipy.org/pipermail/scipy-user/2007-September/013580.html (the problem being about the sun compilers, I think this applies to sparc as well). cheers, David From edschofield at gmail.com Fri Sep 14 05:37:53 2007 From: edschofield at gmail.com (Ed Schofield) Date: Fri, 14 Sep 2007 10:37:53 +0100 Subject: [Numpy-discussion] arange and floating point arguments Message-ID: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Hi everyone, This was reported yesterday as a bug in Debian's numpy package: >>> len(numpy.arange(0, 0.6, 0.1)) == len(numpy.arange(0, 0.4+0.2, 0.1)) False The cause is this: >>> ceil((0.4+0.2)/0.1) 7.0 >>> ceil(0.6/0.1) 6.0 which holds for both numpy's and the standard library's ceil(). Using arange in this way is a fundamentally unreliable thing to do, but is there anything we want to do about this? Should numpy emit a warning when using arange with floating point values when (stop-start)/step is close to an integer? -- Ed From lbolla at gmail.com Fri Sep 14 05:46:49 2007 From: lbolla at gmail.com (lorenzo bolla) Date: Fri, 14 Sep 2007 11:46:49 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: <80c99e790709140246r184a605ar14023d16a8d5a00a@mail.gmail.com> this is really annoying. Matlab handles the "ceil" weirdness quite well, though. -------------------------------------------------------------- >> ceil(0.6/0.1) ans = 6 >> ceil((0.4+0.2)/0.1) ans = 7 >> 0:0.1:0.6 ans = 0 1.000000000000000e-001 2.000000000000000e-001 3.000000000000000e-001 4.000000000000000e-001 5.000000000000000e-001 6.000000000000000e-001 >> 0:0.1:(0.4+0.2) ans = 0 1.000000000000000e-001 2.000000000000000e-001 3.000000000000000e-001 4.000000000000001e-001 5.000000000000001e-001 6.000000000000001e-001 >> length(0:0.1:0.6) == length(0:0.1:(0.4+0.2)) ans = 1 -------------------------------------------------------------- hth, L. On 9/14/07, Ed Schofield wrote: > > Hi everyone, > > This was reported yesterday as a bug in Debian's numpy package: > > >>> len(numpy.arange(0, 0.6, 0.1)) == len(numpy.arange(0, 0.4+0.2, 0.1)) > False > > The cause is this: > > >>> ceil((0.4+0.2)/0.1) > 7.0 > > >>> ceil(0.6/0.1) > 6.0 > > which holds for both numpy's and the standard library's ceil(). > > Using arange in this way is a fundamentally unreliable thing to do, > but is there anything we want to do about this? Should numpy emit a > warning when using arange with floating point values when > (stop-start)/step is close to an integer? > > -- Ed > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joris.DeRidder at ster.kuleuven.be Fri Sep 14 08:07:25 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 14 Sep 2007 14:07:25 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: <05561D7A-86D9-4478-ADDF-E5222107E911@ster.kuleuven.be> Might using min(ceil((stop-start)/step), ceil((stop-start)/step-r)) with r = finfo(double).resolution instead of ceil((stop-start)/step) perhaps be useful? Joris On 14 Sep 2007, at 11:37, Ed Schofield wrote: > Hi everyone, > > This was reported yesterday as a bug in Debian's numpy package: > >>>> len(numpy.arange(0, 0.6, 0.1)) == len(numpy.arange(0, 0.4+0.2, >>>> 0.1)) > False > > The cause is this: > >>>> ceil((0.4+0.2)/0.1) > 7.0 > >>>> ceil(0.6/0.1) > 6.0 > > which holds for both numpy's and the standard library's ceil(). > > Using arange in this way is a fundamentally unreliable thing to do, > but is there anything we want to do about this? Should numpy emit a > warning when using arange with floating point values when > (stop-start)/step is close to an integer? > > -- Ed Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From david at ar.media.kyoto-u.ac.jp Fri Sep 14 08:19:34 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 14 Sep 2007 21:19:34 +0900 Subject: [Numpy-discussion] Requesting svn write access to numpy ? Message-ID: <46EA7C56.5090506@ar.media.kyoto-u.ac.jp> Hi, I would like to know whether I could request svn write access to numpy svn. There are several things I would like to work on which are big enough so that just patch would be difficult, and branches more appropriate, and my understanding is that svn branches requires write access. The things I would like to work on are: - in the immediate futur: support for sunperf, and solaris compilers - some work on numpy FFT (using KISS FFT, which is a BSD, really small library for FFT, see my other email) - some work on distutils for ctypes support (eg being able to build extension to be used with ctypes). Thank you, David From lou_boog2000 at yahoo.com Fri Sep 14 09:54:26 2007 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Fri, 14 Sep 2007 06:54:26 -0700 (PDT) Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <05561D7A-86D9-4478-ADDF-E5222107E911@ster.kuleuven.be> Message-ID: <48959.82257.qm@web34401.mail.mud.yahoo.com> I thought this is what the linspace function was written for in numpy. Why not use that? It works just like you would want always including the final point. --- Joris De Ridder wrote: > Might using > > min(ceil((stop-start)/step), > ceil((stop-start)/step-r)) > > with r = finfo(double).resolution instead of > ceil((stop-start)/step) > perhaps be useful? > > Joris -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Catch up on fall's hot new shows on Yahoo! TV. Watch previews, get listings, and more! http://tv.yahoo.com/collections/3658 From Joris.DeRidder at ster.kuleuven.be Fri Sep 14 10:48:42 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 14 Sep 2007 16:48:42 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <48959.82257.qm@web34401.mail.mud.yahoo.com> References: <48959.82257.qm@web34401.mail.mud.yahoo.com> Message-ID: <048509B7-B16B-4489-9335-28E6125ADE84@ster.kuleuven.be> On 14 Sep 2007, at 15:54, Lou Pecora wrote: > I thought this is what the linspace function was > written for in numpy. Why not use that? AFAIK, linspace() is written to generate N evenly spaced numbers between start and stop inclusive. Similar but not quite the same as arange(). > It works just like you would want always including the final point. The example I gave was actually meant to _avoid_ inclusion of the last point. E.g. In [93]: arange(0.0, 0.4+0.2, 0.1) Out[93]: array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]) In [94]: myrange(0.0, 0.4+0.2, 0.1) Out[94]: array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5]) where myrange() is an ad hoc replacement for arange(): def myrange(start, stop, step): r = finfo(double).resolution N = min(ceil((stop-start)/step), ceil((stop-start)/step-r)) return start + arange(N) * step I'm not 100% sure that the above version of myrange() wouldn't generate surprising results in some cases. If it doesn't, why not include it in (the C-version of) arange()? I don't think users actually count on the inclusion of the end point in some cases, so it would not break code. It would, however, avoid some surprises from time to time. From the example of Lorenzo, it seems that Matlab is always including the endpoint. How exactly is their arange version defined? Joris From oliphant at enthought.com Fri Sep 14 11:01:38 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 14 Sep 2007 10:01:38 -0500 Subject: [Numpy-discussion] Requesting svn write access to numpy ? In-Reply-To: <46EA7C56.5090506@ar.media.kyoto-u.ac.jp> References: <46EA7C56.5090506@ar.media.kyoto-u.ac.jp> Message-ID: <46EAA252.2000908@enthought.com> David Cournapeau wrote: > Hi, > > I would like to know whether I could request svn write access to > numpy svn. There are several things I would like to work on which are > big enough so that just patch would be difficult, and branches more > appropriate, and my understanding is that svn branches requires write > access. The things I would like to work on are: > - in the immediate futur: support for sunperf, and solaris compilers > - some work on numpy FFT (using KISS FFT, which is a BSD, really > small library for FFT, see my other email) > - some work on distutils for ctypes support (eg being able to build > extension to be used with ctypes). > > Yes, I'll try and set you up. What username would you like? -Travis From robert.kern at gmail.com Fri Sep 14 11:54:19 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Sep 2007 10:54:19 -0500 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: <46EAAEAB.2090204@gmail.com> Ed Schofield wrote: > Hi everyone, > > This was reported yesterday as a bug in Debian's numpy package: > >>>> len(numpy.arange(0, 0.6, 0.1)) == len(numpy.arange(0, 0.4+0.2, 0.1)) > False > > The cause is this: > >>>> ceil((0.4+0.2)/0.1) > 7.0 > >>>> ceil(0.6/0.1) > 6.0 > > which holds for both numpy's and the standard library's ceil(). >>> 0.6 == (0.4+0.2) False Consequently, not a bug. > Using arange in this way is a fundamentally unreliable thing to do, > but is there anything we want to do about this? Tell people to use linspace(). Yes, it does a slightly different thing; that's why it works. Most uses of floating point arange() can be cast using linspace() more reliably. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Fri Sep 14 12:01:34 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 14 Sep 2007 12:01:34 -0400 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EAAEAB.2090204@gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: On 14/09/2007, Robert Kern wrote: > Ed Schofield wrote: > > Using arange in this way is a fundamentally unreliable thing to do, > > but is there anything we want to do about this? > > Tell people to use linspace(). Yes, it does a slightly different thing; that's > why it works. Most uses of floating point arange() can be cast using linspace() > more reliably. I would like to point out in particular that numpy's linspace can leave out the last point (something I often want to do): Definition: linspace(start, stop, num=50, endpoint=True, retstep=False) Docstring: Return evenly spaced numbers. Return num evenly spaced samples from start to stop. If endpoint is True, the last sample is stop. If retstep is True then return the step value used. This is one of those cases where "from pylab import *" is going to bite you, though, because its linspace doesn't. You can always fake it with linspace(a,b,N+1)[:-1]. Anne From millman at berkeley.edu Fri Sep 14 13:45:53 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 14 Sep 2007 10:45:53 -0700 Subject: [Numpy-discussion] Requesting svn write access to numpy ? In-Reply-To: <46EAA252.2000908@enthought.com> References: <46EA7C56.5090506@ar.media.kyoto-u.ac.jp> <46EAA252.2000908@enthought.com> Message-ID: Hey Travis (and David), Since you (Travis) approved, I went ahead and gave David (cdavid) svn commit access to numpy. If you (David) have any difficulties, feel free to email me directly and I will take care of it. Cheers, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Fri Sep 14 14:12:43 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 12:12:43 -0600 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: On 9/14/07, Ed Schofield wrote: > > Hi everyone, > > This was reported yesterday as a bug in Debian's numpy package: > > >>> len(numpy.arange(0, 0.6, 0.1)) == len(numpy.arange(0, 0.4+0.2, 0.1)) > False > > The cause is this: > > >>> ceil((0.4+0.2)/0.1) > 7.0 > > >>> ceil(0.6/0.1) > 6.0 > > which holds for both numpy's and the standard library's ceil(). Since none of the numbers are exactly represented in IEEE floating point, this sort of oddity is expected. If you look at the exact values, (.4 + .2)/.1 > 6 and .6/.1 < 6 . That said, I would expect something like ceil(interval/delta - relatively_really_small_number) would generally return the expected result. Matlab probably plays these sort of games. The downside is encouraging bad programming habits. In this case, the programmer should be using linspace.. Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Fri Sep 14 14:35:14 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 14 Sep 2007 14:35:14 -0400 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: On 14/09/2007, Charles R Harris wrote: > Since none of the numbers are exactly represented in IEEE floating point, > this sort of oddity is expected. If you look at the exact values, (.4 + > .2)/.1 > 6 and .6/.1 < 6 . That said, I would expect something like > ceil(interval/delta - relatively_really_small_number) would generally return > the expected result. Matlab probably plays these sort of games. The downside > is encouraging bad programming habits. In this case, the programmer should > be using linspace.. There is actually a context in which floating-point arange makes sense: when you want evenly-spaced points and don't much care how many there are. No reason to play games in this context of course; the question is how to reduce user astonishment. Anne From haase at msg.ucsf.edu Fri Sep 14 15:14:47 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 14 Sep 2007 21:14:47 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: On 9/14/07, Charles R Harris wrote: > > Since none of the numbers are exactly represented in IEEE floating point, > this sort of oddity is expected. If you look at the exact values, (.4 + > .2)/.1 > 6 and .6/.1 < 6 . Just for my own benefit (and the past time) here are the actual numbers I get in my PyShell: >>> 0.6 == (0.4+0.2) False >>> `.6` '0.59999999999999998' >>> `.4` '0.40000000000000002' >>> `.2` '0.20000000000000001' >>> `.2+.4` '0.60000000000000009' To my naive eye this is just "fantastic" ... ;-) -Sebastian Haase PS:you might even notice that "1+2 = 9" ;-) From eike.welk at gmx.net Fri Sep 14 15:49:49 2007 From: eike.welk at gmx.net (Eike Welk) Date: Fri, 14 Sep 2007 21:49:49 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> Message-ID: <200709142149.49976.eike.welk@gmx.net> On Friday 14 September 2007 20:12, Charles R Harris wrote: > Since none of the numbers are exactly represented in IEEE floating > point, this sort of oddity is expected. If you look at the exact > values, (.4 + .2)/.1 > 6 and .6/.1 < 6 . That said, I would expect You hit send too fast! The fractions that can be represented exactly in binary are: 1/2, 1/4, 1/8, ... and not 2/10, 4/10, 8/10 .... See here: In [1]:0.5 == .25+.25 Out[1]:True In [2]:.5 Out[2]:0.5 In [3]:.25 Out[3]:0.25 In [4]:.125 Out[4]:0.125 In [8]:.375 == .25 + .125 Out[8]:True Regards, Eike. From charlesr.harris at gmail.com Fri Sep 14 16:10:28 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 14:10:28 -0600 Subject: [Numpy-discussion] binomial, multinomial coefficients Message-ID: Does anyone know if there are routines in scipy to compute these numbers? If not, I could code some up if there is any interest. As a related question, are there routines for returning the probabilities (as opposed to random number generators) for the various distributions? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 14 16:24:44 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Sep 2007 15:24:44 -0500 Subject: [Numpy-discussion] binomial, multinomial coefficients In-Reply-To: References: Message-ID: <46EAEE0C.2010304@gmail.com> Charles R Harris wrote: > Does anyone know if there are routines in scipy to compute these > numbers? scipy.misc.comb() will handle the binomial coefficients. A ufunc or an implementation that would broadcast would be welcome, though. I don't think we have one for multinomial coefficients. > If not, I could code some up if there is any interest. As a > related question, are there routines for returning the probabilities (as > opposed to random number generators) for the various distributions? scipy.stats should have all of the 1D pdfs though not the multinomial. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Joris.DeRidder at ster.kuleuven.be Fri Sep 14 16:36:46 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 14 Sep 2007 22:36:46 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EAAEAB.2090204@gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: > the question is how to reduce user astonishment. IMHO this is exactly the point. There seems to be two questions here: 1) do we want to reduce user astonishment, and 2) if yes, how could we do this? Not everyone seems to be convinced of the first question, replying that in many cases linspace() could well replace arange(). In many cases, yes, but not all. For some cases arange() has its legitimate use, even for floating point, and in these cases you may get bitten by the inexact number representation. If Matlab seems to be able to avoid surprises, why not numpy? Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From tom.denniston at alum.dartmouth.org Fri Sep 14 16:48:48 2007 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Fri, 14 Sep 2007 15:48:48 -0500 Subject: [Numpy-discussion] NotImplementedType Message-ID: Sometimes numpy operationrs result in NotImplementedType. It makes it a little hard to debug because the problem then crops up later when you try to do an operation with the NotImplementedType. Does anyone know of a way to get numpy to raise instead of returning not implemented type? (Pydb) other.value NotImplemented (Pydb) print other.value NotImplemented (Pydb) type(other.value) --Tom From tim.hochberg at ieee.org Fri Sep 14 16:59:48 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 14 Sep 2007 13:59:48 -0700 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: On 9/14/07, Joris De Ridder wrote: > > > > > the question is how to reduce user astonishment. > > IMHO this is exactly the point. There seems to be two questions here: > 1) do we want to reduce user astonishment, and 2) if yes, how could > we do this? Not everyone seems to be convinced of the first question, > replying that in many cases linspace() could well replace arange(). > In many cases, yes, but not all. For some cases arange() has its > legitimate use, even for floating point, and in these cases you may > get bitten by the inexact number representation. If Matlab seems to > be able to avoid surprises, why not numpy? Perhaps because it's a bad idea? This case may be different, but in general in cases where you try to sweep the surprising nature of floating point under the rug, you are never entirely successful. The end result is that, although surprises crop up with less regularity, they are much, much harder to diagnose and understand when they do crop up. If arange can be "fixed" in a way that's easy to understand, then great. However, if the algorithm for deciding the points is anything but dirt simple, leave it alone. Or, perhaps, deprecate floating point values as arguments. I'm not very convinced by the arguments advanced thus far that arange with floating point has legitimate uses. I've certainly used it this way myself, but I believe that all of my uses could easily be replaced with either linspace or arange with integer arguments. I suspect that cases where the exact properties of arange are required are far between and it's easy enough to simulate the current behaviour if needed. An advantage to that is that the potential pitfalls become obvious when you roll your own version. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Sep 14 17:21:40 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 14 Sep 2007 16:21:40 -0500 Subject: [Numpy-discussion] NotImplementedType In-Reply-To: References: Message-ID: <46EAFB64.5090601@enthought.com> Tom Denniston wrote: > Sometimes numpy operationrs result in NotImplementedType. It makes it > a little hard to debug because the problem then crops up later when > you try to do an operation with the NotImplementedType. Does anyone > know of a way to get numpy to raise instead of returning not > implemented type? > Part of the issue is that this is what Python expects when it does it's mixed-type operations. So, which operators are you referring to exactly? -Travis From tom.denniston at alum.dartmouth.org Fri Sep 14 17:40:50 2007 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Fri, 14 Sep 2007 16:40:50 -0500 Subject: [Numpy-discussion] NotImplementedType In-Reply-To: <46EAFB64.5090601@enthought.com> References: <46EAFB64.5090601@enthought.com> Message-ID: The hitch is the error is in the bowels of the Scientific Python so I was trying to get it to throw an exception to see what was going on. It's while Scientific Python is trying to take a derivate. It's further aggrevated by the fact that due to some bug in pdb or pydb, i'm unable to get up the stack in the debuger and look at the data that caused the NotImplemntedType result. I can always put breakpoints in the Scientific Python code I just thought it would be easier if I could easily cause it to throw and exception when the error occurs. If that's hard i'll just set breakpoints and dig. --Tom On 9/14/07, Travis E. Oliphant wrote: > Tom Denniston wrote: > > Sometimes numpy operationrs result in NotImplementedType. It makes it > > a little hard to debug because the problem then crops up later when > > you try to do an operation with the NotImplementedType. Does anyone > > know of a way to get numpy to raise instead of returning not > > implemented type? > > > Part of the issue is that this is what Python expects when it does it's > mixed-type operations. So, which operators are you referring to exactly? > > -Travis > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Fri Sep 14 17:48:00 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 15:48:00 -0600 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: On 9/14/07, Timothy Hochberg wrote: > > > > On 9/14/07, Joris De Ridder wrote: > > > > > > > > > the question is how to reduce user astonishment. > > > > IMHO this is exactly the point. There seems to be two questions here: > > 1) do we want to reduce user astonishment, and 2) if yes, how could > > we do this? Not everyone seems to be convinced of the first question, > > replying that in many cases linspace() could well replace arange(). > > In many cases, yes, but not all. For some cases arange() has its > > legitimate use, even for floating point, and in these cases you may > > get bitten by the inexact number representation. If Matlab seems to > > be able to avoid surprises, why not numpy? > > > Perhaps because it's a bad idea? This case may be different, but in > general in cases where you try to sweep the surprising nature of floating > point under the rug, you are never entirely successful. The end result is > that, although surprises crop up with less regularity, they are much, much > harder to diagnose and understand when they do crop up. > Exactly. The problem becomes even more dependent on particular circumstance. For instance, if (.2 + .2 + .1) is used instead of (.2 + .4). If arange can be "fixed" in a way that's easy to understand, then great. > However, if the algorithm for deciding the points is anything but dirt > simple, leave it alone. Or, perhaps, deprecate floating point values as > arguments. I'm not very convinced by the arguments advanced thus far that > arange with floating point has legitimate uses. I've certainly used it this > way myself, but I believe that all of my uses could easily be replaced with > either linspace or arange with integer arguments. I suspect that cases > where the exact properties of arange are required are far between and it's > easy enough to simulate the current behaviour if needed. An advantage to > that is that the potential pitfalls become obvious when you roll your own > version. > In the case of arange it should be possible to determine when the result is potentially ambiguous and issue a warning. For instance, if the argument of the ceil function is close to its rounded value. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 14 17:51:27 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Sep 2007 16:51:27 -0500 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: <46EB025F.4070205@gmail.com> Joris De Ridder wrote: > >> the question is how to reduce user astonishment. > > IMHO this is exactly the point. There seems to be two questions here: > 1) do we want to reduce user astonishment, and 2) if yes, how could > we do this? Not everyone seems to be convinced of the first question, > replying that in many cases linspace() could well replace arange(). > In many cases, yes, but not all. For some cases arange() has its > legitimate use, even for floating point, and in these cases you may > get bitten by the inexact number representation. If Matlab seems to > be able to avoid surprises, why not numpy? Here's the thing: binary floating point is intrinsically surprising to people who are only accustomed to decimal. The way to not be surprised is to not use binary floating point. You can hide some of the surprises, but not all of them. When you do try to hide them, all you are doing is creating complicated, ad hoc behavior that is also difficult to predict; for those who have become accustomed to binary floating point's behavior, it's not clear what the "unastonishing" behavior is supposed to be, but binary floating point's is well-defined. Binary floating point is a useful tool for many things. I'm not interested in making numpy something that hides that tool's behavior in order to force it into a use it is not appropriate for. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Sep 14 18:09:26 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Sep 2007 17:09:26 -0500 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> Message-ID: <46EB0696.70404@gmail.com> Charles R Harris wrote: > In the case of arange it should be possible to determine when the result > is potentially ambiguous and issue a warning. For instance, if the > argument of the ceil function is close to its rounded value. What's "close"? The appropriate tolerance depends on the operations that would cause error. For literal inputs, where the only source of error is representation error, 1 eps would suffice, but then so would linspace(). For results of other computations, you might need more than 1 eps. But if you're doing computations, then it oughtn't to matter whether you get the endpoint or not (since you don't know what the values are anyway). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Sep 14 18:29:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 16:29:22 -0600 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EB0696.70404@gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> <46EB0696.70404@gmail.com> Message-ID: On 9/14/07, Robert Kern wrote: > > Charles R Harris wrote: > > > In the case of arange it should be possible to determine when the result > > is potentially ambiguous and issue a warning. For instance, if the > > argument of the ceil function is close to its rounded value. > > What's "close"? The appropriate tolerance depends on the operations that > would > cause error. For literal inputs, where the only source of error is > representation error, 1 eps would suffice, but then so would linspace(). > For > results of other computations, you might need more than 1 eps. But if > you're > doing computations, then it oughtn't to matter whether you get the > endpoint or > not (since you don't know what the values are anyway). I would make 'close' very rough, maybe a relative 100*eps. The point would be to warn of *potential* problems and suggest linspace or some other approach, not to warn on only real problems. My guess is that most uses of arange are either well defined or such that a less ambiguous approach should be used. In a way, the warning would be a guess at programmer intent and a gentler solution than making arange integer only. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Sep 14 18:38:48 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 16:38:48 -0600 Subject: [Numpy-discussion] Buildbot errors Message-ID: I got another buildbot notification and as far as I can tell it has nothing to do with my last commit. The stdio output is at http://buildbot.scipy.org/MacOSX%20x86/builds/49/step-shell/0 And the errors seem to be of this sort: _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) I haven't had any problems compiling on my own machine. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 14 19:06:42 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Sep 2007 18:06:42 -0500 Subject: [Numpy-discussion] Buildbot errors In-Reply-To: References: Message-ID: <46EB1402.10403@gmail.com> Charles R Harris wrote: > I got another buildbot notification and as far as I can tell it has > nothing to do with my last commit. The stdio output is at > > http://buildbot.scipy.org/MacOSX%20x86/builds/49/step-shell/0 > That's not the output of the thing that failed. The build succeeded, but the following step, trying to run the unit tests, failed. http://buildbot.scipy.org/MacOSX%20x86/builds/49/step-shell_2/0 It appears to me that there is some confusion about which Python is being executed. The tests seem to expect Python 2.5: sys.path=["numpy-install/lib/python2.5/site-packages"] whereas the install is picking up Python 2.3: byte-compiling ../numpy-install/lib/python2.3/site-packages/numpy/dual.py to dual.pyc Barry, can you check this and make sure that the correct Python gets picked up during the build? Thanks. > And the errors seem to be of this sort: > > _configtest.c: In function 'main': > _configtest.c:4: error: 'isnan' undeclared (first use in this function) > _configtest.c:4: error: (Each undeclared identifier is reported only once > > _configtest.c:4: error: for each function it appears in.) > _configtest.c: In function 'main': > _configtest.c:4: error: 'isnan' undeclared (first use in this function) > _configtest.c:4: error: (Each undeclared identifier is reported only once > > _configtest.c:4: error: for each function it appears in.) > > > _configtest.c: In function 'main': > _configtest.c:4: error: 'isinf' undeclared (first use in this function) > > _configtest.c:4: error: (Each undeclared identifier is reported only once > _configtest.c:4: error: for each function it appears in.) > _configtest.c: In function 'main': > _configtest.c:4: error: 'isinf' undeclared (first use in this function) > > _configtest.c:4: error: (Each undeclared identifier is reported only once > _configtest.c:4: error: for each function it appears in.) > > I haven't had any problems compiling on my own machine. Those aren't errors that would stop the build; they just tell the config command that isnan() and isinf() aren't available on the platform. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Sep 14 20:21:48 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 14 Sep 2007 18:21:48 -0600 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <80c99e790709140246r184a605ar14023d16a8d5a00a@mail.gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <80c99e790709140246r184a605ar14023d16a8d5a00a@mail.gmail.com> Message-ID: On 9/14/07, lorenzo bolla wrote: > > this is really annoying. > Matlab handles the "ceil" weirdness quite well, though. > > -------------------------------------------------------------- > > >> ceil(0.6/0.1) > > ans = > > 6 > > >> ceil((0.4+0.2)/0.1) > > ans = > > 7 > > >> 0:0.1:0.6 > > ans = > > 0 1.000000000000000e-001 > 2.000000000000000e-001 3.000000000000000e-001 4.000000000000000e-001 > 5.000000000000000e-001 6.000000000000000e-001 > > >> 0:0.1:(0.4+0.2) > > ans = > > 0 1.000000000000000e-001 > 2.000000000000000e-001 3.000000000000000e-001 4.000000000000001e-001 > 5.000000000000001e-001 6.000000000000001e-001 > Well, in Matlab the end point is specified and the result of the division is probably rounded, so in order to have problems you might need to use something like .55 as the endpoint. In Numpy's arange an upper bound is used instead, so roundoff is a problem, but the 5.5 case would be handled easily. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From barrywark at gmail.com Fri Sep 14 21:53:23 2007 From: barrywark at gmail.com (Barry Wark) Date: Fri, 14 Sep 2007 18:53:23 -0700 Subject: [Numpy-discussion] Buildbot errors In-Reply-To: <46EB1402.10403@gmail.com> References: <46EB1402.10403@gmail.com> Message-ID: Sorry about the hassle. It was working fine before a reboot. I'll try to fix it this evening. barry On 9/14/07, Robert Kern wrote: > Charles R Harris wrote: > > I got another buildbot notification and as far as I can tell it has > > nothing to do with my last commit. The stdio output is at > > > > http://buildbot.scipy.org/MacOSX%20x86/builds/49/step-shell/0 > > > > That's not the output of the thing that failed. The build succeeded, but the > following step, trying to run the unit tests, failed. > > http://buildbot.scipy.org/MacOSX%20x86/builds/49/step-shell_2/0 > > It appears to me that there is some confusion about which Python is being > executed. The tests seem to expect Python 2.5: > > sys.path=["numpy-install/lib/python2.5/site-packages"] > > whereas the install is picking up Python 2.3: > > byte-compiling ../numpy-install/lib/python2.3/site-packages/numpy/dual.py to > dual.pyc > > Barry, can you check this and make sure that the correct Python gets picked up > during the build? Thanks. > > > And the errors seem to be of this sort: > > > > _configtest.c: In function 'main': > > _configtest.c:4: error: 'isnan' undeclared (first use in this function) > > _configtest.c:4: error: (Each undeclared identifier is reported only once > > > > _configtest.c:4: error: for each function it appears in.) > > _configtest.c: In function 'main': > > _configtest.c:4: error: 'isnan' undeclared (first use in this function) > > _configtest.c:4: error: (Each undeclared identifier is reported only once > > > > _configtest.c:4: error: for each function it appears in.) > > > > > > _configtest.c: In function 'main': > > _configtest.c:4: error: 'isinf' undeclared (first use in this function) > > > > _configtest.c:4: error: (Each undeclared identifier is reported only once > > _configtest.c:4: error: for each function it appears in.) > > _configtest.c: In function 'main': > > _configtest.c:4: error: 'isinf' undeclared (first use in this function) > > > > _configtest.c:4: error: (Each undeclared identifier is reported only once > > _configtest.c:4: error: for each function it appears in.) > > > > I haven't had any problems compiling on my own machine. > > Those aren't errors that would stop the build; they just tell the config command > that isnan() and isinf() aren't available on the platform. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > From Chris.Barker at noaa.gov Sat Sep 15 01:12:34 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 14 Sep 2007 22:12:34 -0700 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EB025F.4070205@gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> <46EB025F.4070205@gmail.com> Message-ID: <46EB69C2.90508@noaa.gov> Robert Kern wrote: > Here's the thing: binary floating point is intrinsically surprising to people > who are only accustomed to decimal. Very good point. Binary arithmetic is NOT less accurate that decimal arithmetic, it just has different values that it can't represent exactly. So one is surprised that 1.0/3.0 isn't represented exactly! The confusion stems from th fact that we use decimal literals, even when using binary arithmetic, but you just need to learn to get used to it. For what it's worth, the MATLAB mailing list has a constant trickle of notes from new users along the lines of "MATLAB is broken!" when they have encountered binary-decimal issues like these. It is inescapable. Binary representation was one of the first things I learned in my first computer class , using Basic, over 25 years ago (am I really that old!). You really need to learn at least a tiny bit about binary if you're going to do math with computers. Oh, and could someone post an actual example of a use for which FP arange is required (with fudges to try to accommodate decimal to binary conversion errors), and linspace won't do? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From peridot.faceted at gmail.com Sat Sep 15 01:34:22 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 15 Sep 2007 01:34:22 -0400 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EB69C2.90508@noaa.gov> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> <46EB025F.4070205@gmail.com> <46EB69C2.90508@noaa.gov> Message-ID: On 15/09/2007, Christopher Barker wrote: > Oh, and could someone post an actual example of a use for which FP > arange is required (with fudges to try to accommodate decimal to binary > conversion errors), and linspace won't do? Well, here's one: evaluating a function we know to be bandlimited to N harmonics and positive trying to bracket a maximum. We know it doesn't change much faster than T/N, so I might use xs = arange(0,T,1/float(4*N)) and then evaluate the function there. Of course, I don't care how many points there are, so no fudges please. But floating-point arange is certainly useful here; to use linspace or integer arange I'd have to write it in a much clumsier way. (Okay, a little clumsier.) In fact, reluctant as I am to provide arguments in favour of godawful floating-point fudges, if I have the harmonics I can use irfft to evaluate my function. I'll then have to carefully calculate the x-values where irfft evaluates, and an off-by-one problem is going to cause my program to fail. I would use integer arange and scale as appropriate, but there's something to be said for using floating-point arange. linspace(...,endpoint=False) is fine, though. Anne From Joris.DeRidder at ster.kuleuven.be Sat Sep 15 06:52:40 2007 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Sat, 15 Sep 2007 12:52:40 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EB025F.4070205@gmail.com> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> <46EB025F.4070205@gmail.com> Message-ID: <8D1A2678-3327-445C-BD88-7F6EDDF6A480@ster.kuleuven.be> On 14 Sep 2007, at 23:51, Robert Kern wrote: > You can hide some of the surprises, but not all of them. I guess it's impossible to make a bullet-proof "fix". When arange() gets a 'stop' value of 0.60000000000000009, it cannot possibly know whether this stop value is supposed to be 0.6, or whether this value is the result of a genuine computation that has nothing to do with inexact number representation. In the latter case, I would definitely want arange() to be working as it does now. It seems, though, if linspace() is the better equivalent of arange(), its default endpoint=True option seems a little bit inconvenient (but by no means a problem), as you would always have to reset it to emulate arange() behaviour. Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From b3i4old02 at sneakemail.com Sat Sep 15 11:20:49 2007 From: b3i4old02 at sneakemail.com (Michael Hoffman) Date: Sat, 15 Sep 2007 16:20:49 +0100 Subject: [Numpy-discussion] SOLVED Re: Building numpy 1.0.3-2 on Linux 2.6.8 i686 (Debian 3.1) In-Reply-To: References: <46863ADC.6010409@ar.media.kyoto-u.ac.jp> <46879518.8020809@ar.media.kyoto-u.ac.jp> Message-ID: Michael Hoffman wrote: I have finally solved this problem. > """ > $ CFLAGS= LDFLAGS= > LD_LIBRARY_PATH=/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib > /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/bin/python setup.py -v > build "LDFLAGS= python setup.py" is not sufficient. This sets LDFLAGS to an empty string. It is instead necessary to run "unset LDFLAGS" on a previous command line. > [...] > > g77 build/temp.linux-x86_64-2.5/numpy/core/blasdot/_dotblas.o -L/usr/lib > -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lblas > -lpython2.5 -lg2c-pic -o build/lib.linux-x86_64-2.5/numpy/core/_dotblas.so > /usr/lib/libfrtbegin.a(frtbegin.o)(.text+0x1e): In function `main': > : undefined reference to `MAIN__' > collect2: ld returned 1 exit status > /usr/lib/libfrtbegin.a(frtbegin.o)(.text+0x1e): In function `main': > : undefined reference to `MAIN__' > collect2: ld returned 1 exit status > error: Command "g77 > build/temp.linux-x86_64-2.5/numpy/core/blasdot/_dotblas.o -L/usr/lib > -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lblas > -lpython2.5 -lg2c-pic -o > build/lib.linux-x86_64-2.5/numpy/core/_dotblas.so" failed with exit status 1 > """ From cournape at gmail.com Sat Sep 15 14:41:41 2007 From: cournape at gmail.com (David Cournapeau) Date: Sun, 16 Sep 2007 03:41:41 +0900 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? Message-ID: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> Hi, Starting thinking over the whole distutils thing, I was thinking what people would think about using scons inside distutils to build extension. The more I think about it, the more I think than distutils not being maintained, and numpy/scipy building needs being much more complicated (at least different) than usual python extension, trying to circumvent distutils problems is an ever ending fight. Scons, being developped as a Make replacement, can do all we would like to be able to do with distutils, including: - building shared or static libraries, with dependencies (knows how to do it on many platforms). - can build each object file independently (e.g different compiler options) - is much much friendlier than distutils. - can handle external tools like swig, etc... - have basic facility to look for libraries (ala autoconf. By basic, I mean it is far from being as complete as autoconf, but is much better than distutils). Scons has also the following advantages: - written in python, can be distributed with numpy (by this, I mean AFAIK, license-wise, it is ok, and its size is not big): does not add additional dependency. - can be called within distutils quite easily. That is, I don't see big disadvantage to use it with distutils. It would give use some wanted features out of the box (building extensions based on ctypes, much friendlier way to customize building option). There are some things I am not sure about : - how to build python extension with it: this is of course mandatory - what is required for a "bi-directional" communication with distutils: for this to work, distutils needs to be aware of what scons builds (for things like bdist to work, for example). There is no question this will require some work. But anyway, my feeling is there is a need to improve the distutils thing, and I feel like this may be an easier path than patching over distutils defficiencies. I know scons quite a bit, and am willing to develop at least a prototype to see the feasibility of the whole thing. But before starting, I would like to know whether other find the idea attractive, dumb, is a waste of time, etc... cheers, David From charlesr.harris at gmail.com Sat Sep 15 15:02:53 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Sep 2007 13:02:53 -0600 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> Message-ID: Hi David, On 9/15/07, David Cournapeau wrote: > > Hi, > > Starting thinking over the whole distutils thing, I was thinking > what people would think about using scons inside distutils to build > extension. The more I think about it, the more I think than distutils > not being maintained, and numpy/scipy building needs being much more > complicated (at least different) than usual python extension, trying > to circumvent distutils problems is an ever ending fight. Scons, being > developped as a Make replacement, can do all we would like to be able > to do with distutils, including: > - building shared or static libraries, with dependencies (knows how > to do it on many platforms). > - can build each object file independently (e.g different compiler > options) > - is much much friendlier than distutils. > - can handle external tools like swig, etc... > - have basic facility to look for libraries (ala autoconf. By basic, > I mean it is far from being as complete as autoconf, but is much > better than distutils). > > Scons has also the following advantages: > - written in python, can be distributed with numpy (by this, I mean > AFAIK, license-wise, it is ok, and its size is not big): does not add > additional dependency. > - can be called within distutils quite easily. > > That is, I don't see big disadvantage to use it with distutils. It > would give use some wanted features out of the box (building > extensions based on ctypes, much friendlier way to customize building > option). I think there was a thread on this subject before, although I may be thinking of another project. I would certainly welcome anything that made it easier to understand the setup and configuration of numpy, but I am not one of the build guys. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at cens.ioc.ee Sat Sep 15 15:59:25 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 15 Sep 2007 22:59:25 +0300 (EEST) Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> Message-ID: <64518.85.166.14.172.1189886365.squirrel@cens.ioc.ee> On Sat, September 15, 2007 9:41 pm, David Cournapeau wrote: > Starting thinking over the whole distutils thing, I was thinking > what people would think about using scons inside distutils to build > extension. Are you thinking of Python distutils or numpy.distutils? In the case of Python distutils, then I think this is a wrong list to discuss this subject. > The more I think about it, the more I think than distutils > not being maintained, numpy.distutils is being maintained. > and numpy/scipy building needs being much more > complicated (at least different) than usual python extension, trying > to circumvent distutils problems is an ever ending fight. why would be scons be different when new architectures and platforms that need to be supported are developed now and in future? > Scons, being > developped as a Make replacement, can do all we would like to be able > to do with distutils, including: > - building shared or static libraries, with dependencies (knows how > to do it on many platforms). > - can build each object file independently (e.g different compiler > options) IMHO, this is the only important feature that distutils is missing and would be difficult to add to numpy.distutils because of distutils design. > - is much much friendlier than distutils. > - can handle external tools like swig, etc... numpy.distutils can handle swig, psyco,.., also, it can generate sources on-fly. It's not so difficult to add support for some external tool to numpy.distutils. > - have basic facility to look for libraries (ala autoconf. By basic, > I mean it is far from being as complete as autoconf, but is much > better than distutils). Can it achive the same as numpy/distutils/system_info.py? > Scons has also the following advantages: > - written in python, can be distributed with numpy (by this, I mean > AFAIK, license-wise, it is ok, and its size is not big): does not add > additional dependency. > - can be called within distutils quite easily. Scons does not have Fortran compiler support, at least not as advanced as numpy.distutils.fcompiler provides, afaik. I think this is the main reason why numpy.distutils cannot be replaced with scons without having much of effort adding this feature to scons. > That is, I don't see big disadvantage to use it with distutils. It > would give use some wanted features out of the box (building > extensions based on ctypes, much friendlier way to customize building > option). I agree that distutils is a difficult code. But over the years lots of fixes and useful features have been implemented in numpy.distutils that has made using numpy.distutils easier and less errorprone than Python distutils. I don't even remember the last serious issue with numpy.distutils what comes to building scipy or numpy packages. Adding new features is another story of course. > There are some things I am not sure about : > - how to build python extension with it: this is of course mandatory > - what is required for a "bi-directional" communication with > distutils: for this to work, distutils needs to be aware of what scons > builds (for things like bdist to work, for example). It's fine to add new commands to numpy.distutils. > There is no question this will require some work. But anyway, my > feeling is there is a need to improve the distutils thing, and I feel > like this may be an easier path than patching over distutils > defficiencies. I know scons quite a bit, and am willing to develop at > least a prototype to see the feasibility of the whole thing. > > But before starting, I would like to know whether other find the idea > attractive, dumb, is a waste of time, etc... I think the idea of replacing distutils with some other tool which would be easier to use and to extend has brought up every one or two years. I think it hasn't happened because it would require not just some work but lots of it. Whatever people say about distutils, it is still a good tool that has lots of know-how in it for doing its job in a large variety of platforms we have today. But if you think that you can make scons to support the numpy.distutils features that are important for building numpy and scipy, then go for it. Pearu From cournape at gmail.com Sat Sep 15 16:59:49 2007 From: cournape at gmail.com (David Cournapeau) Date: Sun, 16 Sep 2007 05:59:49 +0900 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <64518.85.166.14.172.1189886365.squirrel@cens.ioc.ee> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> <64518.85.166.14.172.1189886365.squirrel@cens.ioc.ee> Message-ID: <5b8d13220709151359j4f76fcb4x302e7b13f459f9f6@mail.gmail.com> On 9/16/07, Pearu Peterson wrote: > On Sat, September 15, 2007 9:41 pm, David Cournapeau wrote: > > > Starting thinking over the whole distutils thing, I was thinking > > what people would think about using scons inside distutils to build > > extension. > > Are you thinking of Python distutils or numpy.distutils? > In the case of Python distutils, then I think this is a > wrong list to discuss this subject. > First, let's get one thing straight: this is not about replacing distutils, but about using scons for what it is good at (building code), instead of extending distutils for what it is was not conceived for in the first place (building compiled code in a flexible way). I certainly do not intend to replace distutils, but to use scons a a build tool for extensions (that is, scons would not be aware of python code, but would be used to compile all the extensions; for pure python code, I don't think there are major issues related to distutils ?). > > The more I think about it, the more I think than distutils > > not being maintained, > > numpy.distutils is being maintained. I was speaking about python distutils, of course. > > and numpy/scipy building needs being much more > > complicated (at least different) than usual python extension, trying > > to circumvent distutils problems is an ever ending fight. > > why would be scons be different when new architectures and platforms > that need to be supported are developed now and in future? scons already knows about many platforms/compiler combinations: except fortran, I think scons supports more than numpy.distutils now. And it is much less fragile: you can extend it much more easily, which is a key point IMHO. > > > Scons, being > > developped as a Make replacement, can do all we would like to be able > > to do with distutils, including: > > - building shared or static libraries, with dependencies (knows how > > to do it on many platforms). > > - can build each object file independently (e.g different compiler > > options) > > IMHO, this is the only important feature that distutils is missing > and would be difficult to add to numpy.distutils because of distutils > design. Whereas it would be trivial with scons :) > > - have basic facility to look for libraries (ala autoconf. By basic, > > I mean it is far from being as complete as autoconf, but is much > > better than distutils). > > Can it achive the same as numpy/distutils/system_info.py? I think so. Finding functions, headers, types, is available, and you have tools like TryBuild, TryLink and so on which make customizing our own really easy. That is, instead of digging into undocumented python distutils api, you have basic, abstracted tools, on which you can build more high level things. > > > Scons has also the following advantages: > > - written in python, can be distributed with numpy (by this, I mean > > AFAIK, license-wise, it is ok, and its size is not big): does not add > > additional dependency. > > - can be called within distutils quite easily. > > Scons does not have Fortran compiler support, at least not as advanced > as numpy.distutils.fcompiler provides, afaik. I think this is the main > reason why numpy.distutils cannot be replaced with scons without > having much of effort adding this feature to scons. I don't know much about fortran, so what are the specific needs for numpy ? scons has a fortran builder, and can be customized. When I take a look at the fcompiler module in numpy.distutils, I get the feeling that most of the hard work would be unnecessary for scons, no ? > > I think the idea of replacing distutils with some other tool which > would be easier to use and to extend has brought up every one or two years. > I think it hasn't happened because it would require not just some > work but lots of it. Whatever people say about distutils, it is still > a good tool that has lots of know-how in it for doing its job in a large > variety of platforms we have today. I agree, and I do not believe much is throwing code away. Here, I am merely suggesting to extend distutils using scons for extension. Actually, some people got the idea of what I have in mind before me: http://openalea.gforge.inria.fr/dokuwiki/doku.php?id=packages:compilation_installation:deploy:deploy (I didn't look too much at the code, because this is GPL, and we cannot just copy it without infringing copyright; I just checked it was doing what I thought it was doing). To sum up: keep distutils for what it is not too bad at, and use scons instead of extending distutils for what it was not conceived for and where scons shines. David From bronto at pobox.com Sun Sep 16 03:50:20 2007 From: bronto at pobox.com (Anton Sherwood) Date: Sun, 16 Sep 2007 00:50:20 -0700 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <46EB69C2.90508@noaa.gov> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <46EAAEAB.2090204@gmail.com> <46EB025F.4070205@gmail.com> <46EB69C2.90508@noaa.gov> Message-ID: <46ECE03B.2060308@pobox.com> Christopher Barker wrote: > Very good point. Binary arithmetic is NOT less accurate that decimal > arithmetic, it just has different values that it can't represent > exactly. . . . Quibble: any number that can be represented exactly in binary can also be represented in decimal, but not vice versa, so binary can indeed be less accurate for some numbers. -- Anton Sherwood, http://www.ogre.nu/ "How'd ya like to climb this high *without* no mountain?" --Porky Pine From david at ar.media.kyoto-u.ac.jp Sun Sep 16 06:05:10 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 16 Sep 2007 19:05:10 +0900 Subject: [Numpy-discussion] [distutils] Best way to add compiler specific library path ? Message-ID: <46ECFFD6.6070901@ar.media.kyoto-u.ac.jp> Hi, While trying to add support for sunperf to numpy.distutils, I came across a simple problem I am not sure how to solve best. I installed the sun compilers for Linux, they are somewhere in my $HOME directory ($HOME/opt/sunstudio/bin). The problem is, when using the fortran compiler, some libraries are automatically linked (the ones in fcompiler/sun.py), which causes the build to fail by default because the libraries are in installation path of sun compilers ($HOME/opt/sunstudio/lib). Wouldn't it make sense to add the suncompiler/lib as a library path automatically for the compiler suncompiler/bin ? If yes, how should I do it ? cheers, David From jorgen.stenarson at bostream.nu Sun Sep 16 07:03:50 2007 From: jorgen.stenarson at bostream.nu (=?ISO-8859-1?Q?J=F6rgen_Stenarson?=) Date: Sun, 16 Sep 2007 13:03:50 +0200 Subject: [Numpy-discussion] Compilation failure on python2.4 win32 In-Reply-To: <46ECFFD6.6070901@ar.media.kyoto-u.ac.jp> References: <46ECFFD6.6070901@ar.media.kyoto-u.ac.jp> Message-ID: <46ED0D96.2080509@bostream.nu> Hi, I cannot compile numpy (rev 2042) for python2.4 on win32, it works on python2.5. It looks like the call to function get_build_architecture in distutils.misc_util.py is python2.5 specific. /J?rgen C:\python\external\numpy>python setup.py config --compiler=mingw32 build --compiler=mingw32 install Running from numpy source directory. Traceback (most recent call last): File "setup.py", line 90, in ? setup_package() File "setup.py", line 62, in setup_package from numpy.distutils.core import setup File "c:\python24\lib\site-packages\PIL\__init__.py", line 5, in ? # package placeholder File "C:\python\external\numpy\numpy\distutils\ccompiler.py", line 451, in ? msvc_on_amd64() File "C:\python\numeric-related\numpy\numpy\distutils\misc_util.py", line 324, in msvc_on_amd64 ImportError: cannot import name get_build_architecture From david at ar.media.kyoto-u.ac.jp Sun Sep 16 07:39:15 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 16 Sep 2007 20:39:15 +0900 Subject: [Numpy-discussion] Initial support for sunperf and sun compilers (linux + solaris) Message-ID: <46ED15E3.2050507@ar.media.kyoto-u.ac.jp> Ok, I created a numpy branch to implement this, and get something working. This is still really rough, though. Please check out the numpy.sunperf branch: svn co http://svn.scipy.org/svn/numpy/branches/numpy.sunperf And then, compile it with the following: SUNPERF=SUNPERFROOT python setup.py build --compiler=sun --fcompiler=sun where SUNPERFROOT is the root of your sunperf installation (for me: /opt/sun/sunstudio12/; SUNPERFROOT/lib/ shoudl contain libsunperf.so). If this works, I will then polish the code to avoid having to set SUNPERF variable. On Linux, this seems to work (all test pass), but there may be some things on solaris which prevent it from working. I have not tested scipy either, so that something you could try also if numpy works. cheers, David From matthieu.brucher at gmail.com Sun Sep 16 08:21:45 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 16 Sep 2007 14:21:45 +0200 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> Message-ID: > > There are some things I am not sure about : > - how to build python extension with it: this is of course mandatory > We use Scons at the labs for the next version of the tool we use, and it is very simple to buil extensions, at least SWIG ones, for Python 2.5 on Windows, there is the need of adding one more line, but it is very straightforward. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From eike.welk at gmx.net Sun Sep 16 11:19:00 2007 From: eike.welk at gmx.net (Eike Welk) Date: Sun, 16 Sep 2007 17:19:00 +0200 Subject: [Numpy-discussion] arange and floating point arguments In-Reply-To: <200709142149.49976.eike.welk@gmx.net> References: <1b5a37350709140237r4267e344wda309f315812d723@mail.gmail.com> <200709142149.49976.eike.welk@gmx.net> Message-ID: <200709161719.00912.eike.welk@gmx.net> Ok, I hit the send button too fast. Sorry! Eike. From matthew.brett at gmail.com Sun Sep 16 14:17:54 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 16 Sep 2007 11:17:54 -0700 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> Message-ID: <1e2af89e0709161117j41321264h8cfd69d405de8258@mail.gmail.com> Hi, > Starting thinking over the whole distutils thing, I was thinking > what people would think about using scons inside distutils to build > extension. In general this seems like an excellent idea. If we can contribute what we need to scons, that would greatly ease the burden of maintenance, and benefit both projects. The key problem will be support. At the moment Pearu maintains and owns numpy.distutils. Will we have the same level of commitment and support for this alternative do you think? How easy would it be to throw up a prototype for the rest of us to look at and get a feel for what the benefits would be? Matthew From charlesr.harris at gmail.com Sun Sep 16 15:11:43 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Sep 2007 13:11:43 -0600 Subject: [Numpy-discussion] Is this a bug? Message-ID: I note a small inconsistency in the use of the out keyword in some functions: >>> a=array(0) >>> sometrue([1],out=a).shape () >>> a=array([0]) >>> sometrue([1],out=a).shape (1,) >>> a=array([[0]]) >>> sometrue([1],out=a).shape (1, 1) >>> a=array([[0,0]]) >>> sometrue(eye(2),axis=1,out=a).shape (1, 2) It seems to me that all but the first case should raise an error, as the shape of the output array is not the same as the expected output. I know this looks picky, but 0d arrays can't be indexed, whereas 1d and 2d arrays can, so they aren't quite compatible. Besides, the current behavior is difficult to describe for the documentation. If the current behavior is the rule, what is that rule? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Sep 16 15:36:19 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Sep 2007 13:36:19 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46E5710D.4010208@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> Message-ID: On 9/10/07, Christopher Barker wrote: > > Thanks for you input Xavier. > > Xavier Gnata wrote: > > std:valarray are quite strange containers because they are not well > > integrated in the STL. > > > I always use vector when I have to deal with arrays. > > > ps : There are no real performance differences between vector and > > valarray (in my use cases...) > > Probably not for our use cases either. However, we're not doing a lot of > STL either, so I'm not sure there is any downside to valarray. It looks > like neither one supports any kind of "view" semantics, so for the > purposes of numpy array wrapping, they really aren't any different. I think that the originator of valarray saying it was misguided might be considered a downside. Now on to your other issue: HOPKINS: > As I see it, *valarray* was designed by BS [Bjarne Stroustrup] BUDGE: No, Bjarne should not be blamed for *valarray*. If any one person should be blamed for *valarray*, it's me. That's why I tend to sign my postings to this reflector as Kent "Frankenstein" Budge. -- Oops, I guess I've tipped my hand already ... HOPKINS: exactly for this purpose; to > allow aggressive optimisation by compilers in various ways e.g. because > it is guaranteed to be alias-free. In theory, this should make it > potentially faster than ANSI C until the restrict keyword is > implemented, should it not? BUDGE: Yes, that was the idea. I wanted *valarray* to provide a mechanism for expressing any one-loop operation as a single expression which could be highly optimized. I also had a vague notion that nested-loop expressions could in turn be expressed as single expressions on nested template classes, but the experience just wasn't there to see all the implications -- you should know that *valarray* was originally *not* a class template, but a pair of classes based on int and double for which there *was* some experience. This is because implementations of templates were not widely available at the time *valarray* was first proposed. HOPKINS: > > However, there does seem to be a body of (informed?) opinion that > *valarray* is 'broken' or at least not working well enough to be worth the > effort. BUDGE: Yes, that is probably a fair assessment, and probably the assessment of the vast majority (though not a unanimous view.) It has become fairly clear that the aliasing guarantees provided by *valarray* simply aren't strong enough to be useful, and that the market incentives for taking advantage of them aren't strong enough even if they were. *valarray* was written at a time when vector supercomputers were still the sexy leading edge of computing. Unfortunately, the best optimization strategy for a vector supercomputer is almost the opposite of the best optimization strategy for modern hierarchical-storage machines. On a vector supercomputer, you wanted to run the largest possible data set past each instruction, so that the vector pipeline remained full. On a hierarchical-memory machine, you want to throw the largest possible number of instructions at a particular working set of data, so that you keep your data in cache (or paged into memory or on processor, depending on which level of the memory hierarchy you are concerned with.) *valarray* might conceivably have been helpful for optimization on vector machines, because it assumes operations are best treated atomically. It's hopeless on modern machines. In any case, it's not at all clear that *valarray* is the right philosophy. *valarray* was meant to replace loops with expressions, but *STL* has shown that loops can be beautiful. HOPKINS: > Does anyone here have experience of the runtime speed of a > carefully constructed (i.e. avoiding the C++ efficiency pitfalls) > *valarray*-based set of BLAS, LAPACK or whatever? BUDGE: I'm not aware of anyone attempting to implement BLAS or LAPACK using *valarray*. HOPKINS: If so, are you using > slices, iterators etc and doing it fully *STL*-style or doing access with > more traditional Fortran-style (i,j) operators. > > Compiler writers; are you taking full advantage of *valarray*? Does it > offer what it suggests? BUDGE: Arch Robison can answer this better than I, but the short answer to both questions is No. HOPKINS: > Numerical methods people; have you compared it (on a level playing > field) with C, Fortran, C++ template-based libraries (Blitz, MTL, > POOMA)? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Sun Sep 16 16:02:23 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 16 Sep 2007 13:02:23 -0700 Subject: [Numpy-discussion] Is this a bug? In-Reply-To: References: Message-ID: On 9/16/07, Charles R Harris wrote: > > I note a small inconsistency in the use of the out keyword in some > functions: > > >>> a=array(0) > >>> sometrue([1],out=a).shape > () > >>> a=array([0]) > >>> sometrue([1],out=a).shape > (1,) > >>> a=array([[0]]) > >>> sometrue([1],out=a).shape > (1, 1) > >>> a=array([[0,0]]) > >>> sometrue(eye(2),axis=1,out=a).shape > (1, 2) > > It seems to me that all but the first case should raise an error, as the > shape of the output array is not the same as the expected output. I know > this looks picky, but 0d arrays can't be indexed, whereas 1d and 2d arrays > can, so they aren't quite compatible. Besides, the current behavior is > difficult to describe for the documentation. If the current behavior is the > rule, what is that rule? I'm not sure what the rule is. FWIW, *a* rule that fits the behavior described above and is pretty easy to explain is: FUNC(..., out=a) <=> a[...] = FUNC(...) To work with the above example: >>> a = array(0) >>> a[...] = sometrue([1]) >>> a array(1) >>> a = array([0]) >>> a[...] = sometrue([1]) >>> a array([1]) >>> a = array([[0]]) >>> a[...] = sometrue([1]) >>> a array([[1]]) >>> a = array([[0,0]]) >>> a[...] = sometrue([1]) >>> a array([[1, 1]]) I'm not sure if other functions are consistent with this rule, however. Nor have I thought it through enough to convince myself that this is the best rule there is, although at first glance it seems fairly reasonable. Chuck > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Sun Sep 16 19:26:44 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Sun, 16 Sep 2007 16:26:44 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> Message-ID: <46EDBBB4.7030202@noaa.gov> Charles R Harris wrote: > On 9/10/07, *Christopher Barker* STL either, so I'm not sure there is any downside to valarray. It looks > like neither one [vector or valarray] supports any kind of "view" semantics, so for the > purposes of numpy array wrapping, they really aren't any different. > > I think that the originator of valarray saying it was misguided might be > considered a downside. I had read that, though interestingly, I haven't seen any more recent commentary about the issues at all. In any case, it appears that what Budge is saying is that the original goal of valarray being well used for optimized numerical routines isn't going to happen (I don't think it has, though there is a PPC altivec version out there). However std::vector doesn't have any numerical optimizations either, so I don't see any reason to choose std::vector over std:valarray. My real question is what compiler and library writers are doing -- has anyone (OK, I guess MS and gcc are all I care about anyway) built anything optimized for them? Are they going to dump them? Who knows? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception From david at ar.media.kyoto-u.ac.jp Mon Sep 17 00:50:45 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 17 Sep 2007 13:50:45 +0900 Subject: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ? In-Reply-To: <1e2af89e0709161117j41321264h8cfd69d405de8258@mail.gmail.com> References: <5b8d13220709151141n7fb78554p1c3c809d5cfa3022@mail.gmail.com> <1e2af89e0709161117j41321264h8cfd69d405de8258@mail.gmail.com> Message-ID: <46EE07A5.60002@ar.media.kyoto-u.ac.jp> Matthew Brett wrote: > Hi, > >> Starting thinking over the whole distutils thing, I was thinking >> what people would think about using scons inside distutils to build >> extension. > > In general this seems like an excellent idea. If we can contribute > what we need to scons, that would greatly ease the burden of > maintenance, and benefit both projects. The key problem will be > support. At the moment Pearu maintains and owns numpy.distutils. > Will we have the same level of commitment and support for this > alternative do you think? I have started to ask some questions related to fortran to the scons ML. At least one guy reports using scons for complex fortran builds (with pre processing, modules, etc...). There are many tools already available for scons. Taking the sources, here is the function which defines the default tools for supported platforms: if str(platform) == 'win32': "prefer Microsoft tools on Windows" linkers = ['mslink', 'gnulink', 'ilink', 'linkloc', 'ilink32' ] c_compilers = ['msvc', 'mingw', 'gcc', 'intelc', 'icl', 'icc', 'cc', 'bcc32' ] cxx_compilers = ['msvc', 'intelc', 'icc', 'g++', 'c++', 'bcc32' ] assemblers = ['masm', 'nasm', 'gas', '386asm' ] fortran_compilers = ['g77', 'ifl', 'cvf', 'f95', 'f90', 'fortran'] ars = ['mslib', 'ar', 'tlib'] elif str(platform) == 'os2': "prefer IBM tools on OS/2" linkers = ['ilink', 'gnulink', 'mslink'] c_compilers = ['icc', 'gcc', 'msvc', 'cc'] cxx_compilers = ['icc', 'g++', 'msvc', 'c++'] assemblers = ['nasm', 'masm', 'gas'] fortran_compilers = ['ifl', 'g77'] ars = ['ar', 'mslib'] elif str(platform) == 'irix': "prefer MIPSPro on IRIX" linkers = ['sgilink', 'gnulink'] c_compilers = ['sgicc', 'gcc', 'cc'] cxx_compilers = ['sgic++', 'g++', 'c++'] assemblers = ['as', 'gas'] fortran_compilers = ['f95', 'f90', 'f77', 'g77', 'fortran'] ars = ['sgiar'] elif str(platform) == 'sunos': "prefer Forte tools on SunOS" linkers = ['sunlink', 'gnulink'] c_compilers = ['suncc', 'gcc', 'cc'] cxx_compilers = ['sunc++', 'g++', 'c++'] assemblers = ['as', 'gas'] fortran_compilers = ['f95', 'f90', 'f77', 'g77', 'fortran'] ars = ['sunar'] elif str(platform) == 'hpux': "prefer aCC tools on HP-UX" linkers = ['hplink', 'gnulink'] c_compilers = ['hpcc', 'gcc', 'cc'] cxx_compilers = ['hpc++', 'g++', 'c++'] assemblers = ['as', 'gas'] fortran_compilers = ['f95', 'f90', 'f77', 'g77', 'fortran'] ars = ['ar'] elif str(platform) == 'aix': "prefer AIX Visual Age tools on AIX" linkers = ['aixlink', 'gnulink'] c_compilers = ['aixcc', 'gcc', 'cc'] cxx_compilers = ['aixc++', 'g++', 'c++'] assemblers = ['as', 'gas'] fortran_compilers = ['f95', 'f90', 'aixf77', 'g77', 'fortran'] ars = ['ar'] elif str(platform) == 'darwin': "prefer GNU tools on Mac OS X, except for some linkers and IBM tools" linkers = ['applelink', 'gnulink'] c_compilers = ['gcc', 'cc'] cxx_compilers = ['g++', 'c++'] assemblers = ['as'] fortran_compilers = ['f95', 'f90', 'g77'] ars = ['ar'] else: "prefer GNU tools on all other platforms" linkers = ['gnulink', 'mslink', 'ilink'] c_compilers = ['gcc', 'msvc', 'intelc', 'icc', 'cc'] cxx_compilers = ['g++', 'msvc', 'intelc', 'icc', 'c++'] assemblers = ['gas', 'nasm', 'masm'] I don't see important platforms missing: all commercial Unices which matter are there with their default compiler, the "big 3" are there too (Mac Os X, Windows, Linux/*BSD). On all those platforms, scons knows how to build static and shared libraries, support rpath on the combinations platform/tools which support it, etc... And adding new tools is much easier than with distutils, I think. Support-wise, scons is used by many project, both open source and commercial. Although not extremely knowledgeable about it, I have done non trivial things with it (including the equivalent of autoconf macro to look for BLAS/LAPACK on many platforms, cross compilation, convertion of some projects from autotools to scons), so I think I know where its default are (e.g. it is terrible for deployment, compared to autotools; as we would drive scons from distutils, this does not matter, though). > > How easy would it be to throw up a prototype for the rest of us to > look at and get a feel for what the benefits would be? I don't intend to do everything at once. I was thinking about first getting a new command scons for numpy.distutils: distutils would simply launch scons with all the necessary arguments (compilers and so on); this would make compiling ctypes extension possible at last, without touching much of the code. I have already asked the openalea people if I could borrow some code from them under acceptable license for us; if they accept, it should take only a few days before having something to show. cheers, David From david at ar.media.kyoto-u.ac.jp Mon Sep 17 01:04:22 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 17 Sep 2007 14:04:22 +0900 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46EDBBB4.7030202@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> <46EDBBB4.7030202@noaa.gov> Message-ID: <46EE0AD6.5000304@ar.media.kyoto-u.ac.jp> Christopher Barker wrote: > Charles R Harris wrote: >> On 9/10/07, *Christopher Barker* > STL either, so I'm not sure there is any downside to valarray. It looks >> like neither one [vector or valarray] supports any kind of "view" semantics, so for the >> purposes of numpy array wrapping, they really aren't any different. >> >> I think that the originator of valarray saying it was misguided might be >> considered a downside. > > I had read that, though interestingly, I haven't seen any more recent > commentary about the issues at all. > > In any case, it appears that what Budge is saying is that the original > goal of valarray being well used for optimized numerical routines isn't > going to happen (I don't think it has, though there is a PPC altivec > version out there). However std::vector doesn't have any numerical > optimizations either, so I don't see any reason to choose std::vector > over std:valarray. > > My real question is what compiler and library writers are doing -- has > anyone (OK, I guess MS and gcc are all I care about anyway) built > anything optimized for them? Are they going to dump them? Who knows? What do you mean by optimization ? I think this question is the key. I remember having used blitz at some point, and I thought it was terrible. It is really complicated, and to get good performances was really difficult. Maybe I used it wrongly, I don't know (this was a few years ago). But at some point, I decided to just use plain C arrays instead: the code was much faster, and actually much easier (I really hate template syntax). I personnally don't think all the template things worth it for optimizing temporaries (which was the goal of blitz): the complexity cost is enormous, for not much benefit. I think C++ is much more useful for the automatic memory management through RAII, which is what std::vector gives you. As long as you think about setting the right size to avoid resizes, all other considerations are not worthwhile IMHO. C speed is already quite good on modern CPU, and std::vector gives you that. If your compiler supports restrict, use it (http://www.cellperformance.com/mike_acton/2006/05/demystifying_the_restrict_keyw.html), this will give you "Fortran speed". The fact that, while C++ being a popular language, a standard class for matrix algebra does not exist yet shows me that this is not that useful, or too complicate to develop. cheers, David From david at ar.media.kyoto-u.ac.jp Mon Sep 17 03:21:02 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 17 Sep 2007 16:21:02 +0900 Subject: [Numpy-discussion] A first proposal for dataset organization Message-ID: <46EE2ADE.2050602@ar.media.kyoto-u.ac.jp> Hi there, A few months ago, we started to discuss about various issues about dataset for numpy/scipy. In the context of my Summer Of Code for machine learning tools in python, I had the possibility to tackle concretely the issue. Before announcing a first alpha version of my work, I would like to gather comments, critics about the following proposal for dataset organization. The following proposal is also available in svn: http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/datasets/DATASET_PROPOSAL.txt Dataset for scipy: design proposal ================================== One of the thing numpy/scipy is missing now is a set of datasets, available for demo, courses, etc. For example, R has a set of dataset available at the core. The expected usage of the datasets are the following: - machine learning: eg the data contain also class information (discrete or continuous) - descriptive statistics - others ? That is, a dataset is not only data, but also some meta-data. The goal of this proposal is to propose common practices for organizing the data, in a way which is both straightforward, and does not prevent specific usage of the data. Organization ------------ A preliminary set of datasets is available at the following address: http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/datasets Each dataset is a directory and defines a python package (e.g. has the __init__.py file). Each package is expected to define the function load, returning the corresponding data. For example, to access datasets data1, you should be able to do: >>> from datasets.data1 import load >>> d = load() # -> d contains the data. load can do whatever it wants: fetching data from a file (python script, csv file, etc...), from the internet, etc... Some special variables must be defined for each package, containing a python string: - COPYRIGHT: copyright informations - SOURCE: where the data are coming from - DESCHOSRT: short description - DESCLONG: long description - NOTE: some notes on the datasets. Format of the data ------------------ Here, I suggest a common practice for the returned value by the load function. Instead of using classes to provide meta-data, I propose to use a dictionnary of arrays, with some values mandatory. The key goals are: - for people who just want the data, there is no extra burden ("just give me the data !" MOTO). - for people who need more, they can easily extract what they need from the returned values. More high level abstractions can be built easily from this model. - all possible dataset should fit into this model. - In particular, I want to be able to be able to convert our dataset to Orange Dataset representation (or other machine learning tool), and vice-versa. For the datasets to be useful in the learn scikits, which is the project which initiated this datasets package, the data returned by load has to be a dict with the following conventions: - 'data': this value should be a record array containing the actual data. - 'label': this value should be a rank 1 array of integers, contains the label index for each sample, that is label[i] should be the label index of data[i]. If it contains float values, it is used for regression instead. - 'class': a record array such as class[i] is the class name. In other words, this makes the correspondance label name > label index. As an example, I use the famouse IRIS dataset: the dataset contains 3 classes of flowers, and for each flower, 4 measures (called attributes in machine learning vocabulary) are available (sepal width and length, petal width and length). In this case, the values returned by load would be: - 'data': a record array containing all the flowers' measurements. For descriptive statistics, that's all you may need. You can easily find the attributes from the dtype (a function to find the attributes is also available: it returns a list of the attributes). - 'labels': an array of integers (for class information) or float (for regression). each class is encoded as an integer, and labels[i] returns this integer for the sample i. - 'class': a record array, which returns the integer code for each class. For example, class['Iris-versicolor'] will return the integer used in label, and all samples i such as label[i] == class['Iris-versicolor'] are of the class 'Iris-versicolor'. This contains enough information to get all useful information through introspection and simple functions. I already implemented a small module to do basic things such as: - selecting only a subset of all samples. - selecting only a subset of the attributes (only sepal length and width, for example). - selecting only the samples of a given class. - small summary of the dataset. This is implemented in less than 100 lines, which tends to show that the above design is not too simplistic. Remaining problems: ------------------- I see mainly two big problems: - if the dataset is big and cannot fit into memory, what kind of API do we want to avoid loading all the data in memory ? Can we use memory mapped arrays ? - Missing data: I thought about subclassing both record arrays and masked arrays classes, but I don't know if this is feasable, or even makes sense. I have the feeling that some Data mining software use Nan (for example, weka seems to use float internally), but this prevents them from representing integer data. Current implementation ---------------------- An implementation following the above design is available in scikits.learn.datasets. If you installed scikits.learn, you can execute the file learn/utils/attrselect.py, which shows the information you can easily extract for now from this model. Also, once the above problems are solved, an arff converter will be available: arff is the format used by WEKA, and many datasets are available at this format: http://weka.sourceforge.net/wekadoc/index.php/en:ARFF_%283.5.4%29 http://www.cs.waikato.ac.nz/ml/weka/index_datasets.html Note ---- Although the datasets package emerged from the learn package, I try to keep it independant from everything else, that is once we agree on the remaining problems and where the package should go, it can easily be put elsewhere without too much trouble. cheers, David From kurdt.bane at gmail.com Mon Sep 17 06:18:51 2007 From: kurdt.bane at gmail.com (Kurdt Bane) Date: Mon, 17 Sep 2007 12:18:51 +0200 Subject: [Numpy-discussion] =?windows-1256?q?=FD_Find_contiguous_sequences?= =?windows-1256?q?_of_booleans_in_arrays?= Message-ID: Hi to all, I've got an 1-D array of bools and I'd like to find the length of the first contiguous sequence of True values, starting from position [0] of the array. (That's equivalent to find the position of the first occurrence of False in the array). The problem is trivial, but I was wondering: what's the best (fastest, cleanest, most pythonesque) way to do it in numpy? And what if I want to get a list of all the contiguous sequences of True values, above a given threshold? Thanks in advance for your advices, Chris. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Sep 17 06:33:03 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 17 Sep 2007 19:33:03 +0900 Subject: [Numpy-discussion] =?utf-8?q?=E2=80=8E_Find_contiguous_sequences_?= =?utf-8?q?of_booleans_in_arrays?= In-Reply-To: References: Message-ID: <46EE57DF.3040208@ar.media.kyoto-u.ac.jp> Kurdt Bane wrote: > Hi to all, > > I've got an 1-D array of bools and I'd like to find the length of the > first contiguous sequence of True values, starting from position [0] > of the array. > (That's equivalent to find the position of the first occurrence of > False in the array). One possibility would be to find the first True and then use cumproduct to get the length, and then create a view with the remaining elements, recursively. But this may be slow. You may be a bit smarter to get the index of the first True value for each sub-sequence by multiplying the array by itself, shifted by one, and using xor. Whether this is pythonic and clean, I don't know :) cheers, David From stefan at sun.ac.za Mon Sep 17 07:15:07 2007 From: stefan at sun.ac.za (stefan) Date: Mon, 17 Sep 2007 12:15:07 +0100 Subject: [Numpy-discussion] Find contiguous sequences of booleans in arrays In-Reply-To: References: Message-ID: <12a39703fe8c7f6f301a55b51d58d0f1@zaphod.lagged.za.net> Hi Kurdt, On Mon, 17 Sep 2007 12:18:51 +0200, "Kurdt Bane" wrote: > I've got an 1-D array of bools and I'd like to find the length of the > first > contiguous sequence of True values, starting from position [0] of the > array. One way would be: x = N.array([True,True,False,True]) x.argmin() > And what if I want to > get > a list of all the contiguous sequences of True values, above a given > threshold? x = N.array([True,True,False,False,False,True,False,False]) jumps, = N.where(N.diff(x)) idx = N.concatenate(([0],jumps+1,[-1])) print [x[a:b] for (a,b) in zip(idx[:-1],idx[1:]) if x[a]] Cheers St?fan From arnar.flatberg at gmail.com Mon Sep 17 11:10:19 2007 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Mon, 17 Sep 2007 17:10:19 +0200 Subject: [Numpy-discussion] Segfault on wrong use of indexing Message-ID: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> Hi list A pretty common use (for me) is to create arrays at the beginning of my code and then fill in parts as I go along. Today, my code hit a segmentation fault. The error was that an index-vector ,that usually is a list with several members, now contained only one member and I tried to insert a 1-dim vector. Here is a short example of what went wrong: Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as n >>> n.__version__ '1.0.1' >>> a = n.random.rand(4,2) >>> a array([[ 0.79561255, 0.4932522 ], [ 0.84008255, 0.88258233], [ 0.35143166, 0.82214487], [ 0.12490114, 0.15799074]]) Let us insert [2,2] into the first row. Correct use: >>> a[0,:] = [2,2] >>> a array([[ 2. , 2. ], [ 0.84008255, 0.88258233], [ 0.35143166, 0.82214487], [ 0.12490114, 0.15799074]]) or: a[[0],:] = [[2,2]] However, when I tried to to insert a 1-dim array with a 'two-dim' index things went wrong: >>> a[[0],:] = [2,2] Segmentation fault Sometimes, (with some lucky pointers, I guess) the latter code will run, with incorrect values in the array. With incorrect, I mean compared to the correct use of indices :-) My fix was to call "atleast_2d" on the parts I wanted to insert into the array. Am I just a horrible user of indices, or is this a bug? The situation were incorrect use of indices just passes silently is what I find a little disturbing. Thanks, Arnar From charlesr.harris at gmail.com Mon Sep 17 12:43:03 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Sep 2007 10:43:03 -0600 Subject: [Numpy-discussion] Segfault on wrong use of indexing In-Reply-To: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> References: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> Message-ID: Hi Arnar, On 9/17/07, Arnar Flatberg wrote: > > Hi list > A pretty common use (for me) is to create arrays at the beginning of > my code and then fill in parts as I go along. Today, my code hit a > segmentation fault. The error was that an index-vector ,that usually > is a list with several members, now contained only one member and I > tried to insert a 1-dim vector. > > Here is a short example of what went wrong: > > Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) > [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy as n > >>> n.__version__ > '1.0.1' > >>> a = n.random.rand(4,2) > >>> a > array([[ 0.79561255, 0.4932522 ], > [ 0.84008255, 0.88258233], > [ 0.35143166, 0.82214487], > [ 0.12490114, 0.15799074]]) > > Let us insert [2,2] into the first row. > Correct use: > >>> a[0,:] = [2,2] > >>> a > array([[ 2. , 2. ], > [ 0.84008255, 0.88258233], > [ 0.35143166, 0.82214487], > [ 0.12490114, 0.15799074]]) > or: > a[[0],:] = [[2,2]] > > However, when I tried to to insert a 1-dim array with a 'two-dim' > index things went wrong: > >>> a[[0],:] = [2,2] > Segmentation fault I don't see that here. Can you try a more recent version of Numpy? 1.0.1 is pretty old. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Mon Sep 17 12:44:45 2007 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 17 Sep 2007 11:44:45 -0500 Subject: [Numpy-discussion] Segfault on wrong use of indexing In-Reply-To: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> References: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> Message-ID: <46EEAEFD.8000501@enthought.com> Arnar Flatberg wrote: > However, when I tried to to insert a 1-dim array with a 'two-dim' > index things went wrong: > >>>> a[[0],:] = [2,2] >>>> > Segmentation fault > > I do not see this error in latest trunk of numpy (I suspect it's also not there in the latest release). > Am I just a horrible user of indices, or is this a bug? > It was a bug. A segfault with pure Python is *always* a bug (unless you are using ctypes). > The situation were incorrect use of indices just passes silently is > what I find a little disturbing. > Which situation is this exactly? Thanks for the report. Don't forget about the Trac pages where bugs (filed as "tickets") can be reported, checked, and fixed. See the www.scipy.org site under the Developer tab for more details. Best regards, -Travis From charlesr.harris at gmail.com Mon Sep 17 13:34:48 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Sep 2007 11:34:48 -0600 Subject: [Numpy-discussion] Find contiguous sequences of booleans in arrays Message-ID: Hi Kurdt, On 9/17/07, Kurdt Bane wrote: > > Hi to all, > > I've got an 1-D array of bools and I'd like to find the length of the > first contiguous sequence of True values, starting from position [0] of the > array. > (That's equivalent to find the position of the first occurrence of False > in the array). > The problem is trivial, but I was wondering: what's the best (fastest, > cleanest, most pythonesque) way to do it in numpy? And what if I want to get > a list of all the contiguous sequences of True values, above a given > threshold? You can find the start of all runs after the first by In [1]: a = array([1,1,1,0,1,1,0,0,1], dtype=bool) In [2]: s = arange(1,len(a))[a[0:-1] ^ a[1:]] In [3]: s Out[3]: array([3, 4, 6, 8]) i.e. The first run is a[0:3], the second a[3:4], etc., and the runs alternate between true and false. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnar.flatberg at gmail.com Mon Sep 17 13:35:51 2007 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Mon, 17 Sep 2007 19:35:51 +0200 Subject: [Numpy-discussion] Segfault on wrong use of indexing In-Reply-To: <46EEAEFD.8000501@enthought.com> References: <5d3194020709170810h6b3fb98fj90835b1645e668a9@mail.gmail.com> <46EEAEFD.8000501@enthought.com> Message-ID: <5d3194020709171035n2bee886er4a3f1269cae01a68@mail.gmail.com> Thanks for the fast reply. Everything works fine with a repos checkout ('1.0.4.dev4045'). I was using the default package of Ubuntu Feisty, and must admit that I like to stay close to the package manager, especially with python stuff. With Gutsy on the doorstep, perhaps this not a serious issue (I see Gutsy uses 1.0.3). I have tested this behavior on three machines running Ubunut Feisty with similar result. On 9/17/07, Travis E. Oliphant wrote: > Arnar Flatberg wrote: > > However, when I tried to to insert a 1-dim array with a 'two-dim' > > index things went wrong: > > > >>>> a[[0],:] = [2,2] > >>>> > > Segmentation fault > > > > > > I do not see this error in latest trunk of numpy (I suspect it's also > not there in the latest release). > > Am I just a horrible user of indices, or is this a bug? > > > It was a bug. A segfault with pure Python is *always* a bug (unless you > are using ctypes). > > The situation were incorrect use of indices just passes silently is > > what I find a little disturbing. > > > Which situation is this exactly? sometimes a segfault does not occur, that is: >>> import numpy as n >>> a = n.random.rand(4,2) >>> a[[0],:] = [2,2] >>> a array([[ 2. , 0. ], [ 0.23502786, 0.13993698], [ 0.9598841 , 0.74795152], [ 0.95212598, 0.52698868]]) , which is quite spooky. (This is with old numpy of course) > Thanks for the report. Don't forget about the Trac pages where bugs > (filed as "tickets") can be reported, checked, and fixed. See the > www.scipy.org site under the Developer tab for more details. > > Best regards, > > > -Travis > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Again Thank you Arnar From Chris.Barker at noaa.gov Mon Sep 17 16:24:27 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 17 Sep 2007 13:24:27 -0700 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46EE0AD6.5000304@ar.media.kyoto-u.ac.jp> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> <46EDBBB4.7030202@noaa.gov> <46EE0AD6.5000304@ar.media.kyoto-u.ac.jp> Message-ID: <46EEE27B.7070500@noaa.gov> David Cournapeau wrote: > Christopher Barker wrote: >> My real question is what compiler and library writers are doing -- has >> anyone (OK, I guess MS and gcc are all I care about anyway) built >> anything optimized for them? Are they going to dump them? Who knows? > What do you mean by optimization ? Well, I'm quite specifically not being precise about that. It appears the POINT of valarray was to provide a way to do computation that compiler(library) writers could optimize in various ways for the system at hand. The one example I have seen is someone that wrote a version that takes advantage of the PPC altivec instructions: (http://www.pixelglow.com/stories/altivec-valarray-2/) Anyway, at this point I'm far less concerned about optimization that just a more robust and convenient way to deal with data that raw pointers. > I > remember having used blitz at some point, and I thought it was terrible. Darn -- it looks so promising. > I think C++ is much more useful > for the automatic memory management through RAII, which is what > std::vector gives you. and std::valarray not? I guess where I'm at now is deciding if there is any advantage or disadvantage to using std::valarray vs. std::vector. The other option is to go with something else: boost::multiarray, blitz++, etc. However, at least in term of how well they might p;lay with numpy arrays, I don't see a reason to do so. > If your compiler supports restrict, use it > (http://www.cellperformance.com/mike_acton/2006/05/demystifying_the_restrict_keyw.html), Thanks for that link -- I'll keep that in mind. And now I finally think I understand what is meant by "aliased" pointer - which explains why, quite deliberately, you can't create a valarray from an existing pointer to a data block. > The fact that, while C++ being a popular language, a standard class for > matrix algebra does not exist yet shows me that this is not that useful, > or too complicate to develop. Could be. Personally, I'm not looking for matrix algebra, rather a generic nd-array class - but the argument is the same. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Tue Sep 18 00:35:08 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 18 Sep 2007 13:35:08 +0900 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46EEE27B.7070500@noaa.gov> References: <46DDF734.7070708@noaa.gov> <4CED078D-B8C7-4917-B0DE-4ED60BDFE015@sandia.gov> <46DEE52F.8050000@noaa.gov> <609FEB05-86EB-4FE2-867E-4A1C84105951@sandia.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> <46EDBBB4.7030202@noaa.gov> <46EE0AD6.5000304@ar.media.kyoto-u.ac.jp> <46EEE27B.7070500@noaa.gov> Message-ID: <46EF557C.2020602@ar.media.kyoto-u.ac.jp> Christopher Barker wrote: > David Cournapeau wrote: >> Christopher Barker wrote: >>> My real question is what compiler and library writers are doing -- has >>> anyone (OK, I guess MS and gcc are all I care about anyway) built >>> anything optimized for them? Are they going to dump them? Who knows? >> What do you mean by optimization ? > > Well, I'm quite specifically not being precise about that. It appears > the POINT of valarray was to provide a way to do computation that > compiler(library) writers could optimize in various ways for the system > at hand. The one example I have seen is someone that wrote a version > that takes advantage of the PPC altivec instructions: > > (http://www.pixelglow.com/stories/altivec-valarray-2/) > > Anyway, at this point I'm far less concerned about optimization that > just a more robust and convenient way to deal with data that raw pointers. > >> I >> remember having used blitz at some point, and I thought it was terrible. > > Darn -- it looks so promising. I realize that I sounded more convinced than I really am. First, to make my perspective more obvious, let me say that I generally hate template. I think the syntax is terrible, and make the code totally unreadable for everything but simple cases (simple container, for example); I think it is a wrong solution for a broken language. So I prefer to avoid them if I can. My understanding of blitz is that it is supposed to be faster mainly because it can avoid temporaries thanks to expression template. So if you don't need this feature, you don't gain much. But when you think about it, avoiding temporaries is done by symbolic computation at the compiler level through template; the idea is to make expressions such as A = B * C + D * E^-1 * F where everything is a matrix the most efficient possible. C/C++ makes it hard because it needs to use binary operations with a returned value. So in the end, this is really a parsing problem; if so, why not use a language which can do symbolic computation, and convert them into a compiled language ? By using expression template, you use a totally broken syntax to do things which are much more easily done by a language easy to parse (say LISP). When you take a look at http://ubiety.uwaterloo.ca/~tveldhui/papers/DrDobbs2/drdobbs2.html, you also realize that the tests are done on architectures/compilers which are different from the ones available now. The only way to really know is to do your own tests: have a reasonable example of the kind of operations you intend to do, benchmark it, and see the differences. My experience says it definitely does not worth it for my problems. Maybe yours will be different. > >> I think C++ is much more useful >> for the automatic memory management through RAII, which is what >> std::vector gives you. > > and std::valarray not? I guess where I'm at now is deciding if there is > any advantage or disadvantage to using std::valarray vs. std::vector. > The other option is to go with something else: boost::multiarray, > blitz++, etc. However, at least in term of how well they might p;lay > with numpy arrays, I don't see a reason to do so. Valarray and vector give you more or less the same here concerning RAII. But vector really is more common, so I would rather pick up vector instead of valarray unless there is a good reason not to do so, not the contrary. I don't know much boost::multiarray (I tried a bit ublas, and found the performances quite bad compared to C, using gcc; again, this was a few years ago, it may have changed since). I almost never used more than rank 2 arrays, so I don't know much about multi_array. cheers, David From markbak at gmail.com Tue Sep 18 06:33:29 2007 From: markbak at gmail.com (mark) Date: Tue, 18 Sep 2007 10:33:29 -0000 Subject: [Numpy-discussion] confusion about min/max Message-ID: <1190111609.880922.173350@n39g2000hsh.googlegroups.com> Hello - When I am doing from numpy import * It does not import the min() function, but when I do from numpy import min it does import the min() function Does that make sense? I know, I should probably use a.min() rather than min(a), but why does min() not get imported on an import * ? Thanks, Mark From gael.varoquaux at normalesup.org Tue Sep 18 07:07:29 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 18 Sep 2007 13:07:29 +0200 Subject: [Numpy-discussion] confusion about min/max In-Reply-To: <1190111609.880922.173350@n39g2000hsh.googlegroups.com> References: <1190111609.880922.173350@n39g2000hsh.googlegroups.com> Message-ID: <20070918110729.GH27741@clipper.ens.fr> On Tue, Sep 18, 2007 at 10:33:29AM -0000, mark wrote: > Does that make sense? I know, I should probably use a.min() rather > than min(a), but why does min() not get imported on an import * ? Because min isn't in numpy.__all__. Python imports only identifiers listed in __all__ if __all__ is present. Ga?l From stefan at sun.ac.za Tue Sep 18 07:30:18 2007 From: stefan at sun.ac.za (stefan) Date: Tue, 18 Sep 2007 12:30:18 +0100 Subject: [Numpy-discussion] confusion about min/max In-Reply-To: <20070918110729.GH27741@clipper.ens.fr> References: <20070918110729.GH27741@clipper.ens.fr> Message-ID: <5cd26c80df8a9a57c711c74830ff3119@zaphod.lagged.za.net> On Tue, 18 Sep 2007 13:07:29 +0200, Gael Varoquaux wrote: > On Tue, Sep 18, 2007 at 10:33:29AM -0000, mark wrote: >> Does that make sense? I know, I should probably use a.min() rather >> than min(a), but why does min() not get imported on an import * ? > > Because min isn't in numpy.__all__. Python imports only identifiers > listed in __all__ if __all__ is present. The rationale behind this is to prevent you from overwriting the built-in max function. Sooner or later that would cause trouble. Use import numpy as N N.max(...) or, as you said, import max explicitly. Cheers St?fan From alexandre.fayolle at logilab.fr Tue Sep 18 08:07:14 2007 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Tue, 18 Sep 2007 14:07:14 +0200 Subject: [Numpy-discussion] FPE on tensordot Message-ID: <20070918120714.GC14956@logilab.fr> Hi, A user of some code I've written is experiencing some strange behaviour with numpy.tensordot. I have unfortunately no access to his computer and cannot reproduce the crash on my machine. The short way of reproducing this is: import numpy a=numpy.array([0.5,0.5]) b=numpy.array([[0.,1.],[2.,3.]]) numpy.tensordot(a,b,axes=(0,0)) This works and returns array([ 1., 2.]) import numpy a=numpy.array([0.4,0.5]) b=numpy.array([[0.,1.],[2.,3.]]) numpy.tensordot(a,b,axes=(0,0)) This crashes with an FPE. He is running Linux (Debian sarge based) with python 2.3.5. Numpy 1.0.3.1 was compiled manually by the admins of the lab (and I don't know which options they used, probably the default). The machine is a 32bit Intel. I'm puzzled, and welcome any insight on this. Regards -- Alexandre Fayolle LOGILAB, Paris (France) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From matthieu.brucher at gmail.com Tue Sep 18 08:21:36 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 18 Sep 2007 14:21:36 +0200 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46EF557C.2020602@ar.media.kyoto-u.ac.jp> References: <46DDF734.7070708@noaa.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov> <46EDBBB4.7030202@noaa.gov> <46EE0AD6.5000304@ar.media.kyoto-u.ac.jp> <46EEE27B.7070500@noaa.gov> <46EF557C.2020602@ar.media.kyoto-u.ac.jp> Message-ID: > > My understanding of blitz is that it is supposed to be faster mainly > because it can avoid temporaries thanks to expression template. In fact, Boost.uBLAS uses expression templates as well. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 10:24:42 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2007 08:24:42 -0600 Subject: [Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share? In-Reply-To: <46EF557C.2020602@ar.media.kyoto-u.ac.jp> References: <46DDF734.7070708@noaa.gov> <46DEF965.1000308@noaa.gov> <46DF04E9.2020501@enthought.com> <46E2F926.1000400@obs.univ-lyon1.fr> <46E5710D.4010208@noaa.gov>